- 30 Aug, 2021 2 commits
-
-
tavit ohanian authored
-
tavit ohanian authored
-
- 24 Aug, 2021 2 commits
-
-
tavit ohanian authored
-
tavit ohanian authored
-
- 23 Aug, 2021 2 commits
-
-
tavit ohanian authored
-
tavit ohanian authored
-
- 16 Aug, 2021 1 commit
-
-
tavit ohanian authored
-
- 14 Aug, 2021 1 commit
-
-
tavit ohanian authored
-
- 11 Aug, 2021 2 commits
-
-
Masih H. Derkani authored
Since `CopyFileRange` performance is OS-dependant, we cannot guarantee that `ExtractV1File` will keep copies out of user-space. For example, on Linux with sufficiently old Kernel the current behaviour will fall back to user-space copy. Document this on the function so that it is made clear. Improve benchmarking determinism and error messages. The benchmark numbers that Daniel obtained on his laptop, running Linux 5.13 on an i5-8350U with /tmp being tmpfs, are as follows. The "old" results are BenchmarkExtractV1UsingReader, and the "new" are BenchmarkExtractV1File. name old time/op new time/op delta ExtractV1-8 1.33ms ± 1% 1.11ms ± 2% -16.48% (p=0.000 n=8+8) name old speed new speed delta ExtractV1-8 7.88GB/s ± 1% 9.43GB/s ± 2% +19.74% (p=0.000 n=8+8) name old alloc/op new alloc/op delta ExtractV1-8 34.0kB ± 0% 1.0kB ± 0% -97.09% (p=0.000 n=8+8) name old allocs/op new allocs/op delta ExtractV1-8 26.0 ± 0% 23.0 ± 0% -11.54% (p=0.000 n=8+8) So, at least in the case where the filesystem is very fast, we can see that the benefit is around 10-20%, as well as fewer allocs thanks to not needing a user-space buffer. The performance benefit will likely be smaller on slower disks. For the Linux syscall logic, see: - https://cs.opensource.google/go/go/+/refs/tags/go1.16.7:src/internal/poll/copy_file_range_linux.go;drc=refs%2Ftags%2Fgo1.16.7;l=54
-
Masih H. Derkani authored
Implement `ExtractV1File` where the function takes path to a CARv2 file and efficiently extracts its inner CARv1 payload. Note, the implementation only supports CARv2 as input and returns a dedicated error if the supplied input is already in CARv1 format. Implement benchmarks comparing extraction using `Reader` vs `ExtractV1File`. Implement tests that assert in-place extraction as well as invalid input and both v1/v2 input Fixes #207
-
- 10 Aug, 2021 1 commit
-
-
Daniel Martí authored
This allows closing a read-write blockstore without doing the extra work to finalize its header and index. Can be useful if an entire piece of work is cancelled, and also simplifies the tests. Also make ReadOnly error in a straightforward way if it is used after being closed. Before, this could lead to panics, as we'd attempt to read the CARv1 file when it's closed. Both mechanisms now use a "closed" boolean, which is consistent and simpler than checking a header field. Finally, add tests that ensure both ReadOnly and ReadWrite behave as intended once they have been closed. The tests also uncovered that AllKeysChan would not release the mutex if it encountered an error early on. Fix that, too. While at it, fix some now-obsolete references to panics on unsupported or after-close method calls. Fixes #205.
-
- 06 Aug, 2021 1 commit
-
-
Daniel Martí authored
-
- 05 Aug, 2021 3 commits
-
-
Daniel Martí authored
The last commit did a mv. I intended to do a cp.
-
Daniel Martí authored
Go modules has to support many OSs and filesystems, so it completely ignores symlinks when creating module zips. And parts of the Go ecosystem, like pkg.go.dev, consume those zips. So this symlink actually makes the module technically non-OSS, since it completely lacks a LICENSE file when downloaded. pkg.go.dev thus refuses to show its documentation. One option would be to rename the root LICENSE.md file to just LICENSE, since cmd/go treats that file in a special way, including it in subdirectory-module zips. Unfortunately, we do want the ".md" extension, since the file is formatted to outline the double license. As such, simply copy the file. It should be fine as it's mostly static.
-
Masih H. Derkani authored
Add examples link to the README and refine its structure. List features of CARv2. Use stronger language to encourage users to use `v2` over `v0`.
-
- 04 Aug, 2021 2 commits
-
-
Daniel Martí authored
While at it, make v2's license a symlink, so the two stay in sync. The reason we want the v2 module to have its own license is so that its module zip includes the file as well. Besides mentioning v0 and v2, the root README now also links to pkgsite and lists Masih and myself as default maintainers, given that we're the ones that did major work on this library last. Fixes #124. Fixes #179.
-
Daniel Martí authored
It has unintended side effects, as we leak the inner ReadOnly value as well as its methods to the parent ReadWrite API. One unintended consequence is that it was possible to "close" a ReadWrite, when we only wanted to support finalizing it. The resumption tests do want to close without finalization, for the sake of emulating partially-written CARv2 files. Those can hook into the unexported API via export_test.go. If a downstream really wants to support closing a ReadWrite blockstore without finalizing it, two alternatives are outlined in export_test.go.
-
- 03 Aug, 2021 1 commit
-
-
Masih H. Derkani authored
Implement an iterator over CAR blocks that accepts both v1 and v2. The reader only requires io.Reader interface to operate and exposes the version and roots of the underlaying CAR file as fields. Remove `Next` from CARv2 reader now that the iteration functionality is provided using `BlockReader`. Update the benchmarks to reflect the changes and add example usage for `NewBlockReader`. Add tests that assert `BlockReader` behaviour in erroneous and successful cases for both underlying CARv1 and CARv2 payload.
-
- 02 Aug, 2021 1 commit
-
-
Daniel Martí authored
It's technically correct to use panics in these situations, as the downstream user is clearly breaking the contract documented in the API's godoc. However, all these methods return errors already, so panicking is not our only option here. There's another reason that makes a panic unfortunate. ReadWrite.Finalize enables the panic behavior on other methods, which makes it much easier to run into in production even when the code and its tests work normally. Finally, there's some precedent for IO interfaces using errors rather than panics when they are used after closing. For example, os.File.Read returns an error if the file is closed. Note that this change doesn't make panics entirely impossible. The blockstore could still run into exceptional and unexpected panics, such as those caused by memory corruption or internal bugs.
-
- 30 Jul, 2021 1 commit
-
-
Daniel Martí authored
BenchmarkReadBlocks uses carv2.OpenReader with its Roots and Next method. From six runs on a laptop with a i5-8350U and benchstat: name time/op ReadBlocks-8 633µs ± 2% name speed ReadBlocks-8 824MB/s ± 2% name alloc/op ReadBlocks-8 1.32MB ± 0% name allocs/op ReadBlocks-8 13.5k ± 0% OpenReadOnlyV1 uses blockstore.OpenReadOnly with its Get method. The method is used on all blocks in a shuffled order, to ensure that the index is working as intended. The input file lacks an index, so we also generate that. name time/op OpenReadOnlyV1-8 899µs ± 1% name speed OpenReadOnlyV1-8 534MB/s ± 1% name alloc/op OpenReadOnlyV1-8 1.52MB ± 0% name allocs/op OpenReadOnlyV1-8 27.2k ± 0% Both benchmarks use the "sample" v1/v2 CAR files, which weigh about half a megabyte. It seems like a good size; a significantly larger CAR file would make each benchmark iteration take tens or hundreds of milliseconds, making it much slower to obtain many benchmark results, since we want at least thousands of iterations to avoid noise.
-
- 25 Jul, 2021 1 commit
-
-
Masih H. Derkani authored
Implement the mechanism to iterate over blocks in CARv1 or CARv2 via the `carv2.Reader` API. Implement tests both for the iterator and the rest of the reader API. Add test files with corrupt sections etc. Fixes #189
-
- 23 Jul, 2021 2 commits
-
-
Masih H. Derkani authored
Implement a mechanism that allows the user to hook an error handling function to the context passed to `AllKeysChan`. The function is then notified when an error occurs during asynchronous traversal of data payload. Add tests that assert the success and failure cases when error handler is set. Fixes #177
-
Masih H. Derkani authored
The CARv2 implementation uses an internal fork of CARv1 due to upstream dependency issues captured in #104. Propagate the options set in CARv2 APIs for treatment of zero-lenth sections onto internal packages so that APIs using the internal CARv1 reader behave consistently. Use a `bool` for setting the option, since it is the only option needed in CARv1. Update tests to reflect changes. Add additional tests to internal CARv1 package and ReadOnly blockstore that assert option is propagated. Fixes #190
-
- 21 Jul, 2021 2 commits
-
-
Masih H. Derkani authored
When error is not expected in tests, assert that there is indeed no error.
-
Masih H. Derkani authored
Fix a bug in read options where the value of zero-len option is always set to `true` even if the flag passed is false. This meant if the option was present it would have always enabled the zero-len as EOF. Allow the user to specify read options on `ReadOrGenerateIndex`. Pass the options onto downstream reader/generators. Generate a CARv1 file with null padding and check it into `testdata`. In addition check in another such file from ignite just to have two different samples of this case. Enhance index generation tests to expect error when zero-len option is not set, and generate index when it is for files with null padding. Make RW blockstore test use the fixed test files from `testdata` instead of generating every time.
-
- 20 Jul, 2021 3 commits
-
-
Masih H. Derkani authored
Use existing SDK to create temp file in index example.
-
Masih H. Derkani authored
Avoid writing to files in `testdata` through examples. This is because when tests that use those file are run in parallel will fail if files are modified. Instead write to a temporary file in examples, and whenever opening testdata files in `RW` mode, make a temporary copy. Thanks to @mvdan for pointing this out. Fixes #175
-
Daniel Martí authored
Update the tests too, as a few expected whole CIDs. For now, just make them use the option. This required index.Index to gain a new method, GetAll, so that when using whole CIDs, methods like Get can return true if any of the matching indexed CIDs are an exact whole-CID match, and not just looking at the first matching indexed CID. GetAll is akin to an iteration over all matching indexed CIDs, and its callback returns a boolean to say if the iteration should continue. This allows stopping as soon as we're done. We also remove Index.Get, instead replacing it with a helper called GetFirst, which simply makes simple uses of GetAll a single line. We remove the non-specced and unused index implementations, too. They were left in place in case they were useful again, but they haven't been so far, and their code is still in git. Keeping them around just means updating more code when refactoring. While at it, make ZeroLengthSectionAsEOF take a boolean and return an option, just like the other boolean options, for consistency. Fixes #130.
-
- 16 Jul, 2021 12 commits
-
-
Daniel Martí authored
We've been working on CARv2 in the wip-v2 branch for some weeks. In preparation for the first v2 beta, merge all of those commits into master. The commits are mostly as they were originally, minus some minor squashing to clean up the history.
-
Daniel Martí authored
-
Masih H. Derkani authored
Implement examples that show how to open a read-only blockstore, and a read-write blockstore along with resumption from the same file. The example surfaced a race condition where if the `AllKeysChan` is partially consumed and file is closed then the gorutie started by chan population causes the race condition. Locking on Close will make the closure hang. Better solution is needed but choices are limited due to `AllKeysChan` signature. Fixes: - https://github.com/ipld/go-car/issues/124
-
Masih H. Derkani authored
It's two lines of code extra and we already have read from and write to APIs for index. Remove `hydrate` cli since the assumptions it makes about appending index are no longer true. header needs updating for example. Removing to be re-added when needed as part of a propper CARv2 CLI.
-
Masih H. Derkani authored
If appendage is needed there is already `car.AttachIndex`.
-
Masih H. Derkani authored
Rename terminology to match what the spec uses to describe the inner CARv1 payload, i.e. Data payload. Update docs to use CARv1 and CARv2 consistently. Fix typo in Options API. See: - https://ipld.io/specs/transport/car/carv2/
-
Masih H. Derkani authored
Remove the memory intensive CARv2 writer implementation since it needs to be reimplemented anyway and not used. We will release writers if needed when it comes to SelectiveCar writer implementation. Just now it is not clear if we want a CARv2 writer that works with the deprecated `format.NodeGetter`. Add tests for V1 wrapping functionality. Reformat code in repo for consistency using `gofumpt`.
-
Masih H. Derkani authored
Add examples that show how to read and write an index to/from file. Test marshalling and unmarshalling index files. Run `gofumt` on repo.
-
Daniel Martí authored
We've agreed to have unified options, since many will be shared between the root and blockstore packages. Include docs, and update the tests.
-
Masih H. Derkani authored
Improve reader type conversion by checking if type satisfies ReaderAt to avoid unnecessary wrapping. Move io converters into one place. Fixes: - https://github.com/ipld/go-car/issues/145
-
Daniel Martí authored
Fixes #169.
-
Masih H. Derkani authored
Remove the duplicate test CARv1 file that existed both in `testdata` and `blockstore/testdata` in favour of the one in root. Reflect change in tests. Duplicity checked using `md5` digest: ```sh $ md5 testdata/sample-v1.car MD5 (testdata/sample-v1.car) = 14fcffc271e50bc56cfce744d462a9bd $ md5 blockstore/testdata/test.car MD5 (blockstore/testdata/test.car) = 14fcffc271e50bc56cfce744d462a9bd ```
-