- 16 Aug, 2021 1 commit
-
-
tavit ohanian authored
-
- 29 Jul, 2021 1 commit
-
-
tavit ohanian authored
-
- 25 Dec, 2020 1 commit
-
-
Daniel Martí authored
As discussed on the issue thread, ipld.Kind and schema.TypeKind are more intuitive, closer to the spec wording, and just generally better in the long run. The changes are almost entirely automated via the commands below. Very minor changes were needed in some of the generators, and then gofmt. sed -ri 's/\<Kind\(\)/TypeKind()/g' **/*.go git checkout fluent # since it uses reflect.Value.Kind sed -ri 's/\<Kind_/TypeKind_/g' **/*.go sed -i 's/\<Kind\>/TypeKind/g' **/*.go sed -i 's/ReprKind/Kind/g' **/*.go Plus manually undoing a few renames, as per Eric's review. Fixes #94.
-
- 16 Dec, 2020 1 commit
-
-
Daniel Martí authored
We only supported representing Int nodes as Go's "int" builtin type. This is fine on 64-bit, but on 32-bit, it limited those node values to just 32 bits. This is a problem in practice, because it's reasonable to want more than 32 bits for integers. Moreover, this meant that IPLD would change behavior if built for a 32-bit platform; it would not be able to decode large integers, for example, when in fact that was just a software limitation that 64-bit builds did not have. To fix this problem, consistently use int64 for AsInt and AssignInt. A lot more functions are part of this rewrite as well; mainly, those revolving around collections and iterating. Some might never need more than 32 bits in practice, but consistency and portability is preferred. Moreover, many are interfaces, and we want IPLD interfaces to be flexible, which will be important for ADLs. Below are some GNU sed lines which can be used to quickly update function signatures to use int64: sed -ri 's/(func.* AsInt.*)\<int\>/\1int64/g' **/*.go sed -ri 's/(func.* AssignInt.*)\<int\>/\1int64/g' **/*.go sed -ri 's/(func.* Length.*)\<int\>/\1int64/g' **/*.go sed -ri 's/(func.* LookupByIndex.*)\<int\>/\1int64/g' **/*.go sed -ri 's/(func.* Next.*)\<int\>/\1int64/g' **/*.go sed -ri 's/(func.* ValuePrototype.*)\<int\>/\1int64/g' **/*.go Note that the function bodies, as well as the code that calls said functions, may need to be manually updated with the integer type change. That cannot be automated, because it's possible that an automated fix would silently introduce potential overflows not being handled. Some TODOs and FIXMEs for overflow checks are removed, since we remove some now unnecessary int64->int conversions. On the other hand, the older codecs based on refmt need to gain some overflow check TODOs, since refmt uses ints. That is okay for now, since we'll phase out refmt pretty soon. While at it, update codectools to use int64 for token Length fields, so that it properly supports full IPLD integers without machine-dependent behavior and overflow checks. The budget integer is also updated to be int64, since the lengths it uses are now int64. Note that this refactor needed changes to the Go code generator as well as some of the tests, for the purpose of updating all the code. Finally, note that the code-generated iterator structs do not use int64 fields internally, even though they must return int64 numbers to implement the interface. This is because they use the numeric fields to count up to a small finite amount (such as the number of fields in a Go struct), or up to the length of a map/slice. Neither of them can ever outgrow "int". Fixes #124.
-
- 01 Dec, 2020 2 commits
-
-
Eric Myhre authored
I dearly wish this wasn't such a dark art. But I really want these tests, too.
-
Eric Myhre authored
The docs in the diff should cover it pretty well. It's a reader-wrapper that does a lot of extremely common buffering and small-read operations that parsers tend to need. This emerges from some older generation of code in refmt with similar purpose: https://github.com/polydawn/refmt/blob/master/shared/reader.go Unlike those antecedents, this one is a single concrete implementation, rather than using interfaces to allow switching between the two major modes of use. This is surely uglier code, but I think the result is more optimizable. The tests include aggressive checks that operations take exactly as many allocations as planned -- and mostly, that's *zero*. In the next couple of commits, I'll be adding parsers which use this. Benchmarks are still forthcoming. My recollection from the previous bout of this in refmt was that microbenchmarking this type wasn't a great use of time, because when we start benchmarking codecs built *upon* it, and especially, when looking at the pprof reports from that, we'll see this reader showing up plain as day there, and nicely contextualized... so, we'll just save our efforts for that point.
-
- 14 Nov, 2020 6 commits
-
-
Eric Myhre authored
These aren't excersied yet -- and this is accordingly still highly subject to change -- but so far in developing this package, the pattern has been "if I say maybe this should have X", it's always turned out it indeed should have X. So let's just do that and then try it out, and have the experimental code instead of the comments.
-
Eric Myhre authored
Useful for tests that do deep equality tests on structures. Same caveat about current placement of this method as in the previous commit: this might be worth detaching and shifting to a 'codectest' or 'tokentest' package. But let's see how it shakes out.
-
Eric Myhre authored
This is far too useful in testing to reproduce in each package that needs something like it. It's already shown up as desirable again as soon as I start implementing even a little bit of even one codec tokenizer, and that's gonna keep happening. This might be worth moving to some kind of a 'tokentest' or 'codectest' package instead of cluttering up this one, but... we'll see; I've got a fair amount more code to flush into commits, and after that we can reshake things and see if packages settle out differently.
-
Eric Myhre authored
There were already comments about how this would be "probably" necessary; I don't know why I wavered, it certainly is.
-
Eric Myhre authored
You can write a surprising amount of code where the compiler will shrug and silently coerce things for you. Right up until you can't. (Some test cases that'll be coming down the commit queue shortly happened to end up checking the type of the constants, and, well. Suddenly this was noticable.)
-
Eric Myhre authored
The tokenization system may look familiar to refmt's tokens -- and indeed it surely is inspired by and in the same pattern -- but it hews a fair bit closer to the IPLD Data Model definitions of kinds, and it also includes links as a token kind. Presense of link as a token kind means if we build codecs around these, the handling of links will be better and most consistently abstracted (the current dagjson and dagcbor implementations are instructive for what an odd mess it is when you have most of the tokenization happen before you get to the level that figures out links; I think we can improve on that code greatly by moving the barriers around a bit). I made both all-at-once and pumpable versions of both the token producers and the token consumers. Each are useful in different scenarios. The pumpable versions are probably generally a bit slower, but they're also more composable. (The all-at-once versions can't be glued to each other; only to pumpable versions.) Some new and much reduced contracts for codecs are added, but not yet implemented by anything in this comment. The comments on them are lengthy and detail the ways I'm thinking that codecs should be (re)implemented in the future to maximize usability and performance and also allow some configurability. (The current interfaces "work", but irritate me a great deal every time I use them; to be honest, I just plain guessed badly at what the API here should be the first time I did it. Configurability should be both easy to *not* engage in, but also easier if you do (and in pariticular, not require reaching to *another* library's packages to do it!).) More work will be required to bring this to fruition. It may be particularly interesting to notice that the tokenization systems also allow complex keys -- maps and lists can show up as the keys to maps! This is something not allowed by the data model (and for dare I say obvious reasons)... but it's something that's possible at the schema layer (e.g. structs with representation strategies that make them representable as strings can be used as map keys), so, these functions support it.
-