- 25 Jun, 2019 1 commit
-
-
Eric Myhre authored
In thinking about how to make a 'bind' (aka, reflect and atlases) Node implementation, some interesting stuff comes up: despite being all one concrete Node implementation, it needs specialization (for the reflect.Value bound inside, specifically); and 'bind' nodes don't *necessarily* have a schema and a typed.Node associated with them, so the existing comments about specializing using Type info don't actually apply. So! What do? These comments are a tad hypothetical (and have been on my uncommitted working tree for a while, so hopefully they're not *too* stale; I just want to get them in history somewhere rather than keep dancing my patches around them)... but there's almost certainly something to address somewhere in this area.
-
- 29 Mar, 2019 2 commits
-
-
Eric Myhre authored
Including one interesting fix for dagjson. Since json can include whitespace -- and especially since our implementation currently uses prettyprinted json with quite a bit of said whitespace -- it's important to handle it consistently. We had a fun issue here: the json would be emitted with a trailing linebreak (as is generally what you want for printing to a terminal, etc!)... and thus hashed with it. Then, when loading the object, our parser will load exactly every byte needed to parse the object, then stop. Which... will cause it to return right before consuming that trailing linebreak. Which would cause that trailing linebreak to not be fed into the hasher, since we've carefully used a system which tees exactly the bytes consumed by the parser into the the hasher. So of course the link hash validation would fail. Woowee. I documented some of these details in an issue on the specs repo: https://github.com/ipld/specs/issues/108 There's not a super clear resolution over there as yet, but there seems to be a general agreement that whitespace should be tolerated, so... let's do so. As of this patch, the dagjson unmarshaller will consume all additional whitespace after finishing consumption of the json object iself. Dagcbor doesn't need a similar fix: there's no such thing as any possibility of other nonsemantic bytes, so there's nothing to absorb; and if we don't reach the end of the reader... we technically don't *care*: given the same reader over the same data being used to load the same link, we'll behave consistently; and therefore it follows that any additional bytes in the reader are unobservable to our universe. An earlier (and badly broken) draft of this attempted to put the read-to-end behavior in the cidlink package, but in addition to being unnecessary for dagcbor as described above, it also would've been simply *wrong*: the whitespace slurp is specific to dagjson. Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
The refmt step functions expose "done" a little more eagerly than this code figured on, so that's been corrected. Signed-off-by: Eric Myhre <hash@exultant.us>
-
- 21 Mar, 2019 1 commit
-
-
Eric Myhre authored
We now have both MapIterator and ListIterator interfaces. Both return key-value (or index-value) pairs, rather than just keys. List iterators may seem a tad redundant: you just loop over the length, right? Well, sure. But there's one place a list iterator shines: selecting only a subset of elements. And indeed, we'll be doing exactly that in the traversal/selector package; therefore, we definitely need list iterators. We might want keys-only iterators again in the future, but at present, I'm deferring that. It's definitely true that we should have iterators returning values as a core feature, since they're likely to be more efficiently supportable than "random" access (especially when we get to some Advanced Layout data systems), so we'll implement those first. Additionally, note that MapIterator now returns a Node for the key. This is to account for that fact that when using the schema system and typed nodes, map keys can be more *specific* types. Such nodes are still required to be kind==ReprKind_String, but string might not be their *preferred* native format (think: tuples with serialized to be delimiter-separated strings); we need to account for that. (MapBuilder.Insert method already takes a Node parameter for similar reasons: so it can take *typed* nodes. Node.TraverseField accepting a plain string is the oddball out here, and should be rectified.) Signed-off-by: Eric Myhre <hash@exultant.us>
-
- 19 Mar, 2019 1 commit
-
-
Eric Myhre authored
Having a function called "Kind" return a "ReprKind" was inconsistent. Also, we want to introduce a "Kind" method on `typed.Node` in the future. No logical content to this change: you can safely refactor with sed. Signed-off-by: Eric Myhre <hash@exultant.us>
-
- 16 Mar, 2019 3 commits
-
-
Eric Myhre authored
And fixes to all the encoding systems for checking lengths when they're provided by map and list start tokens. Inconsistencies there are now errors. And some consistency changes across all the encoders to keep the diff of the dag-json system as minimal as possible. (Dag-json needs to refer to the last handful of tokens sometimes when parsing a mapClose, so we keep their values outside of the loop body now.) Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
We now have CIDs support! You can create links backed by cids, and marshal them with dag-cbor; and you can unmarshal cbor data with dag-cbor and expect things with the CID link tag to be parsed into CIDs and exposed as IPLD Links. Yay! (Dag-json is lagging. The parse for those links is... more involved. When supported, it'll similarly have its own unmarshal and marshal just like the ones this diff introduces for dag-cbor.) Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
All of which now explicitly confess their cid-specificness. Some things in the 'repose' draft were *really* twisted; I'm glad that first draft is getting replaced before anything actually used it... For example, NodeBuilderChooser was just ridiculously misplaced abstraction. When doing a traversal, you have the local type information (if any) already in hand, and can just... pick an appropriate NodeBuilder already. Now that the NodeBuilder is simply a parameter to Link.Load, everything shakes out much, much more clearly as a result. The cidlink package contains all concrete referns to Cids. This implements the ipld.Link and ipld.LinkBuilder interfaces... but if you don't import the cidlink package in your program, you won't find any of the cid packages (nor their numerous transitive dependencies) in your dependency set. Multicodecs are now a registry which is confined in scope to the cidlink package. (It's global, but I think in practice this will be fine: it's a plugin system, and there's no good cause for allowing variations in how those magic bytes of cids are interpreted.) There are now dagcbor and dagjson packages for encoding. These explicitly refer to the cidlink package (and register themselves on package init). While these refer to cidlink, you could imagine we might also introduce other encoding packages which *don't*. Finally, note that the dagcbor and dagjson packages are in fact still not done. This is the same logic/completeness they had before this diff... which does not include actual parsing of cids! However, it's now clear where to introduce that and at what scope. (It will probably require more duplication of unmarshalling code than desirable, but, alas, that might simply be the cost of doing business. Dagjson in particular has topologically "interesting" things to handle that I'd be loathe to make a sufficiently pluggable unmarshal traversal to support; it would be possible, but likely a noticable slowdown to continuously check and then promptly disregard the interesting case.) Some work remains, but this is now pretty close to sanity. Signed-off-by: Eric Myhre <hash@exultant.us>
-
- 21 Feb, 2019 1 commit
-
-
Eric Myhre authored
We have both generic marshal and unmarshal -- they should work for any current or future ipld.Node implementation, and for any encoding mechanism that can be bridged to via refmt tokens. Tests are also updated to use builders rather than the ancient "mutable node" nonsense, which removes... I think nearly the last incident of that stuff; maybe we can remove it entirely soon. As when we moved the unmarshal code into its generic form, most of this code already existed and needed minor modification. Git even correctly detects it as a rename this time since the diff is so small. And as when we moved the unmarshal code, now we also remove the whole PushTokens interface; we've gotten to something better now. Finally we're getting to the point we can look at wiring these up together with all the multicodec glue and get link loading wizardry at full voltage. Yesss. Sooon. Signed-off-by: Eric Myhre <hash@exultant.us>
-
- 20 Feb, 2019 1 commit
-
-
Eric Myhre authored
This unmarshal works for any NodeBuilder implementation, tada! Old ipldfree.Node-specific unmarshal dropped... as well as that entire system of interfaces. They were first-pass stuff, and I think now it's pretty clear that it was barking up the wrong tree, and we've got better ideas to go with now instead. (Also, as is probably obvious from a skim, the old code flipped pretty clearly into the new code.) Turns out refmt tokens aren't a very relevant interface in IPLD. I'm still using them... internally, to wire up the CBOR and JSON parsers without writing those again. But the rest of IPLD is more like a full-on and principled alternative to refmt/obj and all its reflection code, and that's... pretty great. Earlier, I had a suspicion that we would want more interfaces for token handling on each Node implementation directly, and in particular I suspected we might use those token-based interfaces for doing transcription features that flip data from one concrete Node implementation into another. (That's why we had this ipldfree.Node-specialized impl in the first place.) **This turns out to have been wrong!** Instead, now that we have the ipld.NodeBuilder interface standard, that turns out to be much better suited to solving the same needs, and it can do so: - without adding tokens to the picture (simpler), - without requiring tokenization-based interfaces be implemented per concrete ipld.Node implementation (OH so much simpler), - and arguably NodeBuilder is even doing it *better* because it doesn't need to force linearization (and while usually that doesn't matter... one can perhaps imagine it coming up if we wanted to do a data transcription in memory into a Node implementation which has an orderings requirement). So yeah, this is a nice thing to have been wrong about. Much simpler now. Old ipldfree.Node-specialized 'PushTokens' is still around. Not for long, though; it just hasn't finished being ported to the new properly generalized style quite yet. Note, this is not the *whole* story, still, either. For example, still expect to have an ipldcbor.Node which someday has a *significantly* different set of marshal and unmarshal methods -- it may eschew refmt entirely, and take the (very) different strategy of building a skiplist over raw byte slices! -- that will exist *in addition* to the generic implementations we're doing here and now. More on that soon. Yeah. A lot of interfaces to get lined up, here. Some of them tug in such different directions that picking the right ones to make it all possible seems roughly like solving one of the NP-hard satisfiability problems. (Good thing it's actually with a small enough number of choices that it's tractable; on the other hand, enumerating those choices isn't fast, and the 'verifier' function here ain't fast either, and being a "design" thing, it can only be evaluated on human wetware. So yeah, an NP problem on a tractable domain but slow setup and slow verifier. Sounds about right.) (uh, I'm going to write a book "Design: It's Hard: The Novel" after this.) Tests are patched enough to keep working where they are; I think it's possible that a reshuffle of some of them to be more closely focused on the marshal code rather than the node implementation packages might be in order, but I'm going to let that be a future issue. (Oh, and they did shine a light on one quick issue about MapBuilder initialization, which is also now fixed.) Signed-off-by: Eric Myhre <hash@exultant.us>
-