1. 25 Jun, 2019 1 commit
    • Eric Myhre's avatar
      Comments about future needs on nodebuilder. · 11bf0d4e
      Eric Myhre authored
      In thinking about how to make a 'bind' (aka, reflect and atlases) Node
      implementation, some interesting stuff comes up: despite being all
      one concrete Node implementation, it needs specialization (for the
      reflect.Value bound inside, specifically); and 'bind' nodes don't
      *necessarily* have a schema and a typed.Node associated with them,
      so the existing comments about specializing using Type info don't
      actually apply.  So!  What do?
      
      These comments are a tad hypothetical (and have been on my uncommitted
      working tree for a while, so hopefully they're not *too* stale; I just
      want to get them in history somewhere rather than keep dancing my
      patches around them)... but there's almost certainly something to
      address somewhere in this area.
      11bf0d4e
  2. 29 Mar, 2019 2 commits
    • Eric Myhre's avatar
      Test that dagcbor and dagjson roundtrip cidlinks. · 7106176e
      Eric Myhre authored
      Including one interesting fix for dagjson.
      
      Since json can include whitespace -- and especially since our
      implementation currently uses prettyprinted json with quite a bit of
      said whitespace -- it's important to handle it consistently.
      
      We had a fun issue here: the json would be emitted with a trailing
      linebreak (as is generally what you want for printing to a terminal,
      etc!)... and thus hashed with it.  Then, when loading the object,
      our parser will load exactly every byte needed to parse the object,
      then stop.  Which... will cause it to return right before consuming
      that trailing linebreak.
      
      Which would cause that trailing linebreak to not be fed into the
      hasher, since we've carefully used a system which tees exactly the
      bytes consumed by the parser into the the hasher.
      
      So of course the link hash validation would fail.  Woowee.
      
      I documented some of these details in an issue on the specs repo:
      https://github.com/ipld/specs/issues/108
      There's not a super clear resolution over there as yet, but there seems
      to be a general agreement that whitespace should be tolerated, so...
      let's do so.
      
      As of this patch, the dagjson unmarshaller will consume all additional
      whitespace after finishing consumption of the json object iself.
      
      Dagcbor doesn't need a similar fix: there's no such thing as any
      possibility of other nonsemantic bytes, so there's nothing to absorb;
      and if we don't reach the end of the reader... we technically don't
      *care*: given the same reader over the same data being used to load
      the same link, we'll behave consistently; and therefore it follows that
      any additional bytes in the reader are unobservable to our universe.
      
      An earlier (and badly broken) draft of this attempted to put the
      read-to-end behavior in the cidlink package, but in addition to being
      unnecessary for dagcbor as described above, it also would've been
      simply *wrong*: the whitespace slurp is specific to dagjson.
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      7106176e
    • Eric Myhre's avatar
      Encoding tests and fixtures and fixes. · 00a58f7a
      Eric Myhre authored
      The refmt step functions expose "done" a little more eagerly than this
      code figured on, so that's been corrected.
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      00a58f7a
  3. 21 Mar, 2019 1 commit
    • Eric Myhre's avatar
      Iterator refactor: entry-based, for map and list. · b84e99cd
      Eric Myhre authored
      We now have both MapIterator and ListIterator interfaces.
      Both return key-value (or index-value) pairs, rather than just keys.
      
      List iterators may seem a tad redundant: you just loop over the length,
      right?  Well, sure.  But there's one place a list iterator shines:
      selecting only a subset of elements.  And indeed, we'll be doing
      exactly that in the traversal/selector package; therefore, we
      definitely need list iterators.
      
      We might want keys-only iterators again in the future, but at present,
      I'm deferring that.  It's definitely true that we should have iterators
      returning values as a core feature, since they're likely to be more
      efficiently supportable than "random" access (especially when we get to
      some Advanced Layout data systems), so we'll implement those first.
      
      Additionally, note that MapIterator now returns a Node for the key.
      This is to account for that fact that when using the schema system and
      typed nodes, map keys can be more *specific* types.  Such nodes are
      still required to be kind==ReprKind_String, but string might not be
      their *preferred* native format (think: tuples with serialized to be
      delimiter-separated strings); we need to account for that.
      (MapBuilder.Insert method already takes a Node parameter for similar
      reasons: so it can take *typed* nodes.  Node.TraverseField accepting
      a plain string is the oddball out here, and should be rectified.)
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      b84e99cd
  4. 19 Mar, 2019 1 commit
    • Eric Myhre's avatar
      Naming: ReprKind. · fe099392
      Eric Myhre authored
      Having a function called "Kind" return a "ReprKind" was inconsistent.
      
      Also, we want to introduce a "Kind" method on `typed.Node` in the future.
      
      No logical content to this change: you can safely refactor with sed.
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      fe099392
  5. 16 Mar, 2019 3 commits
    • Eric Myhre's avatar
      Dag-json marshal and unmarshal. · 10442b3c
      Eric Myhre authored
      And fixes to all the encoding systems for checking lengths when they're
      provided by map and list start tokens.  Inconsistencies there are now
      errors.
      
      And some consistency changes across all the encoders to keep the diff
      of the dag-json system as minimal as possible.  (Dag-json needs to
      refer to the last handful of tokens sometimes when parsing a mapClose,
      so we keep their values outside of the loop body now.)
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      10442b3c
    • Eric Myhre's avatar
      Dag-cbor marshal/unmarshal. · a274b2db
      Eric Myhre authored
      We now have CIDs support!  You can create links backed by cids,
      and marshal them with dag-cbor; and you can unmarshal cbor data
      with dag-cbor and expect things with the CID link tag to be parsed
      into CIDs and exposed as IPLD Links.  Yay!
      
      (Dag-json is lagging.  The parse for those links is... more involved.
      When supported, it'll similarly have its own unmarshal and marshal
      just like the ones this diff introduces for dag-cbor.)
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      a274b2db
    • Eric Myhre's avatar
      Repose refactored into several packages. · 57832ad7
      Eric Myhre authored
      All of which now explicitly confess their cid-specificness.
      
      Some things in the 'repose' draft were *really* twisted; I'm glad that
      first draft is getting replaced before anything actually used it...
      
      For example, NodeBuilderChooser was just ridiculously misplaced abstraction.
      When doing a traversal, you have the local type information (if any) already
      in hand, and can just... pick an appropriate NodeBuilder already.  Now that
      the NodeBuilder is simply a parameter to Link.Load, everything shakes out
      much, much more clearly as a result.
      
      The cidlink package contains all concrete referns to Cids.  This implements
      the ipld.Link and ipld.LinkBuilder interfaces... but if you don't import
      the cidlink package in your program, you won't find any of the cid packages
      (nor their numerous transitive dependencies) in your dependency set.
      
      Multicodecs are now a registry which is confined in scope to the cidlink
      package.  (It's global, but I think in practice this will be fine: it's a
      plugin system, and there's no good cause for allowing variations in how
      those magic bytes of cids are interpreted.)
      
      There are now dagcbor and dagjson packages for encoding.  These explicitly
      refer to the cidlink package (and register themselves on package init).
      While these refer to cidlink, you could imagine we might also introduce
      other encoding packages which *don't*.
      
      Finally, note that the dagcbor and dagjson packages are in fact still not
      done.  This is the same logic/completeness they had before this diff...
      which does not include actual parsing of cids!  However, it's now clear
      where to introduce that and at what scope.  (It will probably require
      more duplication of unmarshalling code than desirable, but, alas, that
      might simply be the cost of doing business.  Dagjson in particular has
      topologically "interesting" things to handle that I'd be loathe to make
      a sufficiently pluggable unmarshal traversal to support; it would be
      possible, but likely a noticable slowdown to continuously check and
      then promptly disregard the interesting case.)
      
      Some work remains, but this is now pretty close to sanity.
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      57832ad7