1. 16 Mar, 2019 2 commits
    • Eric Myhre's avatar
      Begin introduction of new linking interfaces. · 6950c5cf
      Eric Myhre authored
      This is going to be the start of a pretty hectic set of commits.
      
      1. The LinkLoader, LinkBuilder, and LinkContext types currently in
        the traversal package are hereby doomed (and will be deleted by
        the time this branch is ready to land).
      
      2. Cid itself will disappear from almost all remaining concrete uses.
        For example, traversal.TraversalProgress.LastBlock will use Link
        instead of Cid.  (You'll have to cast back to Cid if that detail
        is important to your application!)
      
      3. The 'repose' package is pretty much getting nuked.
      
      Instead: we have these new interfaces.  Link will be abstract.
      
      We'll add a linking package with a subpackage containing implementation
      with Cids.  Our encoding systems will then also live there: this makes
      sense since multicodec is definitely a detail associated with Cids.
      We'll also have dag-cbor and dag-json specific encodings in subpackages
      associated with this whole thing: those will be able to read and reify
      Link instances during their handling of serial data.  This too is
      parsimonious and correct (e.g. dag-json parsing is technically distinct
      from regular json parsing, even if it's experientially close).
      
      And if all goes according to plan, we will -- shockingly -- be able to
      use almost the entirety of the IPLD system... *without* forming an
      explicit import dependency on the Cid packages *until* we directly use
      any link loading features based on Cids.  That'll be neat.
      
      All these changes will be staged across a series of commits because
      the total diff will definitely be something fierce.
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      6950c5cf
    • Eric Myhre's avatar
      Move Path to the root package. · d0edb867
      Eric Myhre authored
      Yes, this one has moved about a bunch now.  Hopefully this is the last time.
      It's true and elegant that paths really only emerge as descriptions of traversal
      progress; however, this misses a few other practicalities.
      There are other kinds of traversal (other than the traversal package, whoa!) out
      there: see the typesystem packages, which had grown a custom path implementation
      simply for error reporting messages!
      In general, we seem to want to have Path around for logging and errors, which
      will make it increasingly desirable to have available in the root package when
      we begin to clean up towards strongly typed errors.
      And we also need Path for LinkContext -- and wherever that comes to rest, it
      definitely need to not be an import cycle problem, which it *is* if Path is
      in traversal and we wanted LinkContext to be *anywhere* else.
      All of these point to moving Path back up to the root, and the errors concern
      in particular cinches it.
      
      Drop the Path.traverse method.  As has been noted in its comment for a while
      now, that method wasn't useful for much, having been replaced by features
      in the traversal package.
      
      (Also drop the tests specific to the Path.traverse method.  We should write
      more tests against the features now implemented by travesral.Focus... but at
      this point, it'd be easier to start over.  The tests we're dropping are against
      a different model of traversal (returns rather than visitors) and are also
      built against the old hacky ipldfree mutable model which is deprecated and
      soon to be dropped.)
      
      Also, a small docs fix: drop description of this Path implementation as a
      "merklepath" -- it is not.  These paths are all relative, and do not contain
      an innate understanding of hashed object identifiers at their first segment.
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      d0edb867
  2. 13 Mar, 2019 5 commits
  3. 12 Mar, 2019 6 commits
  4. 08 Mar, 2019 1 commit
  5. 04 Mar, 2019 1 commit
  6. 27 Feb, 2019 2 commits
    • Eric Myhre's avatar
      Beginye another pass on docs. · d168547a
      Eric Myhre authored
      Starting with a new index, which enumerates lots of things which shall
      deserve at least one page or section.
      
      Big picture also includes links out to things which shall require their
      own (as yet unwritten) pages.
      
      Old ball-of-mud "dev.md" already being eroded into the new files.
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      d168547a
    • Eric Myhre's avatar
      Merge branch 'traversals'. · aaea73ad
      Eric Myhre authored
      Incomplete, but progress.
      
      There are still many many todos in the content, but that branch
      has simply gotten long enough.  And I need to spend some time
      docuemnting (and docs have diverged on master), so it's time to
      reel it in.
      aaea73ad
  7. 21 Feb, 2019 6 commits
    • Eric Myhre's avatar
      Merge branch 'encoding' into traversals · cd841fb1
      Eric Myhre authored
      cd841fb1
    • Eric Myhre's avatar
      Wiring JSON and CBOR to multicodec in repose! Yas! · 749cf35c
      Eric Myhre authored
      Finally.
      
      This is "easy" now that we've got all the generic marshalling
      implemented over Node and unmarshalling onto NodeBuilder:
      we just glue those pieces together with the JSON and CBOR
      parsers and emitters of refmt, and we're off.
      
      These methods in the repose package further conform this to
      byte-wise reader and writer interfaces so we can squeeze them
      into Multicodec{En|De}codeTable... and thence feed them
      into ComposeLink{Loader|Builder} and USE them!
      
      And by use them, I mean not just use them, but use them transparently
      and without a fuss, even from the very high level Traverse and
      Transform methods, which let users manipulate Nodes and NodeBuilders
      with *no idea* of all the details of serialization down here.
      
      Which is awesome.
      
      (Okay, there's a few more todo entries to sort out in the link
      handling stuff.  Probably several.)
      
      But this should about wrap it up with encoding stuff!  I'm going
      to celebrate with a branch merge, and consider the main topic to
      be lifted back up to the Traverse stuff and how to integrate
      cleanly with link loaders and link builders.
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      749cf35c
    • Eric Myhre's avatar
      New marshal implementation! Generic. Woo! · f150a81b
      Eric Myhre authored
      We have both generic marshal and unmarshal -- they should work for any
      current or future ipld.Node implementation, and for any encoding
      mechanism that can be bridged to via refmt tokens.
      
      Tests are also updated to use builders rather than the ancient
      "mutable node" nonsense, which removes... I think nearly the last
      incident of that stuff; maybe we can remove it entirely soon.
      
      As when we moved the unmarshal code into its generic form, most of
      this code already existed and needed minor modification.  Git even
      correctly detects it as a rename this time since the diff is so small.
      And as when we moved the unmarshal code, now we also remove the
      whole PushTokens interface; we've gotten to something better now.
      
      Finally we're getting to the point we can look at wiring these up
      together with all the multicodec glue and get link loading wizardry
      at full voltage.  Yesss.  Sooon.
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      f150a81b
    • Eric Myhre's avatar
      Remove unused field in fluent.NodeBuilder. · 5f70f2e3
      Eric Myhre authored
      Totally unexported, never touched.  Baleet.
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      5f70f2e3
    • Eric Myhre's avatar
      Remove Build from fluent.{Map|List}Builder view. · 0776b93b
      Eric Myhre authored
      Per comments in this diff as well as discussion in previous commit's
      message, there's no need for it to be exported anymore.
      
      In fact, the only possible way it could've been used would've been
      invalid -- calling those methods in the closure body would've caused
      the list/map builder to invalidate itself, and make the library's
      later Build call almost certainly error.  So, it's very much better
      not to have it at exported at all.
      
      This has brought us to an odd result: we could very nearly be dropping
      the fluent.{Map|List}Builder *interfaces* (though of course keeping
      their internal error-into-panic implementations) and just continuing
      to fit the existing ipld.{Map|List}Builder interfaces (but you'd be
      within your rights to ignore all error returns; they'd be redundant).
      Would this be helpful, though?  I dunno.  So let's not.
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      0776b93b
    • Eric Myhre's avatar
      Significant changes to fluent.{List|Map}Builder. · f316102c
      Eric Myhre authored
      Two changes are combined here: these builders now work in tandem with
      closures (and call-chaining style is being backed away from entirely);
      and, these new closures also get NodeBuilder values supplied to them.
      
      This correctly prepares for NodeBuilder specializations to be viable
      in the future.  Although this need is not "yet" exercised,
      typed.NodeBuilder won't be able to keep its semantics properly
      without this -- It's similar to the existing comments in Unmarshal code
      which mention how we'll require specializations for typed.Node usage
      in the future around the recursion sites there.  And while this isn't
      exercised yet, getting the API correctly-shaped earlier than later
      will save a lot of refactoring fuss.
      
      I have also just plain been wanting a closure style syntax, because it
      nests nicely and more does the right things with visual alignment.
      So now we have one.  Nice.
      
      The diffs in the test code using this new style of builder should be
      a nice example of how this looks compared to the old API syntax.
      I think those diffs speak for themselves.
      
      (It absolutely would be even nicer if we had a syntax for declaring
      closures in golang that could infer the types of their arguments
      without putting such a heavy textual weight on the closure declaration
      site.  There's no way to do this in Golang currently, but I'm pretty
      certain it would be a feasible language enhancement without much
      in the way of theoretic stretches nor significant complier complexity
      nor additional user complexity.  This might be something to pursue.)
      
      ---
      
      One interesting design point to debate within the closure design here:
      should we really try to 'save' the closure from calling Build?
      (This code does.)
      
      On a first thought, perhaps not: doing the Build call in the library
      code deprives the author of the closure from being able to Recover
      errors from the final Build call if desired.
      Doubly problematically, stack traces from those Build errors would
      contain worrying less meaningful stack traces, since such errors would
      be raised from library code rather than the closure.
      
      On the second thought: both those issues seem easily addressed in
      in actual usage.  The stack trace will still capture the line of the
      `CreateFoo(fn(...` invokation, and that line should be sufficiently
      proximate to user (rather than library) code to be meaningful.
      (There's approximately no reason I can imagine to be passing any
      ListBuildingClosure functions from far away.  If we had ruby syntax,
      this stuff would all be a 'do' block at the end of line, and
      it would be syntactically unsupported to provide from a distance.)
      It's even possible to wrap exactly the `CreateFoo` call in a Recover,
      which makes it possible to get every bit as much control as if we
      made the Build call the closure's responsibility and wrapped it there.
      
      In light of those 'second' thoughts -- and also because having the
      {Map|List}BuildingClosure definitions include an 'ipld.Node' return
      type *in addition* to their already irritating length is a not
      insigificant exacerbation of usage friction -- I'm opting to keep
      the Build call in the fluent library code for now.
      
      ---
      
      (This fix comes here and now because writing additional test fixtures
      for the next-up marshalling code without these QoL improvements was
      just not any fun, and the more critical future issues jumped in front
      of my eyes at roughly the same time.)
      
      ---
      
      Also worth noting in this diff: the large REVIEW comment in the
      MapBuildingClosure docs about having a NodeBuilder for keys.
      
      I haven't added all of the notes I have on the strings-vs-Nodes
      questions around map key APIs to tracking in git, and probably should;
      it's turned out more fiddly than expected.  This comment here probably
      already gives a fair insight into both how and why and how much so.
      
      The NodeBuilder-for-keys param is included in MapBuildingClosure
      for now out of an abundance of caution.  But this is indeed worthy
      of review and might be elided in the future.
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      f316102c
  8. 20 Feb, 2019 2 commits
    • Eric Myhre's avatar
      New unmarshal implementation! Generic. Woo! · be01e1e5
      Eric Myhre authored
      This unmarshal works for any NodeBuilder implementation, tada!
      
      Old ipldfree.Node-specific unmarshal dropped... as well as that entire
      system of interfaces.  They were first-pass stuff, and I think now it's
      pretty clear that it was barking up the wrong tree, and we've got better
      ideas to go with now instead.  (Also, as is probably obvious from a skim,
      the old code flipped pretty clearly into the new code.)
      
      Turns out refmt tokens aren't a very relevant interface in IPLD.
      I'm still using them... internally, to wire up the CBOR and JSON
      parsers without writing those again.  But the rest of IPLD is more
      like a full-on and principled alternative to refmt/obj and all its
      reflection code, and that's... pretty great.
      
      Earlier, I had a suspicion that we would want more interfaces for token
      handling on each Node implementation directly, and in particular I
      suspected we might use those token-based interfaces for doing transcription
      features that flip data from one concrete Node implementation into another.
      (That's why we had this ipldfree.Node-specialized impl in the first place.)
      **This turns out to have been wrong!**  Instead, now that we have the
      ipld.NodeBuilder interface standard, that turns out to be much better suited
      to solving the same needs, and it can do so:
      
      - without adding tokens to the picture (simpler),
      
      - without requiring tokenization-based interfaces be implemented per
      concrete ipld.Node implementation (OH so much simpler),
      
      - and arguably NodeBuilder is even doing it *better* because it doesn't
      need to force linearization (and while usually that doesn't matter... one
      can perhaps imagine it coming up if we wanted to do a data transcription
      in memory into a Node implementation which has an orderings requirement).
      
      So yeah, this is a nice thing to have been wrong about.  Much simpler now.
      
      Old ipldfree.Node-specialized 'PushTokens' is still around.  Not for long,
      though; it just hasn't finished being ported to the new properly generalized
      style quite yet.
      
      Note, this is not the *whole* story, still, either.  For example, still
      expect to have an ipldcbor.Node which someday has a *significantly* different
      set of marshal and unmarshal methods -- it may eschew refmt entirely,
      and take the (very) different strategy of building a skiplist over raw
      byte slices! -- that will exist *in addition* to the generic implementations
      we're doing here and now.  More on that soon.
      
      Yeah.  A lot of interfaces to get lined up, here.  Some of them tug in such
      different directions that picking the right ones to make it all possible
      seems roughly like solving one of the NP-hard satisfiability problems.
      (Good thing it's actually with a small enough number of choices that it's
      tractable; on the other hand, enumerating those choices isn't fast, and
      the 'verifier' function here ain't fast either, and being a "design" thing,
      it can only be evaluated on human wetware.  So yeah, an NP problem on a
      tractable domain but slow setup and slow verifier.  Sounds about right.)
      
      (uh, I'm going to write a book "Design: It's Hard: The Novel" after this.)
      
      Tests are patched enough to keep working where they are; I think it's
      possible that a reshuffle of some of them to be more closely focused on
      the marshal code rather than the node implementation packages might be
      in order, but I'm going to let that be a future issue.  (Oh, and they
      did shine a light on one quick issue about MapBuilder initialization,
      which is also now fixed.)
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      be01e1e5
    • Eric Myhre's avatar
      Fixing MapBuilder error exposure. · eafc200a
      Eric Myhre authored
      This is the first commit going down a long and somewhat dizzying prerequisite tree:
      
      - For graphsync (an out-of-repo consuming project) we need selectors
      - For Selectors we need traversal implemented
      - For Traversal implementations we need link loaders [‡]
      - For link loading we need all deserialization implemented
      - (and ideally, link creation is done at the same time, so we don't get surprised by any issues with the duals later)
      - and it turns out for deserialization, we now have some incongruities with the earlier draft at MapBuilder...
      
      So we're all the way at bugfixes in the core ipld.MapBuilder API.  Nice.
      
      ([‡] Some of those jumps are a little strained.  In particular, traversal doesn't
      *in general* need link loaders, so we might choose a very different implementation
      order (probably one that involves me having a lot less headaches)... *except*,
      since our overall driver for implementation order choices right now is graphsync,
      we particularly need traversals crossing block boundaries since we're
      interested in making sure selectors do what graphsync needs.  Uuf.)
      
      What's the MapBuilder design issue?  Well, not enough error returns, mostly.
      We tried to put the fluent call-chaining API in the wrong place.
      
      Why is this suddenly an issue now?  Well, it turns out that properly genericising
      the deserialization needs to be able to report error states like invalid
      repeated map keys promptly.
      
      Wait, didn't we even *have* deserialization code before?  Yes, yes we did.
      It's just that that draft was specialized to the ipldfree.Node implementation...
      and so it doesn't hold up anymore when we need it to work in general traversal.
      
      Okay, so.  That's the stack depth.
      
      With all that in mind...
      
      This diff adds more error return points to ipld.MapBuilder, and maintains the
      fluent variant more closely matching the old behavior (including the
      call-chaining style), and fixes tests that relied on this syntax.
      
      Duplicate keys rejection is also in this commit.  I thought about splitting it
      into further commits, but it's so small.  (We may actually need more work in
      the future to enable Amend+(updating)Insert, but that's for later; perhaps
      an Upsert method; whatever, I dunno, out of scope for thought at the moment.)
      
      And then we'll carry on, one step at a time, from there.  Whew.
      
      ---
      
      Sidebar: also dropping MapBuilder.InsertAll(map[Node]Node) from the interface.
      I think this could be better implemented as a functional feature that works
      over a MapBuilder than being a builtin, and we should prefer a trim MapBuilder.
      And might as well drop it now rather than bother fixing it up just to remove later.
      
      ---
      
      ipld.ListBuilder also updated to no longer do a call-chaining style API, while
      fluent.ListBuilder continues to do so.  This is mainly for consistency;
      we don't have the same potential for mid-build error conditions for lists
      as we do with maps, but ipld.ListBuilder and ipld.MapBuilder should be similar.
      
      ---
      
      Aaaaand one more!  NodeBuilder.{Create,Append}{Map,List}() have ALL been
      updated to also return errors.  Previously, the append methods had an error
      state if you used them when the NodeBuilder was bound to a predecessor node
      of an unmatching type, but they just swallowed them into the builder and
      regurgitated them (much) later; we're no longer doing this.  Additionally,
      it's occurred to me that *typed* builders -- while not so much a thing, yet,
      certainly a thing that's coming -- will even potentially error on CreateMap
      and CreateList methods, according to their type constraints.  So, jump that now.
      
      ...
      
      Yeah, basically a whole tangle of misplaced optimism about error paths in
      NodeBuilder and its whole set of siblings has been torn through at once here.
      Bandaid ripping sound.
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      eafc200a
  9. 14 Feb, 2019 3 commits
    • Eric Myhre's avatar
      Introduce repose.NodeBuilderChooser. · dfd17c2e
      Eric Myhre authored
      Mea cupla for all these absolutely terrible placeholder names.
      
      Previous comment on MulticodecDecoder about it probably currying a
      NodeBuilder inside itself was wrong.  (Message of the previous commit
      explores this in further detail already, so I won't repeat that here.)
      
      We'll be making concrete implementations of MulticodecDecoder and
      MulticodecDecoder in this library for at least JSON and CBOR (for
      "batteries-included" operation on the most common uses); other
      implementations (like git, etc) will of course be possible to register,
      but probably won't be included in this repo directly (for the sake of
      limiting dependency sprawl).  When we get to them, those two core
      implementations should be a good example of the probing-for-interfaces
      described in the comments here.
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      dfd17c2e
    • Eric Myhre's avatar
      Link loaders and their dual. · a50962ae
      Eric Myhre authored
      This is a first and incomplete draft, and also with placeholder names.
      
      The whole topic of loading links and writing nodes out to a hashable
      linkable form is surprisingly parameterizable.
      On the one hand, great, flexibility is good.
      On the other hand, this makes it intensely difficult to provide a
      simple design that focuses on the essense of user wants and needs
      rather than getting bogged down in a torrent of minutiae.
      
      We want to be able to support a choice of abstraction in storage,
      for example -- getting and putting of raw bytes -- without forcing
      simultaenously engaging with and reimplementing the usage of multihash,
      the encoding and multicodec tagging, and so on.
      This is easier said than done.
      
      The attempt here is to have some function interfaces which do the
      aggregated work -- and these are what we'll actually use in the rest of
      the codebase, such as in the TraversalConfig for example -- and have
      some functions to build them out of closures that bind together all
      the other details of codecs and hashing and so forth.
      In theory you could build your own LinkLoader entirely from whole
      cloth; in practice, we expect to use ComposeLinkLoader in almost 100%
      of all cases.
      
      I think this is on the overall right track, but there's at least a few
      goofs in this draft: specifically, at the moment, somehow the pick of
      Node impl got kicked to all the way inside the MulticodecDecoder.
      This is almost certainly wrong: we *absolutely* want to support users
      picking different Node implementations without having to reconfigure
      the wiring of marshal and unmarshal!  For example: we need to be able
      to load things into typed.Node implementations with concrete types
      that were produced by codegen; and we don't want to force rewriting
      the cbor parser every time a user needs to do this for a new type!
      Something further is required here.  We'll iterate in coming commits.
      
      The multicodec tables are an attempt to solve that parameter in a
      fairly straightforward plugin-supporting way.  This is important
      because IPLD has support for some interesting forms of serializaton
      such as "git" -- and while we want to support this, we also absolutely
      do not want to link that as a strict essential dependency of the core
      IPLD library.  Where exactly this is going should also become clearer
      in future commits -- and remember, as described in the previous
      paragraph, at least one thing is completely wrong here and needs
      a second round of draft to unkink it.
      
      There are some inline comments about deficiencies in the multihash
      interfaces that are currently exported.  Some improvements upstream
      would be nice; for now, I'm simply coding as if streaming use *was*
      supported, so everything will be that much easier to fix when we get
      the upstream straightened out.
      
      The overall number of parameters in LinkBuilder is still pretty
      overwhelming, frankly, but I don't know how to get it down any further.
      
      The saving grace of all this is that when it's put to work in the
      traversal package, things like traversal.Transform will simply stash
      the metadata for CID and multicodec and multihash from any links
      traversed... and then if an update is propagated through, it'll just
      pop that metadata back up for the saving of the new "updated" Node.
      The whole lifetime of that metadata is on the stack and the user
      never should be bothered by it as long as they're doing "updates".
      (Users creating *new* objects get the full facefull of choices,
      of course.  I don't know if anything can be done about that.)
      
      It's possible that the whole
      "multicodecType uint64, multihashType uint64, multihashLength int"
      suite of arguments to LinkBuilder should be conveyed in a tuple;
      it's not really likely that they'll ever vary independently, and in
      general whole applications have probably picked one set of values
      and will stick with it application-wide.  The `cid.Prefix` struct
      even matches this already.  But it's... not really clear what's
      going on in that package, frankly.  There seem to be several competing
      variations on builder patterns and I've got no idea which one we're
      "supposed" to be using.  Therefore, I'm not using cid.Prefix for
      this purpose until some more conversations are had about that.
      
      Also note ActualStorer (placeholder name) and StoreCommitter
      as a pairing of functions.  Previous generations of library have
      conjoined this, based on assumption that we'll have "blocks", and
      those are small enough that it's "fine" to have them completely
      in memory, and do a hashing pass on them before beginning to write.
      This is nonsense and the buck stops here.  A two-phase operation
      with commit to a hash at the *end* is a strictly more powerful
      model -- it's easy to implement buffering and all-at-once write
      when you have a two-phase interface; it's impossible to implement
      a useful two-phase interface when you're forced to start from
      an overbuffered single-step interface -- and lends itself better
      to efficiency and a lower high-water-mark for memory usage.
      Making this choice here in the IPLD libraries might make it
      moderately harder to connect this to existing IPFS blockstore code...
      but it'll also make it easier to write other correct storage and
      loaders (think: opening a single local file with a temp name, and
      atomically moving it to a correct CAS location in the commit call),
      as well as eventually be on the right path when we start fixing other
      IPFS library components to be more streaming friendly.
      
      tl;dr design work.  It's hard.
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      a50962ae
    • Eric Myhre's avatar
      Outline of Traverse* funcs; comment about link loaders in TraversalConfig;... · 67abbce4
      Eric Myhre authored
      Outline of Traverse* funcs; comment about link loaders in TraversalConfig; confession of working draft of Selector interface; fixed typo in AdvVisitFn return types.
      
      Yes, that's lot of things heaped in one commit.
      
      Yes, (unfortunately,) they're all related.  Almost the entirety of this
      has to manifest *at once*, all seamlessly connected, for any part of
      it to get off the ground.
      
      Note how very much is handwaved in the comments about TraversalConfig
      having linkloader tools of some kind.  There's a lot to unpack there.
      Getting it to work well, and let users of the library supply functions
      that customize the parts that are interesting to the user, while *not*
      getting them hamstrung by a bunch of details about multicodecs and
      multihashing (unless they want to!)... requires careful design work.
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      67abbce4
  10. 12 Feb, 2019 5 commits
    • Eric Myhre's avatar
      Merge branch 'docs-re-migrations' · f2bc16e3
      Eric Myhre authored
      f2bc16e3
    • Eric Myhre's avatar
      Moremore schema migration prose. · 94cf517c
      Eric Myhre authored
      I think this is as much as I can write about this for now.
      
      Regarding that last bit about not having total migration magic:
      I'd certainly be neato to offer more auto-migration tools, based on
      perhaps a "patch"ing approach as outlined in
      https://github.com/ipld/js-ipld/issues/66#issuecomment-266345127 ,
      or on generalized recursion schemes, or a combination.
      However... that's a tad downstream of the present ;)
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      94cf517c
    • Eric Myhre's avatar
      More schema migration prose. · deeeacc5
      Eric Myhre authored
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      deeeacc5
    • Eric Myhre's avatar
      49ebef9b
    • Eric Myhre's avatar
      traversal as a package! Big diff; docs!, etc. · 69f89f4b
      Eric Myhre authored
      All traversal/transform/selector/path code to date got a rethink.
      
      For the longest time, I was trying to keep all the traversal code in
      the main root IPLD package, thinking/hoping that the number of relevant
      functions would be minimal enough that this would be fine.
      
      Somewhere around the time where I conceded that yes, we *do* need
      separate methods for writable and readonly paths (it's true that
      readonly is a degenerate case of writable -- but they have different
      performance characteristics!), I gave up: the number of functions in
      play is too high to heap alltogether in the root package namespace.
      
      This leads to a number of other surprising results: namely, that
      ipld.Path *isn't* -- it's actually traversal.Path!  We *don't use*
      paths anywhere else except as guidance to some forms of traversal,
      and logs and progress markers during traversal.  Huh!
      Despite being initially surprising, I now regard this as a *good*
      code smell.
      
      ipld.Traversal as an interface and ipld.Path.Traverse are dropped.
      These turned out to be the wrong level of abstraction:
      it's both missing things, and not able to return enough things.
      By the time we would fix both, it's not an abstraction anymore.
      
      Path.Traverse didn't know how to use a link loader func -- and
      that's correct; it shouldn't -- and it also didn't retain and
      return a stack of Nodes it traversed -- which again, it shouldn't,
      because in general that would be useless overkill and resource waste...
      it's just that both these things are essential in some use cases.
      So!  Cut that gordian knot.
      
      The Focus/FocusTransform/Traverse/TraverseTransform methods will now
      do all recursion work themselves, internally.  The transform family
      will keep stacks of nodes encountered along the way, to aid in the
      (presumably) coming tree re-build; the focus family won't.
      All of these methods will be able to support link loading, though
      implementation of that is still upcoming.
      
      There's a few more fields in TraversalProgress, also preparing for when
      we have automatic link loading and traversal: we'll want to be able
      to inform the user's callbacks about load edges.  (I'm not a fan of the
      term 'block' here, but running with it for the moment.)
      
      TraversalConfig is expected to grow.  Namely, to include link
      loaders... and subsequently, probably block writers, as well!  This is
      a big subject, though.  It'll probably take several commits and
      possibly more than one iteration over time to hammer out a good API.
      
      One thing that's interesting to note that neither TraversalConfig nor
      TraversalProgress will grow to contain certain things: for example,
      a map of "seen" CIDs when doing link traversal.  Why?  Well, if you've
      been mentally modelling things as a graph, this is surprising; surely
      any graph traversal needs to remember a set of already-visited nodes.
      The trick is... we're *not* doing a graph traversal -- not exactly!
      Since our visitor gets a (Node,Path) *tuple*... it would be wrong to
      memoize on anything other than the complete tuple; and the Path part
      of the tuple means we're actually doing a *tree* walk rather than
      a graph walk.  (And of course, since we're operating on DAGs,
      termination guarantees are already a non-issue.)  Tada; no "seen" set.
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      69f89f4b
  11. 08 Feb, 2019 1 commit
    • Eric Myhre's avatar
      Updated Traversal; introduce TraversalProgress. · 3aa48459
      Eric Myhre authored
      The policy on partial progress results is now explicitly "no".
      It wasn't useful.
      Tests have been updated accordingly.  (The relevant tests *do* still
      still assert that the traversal errors happen where they should; it's
      just via the... well, the error value.  Sense, neh?)
      
      We're introducing a TraversalProgress struct.  This is roughly
      following "Option 2" from the earlier "this composition is imperfect!"
      lamentation block: it allows us to keep carrying Path from existing
      moves across the graph at all times.
      
      (I suspect we're going to incur somewhat unfortunate memory allocation
      thrash in large-scale operation because of all these immutable Path
      objects, which (ultra)unfortunately can't even reuse the n+1 length
      arrays when their prior sibling already allocated them... However,
      I care a lot more about getting the composability and ergonomics of
      these interfaces right first; and we can discuss how to hack shortcuts
      back in later.  (Perhaps the more visit-oriented functions will do
      some fancy footwork, and combine it with warnings that the Path objects
      handed the visitor are only valid for its scope.  This is sharp-edged,
      but seems to be common in practice in this kind of library; basically
      every performant git library I've ever seen has caveats like this on
      its iterators, for example.  Anyway!  Not doing that yet.))
      
      Context is coming along for the ride on TraversalProgress!
      This is discussed in another one of the large comment blocks that's
      disappearing in this commit.
      
      Newly exported Path.Join method.  Does what it says on the tin.
      
      Usage of this whole new TraversalProgress system is limited: we're just
      cutting away the incomplete cruft of first-draft functions in this
      diff whenever they got in the way, and newly designed
      Traverse/Visit/Transform/etc functions which correctly use
      TraversalProgress will be coming up in future diffs.
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      3aa48459
  12. 06 Feb, 2019 3 commits
    • Eric Myhre's avatar
      Some comments on Traversal design. · c209676a
      Eric Myhre authored
      Revision needed.  An idea I earlier discarded as too complex seems both
      less complex and more waranted after sleeping on it.
      
      Dunno if I'll get there tonight, but want this pushed for conversation.
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      c209676a
    • Eric Myhre's avatar
      fluent.Node.Keys should panic if errors on route. · 9c13d7d6
      Eric Myhre authored
      Like all other methods on fluent.Node that panic the instant a concrete
      type of anything is asserted outside of traversal attempts.
      
      There's not much sense in holding off until trying to start an
      iteration that's most certainly going to fail; and we do assume that
      anyone making the iterator intends to use it.
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      9c13d7d6
    • Eric Myhre's avatar
      ipldfree.NodeBuilder, fluent.NodeBuilder, tests! · 1a6107b5
      Eric Myhre authored
      We now have nice, immutable nodes... as soon as we deprecate and remove
      the MutableNode stuff.
      
      All the tests pass.
      
      A few methods for bulk update are still stubbed... as is delete;
      seealso the comment about that in the API file.
      
      NodeBuilders needed replication in the fluent package as well.
      This is perhaps a tad frustrating, as well as borderline surprising
      (because "how can sheer memset have errors anyway?"), but the ability
      for node building to reject invalid calls will become relevant when
      typed.Node is in play and has to enforce type rules while still
      supporting the generic methods.
      
      The builder syntax for maps and lists is based on chaining.
      This works... okay.  I'm not as satisfied with the visual effect of it
      as I'd hoped.  There'll probably be more experiments in this area.
      Things nest -- critical success; but the indentation level doesn't
      quite match what we'd contemporarily consider natural flow.
      
      But look at the diff between the mutableNode test specs and the new
      immutable&fluent nodeBuilder test specs!  The old ones were nuts:
      temp variables everywhere for construction, total visual disorder, etc.
      The new NodeBuilder is much much better for composition, and doesn't
      need named variables to save fragments during construction.
      More refinement on map/list builders still seems possible, but it's
      moving in the right direction.
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      1a6107b5
  13. 05 Feb, 2019 3 commits
    • Eric Myhre's avatar
      Test specs for nested map unmarshal. · 5d19d364
      Eric Myhre authored
      Fortunately, that just already worked fine.
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      5d19d364
    • Eric Myhre's avatar
      fluent support for KeysIterator, Length, etc. · 0d61265c
      Eric Myhre authored
      Allows uncommenting some fixme's in some tests.
      
      Also, changed free.keyIterator to be unexported.  Exporting that was
      a typo; there's simply no reason for that to be exported.
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      0d61265c
    • Eric Myhre's avatar
      ipld.NodeUnmarshaller func interface; 'free' impl. · 35323ff0
      Eric Myhre authored
      Hej, we can finally unmarshal things again.  Plug a cbor parser
      in from the refmt library and go go go!
      
      This doesn't use any of the mutablenode stuff because, as remarked
      upon a (quite a) few commits back, we want to replace that with
      NodeBuilder interfaces; but, furthermore, we actually don't *need*
      those interfaces for anything in an unmarshalling path (because
      I don't expect we'll seriously need to switch result Node impl
      in mid-unmarshalling stream); so.
      
      You can imagine that the cbor.Node system will have a *very* different
      implementation of this interface (and probably another method that
      doesn't match this interface at all, and is more directly byte stream
      based); but that's for later work.
      
      And tests!
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      35323ff0