1. 13 Mar, 2019 1 commit
    • Eric Myhre's avatar
      Update deps. · a64d28cd
      Eric Myhre authored
      A base32 library moved github orgs and name.
      
      One small change to repose (refmt added a param to cbor decoders).
      
      Other updates unimpactful.
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      a64d28cd
  2. 21 Feb, 2019 1 commit
    • Eric Myhre's avatar
      Wiring JSON and CBOR to multicodec in repose! Yas! · 749cf35c
      Eric Myhre authored
      Finally.
      
      This is "easy" now that we've got all the generic marshalling
      implemented over Node and unmarshalling onto NodeBuilder:
      we just glue those pieces together with the JSON and CBOR
      parsers and emitters of refmt, and we're off.
      
      These methods in the repose package further conform this to
      byte-wise reader and writer interfaces so we can squeeze them
      into Multicodec{En|De}codeTable... and thence feed them
      into ComposeLink{Loader|Builder} and USE them!
      
      And by use them, I mean not just use them, but use them transparently
      and without a fuss, even from the very high level Traverse and
      Transform methods, which let users manipulate Nodes and NodeBuilders
      with *no idea* of all the details of serialization down here.
      
      Which is awesome.
      
      (Okay, there's a few more todo entries to sort out in the link
      handling stuff.  Probably several.)
      
      But this should about wrap it up with encoding stuff!  I'm going
      to celebrate with a branch merge, and consider the main topic to
      be lifted back up to the Traverse stuff and how to integrate
      cleanly with link loaders and link builders.
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      749cf35c
  3. 14 Feb, 2019 2 commits
    • Eric Myhre's avatar
      Introduce repose.NodeBuilderChooser. · dfd17c2e
      Eric Myhre authored
      Mea cupla for all these absolutely terrible placeholder names.
      
      Previous comment on MulticodecDecoder about it probably currying a
      NodeBuilder inside itself was wrong.  (Message of the previous commit
      explores this in further detail already, so I won't repeat that here.)
      
      We'll be making concrete implementations of MulticodecDecoder and
      MulticodecDecoder in this library for at least JSON and CBOR (for
      "batteries-included" operation on the most common uses); other
      implementations (like git, etc) will of course be possible to register,
      but probably won't be included in this repo directly (for the sake of
      limiting dependency sprawl).  When we get to them, those two core
      implementations should be a good example of the probing-for-interfaces
      described in the comments here.
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      dfd17c2e
    • Eric Myhre's avatar
      Link loaders and their dual. · a50962ae
      Eric Myhre authored
      This is a first and incomplete draft, and also with placeholder names.
      
      The whole topic of loading links and writing nodes out to a hashable
      linkable form is surprisingly parameterizable.
      On the one hand, great, flexibility is good.
      On the other hand, this makes it intensely difficult to provide a
      simple design that focuses on the essense of user wants and needs
      rather than getting bogged down in a torrent of minutiae.
      
      We want to be able to support a choice of abstraction in storage,
      for example -- getting and putting of raw bytes -- without forcing
      simultaenously engaging with and reimplementing the usage of multihash,
      the encoding and multicodec tagging, and so on.
      This is easier said than done.
      
      The attempt here is to have some function interfaces which do the
      aggregated work -- and these are what we'll actually use in the rest of
      the codebase, such as in the TraversalConfig for example -- and have
      some functions to build them out of closures that bind together all
      the other details of codecs and hashing and so forth.
      In theory you could build your own LinkLoader entirely from whole
      cloth; in practice, we expect to use ComposeLinkLoader in almost 100%
      of all cases.
      
      I think this is on the overall right track, but there's at least a few
      goofs in this draft: specifically, at the moment, somehow the pick of
      Node impl got kicked to all the way inside the MulticodecDecoder.
      This is almost certainly wrong: we *absolutely* want to support users
      picking different Node implementations without having to reconfigure
      the wiring of marshal and unmarshal!  For example: we need to be able
      to load things into typed.Node implementations with concrete types
      that were produced by codegen; and we don't want to force rewriting
      the cbor parser every time a user needs to do this for a new type!
      Something further is required here.  We'll iterate in coming commits.
      
      The multicodec tables are an attempt to solve that parameter in a
      fairly straightforward plugin-supporting way.  This is important
      because IPLD has support for some interesting forms of serializaton
      such as "git" -- and while we want to support this, we also absolutely
      do not want to link that as a strict essential dependency of the core
      IPLD library.  Where exactly this is going should also become clearer
      in future commits -- and remember, as described in the previous
      paragraph, at least one thing is completely wrong here and needs
      a second round of draft to unkink it.
      
      There are some inline comments about deficiencies in the multihash
      interfaces that are currently exported.  Some improvements upstream
      would be nice; for now, I'm simply coding as if streaming use *was*
      supported, so everything will be that much easier to fix when we get
      the upstream straightened out.
      
      The overall number of parameters in LinkBuilder is still pretty
      overwhelming, frankly, but I don't know how to get it down any further.
      
      The saving grace of all this is that when it's put to work in the
      traversal package, things like traversal.Transform will simply stash
      the metadata for CID and multicodec and multihash from any links
      traversed... and then if an update is propagated through, it'll just
      pop that metadata back up for the saving of the new "updated" Node.
      The whole lifetime of that metadata is on the stack and the user
      never should be bothered by it as long as they're doing "updates".
      (Users creating *new* objects get the full facefull of choices,
      of course.  I don't know if anything can be done about that.)
      
      It's possible that the whole
      "multicodecType uint64, multihashType uint64, multihashLength int"
      suite of arguments to LinkBuilder should be conveyed in a tuple;
      it's not really likely that they'll ever vary independently, and in
      general whole applications have probably picked one set of values
      and will stick with it application-wide.  The `cid.Prefix` struct
      even matches this already.  But it's... not really clear what's
      going on in that package, frankly.  There seem to be several competing
      variations on builder patterns and I've got no idea which one we're
      "supposed" to be using.  Therefore, I'm not using cid.Prefix for
      this purpose until some more conversations are had about that.
      
      Also note ActualStorer (placeholder name) and StoreCommitter
      as a pairing of functions.  Previous generations of library have
      conjoined this, based on assumption that we'll have "blocks", and
      those are small enough that it's "fine" to have them completely
      in memory, and do a hashing pass on them before beginning to write.
      This is nonsense and the buck stops here.  A two-phase operation
      with commit to a hash at the *end* is a strictly more powerful
      model -- it's easy to implement buffering and all-at-once write
      when you have a two-phase interface; it's impossible to implement
      a useful two-phase interface when you're forced to start from
      an overbuffered single-step interface -- and lends itself better
      to efficiency and a lower high-water-mark for memory usage.
      Making this choice here in the IPLD libraries might make it
      moderately harder to connect this to existing IPFS blockstore code...
      but it'll also make it easier to write other correct storage and
      loaders (think: opening a single local file with a temp name, and
      atomically moving it to a correct CAS location in the commit call),
      as well as eventually be on the right path when we start fixing other
      IPFS library components to be more streaming friendly.
      
      tl;dr design work.  It's hard.
      Signed-off-by: default avatarEric Myhre <hash@exultant.us>
      a50962ae