Commits · a64d28cd83502015b75a56d6bc7ef832cab1b525 · ld / go-ld-prime

13 Mar, 2019 1 commit

Eric Myhre authored Mar 13, 2019

A base32 library moved github orgs and name.

One small change to repose (refmt added a param to cbor decoders).

Other updates unimpactful.
Signed-off-by: Eric Myhre <hash@exultant.us>

a64d28cd

21 Feb, 2019 1 commit

Wiring JSON and CBOR to multicodec in repose! Yas! · 749cf35c

Eric Myhre authored Feb 21, 2019

Finally.

This is "easy" now that we've got all the generic marshalling
implemented over Node and unmarshalling onto NodeBuilder:
we just glue those pieces together with the JSON and CBOR
parsers and emitters of refmt, and we're off.

These methods in the repose package further conform this to
byte-wise reader and writer interfaces so we can squeeze them
into Multicodec{En|De}codeTable... and thence feed them
into ComposeLink{Loader|Builder} and USE them!

And by use them, I mean not just use them, but use them transparently
and without a fuss, even from the very high level Traverse and
Transform methods, which let users manipulate Nodes and NodeBuilders
with *no idea* of all the details of serialization down here.

Which is awesome.

(Okay, there's a few more todo entries to sort out in the link
handling stuff.  Probably several.)

But this should about wrap it up with encoding stuff!  I'm going
to celebrate with a branch merge, and consider the main topic to
be lifted back up to the Traverse stuff and how to integrate
cleanly with link loaders and link builders.
Signed-off-by: Eric Myhre <hash@exultant.us>

749cf35c

14 Feb, 2019 2 commits

Introduce repose.NodeBuilderChooser. · dfd17c2e

Eric Myhre authored Feb 14, 2019

Mea cupla for all these absolutely terrible placeholder names.

Previous comment on MulticodecDecoder about it probably currying a
NodeBuilder inside itself was wrong. (Message of the previous commit
explores this in further detail already, so I won't repeat that here.)

We'll be making concrete implementations of MulticodecDecoder and
MulticodecDecoder in this library for at least JSON and CBOR (for
"batteries-included" operation on the most common uses); other
implementations (like git, etc) will of course be possible to register,
but probably won't be included in this repo directly (for the sake of
limiting dependency sprawl). When we get to them, those two core
implementations should be a good example of the probing-for-interfaces
described in the comments here.
Signed-off-by: Eric Myhre <hash@exultant.us>

dfd17c2e

Link loaders and their dual. · a50962ae

Eric Myhre authored Feb 14, 2019

This is a first and incomplete draft, and also with placeholder names.

The whole topic of loading links and writing nodes out to a hashable
linkable form is surprisingly parameterizable.
On the one hand, great, flexibility is good.
On the other hand, this makes it intensely difficult to provide a
simple design that focuses on the essense of user wants and needs
rather than getting bogged down in a torrent of minutiae.

We want to be able to support a choice of abstraction in storage,
for example -- getting and putting of raw bytes -- without forcing
simultaenously engaging with and reimplementing the usage of multihash,
the encoding and multicodec tagging, and so on.
This is easier said than done.

The attempt here is to have some function interfaces which do the
aggregated work -- and these are what we'll actually use in the rest of
the codebase, such as in the TraversalConfig for example -- and have
some functions to build them out of closures that bind together all
the other details of codecs and hashing and so forth.
In theory you could build your own LinkLoader entirely from whole
cloth; in practice, we expect to use ComposeLinkLoader in almost 100%
of all cases.

I think this is on the overall right track, but there's at least a few
goofs in this draft: specifically, at the moment, somehow the pick of
Node impl got kicked to all the way inside the MulticodecDecoder.
This is almost certainly wrong: we *absolutely* want to support users
picking different Node implementations without having to reconfigure
the wiring of marshal and unmarshal! For example: we need to be able
to load things into typed.Node implementations with concrete types
that were produced by codegen; and we don't want to force rewriting
the cbor parser every time a user needs to do this for a new type!
Something further is required here. We'll iterate in coming commits.

The multicodec tables are an attempt to solve that parameter in a
fairly straightforward plugin-supporting way. This is important
because IPLD has support for some interesting forms of serializaton
such as "git" -- and while we want to support this, we also absolutely
do not want to link that as a strict essential dependency of the core
IPLD library. Where exactly this is going should also become clearer
in future commits -- and remember, as described in the previous
paragraph, at least one thing is completely wrong here and needs
a second round of draft to unkink it.

There are some inline comments about deficiencies in the multihash
interfaces that are currently exported. Some improvements upstream
would be nice; for now, I'm simply coding as if streaming use *was*
supported, so everything will be that much easier to fix when we get
the upstream straightened out.

The overall number of parameters in LinkBuilder is still pretty
overwhelming, frankly, but I don't know how to get it down any further.

The saving grace of all this is that when it's put to work in the
traversal package, things like traversal.Transform will simply stash
the metadata for CID and multicodec and multihash from any links
traversed... and then if an update is propagated through, it'll just
pop that metadata back up for the saving of the new "updated" Node.
The whole lifetime of that metadata is on the stack and the user
never should be bothered by it as long as they're doing "updates".
(Users creating *new* objects get the full facefull of choices,
of course. I don't know if anything can be done about that.)

It's possible that the whole
"multicodecType uint64, multihashType uint64, multihashLength int"
suite of arguments to LinkBuilder should be conveyed in a tuple;
it's not really likely that they'll ever vary independently, and in
general whole applications have probably picked one set of values
and will stick with it application-wide. The `cid.Prefix` struct
even matches this already. But it's... not really clear what's
going on in that package, frankly. There seem to be several competing
variations on builder patterns and I've got no idea which one we're
"supposed" to be using. Therefore, I'm not using cid.Prefix for
this purpose until some more conversations are had about that.

Also note ActualStorer (placeholder name) and StoreCommitter
as a pairing of functions. Previous generations of library have
conjoined this, based on assumption that we'll have "blocks", and
those are small enough that it's "fine" to have them completely
in memory, and do a hashing pass on them before beginning to write.
This is nonsense and the buck stops here. A two-phase operation
with commit to a hash at the *end* is a strictly more powerful
model -- it's easy to implement buffering and all-at-once write
when you have a two-phase interface; it's impossible to implement
a useful two-phase interface when you're forced to start from
an overbuffered single-step interface -- and lends itself better
to efficiency and a lower high-water-mark for memory usage.
Making this choice here in the IPLD libraries might make it
moderately harder to connect this to existing IPFS blockstore code...
but it'll also make it easier to write other correct storage and
loaders (think: opening a single local file with a temp name, and
atomically moving it to a correct CAS location in the commit call),
as well as eventually be on the right path when we start fixing other
IPFS library components to be more streaming friendly.

tl;dr design work. It's hard.
Signed-off-by: Eric Myhre <hash@exultant.us>

a50962ae