Commits · c38adf626dbdecc9c56807bacd49cb66c2f2ace6 · ld / go-ld-prime

12 Aug, 2019 3 commits

Add Node.Lookup; and add ErrInvalidKey. · 13e9a113
Eric Myhre authored Aug 12, 2019
```
Signed-off-by: Eric Myhre <hash@exultant.us>
```
13e9a113
Fix docs references missed by exact string match. · 40bacb4f
Eric Myhre authored Aug 12, 2019
```
Signed-off-by: Eric Myhre <hash@exultant.us>
```
40bacb4f

Node traversal(->lookup) method renames. · 2e3868c1

Eric Myhre authored Aug 12, 2019

Most important things first!  To follow this refactor:

```
sed s/TraverseField/LookupString/g
sed s/TraverseIndex/LookupIndex/g
```

It is *literally* a sed-refactor in complexity.

---

Now, details.

This has been pending for a while, and there is some discussion in
https://github.com/ipld/go-ipld-prime/issues/22 .

In short, "Traversal" seemed like a mouthful;
"Field" was always a misnomer here;
and we've discovered several other methods that we *should* have
in the area as well, which necessitated a thought about placement.

In this commit, only the renames are applied, but you can also see
the outlines of two new methods in the Node interface, as comments.
These will be added in future commits.
Signed-off-by: Eric Myhre <hash@exultant.us>

2e3868c1

11 Aug, 2019 2 commits

Partial typed.wrapnodeStruct implementation. · 99058684

Eric Myhre authored Aug 11, 2019

It covers the bases on reading, and has a NodeBuilder which works to
create new nodes based on the type-level structure.

(Quite a few other things about it are incomplete, and it might end
up getting merged this way, because the goal here was primarily to
scope out if any of our abstractions will shatter badly, and we've
now got information on that.)

This seems to indicate we are indeed on the right track, regarding
some of those comments in the codegeneration work in the previous
commit: check out the 'Representation' method.  Yep -- we've got
multiple nodebuilders in our future.
Signed-off-by: Eric Myhre <hash@exultant.us>

99058684

Correct a comment about performance in justString. · d0ce3ded

Eric Myhre authored Aug 11, 2019

There's actually a doozy of performance considerations in this commit.
Verifying (and then correcting) that comment kicked off a surprisingly
deep and demanding binge of research.

(Some of the considerations are still only "considerations",
unfortunately -- one key discovery is that (surprise!) conclusive
choices require more info than microbenchmarks alone can yield.)

First, the big picture:

One of the things we need to be really careful about throughout a
system like go-ipld-prime (where we're dealing with large amounts of
serialization) is the cost of garbage collection. Since we're often
inside of an application's "tight loop" or "hot loop" or whatever you
prefer to call it, if we lean on the garbage collector too heavily...
it's very, very likely to show up a system-wide impact.

So, in essence, we want to call "malloc" less. This isn't always easy.
Sometimes it's downright impossible: we're building large tree
structures; it's flatly impossible to do this without allocating
some memory.
In other cases, there are avoidable things: and in particular, one
common undesirable source of allocations comes from "autoboxing" around
interfaces. (More specifically, the name of the enemy here will often
show up on profiling reports as "runtime.convT2I".) Sometimes this one
can be avoided; other times, not.

Now, a more detailed picture:

There are actually several functions in the runtime that relate to
memory allocation and garbage collection, and the less we use any of
them, the better; but also, they are not all created equal.

These are the functions that are of interest:

- runtime.convT2I / runtime.convT2E
- runtime.newObject
- runtime.writeBarrier / runtime.gcWriteBarrier
- runtime.convTstring / etc

Most of these functions call `runtime.mallocgc` internally, which is
why they're worth of note. (writeBarrier/gcWriteBarrier are also
noteworthy, but are different beasts.)

Different kinds of implementations of something like `justString` will
cause the generated assembly to contain calls to various combinations
of these runtime functions when they're converted into a `Node`.

These are the variations considered:

- Variation 1: `type justString string`:
results in `runtime.convTstring`.

- Variation 2: `type justString struct { x string }`:
results in `runtime.convT2I`.

- Variation 3: as above, but returning `&justString{val}`:
results in `runtime.newobject` *and* its friends
`runtime.writeBarrier` and `runtime.gcWriteBarrier`.

The actual performance of these... it depends.

In microbenchmarks, I've found the above examples are roughly:

- Variation 1: 23.9 ns/op 16 B/op 1 allocs/op
- Variation 2: 31.1 ns/op 16 B/op 1 allocs/op
- Variation 3: 23.0 ns/op 16 B/op 1 allocs/op

So, a couple things about this surprised me; and a couple things I'm
still pretty sure are just microbenchmarks being misleading.

First of all: *all* of these call out to `mallocgc` internally. And so
we see an alloc-per-op reported in all three of them. (This actually
kinda bugs me, because I feel like we should be able to fit all the
requisite information in the first case directly into the interface,
which is, if I understand correctly, already always two words; and
arguably, the compiler might be smart enough to do this in the second
case as well. But I'm wrong about this, for whatever reason, so let's
just accept this one and move along.) But they vary in time. Why?

Variation 2 seems to stand out as slower. Interestingly, it turns out
`convT2E` and `convT2I` are extra problematic because they involve a
call of `typedmemmove` internally -- as a comment in the source says,
there's both an allocation, a zeroing, and then a copy here (at least,
as of go1.12); this is a big bummer. In addition, even before getting
that deep, if you check out the disassembly of just our functions:
for our second variation, as inlined into our microbenchmark, there are
9 instructions, plus 1 'CALL'; vs only 3+1 for the first variation.
This memmove and extra instructions seems to be the explainer for why
our second variation (`struct{string}`) is significantly (~8ns) slower.

(And here I thought variation two would do well! A struct with one
field is the same size as the field itself; a string is one word
of pointer; and an interface has another word for type; and that's
our two words, so it should all pack, and on the stack! Alas: no.)

Now back up to Variation 1 (just a typedef of a string): this one
invokes `runtime.convTstring`, and while that does invoke `mallocgc`,
there's a detail about how that's interesting: it does it with an ask
for a small number of bytes. Specifically, it asks for... well,
`unsafe.Sizeof(string)`, so that varies by platform, but it's "tiny".
What's "tiny" mean? `mallocgc` has a specific definition of this, and
you can see it by grepping the runtime package source for
"maxTinySize": it's 16 bytes. Things under this size get special
treatment from a "tiny allocator"; this seems to be why
`runtime.convTstring` is relatively advantaged.

(You can see benchmarks relating to this in the runtime package itself:
try `go test -run=x -bench=Malloc runtime`. There's a *huge* cliff
between MallocLargeStruct versus the rest of its fellows.)

Variation 3 also appears competitive. This one surprises me, and this
is where I still feel like microbenchmarks must be hoodwinking.
The use of `runtime.newobject` seems to hit the same corners as
`runtime.convTstring` at runtime in our situation here: it's "tiny";
that's neat. More confusingly, though, `runtime.writeBarrier` and
`runtime.gcWriteBarrier` *should* be (potentially) very high cost
calls. And for some reason, they're not. This particular workload in
the microbenchmark must just-so-happen to tickle things in such a way
that these calls are free (literally; within noise levels), and I
suspect that's a happy coincidence in the benchmark that won't at all
hold in real usage -- as any amount of real memory contention appears,
the costs of these gc-related calls can be expected to rise.

I did a few more permutations upon Variations 2 and 3, just out of
curiosity and for future reference, adding extra fields to see if any
interesting step functions revel themselves. Here's what I found:

- {str,int,int,int,int} is 48 bytes; &that allocs the same amount; in speed, & is faster; 33ns vs 42ns.
- {str,int,int,int} is 48 bytes; &that allocs the same amount; in speed, & is faster; 32ns vs 42ns.
- {str,int,int} is 32 bytes; &that allocs the same amount; in speed, & is faster; 32ns vs 39ns.
- {str,int} is 32 bytes; &that allocs the same amount; in speed, & is faster; 31ns vs 38ns.
- {str} is 16 bytes; &that allocs the same amount; in speed, & is faster; 24ns vs vs 32ns.

Both rise in time cost as the struct grows, but the non-pointer
variant grows faster, and it experiences a larger step of increase
each time the size changes (which in turn steps because of alignment).
The &{str} case is noticeably faster than the apparently linear
progression that starts upon adding a second field; since we see the
number 16 involved, it seems likely that this is the influence of the
"tiny allocator" in action, and the rest of the values are linear
relative to each other because they're all over the hump where the
tiny allocator special path disengages.

(One last note: there's also a condition about "noscan" which toggles
the "tiny allocator", and I don't fully understand this detail. I'd
have thought strings might count as a pointer, which would cause our
Variation 3 to not pass the `t.kind&kindNoPointers` check; but the
performance cliff observation described in the previous paragraph
seems to empirically say we're not getting kicked out by "noscan".
(Either that or there's some yet-other phenomenon I haven't sussed.))

(Okay, one even-laster note: in the course of diving around in the
runtime malloc code, I found an interesting comment about using memory
"from the P's arena" -- "P" being one of the letters used when talking
about goroutine internals -- and I wonder if that contributes to our
little mystery about how the `gcWriteBarrier` method seems so oddly
low-cost in these microbenchmarks: perhaps per-thread arenas combined
with lack of concurrency in the benchmark combined with quickly- and
sequentially-freed allocations means any gcWriteBarrier is essentially
reduced to nil. However, this is just a theory, and I won't claim to
totally understand the implications of this; commenting on it here
mostly to serve as a pointer to future reading.)

---

Okay. So what comes of all this?

- I have two choices: attempt to proceed further down a rabbithole of
microbenchmarking and assembly-splunking (and next, I think, patching
debug printfs into the compiler and runtime)... or, I can see that last
one as a step too far for today, pull up, commit this, and return to
this subject when there's better, less-toy usecases to test with.
I think the latter is going to be more productive.

- I'm going to use the castable variation here (Variation 1). This
won't always be the correct choice: it only flies here because strings
are immutable anyway, and because it's a generic storage implementation
rather than having any possibility of additional constraints, adjuncts
from the schema system, validators, etc; and therefore, I don't
actually care if it's possible to cast things directly in and out of
this type (since doing so can't break tree immutability, and it can't
break any of those other contracts because there aren't any).

- A dev readme file appears. It discusses what choices we might make
for other cases in the future. It varies by go native kind; and may
be different for codegen'd types vs general storage implementations.

- Someday, I'd like to look at this even further. I have a persistent,
nagging suspicion that it should be possible to make more steps in the
direction of "zero cost abstractions" in this vicinity. However, such
improvements would seem to be pretty deep in the compiler and runtime.
Someday, perhaps; but today... I started this commit in search of a
simple diff to a comment! Time to reel it in.

Whew.
Signed-off-by: Eric Myhre <hash@exultant.us>

d0ce3ded

08 Aug, 2019 1 commit

Add a 'justString' implementation. · 040f47d2

Eric Myhre authored Aug 05, 2019

This is meant to be about the cheapest implementation of Node possible.

It's important to have this given how often we need string nodes.
In the example of the moment: in writing some typed.Node systems, I
need some intermediate string nodes just to nudge around map processes.
Those nodes need to be handy and their implementation must be cheap.

It's also useful that we can have a constructor which doesn't return
errors, so it's easier to chain.  The NodeBuilder features are
restrained to needing to be able to yield errors for interface
satisfaction purposes, in turn needed for more complex features.
But when building a specific node, there's no reason we need to go
through NodeBuilder at all.

I'm putting this in the ipldfree package at the moment.  I wouldn't
mind putting a constructor for this in the root ipld package as well,
but that would either mean a cyclic import or putting this
implementation as a blessed one in the main package.
(Putting this as a blessed default implementation in the main package
is possible, and can be considered later, but is a "not today" thing;
having these implementations in a subpackage is a good forcing-function
for development to make sure we don't accidentally leave any important
things unexported in the main package.)

040f47d2

20 Jul, 2019 2 commits

Add Node.IsUndefined, ipld.Undef and ipld.Null. · 6802ea10

Eric Myhre authored Jul 20, 2019

The comment in the ipld.Node type about IsUndefined has been there
quite a while; today we're finally using it.

With this set of changes, the generated code for structs now compiles
successfully again: the Undef and Null thunks are both used there.

6802ea10

Add typed.ErrNoSuchField and ipld.ErrNotExists. · 8c6af6f5

Eric Myhre authored Jul 20, 2019

The latter isn't used yet; that's not the main goal of this branch (and
I think I'd rather do the refactor to use ipld.ErrNotExists after
moving the PathSegment types from the selector package into the root...
which in turn I'm not going to do while some of the other current work
on Selectors is in progress in other branches. So! Latur.); I just
wanted it around for documentation and clarity of intent.

This also adds a couple of other dummy references to the 'typed'
package. I know keeping a set of packages-used-in-this-file is a
common step in the development of codegen systems targetting golang,
but I'm going to try to avoid it until the need is really forced.

8c6af6f5

10 Jul, 2019 2 commits

Update ipldfree.Node to use ErrWrongKind. · a32b481f
Eric Myhre authored Jul 02, 2019
```
Yayy typed errors and consistent messaging!
```
a32b481f

Finish ErrWrongKind; add typed.Node.Representation · ecad4261

Eric Myhre authored Jul 02, 2019

typed.Node.Representation(), which returns another Node, should address
most of the infelicies we've found so far in trying to plan nice code
that works over the schema layer.

Also added in this comment: ipld.ReprKindSet, primarily for use in the
ErrWrongKind error.  It comes up often enough we might as well formalize
the thing.

ecad4261

25 Jun, 2019 1 commit

Refactor of type/schema code. · 122c5338

Eric Myhre authored Jun 25, 2019

- `typed.Node` -> now lives in the `impl/typed` package, more like
other nodes.

- Most of the other essential parts of reasoning about types moved
to `schema` package.  ("typed.Type" seemed like a nasty stutter.)

- `typed.Universe` renamed to `schema.TypeSystem`.

- Current `Validate` method moved to `schema` package, but see new
comments about potential future homes of that code, and its
aspirational relationship to `typed.Node`.

Conspicuously *not* yet refactored or moved in this comment:

- The `typed/declaration` package -- though it will shortly be scrapped
and later reappear as `schema/ast`.  The dream is that it would be
neatest of all if we could generate it by codegen; but we'll see.
(This would seem to imply we'll have to make sufficient exported
methods for creating the `schema.Type` values... while we also want
to make those immutable.  We'll... see.)

- The `typed/gen` package is also untouched in this commit, but
should similarly move shortly.  The codegen really ought to be built
against the `schema.Type` reified interfaces.

Overall, this drops the sheer nesting depths of packages a fair bit,
which seems likely to be a good smell.

122c5338

24 Mar, 2019 1 commit

Node contained in MapBuilder and LinkBuilder don't need another separate... · 2d4e49e8

Eric Myhre authored Mar 24, 2019

Node contained in MapBuilder and LinkBuilder don't need another separate pointer and heap allocation.
Signed-off-by: Eric Myhre <hash@exultant.us>

2d4e49e8

21 Mar, 2019 2 commits

Package-level docs; and fix free/Node docs. · 6c3e2490
Eric Myhre authored Mar 21, 2019
```
Signed-off-by: Eric Myhre <hash@exultant.us>
```
6c3e2490

Iterator refactor: entry-based, for map and list. · b84e99cd

Eric Myhre authored Mar 21, 2019

We now have both MapIterator and ListIterator interfaces.
Both return key-value (or index-value) pairs, rather than just keys.

List iterators may seem a tad redundant: you just loop over the length,
right? Well, sure. But there's one place a list iterator shines:
selecting only a subset of elements. And indeed, we'll be doing
exactly that in the traversal/selector package; therefore, we
definitely need list iterators.

We might want keys-only iterators again in the future, but at present,
I'm deferring that. It's definitely true that we should have iterators
returning values as a core feature, since they're likely to be more
efficiently supportable than "random" access (especially when we get to
some Advanced Layout data systems), so we'll implement those first.

Additionally, note that MapIterator now returns a Node for the key.
This is to account for that fact that when using the schema system and
typed nodes, map keys can be more *specific* types. Such nodes are
still required to be kind==ReprKind_String, but string might not be
their *preferred* native format (think: tuples with serialized to be
delimiter-separated strings); we need to account for that.
(MapBuilder.Insert method already takes a Node parameter for similar
reasons: so it can take *typed* nodes. Node.TraverseField accepting
a plain string is the oddball out here, and should be rectified.)
Signed-off-by: Eric Myhre <hash@exultant.us>

b84e99cd

19 Mar, 2019 1 commit

Naming: ReprKind. · fe099392

Eric Myhre authored Mar 19, 2019

Having a function called "Kind" return a "ReprKind" was inconsistent.

Also, we want to introduce a "Kind" method on `typed.Node` in the future.

No logical content to this change: you can safely refactor with sed.
Signed-off-by: Eric Myhre <hash@exultant.us>

fe099392

16 Mar, 2019 3 commits

Add NodeBuilder to Node interface. · 025fcf8a

Eric Myhre authored Mar 16, 2019

(... offically.  Lots of docs have probably already been stating that
this is there.  Now it actually... is.)
Signed-off-by: Eric Myhre <hash@exultant.us>

025fcf8a

Drop MutableNode interface. · 6428f14f

Eric Myhre authored Mar 16, 2019

This has been deprecated and replaced by the NodeBuilder system
for a good while now; time to scrape it into the dustbin completely.

Tests that were primarily on the mutable node system itself also
drop, so, this is a *very* large delete diff.

A few other tests used MutableNode just incidentally, and those are
quick fixed to use NodeBuilder.
Signed-off-by: Eric Myhre <hash@exultant.us>

6428f14f

Update Node interfaces to use Link instead of CID. · 694c6f3c

Eric Myhre authored Mar 16, 2019

As detailed in comments a few commits ago, this is part of a big, big
roll towards keeping linking details far enough off to one side that
one can actually use most of the IPLD system without forming an
explicit compile-time dependency on any linking features (until, of
course, one uses the linking features).

This is a surprisingly small diff, because... well, because most of
the *interesting* features around linking simply weren't implemented
yet, and at this point everything that is has already been isolated
in the new cidlink and related encoding packages.
"CID" was *already* just a semantic placeholder that meant "eh, link".
Signed-off-by: Eric Myhre <hash@exultant.us>

694c6f3c

21 Feb, 2019 1 commit

New marshal implementation! Generic. Woo! · f150a81b

Eric Myhre authored Feb 21, 2019

We have both generic marshal and unmarshal -- they should work for any
current or future ipld.Node implementation, and for any encoding
mechanism that can be bridged to via refmt tokens.

Tests are also updated to use builders rather than the ancient
"mutable node" nonsense, which removes... I think nearly the last
incident of that stuff; maybe we can remove it entirely soon.

As when we moved the unmarshal code into its generic form, most of
this code already existed and needed minor modification.  Git even
correctly detects it as a rename this time since the diff is so small.
And as when we moved the unmarshal code, now we also remove the
whole PushTokens interface; we've gotten to something better now.

Finally we're getting to the point we can look at wiring these up
together with all the multicodec glue and get link loading wizardry
at full voltage.  Yesss.  Sooon.
Signed-off-by: Eric Myhre <hash@exultant.us>

f150a81b

20 Feb, 2019 2 commits

New unmarshal implementation! Generic. Woo! · be01e1e5

Eric Myhre authored Feb 20, 2019

This unmarshal works for any NodeBuilder implementation, tada!

Old ipldfree.Node-specific unmarshal dropped... as well as that entire
system of interfaces.  They were first-pass stuff, and I think now it's
pretty clear that it was barking up the wrong tree, and we've got better
ideas to go with now instead.  (Also, as is probably obvious from a skim,
the old code flipped pretty clearly into the new code.)

Turns out refmt tokens aren't a very relevant interface in IPLD.
I'm still using them... internally, to wire up the CBOR and JSON
parsers without writing those again.  But the rest of IPLD is more
like a full-on and principled alternative to refmt/obj and all its
reflection code, and that's... pretty great.

Earlier, I had a suspicion that we would want more interfaces for token
handling on each Node implementation directly, and in particular I
suspected we might use those token-based interfaces for doing transcription
features that flip data from one concrete Node implementation into another.
(That's why we had this ipldfree.Node-specialized impl in the first place.)
**This turns out to have been wrong!**  Instead, now that we have the
ipld.NodeBuilder interface standard, that turns out to be much better suited
to solving the same needs, and it can do so:

- without adding tokens to the picture (simpler),

- without requiring tokenization-based interfaces be implemented per
concrete ipld.Node implementation (OH so much simpler),

- and arguably NodeBuilder is even doing it *better* because it doesn't
need to force linearization (and while usually that doesn't matter... one
can perhaps imagine it coming up if we wanted to do a data transcription
in memory into a Node implementation which has an orderings requirement).

So yeah, this is a nice thing to have been wrong about.  Much simpler now.

Old ipldfree.Node-specialized 'PushTokens' is still around.  Not for long,
though; it just hasn't finished being ported to the new properly generalized
style quite yet.

Note, this is not the *whole* story, still, either.  For example, still
expect to have an ipldcbor.Node which someday has a *significantly* different
set of marshal and unmarshal methods -- it may eschew refmt entirely,
and take the (very) different strategy of building a skiplist over raw
byte slices! -- that will exist *in addition* to the generic implementations
we're doing here and now.  More on that soon.

Yeah.  A lot of interfaces to get lined up, here.  Some of them tug in such
different directions that picking the right ones to make it all possible
seems roughly like solving one of the NP-hard satisfiability problems.
(Good thing it's actually with a small enough number of choices that it's
tractable; on the other hand, enumerating those choices isn't fast, and
the 'verifier' function here ain't fast either, and being a "design" thing,
it can only be evaluated on human wetware.  So yeah, an NP problem on a
tractable domain but slow setup and slow verifier.  Sounds about right.)

(uh, I'm going to write a book "Design: It's Hard: The Novel" after this.)

Tests are patched enough to keep working where they are; I think it's
possible that a reshuffle of some of them to be more closely focused on
the marshal code rather than the node implementation packages might be
in order, but I'm going to let that be a future issue.  (Oh, and they
did shine a light on one quick issue about MapBuilder initialization,
which is also now fixed.)
Signed-off-by: Eric Myhre <hash@exultant.us>

be01e1e5

Fixing MapBuilder error exposure. · eafc200a

Eric Myhre authored Feb 20, 2019

This is the first commit going down a long and somewhat dizzying prerequisite tree:

- For graphsync (an out-of-repo consuming project) we need selectors
- For Selectors we need traversal implemented
- For Traversal implementations we need link loaders [‡]
- For link loading we need all deserialization implemented
- (and ideally, link creation is done at the same time, so we don't get surprised by any issues with the duals later)
- and it turns out for deserialization, we now have some incongruities with the earlier draft at MapBuilder...

So we're all the way at bugfixes in the core ipld.MapBuilder API. Nice.

([‡] Some of those jumps are a little strained. In particular, traversal doesn't
*in general* need link loaders, so we might choose a very different implementation
order (probably one that involves me having a lot less headaches)... *except*,
since our overall driver for implementation order choices right now is graphsync,
we particularly need traversals crossing block boundaries since we're
interested in making sure selectors do what graphsync needs. Uuf.)

What's the MapBuilder design issue? Well, not enough error returns, mostly.
We tried to put the fluent call-chaining API in the wrong place.

Why is this suddenly an issue now? Well, it turns out that properly genericising
the deserialization needs to be able to report error states like invalid
repeated map keys promptly.

Wait, didn't we even *have* deserialization code before? Yes, yes we did.
It's just that that draft was specialized to the ipldfree.Node implementation...
and so it doesn't hold up anymore when we need it to work in general traversal.

Okay, so. That's the stack depth.

With all that in mind...

This diff adds more error return points to ipld.MapBuilder, and maintains the
fluent variant more closely matching the old behavior (including the
call-chaining style), and fixes tests that relied on this syntax.

Duplicate keys rejection is also in this commit. I thought about splitting it
into further commits, but it's so small. (We may actually need more work in
the future to enable Amend+(updating)Insert, but that's for later; perhaps
an Upsert method; whatever, I dunno, out of scope for thought at the moment.)

And then we'll carry on, one step at a time, from there. Whew.

---

Sidebar: also dropping MapBuilder.InsertAll(map[Node]Node) from the interface.
I think this could be better implemented as a functional feature that works
over a MapBuilder than being a builtin, and we should prefer a trim MapBuilder.
And might as well drop it now rather than bother fixing it up just to remove later.

---

ipld.ListBuilder also updated to no longer do a call-chaining style API, while
fluent.ListBuilder continues to do so. This is mainly for consistency;
we don't have the same potential for mid-build error conditions for lists
as we do with maps, but ipld.ListBuilder and ipld.MapBuilder should be similar.

---

Aaaaand one more! NodeBuilder.{Create,Append}{Map,List}() have ALL been
updated to also return errors. Previously, the append methods had an error
state if you used them when the NodeBuilder was bound to a predecessor node
of an unmatching type, but they just swallowed them into the builder and
regurgitated them (much) later; we're no longer doing this. Additionally,
it's occurred to me that *typed* builders -- while not so much a thing, yet,
certainly a thing that's coming -- will even potentially error on CreateMap
and CreateList methods, according to their type constraints. So, jump that now.

...

Yeah, basically a whole tangle of misplaced optimism about error paths in
NodeBuilder and its whole set of siblings has been torn through at once here.
Bandaid ripping sound.
Signed-off-by: Eric Myhre <hash@exultant.us>

eafc200a

06 Feb, 2019 1 commit

ipldfree.NodeBuilder, fluent.NodeBuilder, tests! · 1a6107b5

Eric Myhre authored Feb 06, 2019

We now have nice, immutable nodes... as soon as we deprecate and remove
the MutableNode stuff.

All the tests pass.

A few methods for bulk update are still stubbed... as is delete;
seealso the comment about that in the API file.

NodeBuilders needed replication in the fluent package as well.
This is perhaps a tad frustrating, as well as borderline surprising
(because "how can sheer memset have errors anyway?"), but the ability
for node building to reject invalid calls will become relevant when
typed.Node is in play and has to enforce type rules while still
supporting the generic methods.

The builder syntax for maps and lists is based on chaining.
This works... okay.  I'm not as satisfied with the visual effect of it
as I'd hoped.  There'll probably be more experiments in this area.
Things nest -- critical success; but the indentation level doesn't
quite match what we'd contemporarily consider natural flow.

But look at the diff between the mutableNode test specs and the new
immutable&fluent nodeBuilder test specs!  The old ones were nuts:
temp variables everywhere for construction, total visual disorder, etc.
The new NodeBuilder is much much better for composition, and doesn't
need named variables to save fragments during construction.
More refinement on map/list builders still seems possible, but it's
moving in the right direction.
Signed-off-by: Eric Myhre <hash@exultant.us>

1a6107b5

05 Feb, 2019 4 commits

fluent support for KeysIterator, Length, etc. · 0d61265c

Eric Myhre authored Feb 05, 2019

Allows uncommenting some fixme's in some tests.

Also, changed free.keyIterator to be unexported.  Exporting that was
a typo; there's simply no reason for that to be exported.
Signed-off-by: Eric Myhre <hash@exultant.us>

0d61265c

ipld.NodeUnmarshaller func interface; 'free' impl. · 35323ff0

Eric Myhre authored Feb 05, 2019

Hej, we can finally unmarshal things again.  Plug a cbor parser
in from the refmt library and go go go!

This doesn't use any of the mutablenode stuff because, as remarked
upon a (quite a) few commits back, we want to replace that with
NodeBuilder interfaces; but, furthermore, we actually don't *need*
those interfaces for anything in an unmarshalling path (because
I don't expect we'll seriously need to switch result Node impl
in mid-unmarshalling stream); so.

You can imagine that the cbor.Node system will have a *very* different
implementation of this interface (and probably another method that
doesn't match this interface at all, and is more directly byte stream
based); but that's for later work.

And tests!
Signed-off-by: Eric Myhre <hash@exultant.us>

35323ff0

Node.Keys method now returns iterators. · 5c3bd1af

Eric Myhre authored Feb 05, 2019

An "immediate" rather than generative function is also available;
the docs contain caveats.

At the moment, these distinctions are a bit forced, but we want to be
ready for when we start getting truly huge maps where the generative
usage actually *matters* for either memory reasons, latency reasons,
or both.

Separated Length rather than trying to pull double-duty the Keys
method; the previous combination was just utterly silly.

The implementations in the bind package are stubs; that package is
going to take a lot of work, and so the focus is going to be entirely
on keeping the 'free' package viable; and we'll revisit bind and such
when there's more dev time budget.
Signed-off-by: Eric Myhre <hash@exultant.us>

5c3bd1af

Extract the interface for declaring a node to be self-tokenizing. · 23aec76e
Eric Myhre authored Feb 05, 2019
```
Signed-off-by: Eric Myhre <hash@exultant.us>
```
23aec76e

15 Jan, 2019 2 commits

Repair ipldbind implementing ipld.Node. · 828038d7

Eric Myhre authored Jan 15, 2019

With a panicky method.  This clearly needs to be done... eventually.
I'm currently just embarassed I can't `go test ./...` and that
needs to be fixed *now*.
Signed-off-by: Eric Myhre <hash@exultant.us>

828038d7

Clarify behavior of traverse on an absent field. · ec0ac6d2

Eric Myhre authored Jan 15, 2019

It should return an error explicitly.

A pair of "(nil, nil)" reponses would also be an unambiguous option for
representing an absent member, but seems more likely to create pitfalls
than save a significant amount of code for callers.
Signed-off-by: Eric Myhre <hash@exultant.us>

ec0ac6d2

10 Jan, 2019 1 commit

Hej, we've got tokenization. · 027146d7

Eric Myhre authored Jan 10, 2019

You could now readily rig up the ipldfree.Node implementation to a
refmt cbor.TokenSink, for example, and go to town.  (At the moment,
doing so is left as an exercise to the reader.  We'll make that a
smooth built-in/one-word function at some point, but I'm not yet sure
exactly how that should look, so it's deferred for now.)

While working on this, a lot of other things are cooking on simmer:
I'm churning repeatedly over a lot of thoughts about A) API semantics
in general, B) how to cache CIDs of nodes, and C) how to memoize
serializations / reduce memcopies during partial tree updates...
And (unsurprisingly) I keep coming back to the conclusion that the API
for dang near everything should be immutable at heart in order to
keep things sane.  The problem is figuring out how to pursue this
A) efficiently, B) in tandem with reasonably low-friction nativeness
(i.e. I want autocompletion in a standard golang editor to be as useful
as possible!), and C) given an (as yet) lack of good builder or
mutation-applier patterns.  ipldbind was meant to be a solution to the
majority of the B and C issues there, but that rubs smack against the
grain of "let's be immutable" in golang >:/  So... a rock and a hard
place, in short.
Signed-off-by: Eric Myhre <hash@exultant.us>

027146d7

08 Jan, 2019 4 commits

More scalar test coverage. · 3e4d91d8

Eric Myhre authored Jan 08, 2019

These are all skullduggerish, but basic sanity checks are nice when
you've got >3 impls of a thing to keep an eye on.
Signed-off-by: Eric Myhre <hash@exultant.us>

3e4d91d8

More test specs. · 659e13d2
Eric Myhre authored Jan 08, 2019
```
Signed-off-by: Eric Myhre <hash@exultant.us>
```
659e13d2

Consistency pass on method order and fullness. · 919e159c

Eric Myhre authored Jan 08, 2019

ipldfree and ipldbound and fluent and all the interfaces now have all
the isNull methods, asBytes, setBytes, etc.  (Previously this was a
little swiss-cheesy.)

SMOP.
Signed-off-by: Eric Myhre <hash@exultant.us>

919e159c

Behavior of zero-value ipldfree.Node redefined. · 8c1a9602

Eric Myhre authored Jan 08, 2019

It's now "invalid" rather than A) documented as map, and B) actually
panicky.
Signed-off-by: Eric Myhre <hash@exultant.us>

8c1a9602

08 Dec, 2018 1 commit

Begin schema validation method. · b66f9261

Eric Myhre authored Dec 07, 2018

This will be for the active path -- if we also follow through on the
idea of having a just-in-time checked Node wrapper, a lot of these
checks might end up duplicated. We'll DRY that up when we get there.

Doing accumulation of errors. Might get loud. We can come back and
and early-halt parameters or other accumulation strategies later.

Added IsNull predicate to core Node interface.

Going to want enumerated error categories here for sure, but punting
on that until we get more examples of what all the errors *are*; then
I'll come back and tear all this 'fmt.Errorf' nonsense back out.
Signed-off-by: Eric Myhre <hash@exultant.us>

b66f9261

06 Dec, 2018 1 commit

Add Kind and Keys methods to Node. · 5c32434e

Eric Myhre authored Dec 06, 2018

And ReprKind moves from the typed package to the ipld main package.
It's hard to get too much done without the standardization of ReprKind.

Between the Kind() and Keys() methods, it should now be possible to
perform traversals of unknown nodes.

This diff just worries about implementing all the Kind() methods.
Keys() has some additional questions to handle (namely, map ordering?).
Signed-off-by: Eric Myhre <hash@exultant.us>

5c32434e

26 Nov, 2018 1 commit

Mutable arrays in the ipldfree implementation! · f969ae41

Eric Myhre authored Nov 26, 2018

And a test of building an array and pulling a value back out.
Signed-off-by: Eric Myhre <hash@exultant.us>

f969ae41

10 Nov, 2018 4 commits

First test constructing then viewing a node. · 75ca4f32

Eric Myhre authored Nov 10, 2018

Since this uses a MutableNode factory method as a parameter, this test
is *generic* over our implementations.

Woot!

(Also, used the fluent wrapper interface to make things less
syntactically irritating on the reading side.)
Signed-off-by: Eric Myhre <hash@exultant.us>

75ca4f32

Define MutableNode; ipldfree.Node implements it. · 7782d061

Eric Myhre authored Nov 10, 2018

(Mostly; all the scalars work; the composites are todo's.)

Heading in the direction of being able to construct stuff for testing.
Almost there.
Signed-off-by: Eric Myhre <hash@exultant.us>

7782d061

Add missing link fields to ipldfree.Node. · 6e449c0a

Eric Myhre authored Nov 10, 2018

Not sure why these weren't there already.

(Probably because of the ongoing discussion about "is link a kind at
the data model level?" -- to which I still opine "ideally, no".  But
that's something we're probably going to concede "yes" on *for now*,
and try to straighten out later once we get something ready for the
higher level type systems.)
Signed-off-by: Eric Myhre <hash@exultant.us>

6e449c0a

Drag a few things towards consistent order. · 1ac85e2d

Eric Myhre authored Nov 10, 2018

Sort float and bytes (less ubiquitous things) to the bottom.

Drop mention of uint from the ipldfree implementation.  So far all spec
discussions have tended towards mentioning "integers" as a type, and
definitely not "unsigned integers" or "positive-only integers" as a
distinct kind in the core Data Model.
Signed-off-by: Eric Myhre <hash@exultant.us>

1ac85e2d