GetRepresentationNodeGen, and impl of struct->map.
This resolves a *lot* of questions that were previously open.
(Progress will probably be faster after this.)
- It's now clear how GetRepresentationNodeGen works at all.
Turns out it really does just return a nodeGenerator, and that
works... really well.
- We've got the first example of a 'EmitTypedNodeMethodRepresentation'
method which generates a switch statement, so that's under the belt.
- Let's not bury the lede: the entire suite of generation code for
emitting an ipld.Node for the representation of a struct as a map,
and emitting the entire corresponding ipld.NodeBuilder for building
a struct out of map entries! Includes validation for all required
fields being set, the usual type checks, support for rename mappings,
and also validation against repeated entries (this lattermost bit is
a bit controversial, given that there may be other more efficient
places to do this check, but it's in for now; and see next bullets).
- The solution to the "what if there are multiple possible
representation implementations?" question is frankly to ignore it.
I had to think about this a long (long, long) time; time to move on.
Seealso the comments in the 'EmitNodebuilderMethodCreateMap' method
on 'generateStructReprMapNb' -- in short, this problem is too big
to tackle right now. We also, mostly, *don't need to* -- the
solution of "push it to the codec layer" can address the correctness
concerns in all cases I can think of, and the rest is hedging on
efficiency (for which we really need more complete implementations
and thereafter *benchmarks* in order to be conclusive anyway).
Endgame: the current course of action is to build things the way
that will operate correctly for the widest range of inputs.
- (Note to the future, regarding that last bullet point: some of
trickiest bits in this choice matrix around efficiency are where
concerns would be mostly in the codec layer, but would get efficiency
boosts from knowledge that's only available from the schema layer.
But the future-planned feature of generating ultra-fastpath direct
marshal and unmarshal functions with codec specialization will have
enough information at hand to actually cut straight through all of
those concerns!)
- Not appearing in this commit, but: expect a fairly huge writeup about
all these map ordering choices to be coming up in an exploration
report document in the ipld/specs repo soon.
The two commits coming before this one -- especially the "generality
of codegen helper mixins" one -- also were direct leadups for all this.
Several additional things remain todo:
- This all needs test coverage, and I haven't mustered that far yet.
Coming in the next commit or so. I won't be surprised if there's at
least one bug in this area until those are done. (I don't like
committing without tests included, but the current tests probably
need a small refactor in order to grow smoothly, and I'm not gonna
try to heap that onto the current diff. On the plus side: everything
in the generated output typechecks so far, and that's quite a bit.)
- Support for "implicit" values is missing. TODOs inline. They'll
interact with roughly the same parts of the code as optionals do.
- The representation gen for strings is, as you can see, a todo.
(It's probably an "easy" one -- but also, it would be nice to get
it to reuse as much code as possible, because afaict the
representation node and the type-semantics node are almost identical,
so that might turn out to be interesting.)
- Note that before we can rig unmarshall up to this and have it work
recursively and completely, we'll need to address the known todo of
nodebuilders-need-methods-to-propose-child-nodebuilders. I've been
putting that one off for a while, but I think we're coming up on
when it's correct to get that one done -- just before adding any more
generators or representations would be good timing.
- Several error paths are still using very stringy errors, and yearn to
be refactored into typed error structures. These are mostly the same
ones as have already appeared in other recent commits; we have
learned a few more things about which parts of the error message need
to be flexible, though... so the time to tackle these will also be
"soon". (Probably right after we do some more testing work, so we
can then immediately add tests for the unhappy paths right as we
upgrade the errors to typed constructions.)
Some other organizational open questions:
- Note that for the type-level node and nodebuilders, we're using two
separate files; and for the representation and its builder, I haven't
done so (yet). Would be good to move to one way or the other.
Undecided which one is more readable vs shocking yet.
- The names of the types we're using inside the generation isn't very
consistent right now either. It's evolving towards consistency as we
get more cases explored, and I think it's nearly at the mark now, but
I haven't been proactively refactoring the older stuff yet. Should;
but since it'll be roughly sed levels of complexity, not a blocker.
Things that look like tempting todos, but probably aren't:
- It *looks* at first glance like there's a lot of duplicated code
between the map representation of the struct and the struct itself.
I'm fairly sure this is a red herring and should not be pursued:
the places which are the same are many, it's true; but the places
that are different are wormed in all over the place, and trying to
extract the common features will likely result in templates which
are completely unreadable. This degree of almost-commonality is
also probably going to be unique in the entire set of kinds and
representation strategies that we'll deal with, making it further
unworthy of special attempts at "simplification". (The strings
case mentioned above as a todo is different from this, because there,
things are actually *identical*, not merely close (... I think!).)
I could be wrong about this, but if so, it'll be better to revisit
the question after several more kinds and representations get at
least their first draft.
Whew. Not sure what the hours-vs-sloc ratio is on this diff, but
it's *high*. Also worth it: a lot of the future course of development
is set out in the implications of the choices touched on here, and as
much as I'd like to develop iteratively, past experience (on refmt
in particular) tells me some of these will not be easy to revisit.
Signed-off-by: Eric Myhre <hash@exultant.us>
Showing
Please register or sign in to comment