- 21 Feb, 2019 5 commits
-
-
Eric Myhre authored
Finally. This is "easy" now that we've got all the generic marshalling implemented over Node and unmarshalling onto NodeBuilder: we just glue those pieces together with the JSON and CBOR parsers and emitters of refmt, and we're off. These methods in the repose package further conform this to byte-wise reader and writer interfaces so we can squeeze them into Multicodec{En|De}codeTable... and thence feed them into ComposeLink{Loader|Builder} and USE them! And by use them, I mean not just use them, but use them transparently and without a fuss, even from the very high level Traverse and Transform methods, which let users manipulate Nodes and NodeBuilders with *no idea* of all the details of serialization down here. Which is awesome. (Okay, there's a few more todo entries to sort out in the link handling stuff. Probably several.) But this should about wrap it up with encoding stuff! I'm going to celebrate with a branch merge, and consider the main topic to be lifted back up to the Traverse stuff and how to integrate cleanly with link loaders and link builders. Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
We have both generic marshal and unmarshal -- they should work for any current or future ipld.Node implementation, and for any encoding mechanism that can be bridged to via refmt tokens. Tests are also updated to use builders rather than the ancient "mutable node" nonsense, which removes... I think nearly the last incident of that stuff; maybe we can remove it entirely soon. As when we moved the unmarshal code into its generic form, most of this code already existed and needed minor modification. Git even correctly detects it as a rename this time since the diff is so small. And as when we moved the unmarshal code, now we also remove the whole PushTokens interface; we've gotten to something better now. Finally we're getting to the point we can look at wiring these up together with all the multicodec glue and get link loading wizardry at full voltage. Yesss. Sooon. Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
Totally unexported, never touched. Baleet. Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
Per comments in this diff as well as discussion in previous commit's message, there's no need for it to be exported anymore. In fact, the only possible way it could've been used would've been invalid -- calling those methods in the closure body would've caused the list/map builder to invalidate itself, and make the library's later Build call almost certainly error. So, it's very much better not to have it at exported at all. This has brought us to an odd result: we could very nearly be dropping the fluent.{Map|List}Builder *interfaces* (though of course keeping their internal error-into-panic implementations) and just continuing to fit the existing ipld.{Map|List}Builder interfaces (but you'd be within your rights to ignore all error returns; they'd be redundant). Would this be helpful, though? I dunno. So let's not. Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
Two changes are combined here: these builders now work in tandem with closures (and call-chaining style is being backed away from entirely); and, these new closures also get NodeBuilder values supplied to them. This correctly prepares for NodeBuilder specializations to be viable in the future. Although this need is not "yet" exercised, typed.NodeBuilder won't be able to keep its semantics properly without this -- It's similar to the existing comments in Unmarshal code which mention how we'll require specializations for typed.Node usage in the future around the recursion sites there. And while this isn't exercised yet, getting the API correctly-shaped earlier than later will save a lot of refactoring fuss. I have also just plain been wanting a closure style syntax, because it nests nicely and more does the right things with visual alignment. So now we have one. Nice. The diffs in the test code using this new style of builder should be a nice example of how this looks compared to the old API syntax. I think those diffs speak for themselves. (It absolutely would be even nicer if we had a syntax for declaring closures in golang that could infer the types of their arguments without putting such a heavy textual weight on the closure declaration site. There's no way to do this in Golang currently, but I'm pretty certain it would be a feasible language enhancement without much in the way of theoretic stretches nor significant complier complexity nor additional user complexity. This might be something to pursue.) --- One interesting design point to debate within the closure design here: should we really try to 'save' the closure from calling Build? (This code does.) On a first thought, perhaps not: doing the Build call in the library code deprives the author of the closure from being able to Recover errors from the final Build call if desired. Doubly problematically, stack traces from those Build errors would contain worrying less meaningful stack traces, since such errors would be raised from library code rather than the closure. On the second thought: both those issues seem easily addressed in in actual usage. The stack trace will still capture the line of the `CreateFoo(fn(...` invokation, and that line should be sufficiently proximate to user (rather than library) code to be meaningful. (There's approximately no reason I can imagine to be passing any ListBuildingClosure functions from far away. If we had ruby syntax, this stuff would all be a 'do' block at the end of line, and it would be syntactically unsupported to provide from a distance.) It's even possible to wrap exactly the `CreateFoo` call in a Recover, which makes it possible to get every bit as much control as if we made the Build call the closure's responsibility and wrapped it there. In light of those 'second' thoughts -- and also because having the {Map|List}BuildingClosure definitions include an 'ipld.Node' return type *in addition* to their already irritating length is a not insigificant exacerbation of usage friction -- I'm opting to keep the Build call in the fluent library code for now. --- (This fix comes here and now because writing additional test fixtures for the next-up marshalling code without these QoL improvements was just not any fun, and the more critical future issues jumped in front of my eyes at roughly the same time.) --- Also worth noting in this diff: the large REVIEW comment in the MapBuildingClosure docs about having a NodeBuilder for keys. I haven't added all of the notes I have on the strings-vs-Nodes questions around map key APIs to tracking in git, and probably should; it's turned out more fiddly than expected. This comment here probably already gives a fair insight into both how and why and how much so. The NodeBuilder-for-keys param is included in MapBuildingClosure for now out of an abundance of caution. But this is indeed worthy of review and might be elided in the future. Signed-off-by: Eric Myhre <hash@exultant.us>
-
- 20 Feb, 2019 2 commits
-
-
Eric Myhre authored
This unmarshal works for any NodeBuilder implementation, tada! Old ipldfree.Node-specific unmarshal dropped... as well as that entire system of interfaces. They were first-pass stuff, and I think now it's pretty clear that it was barking up the wrong tree, and we've got better ideas to go with now instead. (Also, as is probably obvious from a skim, the old code flipped pretty clearly into the new code.) Turns out refmt tokens aren't a very relevant interface in IPLD. I'm still using them... internally, to wire up the CBOR and JSON parsers without writing those again. But the rest of IPLD is more like a full-on and principled alternative to refmt/obj and all its reflection code, and that's... pretty great. Earlier, I had a suspicion that we would want more interfaces for token handling on each Node implementation directly, and in particular I suspected we might use those token-based interfaces for doing transcription features that flip data from one concrete Node implementation into another. (That's why we had this ipldfree.Node-specialized impl in the first place.) **This turns out to have been wrong!** Instead, now that we have the ipld.NodeBuilder interface standard, that turns out to be much better suited to solving the same needs, and it can do so: - without adding tokens to the picture (simpler), - without requiring tokenization-based interfaces be implemented per concrete ipld.Node implementation (OH so much simpler), - and arguably NodeBuilder is even doing it *better* because it doesn't need to force linearization (and while usually that doesn't matter... one can perhaps imagine it coming up if we wanted to do a data transcription in memory into a Node implementation which has an orderings requirement). So yeah, this is a nice thing to have been wrong about. Much simpler now. Old ipldfree.Node-specialized 'PushTokens' is still around. Not for long, though; it just hasn't finished being ported to the new properly generalized style quite yet. Note, this is not the *whole* story, still, either. For example, still expect to have an ipldcbor.Node which someday has a *significantly* different set of marshal and unmarshal methods -- it may eschew refmt entirely, and take the (very) different strategy of building a skiplist over raw byte slices! -- that will exist *in addition* to the generic implementations we're doing here and now. More on that soon. Yeah. A lot of interfaces to get lined up, here. Some of them tug in such different directions that picking the right ones to make it all possible seems roughly like solving one of the NP-hard satisfiability problems. (Good thing it's actually with a small enough number of choices that it's tractable; on the other hand, enumerating those choices isn't fast, and the 'verifier' function here ain't fast either, and being a "design" thing, it can only be evaluated on human wetware. So yeah, an NP problem on a tractable domain but slow setup and slow verifier. Sounds about right.) (uh, I'm going to write a book "Design: It's Hard: The Novel" after this.) Tests are patched enough to keep working where they are; I think it's possible that a reshuffle of some of them to be more closely focused on the marshal code rather than the node implementation packages might be in order, but I'm going to let that be a future issue. (Oh, and they did shine a light on one quick issue about MapBuilder initialization, which is also now fixed.) Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
This is the first commit going down a long and somewhat dizzying prerequisite tree: - For graphsync (an out-of-repo consuming project) we need selectors - For Selectors we need traversal implemented - For Traversal implementations we need link loaders [‡] - For link loading we need all deserialization implemented - (and ideally, link creation is done at the same time, so we don't get surprised by any issues with the duals later) - and it turns out for deserialization, we now have some incongruities with the earlier draft at MapBuilder... So we're all the way at bugfixes in the core ipld.MapBuilder API. Nice. ([‡] Some of those jumps are a little strained. In particular, traversal doesn't *in general* need link loaders, so we might choose a very different implementation order (probably one that involves me having a lot less headaches)... *except*, since our overall driver for implementation order choices right now is graphsync, we particularly need traversals crossing block boundaries since we're interested in making sure selectors do what graphsync needs. Uuf.) What's the MapBuilder design issue? Well, not enough error returns, mostly. We tried to put the fluent call-chaining API in the wrong place. Why is this suddenly an issue now? Well, it turns out that properly genericising the deserialization needs to be able to report error states like invalid repeated map keys promptly. Wait, didn't we even *have* deserialization code before? Yes, yes we did. It's just that that draft was specialized to the ipldfree.Node implementation... and so it doesn't hold up anymore when we need it to work in general traversal. Okay, so. That's the stack depth. With all that in mind... This diff adds more error return points to ipld.MapBuilder, and maintains the fluent variant more closely matching the old behavior (including the call-chaining style), and fixes tests that relied on this syntax. Duplicate keys rejection is also in this commit. I thought about splitting it into further commits, but it's so small. (We may actually need more work in the future to enable Amend+(updating)Insert, but that's for later; perhaps an Upsert method; whatever, I dunno, out of scope for thought at the moment.) And then we'll carry on, one step at a time, from there. Whew. --- Sidebar: also dropping MapBuilder.InsertAll(map[Node]Node) from the interface. I think this could be better implemented as a functional feature that works over a MapBuilder than being a builtin, and we should prefer a trim MapBuilder. And might as well drop it now rather than bother fixing it up just to remove later. --- ipld.ListBuilder also updated to no longer do a call-chaining style API, while fluent.ListBuilder continues to do so. This is mainly for consistency; we don't have the same potential for mid-build error conditions for lists as we do with maps, but ipld.ListBuilder and ipld.MapBuilder should be similar. --- Aaaaand one more! NodeBuilder.{Create,Append}{Map,List}() have ALL been updated to also return errors. Previously, the append methods had an error state if you used them when the NodeBuilder was bound to a predecessor node of an unmatching type, but they just swallowed them into the builder and regurgitated them (much) later; we're no longer doing this. Additionally, it's occurred to me that *typed* builders -- while not so much a thing, yet, certainly a thing that's coming -- will even potentially error on CreateMap and CreateList methods, according to their type constraints. So, jump that now. ... Yeah, basically a whole tangle of misplaced optimism about error paths in NodeBuilder and its whole set of siblings has been torn through at once here. Bandaid ripping sound. Signed-off-by: Eric Myhre <hash@exultant.us>
-
- 14 Feb, 2019 3 commits
-
-
Eric Myhre authored
Mea cupla for all these absolutely terrible placeholder names. Previous comment on MulticodecDecoder about it probably currying a NodeBuilder inside itself was wrong. (Message of the previous commit explores this in further detail already, so I won't repeat that here.) We'll be making concrete implementations of MulticodecDecoder and MulticodecDecoder in this library for at least JSON and CBOR (for "batteries-included" operation on the most common uses); other implementations (like git, etc) will of course be possible to register, but probably won't be included in this repo directly (for the sake of limiting dependency sprawl). When we get to them, those two core implementations should be a good example of the probing-for-interfaces described in the comments here. Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
This is a first and incomplete draft, and also with placeholder names. The whole topic of loading links and writing nodes out to a hashable linkable form is surprisingly parameterizable. On the one hand, great, flexibility is good. On the other hand, this makes it intensely difficult to provide a simple design that focuses on the essense of user wants and needs rather than getting bogged down in a torrent of minutiae. We want to be able to support a choice of abstraction in storage, for example -- getting and putting of raw bytes -- without forcing simultaenously engaging with and reimplementing the usage of multihash, the encoding and multicodec tagging, and so on. This is easier said than done. The attempt here is to have some function interfaces which do the aggregated work -- and these are what we'll actually use in the rest of the codebase, such as in the TraversalConfig for example -- and have some functions to build them out of closures that bind together all the other details of codecs and hashing and so forth. In theory you could build your own LinkLoader entirely from whole cloth; in practice, we expect to use ComposeLinkLoader in almost 100% of all cases. I think this is on the overall right track, but there's at least a few goofs in this draft: specifically, at the moment, somehow the pick of Node impl got kicked to all the way inside the MulticodecDecoder. This is almost certainly wrong: we *absolutely* want to support users picking different Node implementations without having to reconfigure the wiring of marshal and unmarshal! For example: we need to be able to load things into typed.Node implementations with concrete types that were produced by codegen; and we don't want to force rewriting the cbor parser every time a user needs to do this for a new type! Something further is required here. We'll iterate in coming commits. The multicodec tables are an attempt to solve that parameter in a fairly straightforward plugin-supporting way. This is important because IPLD has support for some interesting forms of serializaton such as "git" -- and while we want to support this, we also absolutely do not want to link that as a strict essential dependency of the core IPLD library. Where exactly this is going should also become clearer in future commits -- and remember, as described in the previous paragraph, at least one thing is completely wrong here and needs a second round of draft to unkink it. There are some inline comments about deficiencies in the multihash interfaces that are currently exported. Some improvements upstream would be nice; for now, I'm simply coding as if streaming use *was* supported, so everything will be that much easier to fix when we get the upstream straightened out. The overall number of parameters in LinkBuilder is still pretty overwhelming, frankly, but I don't know how to get it down any further. The saving grace of all this is that when it's put to work in the traversal package, things like traversal.Transform will simply stash the metadata for CID and multicodec and multihash from any links traversed... and then if an update is propagated through, it'll just pop that metadata back up for the saving of the new "updated" Node. The whole lifetime of that metadata is on the stack and the user never should be bothered by it as long as they're doing "updates". (Users creating *new* objects get the full facefull of choices, of course. I don't know if anything can be done about that.) It's possible that the whole "multicodecType uint64, multihashType uint64, multihashLength int" suite of arguments to LinkBuilder should be conveyed in a tuple; it's not really likely that they'll ever vary independently, and in general whole applications have probably picked one set of values and will stick with it application-wide. The `cid.Prefix` struct even matches this already. But it's... not really clear what's going on in that package, frankly. There seem to be several competing variations on builder patterns and I've got no idea which one we're "supposed" to be using. Therefore, I'm not using cid.Prefix for this purpose until some more conversations are had about that. Also note ActualStorer (placeholder name) and StoreCommitter as a pairing of functions. Previous generations of library have conjoined this, based on assumption that we'll have "blocks", and those are small enough that it's "fine" to have them completely in memory, and do a hashing pass on them before beginning to write. This is nonsense and the buck stops here. A two-phase operation with commit to a hash at the *end* is a strictly more powerful model -- it's easy to implement buffering and all-at-once write when you have a two-phase interface; it's impossible to implement a useful two-phase interface when you're forced to start from an overbuffered single-step interface -- and lends itself better to efficiency and a lower high-water-mark for memory usage. Making this choice here in the IPLD libraries might make it moderately harder to connect this to existing IPFS blockstore code... but it'll also make it easier to write other correct storage and loaders (think: opening a single local file with a temp name, and atomically moving it to a correct CAS location in the commit call), as well as eventually be on the right path when we start fixing other IPFS library components to be more streaming friendly. tl;dr design work. It's hard. Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
Outline of Traverse* funcs; comment about link loaders in TraversalConfig; confession of working draft of Selector interface; fixed typo in AdvVisitFn return types. Yes, that's lot of things heaped in one commit. Yes, (unfortunately,) they're all related. Almost the entirety of this has to manifest *at once*, all seamlessly connected, for any part of it to get off the ground. Note how very much is handwaved in the comments about TraversalConfig having linkloader tools of some kind. There's a lot to unpack there. Getting it to work well, and let users of the library supply functions that customize the parts that are interesting to the user, while *not* getting them hamstrung by a bunch of details about multicodecs and multihashing (unless they want to!)... requires careful design work. Signed-off-by: Eric Myhre <hash@exultant.us>
-
- 12 Feb, 2019 1 commit
-
-
Eric Myhre authored
All traversal/transform/selector/path code to date got a rethink. For the longest time, I was trying to keep all the traversal code in the main root IPLD package, thinking/hoping that the number of relevant functions would be minimal enough that this would be fine. Somewhere around the time where I conceded that yes, we *do* need separate methods for writable and readonly paths (it's true that readonly is a degenerate case of writable -- but they have different performance characteristics!), I gave up: the number of functions in play is too high to heap alltogether in the root package namespace. This leads to a number of other surprising results: namely, that ipld.Path *isn't* -- it's actually traversal.Path! We *don't use* paths anywhere else except as guidance to some forms of traversal, and logs and progress markers during traversal. Huh! Despite being initially surprising, I now regard this as a *good* code smell. ipld.Traversal as an interface and ipld.Path.Traverse are dropped. These turned out to be the wrong level of abstraction: it's both missing things, and not able to return enough things. By the time we would fix both, it's not an abstraction anymore. Path.Traverse didn't know how to use a link loader func -- and that's correct; it shouldn't -- and it also didn't retain and return a stack of Nodes it traversed -- which again, it shouldn't, because in general that would be useless overkill and resource waste... it's just that both these things are essential in some use cases. So! Cut that gordian knot. The Focus/FocusTransform/Traverse/TraverseTransform methods will now do all recursion work themselves, internally. The transform family will keep stacks of nodes encountered along the way, to aid in the (presumably) coming tree re-build; the focus family won't. All of these methods will be able to support link loading, though implementation of that is still upcoming. There's a few more fields in TraversalProgress, also preparing for when we have automatic link loading and traversal: we'll want to be able to inform the user's callbacks about load edges. (I'm not a fan of the term 'block' here, but running with it for the moment.) TraversalConfig is expected to grow. Namely, to include link loaders... and subsequently, probably block writers, as well! This is a big subject, though. It'll probably take several commits and possibly more than one iteration over time to hammer out a good API. One thing that's interesting to note that neither TraversalConfig nor TraversalProgress will grow to contain certain things: for example, a map of "seen" CIDs when doing link traversal. Why? Well, if you've been mentally modelling things as a graph, this is surprising; surely any graph traversal needs to remember a set of already-visited nodes. The trick is... we're *not* doing a graph traversal -- not exactly! Since our visitor gets a (Node,Path) *tuple*... it would be wrong to memoize on anything other than the complete tuple; and the Path part of the tuple means we're actually doing a *tree* walk rather than a graph walk. (And of course, since we're operating on DAGs, termination guarantees are already a non-issue.) Tada; no "seen" set. Signed-off-by: Eric Myhre <hash@exultant.us>
-
- 08 Feb, 2019 1 commit
-
-
Eric Myhre authored
The policy on partial progress results is now explicitly "no". It wasn't useful. Tests have been updated accordingly. (The relevant tests *do* still still assert that the traversal errors happen where they should; it's just via the... well, the error value. Sense, neh?) We're introducing a TraversalProgress struct. This is roughly following "Option 2" from the earlier "this composition is imperfect!" lamentation block: it allows us to keep carrying Path from existing moves across the graph at all times. (I suspect we're going to incur somewhat unfortunate memory allocation thrash in large-scale operation because of all these immutable Path objects, which (ultra)unfortunately can't even reuse the n+1 length arrays when their prior sibling already allocated them... However, I care a lot more about getting the composability and ergonomics of these interfaces right first; and we can discuss how to hack shortcuts back in later. (Perhaps the more visit-oriented functions will do some fancy footwork, and combine it with warnings that the Path objects handed the visitor are only valid for its scope. This is sharp-edged, but seems to be common in practice in this kind of library; basically every performant git library I've ever seen has caveats like this on its iterators, for example. Anyway! Not doing that yet.)) Context is coming along for the ride on TraversalProgress! This is discussed in another one of the large comment blocks that's disappearing in this commit. Newly exported Path.Join method. Does what it says on the tin. Usage of this whole new TraversalProgress system is limited: we're just cutting away the incomplete cruft of first-draft functions in this diff whenever they got in the way, and newly designed Traverse/Visit/Transform/etc functions which correctly use TraversalProgress will be coming up in future diffs. Signed-off-by: Eric Myhre <hash@exultant.us>
-
- 06 Feb, 2019 3 commits
-
-
Eric Myhre authored
Revision needed. An idea I earlier discarded as too complex seems both less complex and more waranted after sleeping on it. Dunno if I'll get there tonight, but want this pushed for conversation. Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
Like all other methods on fluent.Node that panic the instant a concrete type of anything is asserted outside of traversal attempts. There's not much sense in holding off until trying to start an iteration that's most certainly going to fail; and we do assume that anyone making the iterator intends to use it. Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
We now have nice, immutable nodes... as soon as we deprecate and remove the MutableNode stuff. All the tests pass. A few methods for bulk update are still stubbed... as is delete; seealso the comment about that in the API file. NodeBuilders needed replication in the fluent package as well. This is perhaps a tad frustrating, as well as borderline surprising (because "how can sheer memset have errors anyway?"), but the ability for node building to reject invalid calls will become relevant when typed.Node is in play and has to enforce type rules while still supporting the generic methods. The builder syntax for maps and lists is based on chaining. This works... okay. I'm not as satisfied with the visual effect of it as I'd hoped. There'll probably be more experiments in this area. Things nest -- critical success; but the indentation level doesn't quite match what we'd contemporarily consider natural flow. But look at the diff between the mutableNode test specs and the new immutable&fluent nodeBuilder test specs! The old ones were nuts: temp variables everywhere for construction, total visual disorder, etc. The new NodeBuilder is much much better for composition, and doesn't need named variables to save fragments during construction. More refinement on map/list builders still seems possible, but it's moving in the right direction. Signed-off-by: Eric Myhre <hash@exultant.us>
-
- 05 Feb, 2019 6 commits
-
-
Eric Myhre authored
Fortunately, that just already worked fine. Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
Allows uncommenting some fixme's in some tests. Also, changed free.keyIterator to be unexported. Exporting that was a typo; there's simply no reason for that to be exported. Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
Hej, we can finally unmarshal things again. Plug a cbor parser in from the refmt library and go go go! This doesn't use any of the mutablenode stuff because, as remarked upon a (quite a) few commits back, we want to replace that with NodeBuilder interfaces; but, furthermore, we actually don't *need* those interfaces for anything in an unmarshalling path (because I don't expect we'll seriously need to switch result Node impl in mid-unmarshalling stream); so. You can imagine that the cbor.Node system will have a *very* different implementation of this interface (and probably another method that doesn't match this interface at all, and is more directly byte stream based); but that's for later work. And tests! Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
An "immediate" rather than generative function is also available; the docs contain caveats. At the moment, these distinctions are a bit forced, but we want to be ready for when we start getting truly huge maps where the generative usage actually *matters* for either memory reasons, latency reasons, or both. Separated Length rather than trying to pull double-duty the Keys method; the previous combination was just utterly silly. The implementations in the bind package are stubs; that package is going to take a lot of work, and so the focus is going to be entirely on keeping the 'free' package viable; and we'll revisit bind and such when there's more dev time budget. Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
Continues the work started in 2263295e, with the purpose of fixing undesired three-valued logic by disambiguating 'optional' from 'default'. Some comments in the schema now address this in detail, including description of the membership cardinality. Some incongruities in the JSON have been fixed; and a significant new set of definitions for struct-represented-as-map configuration details is introduced. I gave some brief consideration to eliding StructRepresentation_Map and instead simply using a typedef'd map, and it would be possible since it only has one field... however, I'm opting to keep it because I'm willing to bet we'll come up with more properties someday which are for the whole struct rather than just per field, and thus it's unlikely that StructRepresentation_Map will stay single-field forever. The 'default' value itself in the new StructRepresentation_Map_FieldDetails type is interesting: it's the first time we've had an excuse to use an "Any" cop-out. And even more interestingly, it's almost certainly invalid: as commented inline in the diff, that field would probably be even better represented as a kinded union which makes it clear that only the scalar primitives are welcome here. We'll almost certainly introduce that union in a future commit, but I'm gonna let that thought marinate for a while first. There are related interesting questions about how we want to handle the DSL parsing in this area, and those might be good to think about more deeply before finalizing this part of the schema. Signed-off-by: Eric Myhre <hash@exultant.us>
-
- 02 Feb, 2019 1 commit
-
-
Eric Myhre authored
Signed-off-by: Eric Myhre <hash@exultant.us>
-
- 31 Jan, 2019 3 commits
-
-
Eric Myhre authored
I'm using templates for this rather than AST libraries because after some review of existing systems, it seems that this is a popular choice; and I'm flat out going to assume those people have known what they were doing and made their choice rationally. Protobufs, graphql codegen libraries, the ugorji serialization libraries -- almost every codegen I've ever come across uses templating. So, we will too. And so far, the results seem promising. The tests here will be shoddy for some time. I'm outputting code to a directory and periodically eyeballing it and making sure the compiler likes it. My aim is to get some fair fraction of codegen bootstrapped this way; and after we start codegen'ing the typedeclaration package itself, *then* we'll start solidifying things with regression tests. There's a lot of discovery process to go, here. See for example the comments about ranging versus iterators (namely, that it's not possible, which... turns out is going to feed back into our design all the way back out at the layer of the ipld.Node interface itself). Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
Removed the 'name' field from all of them. Doesn't match schema. While it's true that we'll want a name property on the reflective typesystem.Type objects, that's... well, that's over there. I'm starting to make fairly wild zigzags through here; with reason: I'm trying to get as quickly as possible to a bootstrap of the essenses of codegen. The hope is to get there fast, and power it with a few hardcoded values; then get that to churn out a replacement for the hand-written declarations package types; and then teach the codegen to write more and more of the e.g. ipld.Node interfaces and then the serialization bridges and then the NodeBuilder stuff, etc. The core coverage for getting the codegen to go, though, I think it's actually sufficiently smaller than all those parts that maybe this will be reasonable to pull off this way. Time will tell. (Shortly!) Dropping the typed/system/construct.go file temporarily, as the changes to the other typedeclaration files has broken it, and maintaining it through the shaking seems like it will generate inordinately more work than it's worth. It'll be revived soon; it's mostly on the right track, it's just not on the critical path atm. Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
This change to the schema-schema in DSL form makes it accurately match the schema-schema in JSON form, *AND* fixes the undesired three-valued logic at the same time. Wunderbar. Signed-off-by: Eric Myhre <hash@exultant.us>
-
- 28 Jan, 2019 1 commit
-
-
Eric Myhre authored
-
- 21 Jan, 2019 4 commits
-
-
Eric Myhre authored
And fix a few bugs detected in the DSL form (one union typo'd into an incorrect representation; and one just outright missing its representation declaration), but fortunately nothing too serious. Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
-
Eric Myhre authored
The JSON representation of the schema shortens up noticeably if the properties for "nullable" and "optional" are themselves optional. Given that the hope is for the in-practice use of these properties to be the exceptional case rather than the norm, it's nicer to speak less of them. This edges around an interesting question: we don't actually *want* three-valued logic here... but that's precisely what "optional Bool" gives you: it may have the value "true", "false", or "undefined"! What we *mean* here is of course "true" or "undefined". It would be desirable to have a way to encode that. Perhaps we can find a way. One idea is that this is an issue in the representation of bool; and thus perhaps we should configure it in the struct's representation block rather than marking it optional in the struct's definition block. Let's give this more thought in the future. Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
(An alternative proposal, with the same example DSL but different choices in schema representation is in a sibling commit, and we'll merge and pick one or the other to continue with shortly.) This take on TypeTerm keeps things flatter and has fewer schema types: the optional and nullable properties are those of the containers in the recursive type (i.e., the map itself, list itself, or the fields in the case of structs). As with the alternative proposal, this seems to work and capture all of the concepts that we want: - the inline definitions are possible; - they recurse correctly - nullability can be inserted in all the correct places; - you *can't* have a "nullable nullable" nesting; - and bonus, the simple TypeName case is still a string repr. So, since all the same key properties exist in both approaches, in some ways, this is six-of-one/half-a-dozen-of-the-other. Both mechanisms of encoding clearly represent the same things. It's just question of which is nicer to work with, which seems more self-documenting to a human reader, and other intangibles of that sort. In this approach, we have a less rightward-leaning tree when the schema is represented in JSON form; by comparison, the alternative which had "nullable" represented as a union experiences deeper nested objects. Reducing depth-of-tree is hardly a proof of simplicity, but it's usually a good heuristic, so we can count that as a win here. Another comparison we can make is that the number of types used in this diff is 2; the number used in the other approach was 4. We do this at the trade of more fields in existing types, of course, so it's less than perfectly clear cut as a win for simplicity... but I tend towards thinking it is. Especially certainly clearer is that we *don't* have a union type which immediately nests another different representation of union type in this approach! So, while it's always hard to make absolutionist statements about "simplicity", indeed, this seems simpler via several observations. In terms of what other people are doing... One interesting example from another endevour I recently noticed -- the Atom project emits a JSON file describing their API: https://github.com/atom/atom/releases/download/v1.34.0/atom-api.json In this JSON file, they've taken a tack similar to the one emboded in this commit: the "isOptional" property is inline with the field defn's. So, we can count Atom's choice as an effective a +1 vote for that. (It's perhaps interesting they also have *names* in their field obj... however, the fields here are in an array, which is necessary since they're describing positional arguments. So, that's entirely self-consistent, but also different constraints than we have, and thus while interesting, not quite comparable to our choices on that matter.) Note that the JSON-formatted schema-schema file has not been updated yet. We'll update and finish that after merging and selecting a proposal here. Signed-off-by: Eric Myhre <hash@exultant.us>
-
- 18 Jan, 2019 10 commits
-
-
Eric Myhre authored
This seems to work and capture all of the concepts that we want: - the inline definitions are possible; - they recurse correctly - nullability can be inserted in all the correct places; - you *can't* have a "nullable nullable" nesting; - and bonus, the simple TypeName case is still a string repr. The downsides are: - wow, this *looks* complicated on the schema side; - in part because we have a union-union nest, and conceptually we'd probably simply prefer to unify those, but it's a representational problem (can't clearly combine a keyed and a kinded); - the json repr looks to get deep fast. I wonder if we can do better than this. And also, we shifted the flagging of optional fields to StructRepresentation_Map, then eliding a struct for field properties entirely. This seemed potentially sensible given that we suppose a struct can be represented as a list, in which case optionality of fields won't work the same. However, on further thought, I'm also extremely dubious of this idea: whether or not a field can be traversed and then yield "undefined" is defined by whether it's optional; and I'm really suspecting it's going to create a lot of weirdness to make that feature conditional upon the StructRepresentation union. If anything, we should simply make invalid combinations of optionality and representation of that struct, well, erm, invalid. Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
Even though we having concretely specified TypeTerm yet, it's a pretty fair bet that map and array value types will both use it. Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
StructRepresentation also offers at least two options; and in the future, we might anticipate adding more options as well (suppose for example some kind of struct-to-string conversion systems; of which their could be many, but imagine for example one like matching a string like the "options" in your /etc/fstab: simple comma separated keys and values should be definable enough, no?). And now we have to wrangle the TypeTerm recursivity to completeness: how shall we concisely and correctly represent the declaration of inline declarations of map and array kinded types? I've left this a big honking TODO for this commit; it's touchy. Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
Not having a "set" definition is kind of irritating. I'm wondering if we should propose to make a set kind; and simply transform these to a map with null values as necessary. Possibly it could even be argued that this could make sense at the Data Model level? I'm not sure. Something that should be discussed, though. Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
To prove that we can. And now that we've got unions defined, we can actually describe most of the core concepts of "Type" itself, etc. Fixed one bug in the DSL-style schema; the union discriminator keyword for what kind of type a type is should be... "kind". Whew. Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
It is indeed redundant with the maps that must necessarily exist in each of the representation union members. Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
This is actually a fairly big improvement on what we'd written for the internal implementation in Go. The representation *should* be a union, in general. We'll do something similar for structs, and everything that has multiple options for representation strategy. One FIXME bugaboo: it's now absolutely unclear what to put as keys in the members map, as although there is a key, which one to use depends on the representation kind. It would feel odd to not have the membership be clear before reading through to the representation info, but I guess that is defacto where that information lives, so... Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
Added some enums that should be useful, and documented the bejezus out of them. Seems to be going well so far. Documention for TypeUnion. All the other interesting Type types are still todos, though; and we thus far haven't hoisted ourselves back up to the point where typeterm recursiveness needs to be fleshed out, which is one of the things we already know is going to be Fun. Fixed what's probably a typo with an extra equals sign. I'm not sure why my brain keeps wanting to do that for unions. And one interesting note: though I'm keeping TypeKind in the commit, see the todo comment about its potential irrelevance? It's amusing that we've just parallelly reproduced that; the same misstep already happened once before, just in the golang code: TypeKind there was also not needed, because the Type union members already represented it. Signed-off-by: Eric Myhre <hash@exultant.us>
-