- 11 Aug, 2019 5 commits
-
-
Eric Myhre authored
It covers the bases on reading, and has a NodeBuilder which works to create new nodes based on the type-level structure. (Quite a few other things about it are incomplete, and it might end up getting merged this way, because the goal here was primarily to scope out if any of our abstractions will shatter badly, and we've now got information on that.) This seems to indicate we are indeed on the right track, regarding some of those comments in the codegeneration work in the previous commit: check out the 'Representation' method. Yep -- we've got multiple nodebuilders in our future. Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
This has been on the clipboard for a while now. When I first wrote it, the idea that we'd need multiple NodeBuilders per typed Node seemed a little staggering; now that the idea has had a while to cook, it seems clear and uncontrovertial, so this diff now looks like far less of a noodle-baker than it originally was. (I had been planning to switch to working on runtime (that is, wrapper-based, non-codegen) typed nodes for a while to validate this idea, and now I have some partial diffs from that to land too, but the validation returned "true", so now I regret having multiple irons in the fire... sigh! Software development is hard.) Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
-
Eric Myhre authored
There's actually a doozy of performance considerations in this commit. Verifying (and then correcting) that comment kicked off a surprisingly deep and demanding binge of research. (Some of the considerations are still only "considerations", unfortunately -- one key discovery is that (surprise!) conclusive choices require more info than microbenchmarks alone can yield.) First, the big picture: One of the things we need to be really careful about throughout a system like go-ipld-prime (where we're dealing with large amounts of serialization) is the cost of garbage collection. Since we're often inside of an application's "tight loop" or "hot loop" or whatever you prefer to call it, if we lean on the garbage collector too heavily... it's very, very likely to show up a system-wide impact. So, in essence, we want to call "malloc" less. This isn't always easy. Sometimes it's downright impossible: we're building large tree structures; it's flatly impossible to do this without allocating some memory. In other cases, there are avoidable things: and in particular, one common undesirable source of allocations comes from "autoboxing" around interfaces. (More specifically, the name of the enemy here will often show up on profiling reports as "runtime.convT2I".) Sometimes this one can be avoided; other times, not. Now, a more detailed picture: There are actually several functions in the runtime that relate to memory allocation and garbage collection, and the less we use any of them, the better; but also, they are not all created equal. These are the functions that are of interest: - runtime.convT2I / runtime.convT2E - runtime.newObject - runtime.writeBarrier / runtime.gcWriteBarrier - runtime.convTstring / etc Most of these functions call `runtime.mallocgc` internally, which is why they're worth of note. (writeBarrier/gcWriteBarrier are also noteworthy, but are different beasts.) Different kinds of implementations of something like `justString` will cause the generated assembly to contain calls to various combinations of these runtime functions when they're converted into a `Node`. These are the variations considered: - Variation 1: `type justString string`: results in `runtime.convTstring`. - Variation 2: `type justString struct { x string }`: results in `runtime.convT2I`. - Variation 3: as above, but returning `&justString{val}`: results in `runtime.newobject` *and* its friends `runtime.writeBarrier` and `runtime.gcWriteBarrier`. The actual performance of these... it depends. In microbenchmarks, I've found the above examples are roughly: - Variation 1: 23.9 ns/op 16 B/op 1 allocs/op - Variation 2: 31.1 ns/op 16 B/op 1 allocs/op - Variation 3: 23.0 ns/op 16 B/op 1 allocs/op So, a couple things about this surprised me; and a couple things I'm still pretty sure are just microbenchmarks being misleading. First of all: *all* of these call out to `mallocgc` internally. And so we see an alloc-per-op reported in all three of them. (This actually kinda bugs me, because I feel like we should be able to fit all the requisite information in the first case directly into the interface, which is, if I understand correctly, already always two words; and arguably, the compiler might be smart enough to do this in the second case as well. But I'm wrong about this, for whatever reason, so let's just accept this one and move along.) But they vary in time. Why? Variation 2 seems to stand out as slower. Interestingly, it turns out `convT2E` and `convT2I` are extra problematic because they involve a call of `typedmemmove` internally -- as a comment in the source says, there's both an allocation, a zeroing, and then a copy here (at least, as of go1.12); this is a big bummer. In addition, even before getting that deep, if you check out the disassembly of just our functions: for our second variation, as inlined into our microbenchmark, there are 9 instructions, plus 1 'CALL'; vs only 3+1 for the first variation. This memmove and extra instructions seems to be the explainer for why our second variation (`struct{string}`) is significantly (~8ns) slower. (And here I thought variation two would do well! A struct with one field is the same size as the field itself; a string is one word of pointer; and an interface has another word for type; and that's our two words, so it should all pack, and on the stack! Alas: no.) Now back up to Variation 1 (just a typedef of a string): this one invokes `runtime.convTstring`, and while that does invoke `mallocgc`, there's a detail about how that's interesting: it does it with an ask for a small number of bytes. Specifically, it asks for... well, `unsafe.Sizeof(string)`, so that varies by platform, but it's "tiny". What's "tiny" mean? `mallocgc` has a specific definition of this, and you can see it by grepping the runtime package source for "maxTinySize": it's 16 bytes. Things under this size get special treatment from a "tiny allocator"; this seems to be why `runtime.convTstring` is relatively advantaged. (You can see benchmarks relating to this in the runtime package itself: try `go test -run=x -bench=Malloc runtime`. There's a *huge* cliff between MallocLargeStruct versus the rest of its fellows.) Variation 3 also appears competitive. This one surprises me, and this is where I still feel like microbenchmarks must be hoodwinking. The use of `runtime.newobject` seems to hit the same corners as `runtime.convTstring` at runtime in our situation here: it's "tiny"; that's neat. More confusingly, though, `runtime.writeBarrier` and `runtime.gcWriteBarrier` *should* be (potentially) very high cost calls. And for some reason, they're not. This particular workload in the microbenchmark must just-so-happen to tickle things in such a way that these calls are free (literally; within noise levels), and I suspect that's a happy coincidence in the benchmark that won't at all hold in real usage -- as any amount of real memory contention appears, the costs of these gc-related calls can be expected to rise. I did a few more permutations upon Variations 2 and 3, just out of curiosity and for future reference, adding extra fields to see if any interesting step functions revel themselves. Here's what I found: - {str,int,int,int,int} is 48 bytes; &that allocs the same amount; in speed, & is faster; 33ns vs 42ns. - {str,int,int,int} is 48 bytes; &that allocs the same amount; in speed, & is faster; 32ns vs 42ns. - {str,int,int} is 32 bytes; &that allocs the same amount; in speed, & is faster; 32ns vs 39ns. - {str,int} is 32 bytes; &that allocs the same amount; in speed, & is faster; 31ns vs 38ns. - {str} is 16 bytes; &that allocs the same amount; in speed, & is faster; 24ns vs vs 32ns. Both rise in time cost as the struct grows, but the non-pointer variant grows faster, and it experiences a larger step of increase each time the size changes (which in turn steps because of alignment). The &{str} case is noticeably faster than the apparently linear progression that starts upon adding a second field; since we see the number 16 involved, it seems likely that this is the influence of the "tiny allocator" in action, and the rest of the values are linear relative to each other because they're all over the hump where the tiny allocator special path disengages. (One last note: there's also a condition about "noscan" which toggles the "tiny allocator", and I don't fully understand this detail. I'd have thought strings might count as a pointer, which would cause our Variation 3 to not pass the `t.kind&kindNoPointers` check; but the performance cliff observation described in the previous paragraph seems to empirically say we're not getting kicked out by "noscan". (Either that or there's some yet-other phenomenon I haven't sussed.)) (Okay, one even-laster note: in the course of diving around in the runtime malloc code, I found an interesting comment about using memory "from the P's arena" -- "P" being one of the letters used when talking about goroutine internals -- and I wonder if that contributes to our little mystery about how the `gcWriteBarrier` method seems so oddly low-cost in these microbenchmarks: perhaps per-thread arenas combined with lack of concurrency in the benchmark combined with quickly- and sequentially-freed allocations means any gcWriteBarrier is essentially reduced to nil. However, this is just a theory, and I won't claim to totally understand the implications of this; commenting on it here mostly to serve as a pointer to future reading.) --- Okay. So what comes of all this? - I have two choices: attempt to proceed further down a rabbithole of microbenchmarking and assembly-splunking (and next, I think, patching debug printfs into the compiler and runtime)... or, I can see that last one as a step too far for today, pull up, commit this, and return to this subject when there's better, less-toy usecases to test with. I think the latter is going to be more productive. - I'm going to use the castable variation here (Variation 1). This won't always be the correct choice: it only flies here because strings are immutable anyway, and because it's a generic storage implementation rather than having any possibility of additional constraints, adjuncts from the schema system, validators, etc; and therefore, I don't actually care if it's possible to cast things directly in and out of this type (since doing so can't break tree immutability, and it can't break any of those other contracts because there aren't any). - A dev readme file appears. It discusses what choices we might make for other cases in the future. It varies by go native kind; and may be different for codegen'd types vs general storage implementations. - Someday, I'd like to look at this even further. I have a persistent, nagging suspicion that it should be possible to make more steps in the direction of "zero cost abstractions" in this vicinity. However, such improvements would seem to be pretty deep in the compiler and runtime. Someday, perhaps; but today... I started this commit in search of a simple diff to a comment! Time to reel it in. Whew. Signed-off-by: Eric Myhre <hash@exultant.us>
-
- 08 Aug, 2019 5 commits
-
-
Eric Myhre authored
This is meant to be about the cheapest implementation of Node possible. It's important to have this given how often we need string nodes. In the example of the moment: in writing some typed.Node systems, I need some intermediate string nodes just to nudge around map processes. Those nodes need to be handy and their implementation must be cheap. It's also useful that we can have a constructor which doesn't return errors, so it's easier to chain. The NodeBuilder features are restrained to needing to be able to yield errors for interface satisfaction purposes, in turn needed for more complex features. But when building a specific node, there's no reason we need to go through NodeBuilder at all. I'm putting this in the ipldfree package at the moment. I wouldn't mind putting a constructor for this in the root ipld package as well, but that would either mean a cyclic import or putting this implementation as a blessed one in the main package. (Putting this as a blessed default implementation in the main package is possible, and can be considered later, but is a "not today" thing; having these implementations in a subpackage is a good forcing-function for development to make sure we don't accidentally leave any important things unexported in the main package.)
-
Eric Myhre authored
Fix interaction between ExploreRecursive and ExploreUnion
-
Eric Myhre authored
More complete selector traversal
-
Eric Myhre authored
Builder for selectors
-
Eric Myhre authored
Feat/recursive selectors
-
- 06 Aug, 2019 11 commits
-
-
hannahhoward authored
Fixes a bug where ExploreRecursive would not properly replace ExploreRecursiveEdge (and cause a panic when traversed) if ExploreUnion was part of the recursion tree.
-
hannahhoward authored
Properly record last block traversed
-
hannahhoward authored
refactoring functions in traverseInformatively to simply and optimize logic and also minimize inner loop function calls
-
hannahhoward authored
Improve the traverse function so that it only explores relevant fields for low cardinality selectors and traverses high cardinality selectors for lists correctly
-
hannahhoward authored
Add link loading to traverse informatively so that you can traverse across links
-
hannahhoward authored
Decoding of nodes that are just simple values was erroring for both CBOR & JSON. This fixes the issue as well as fixing it in the general unmarshal.
-
hannahhoward authored
Extract the selector buidler to a seperate package
-
hannahhoward authored
Adds a builder class to quickly assemble selectors nodes, using a DSL that enforces the underlying schema. This allows quick assembly of selector nodes. This will likely be removed once IPLD Schema code gen is finished, but for the time being, it provides a quick way to ensure mostly typesafe selector nodes.
-
hannahhoward authored
Rename SelectorContext to ParsedParent (more accurate) and put in a ParseContext state object which is tracked w/o var args
-
hannahhoward authored
Support validating recursive selectors to match ExploreRecursive with ExploreRecursiveEdge
-
hannahhoward authored
An initial implementation of recursive selectors that offers a way to traverse them. Not covered is validating ExploreRecursiveEdge being under an ExploreRecursive selector
-
- 05 Aug, 2019 4 commits
-
-
Eric Myhre authored
-
Eric Myhre authored
Mind, I don't know if all the details of this have been worked out. In particular, where to get the LinkStorer and other fun stuff passed through this interface might be a fun dance (if, indeed, such a thing ends up needed -- this hasn't been written yet!). We'll see. One thing is clear: we want this contract here for all the same already-known reasons: a traversal.Transform over a structure using an ADL should "just work". Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
Prompted in part by https://github.com/ipld/go-ipld-prime/pull/21#issuecomment-518115245 . Turns out some of the stuff our earliest Node implementations do are actually pretty unusual. Many other features have more strictures. Therefore, it's important to specify the behavior such that the features that have to operate within more restrictions can still comply, and it's clear that anything else shouldn't be expected! (Perhaps we should just... make the ipldfree.Node implementations not do the flexible things they do right now. That package is starting to look in need of some renovation anyway. We'll see.) Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
Schema and codegen fun
-
- 29 Jul, 2019 1 commit
-
-
Eric Myhre authored
fix(encoding): match go-ipld-cbor cid encoding
-
- 27 Jul, 2019 1 commit
-
-
hannahhoward authored
go-ipld-cbor encodes a 0 byte ahead of cid bytes for multibase support - to maintain compatibility of nodes encoded to CBOR, add this feature
-
- 21 Jul, 2019 1 commit
-
-
Eric Myhre authored
This one was particularly aimed at making sure I hit all the hardest situations as trying to write the generators for NodeBuilders. It certainly did so! I found enough conceptual/design work to do that I'm actually going to park this here for a while, probably land this branch back to master before it gets any longer, and then approach things from some other angles for a while. In particular: We're going to need several *different* NodeBuilders for schema-typed nodes. The astute reader probably already noticed this, as it's implied by the `typed.Node{}.Representation() Node` method: that adds a second Node instance, and that means a second NodeBuilder. What's a little more fun yet is there might be more than two: for example, if we want the serial mode to support variation in field order for structs, that's another branch we need to account for somewhere... and, ordered vs unordered modes might need different structures of memory to do their job efficiently, so that means wholly different implementations of the interface. Whee. So. I'm thinking I might switch tack to knocking around in the (non-codegen) typed Node package for a while and making sure nothing looks wildly amiss over there while exploring these same features Then, when coming back here: looks like we're due for that breakdown of `EmitNodeBuilder{Detail...}` methods: and perhaps that's actually going to belong on a new type rather than glommed on flatly to `typeGenerator`... since there may be a n:1 relationship there!
-
- 20 Jul, 2019 8 commits
-
-
Eric Myhre authored
A bit fun; wish there was a way to compress that block of ifdefs about optional vs nullable stuff, but so far nothing clearer has emerged. Added ErrIteratorOverread while at it. Should refactoring other node types to use it too, I suppose. But perhaps we'll just accumlate those todos for a while, since there's already one other error refactor with a chain of blocks in front of it.
-
Eric Myhre authored
The comment in the ipld.Node type about IsUndefined has been there quite a while; today we're finally using it. With this set of changes, the generated code for structs now compiles successfully again: the Undef and Null thunks are both used there.
-
Eric Myhre authored
The latter isn't used yet; that's not the main goal of this branch (and I think I'd rather do the refactor to use ipld.ErrNotExists after moving the PathSegment types from the selector package into the root... which in turn I'm not going to do while some of the other current work on Selectors is in progress in other branches. So! Latur.); I just wanted it around for documentation and clarity of intent. This also adds a couple of other dummy references to the 'typed' package. I know keeping a set of packages-used-in-this-file is a common step in the development of codegen systems targetting golang, but I'm going to try to avoid it until the need is really forced.
-
Eric Myhre authored
Generates the struct itself (including the variations of pointers and/or extra fields necessary), There are several major TODOs outstanding; and the generated code doesn't *quite* compile due to several missing references, namely: - ipld.ErrNoSuchField - ipld.Nil - ipld.Undefined The other issues remaining: - The reason ErrNoSuchField is missing is because it needs to be decided where an error like that should go. Is it meaningfully distinct from a general "key absent" error, like a map can return? If it is indeed a distinct thing, does it go in the schema package, or will that just make things annoying until the end of time since every occurence-site-sibling error is in the ipld main package? - We need to think about typed nil / typed undefined. I suspect they are an idea to be avoided (but doing so means using the NodeBuilder() method on the nil or undefined values from a struct will *not* "DTRT", which may be problematic; this can in turn be avoided by "knowing" you're working on a typed node, but, erm. It turns into a question of which of these things is less annoying). - We'll need to backtrack up to the main ipld.Node interface and add that 'IsUndefined() bool' method now! (Or, take a radical approach of using a magic const for that. But... no, that doesn't sound fun.) - Haven't even touched the NodeBuilder yet. (And the typeGenerator interface needs to have those bits broken down, as well, so that we can do good stuff with error path codesharing as with funcs for Node.) - Haven't filled in MapIterator. But that's probably just easy chug. Lots to do, in short. I just need a checkpoint. Note how we've had this on the clipboard for a while already, heh -- the info_test.go file and TestUnderstandingStructMemoryLayout have been aimed at this; notice how any extra bool fields are generated at the bottom of the emitted struct. Many yaks have been shaved in the meanwhile, and yet look at how many we have to go. Isn't this fun? Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
Signed-off-by: Eric Myhre <hash@exultant.us>
-
Eric Myhre authored
Fields in these structs are now named. Embedding is reserved for things that are very explicitly supposed to carry their methods along. At some earlier point, I embedded the Type field in some of the generate structs because it saved me a few keystrokes in refactoring template strings. This was not important. Today I learned: if you have embeddings which cause a name to be shadowed -- as the generateKindedRejections_String.Type field could shadow the generateKindString.Type field, for example -- if those resolve "neutrally" (I don't know if there's a better technical term for this), then... in that case, the templating system will be like "yeah, fine, whatev", and it works. If you give one of those two fields a *separate type*, the templating system will now reject it as if no field of that name can be found. I encountered this while working on the generator for structs (which I *swear* I will commit any second now, when I stop finding prerequisite yaks to shave -- there have been many) and trying to use a more specialized TypeStruct instead of the Type interface. Whee. I get the logic of this. The error message could be better. Anyway: even though it technically didn't affect generateString (yet), making the change here for consistency with what's to come (including specializing the field even though there's currently no actual need for it).
-
Eric Myhre authored
This isn't a net negative size diff yet, but it certainly will have that effect momentarily when we add a generator for another kind. More importantly, it reduces the set of functions in the real generator file to *just* the relevant ones. We'll almost certainly extend this to cover the NodeBuilder half of things as well, later; I just haven't gotten there yet.
-
Eric Myhre authored
I wanted this for use in codegen work: in particular, as part of generating error messages. Note the comment on the case for unions. I still go back and forth on this topic; see some recent previous commit messages on this branch. (I think I'm slowing coming around to "unions act like maps", and it's just that most library code chooses to jump over them transparently, because that's the least-odd path for resolving NodeBuilder questions; but it remains the case that none of this is settled for sure yet.)
-
- 18 Jul, 2019 3 commits
-
-
Hannah Howard authored
Update to selector spec, minus recursive selectors
-
hannahhoward authored
update comment to clearly indicate return value correctly
-
hannahhoward authored
update the structure for ExploreIndex & ExploreRange to be slightly more clear and optimized
-
- 16 Jul, 2019 1 commit
-
-
hannahhoward authored
Adds unit test of each type of selector to check parsing and functionality. Also includes some fixes discovered while creating unit tests
-