-
Eric Myhre authored
Getting *enough* and sufficiently *organized* corpuses becomes a legitimate challenge. The docs outline some of the directions this will go while describing the naming convention. This naming convention has already been cropping up in an ad-hoc way in recent commits; this is a step towards documenting it consistently. There aren't many entries yet; expect it to grow. Using JSON as the defacto format is a little aggressive, perhaps, because it makes sort of a wide dependency span. But since we've already long had unmarshalling of json working, it seems viable in practice. And it means we get the marshalling output target corpuses for free for at least one of our formats. And it means we can readily make comparisons to stdlib json, which is nice for having baselines to frame comparisons against. It also has the interesting sideeffect of making these corpuses immune to change in the face of refactors to NodeBuilder (which will be an absurd concern at any time except... right now). (We'll see. Maybe I'll regret this after some time passes. But if so, this content probably just pivots to being still useful in json marshal and unmarshal tests.) I'd like to put this to work in writing more traversal benchmarks... but that's going to have to wait a few commits, because I've found some import cycles that get very problematic when I try to proceed there, and it looks like they might take a few steps to sort out.
c0cd2769