1. 16 Aug, 2021 1 commit
  2. 29 Jul, 2021 1 commit
  3. 24 Jan, 2021 1 commit
    • Eric Myhre's avatar
      schema: working to unify interfaces and dmt. Intermediate checkpoint commit. · f1859e77
      Eric Myhre authored
      This commit does not pass CI or even fully compile, and while I usually
      try to avoid those, A) I need a checkpoint!, and B) I think this one is
      interestingly illustrative, and I'll probably want to refer to this
      diff and the one that will follow it in the future as part of
      architecture design records (or even possibly experience reports about
      golang syntax).
      
      In this commit: we have three packages:
      
      - schema: full of interfaces (and only interfaces)
      - schema/compiler: creates values matching schema interfaces
      - schema/dmt: contains codegen'd types that parse schema documents.
      
      The dmt package feeds data to the compiler package, and the compiler
      package emits values matching the schema interface.
      This all works very nicely and avoids import cycles.
      
      (Avoiding import cycles has been nontrivial, here, unfortunately.
      The schema/schema2 package (which is still present in this commit,
      but will be removed shortly -- I've scraped most of it over into
      this new 'compiler' package already, just not a bunch of the validation
      rules stuff, yet) was a dream of making this all work by just having
      thin wrapper types around the dmt types.  This didn't fly...
      because codegen'd nodes comply with `schema.TypedNode`, and complying
      with `schema.TypedNode` means they have a function which references
      `schema.Type`... and that means we really must depend on that
      interface and the package it's in.  Ooof.)
      
      The big downer with this state, and why things are currently
      non-compiling at this checkpoint I've made here, is that we have to
      replicate a *lot* of methods into single-use interfaces in the schema
      package for this to work.  This belies the meaning of "interface".
      The reason we'd do this -- the reason to split 'compiler' into its own
      package -- is most because I wanted to keep all the constructor
      mechanisms for schema values out of the direct path of the user's eye,
      because most users shouldn't be using the compiler directly at all.
      
      But... I'm shifting to thinking this attempt to segregate the compiler
      details isn't worth it.  A whole separate package costs too much.
      Most concretely, it would make it impossible to make the `schema.Type`
      interface "closed" (e.g. by having an unexported method), and I think
      at that point we would be straying quite far from desired semantics.
      f1859e77
  4. 03 Jan, 2021 1 commit
  5. 25 Dec, 2020 1 commit
    • Daniel Martí's avatar
      all: rename schema.Kind to TypeKind, ipld.ReprKind to Kind · 2d7d25c4
      Daniel Martí authored
      As discussed on the issue thread, ipld.Kind and schema.TypeKind are more
      intuitive, closer to the spec wording, and just generally better in the
      long run.
      
      The changes are almost entirely automated via the commands below. Very
      minor changes were needed in some of the generators, and then gofmt.
      
      	sed -ri 's/\<Kind\(\)/TypeKind()/g' **/*.go
      	git checkout fluent # since it uses reflect.Value.Kind
      
      	sed -ri 's/\<Kind_/TypeKind_/g' **/*.go
      	sed -i 's/\<Kind\>/TypeKind/g' **/*.go
      	sed -i 's/ReprKind/Kind/g' **/*.go
      
      Plus manually undoing a few renames, as per Eric's review.
      
      Fixes #94.
      2d7d25c4
  6. 04 Aug, 2020 1 commit
  7. 30 Jul, 2020 1 commit
  8. 14 Jul, 2020 1 commit
  9. 09 Jul, 2020 2 commits
    • Eric Myhre's avatar
      08391195
    • Eric Myhre's avatar
      Refactor the schema.Type info to support cycles. · 47abab45
      Eric Myhre authored
      This is full of mundane plonking away at adding stars and ampersands,
      of which approximately none are interesting.
      
      (I'm particularly frustrated by this because these are all placeholder
      types, and we're getting *very* close to replacing them, as we get
      closer and closer to self-hosting... at which point all of this bonking
      about will be made totally irrelevant.  And yet to close the last mile,
      these "small" fixes are surprisingly irritating.  Ah well.)
      
      The bits that *are* interesting:
      
      - the Spawn functions for type info now take strings rather than types
      (so that they don't provoke a cycle problem for the user when
      constructing the information to describe cyclic type info);
      
      - all of the Type info structure hold a pointer to the TypeSystem, and
      use that to look up reified Type info for related types, so that their
      methods don't force the caller to do that themselves.
      
      (The TypeSystem pointer was already there, amusingly; just never before
      initialized, because it hadn't turned out to be load-bearing yet.)
      
      It also would've been possible to just change all the methods on the
      Type types to return TypeName rather than full Type info.  That would
      avoid the need to use a TypeSystem pointer.  I didn't because:
      
      Overall, this was done in such a way as to minimize the diff that
      impacts within the templates.  This was a goal because updating
      templates is a fair bit more work than other code due to the weak
      compiler support.  And we'll end up reviewing and changing these
      methods when we get to our goal of self-hosting generation of the
      schema types from the schema-schema, so, it's not worth pushing around
      diffs in that same area when they'd be sure to be churned under soon.
      47abab45
  10. 04 Jul, 2020 1 commit
    • Eric Myhre's avatar
      Union generation complete, and keyed repr; tests; and they pass. · f62d9445
      Eric Myhre authored
      I can't quite claim tests passed on the *first* shot... but,
      the first shot after mostly syntactical (rather than semantic) fixes?
      Yeah, actually.  That was pretty fun.
      
      Snuck in a bit of DRY'ing up.  The repetition in BeginMap methods
      got to me, and was low hanging fruit, so I extract that from unions
      and also backported it to structs.
      
      Errors got some work in this commit, because it turns out I've
      straightjacketed myself by not allowing "fmt.Errorf" due to imports.
      There's a lot to do there, and I only tackled what was directly
      critical to get this commit about unions across the finish line,
      but there's a few remarks in comments about where to go next.
      
      Some more comments about future work on the type info holder types
      also appears; I'm starting to skid on those placeholders and their
      issues more and more.  I really hope we can get to replace those
      sooner than later.
      
      And... also, yes, the idea of not having a "focus" state field in
      assemblers really bit it, hard, as speculated in the previous
      commit message.  I ended up using 'ca' in more switches than I
      expected, simply because it's easier to use that than have the
      conditonal templating branches that would be necessary to use the other
      tagging mechanisms that do also have sufficient info.
      
      One big fixme in the core interfaces for nodebuilders (wince):
      the ValuePrototype method can error sometimes, and that hasn't been
      accounted for.  Need to make a decision about what to do there.
      It's not really an exercised path in practice, but it shouldn't
      contain caltrops, regardless of how frequently used it is.
      (This probably would've come up earlier, except there's a bunch of
      stubs about ValuePrototype in other parts of codegen already; all of
      them need backfill, but haven't yet made it to top of the todo heap.)
      
      But despite all the fixmes accumulated, this does bring unions
      to be a usable thing, and definitively proves out that the design
      still works, even for what turns out to be one of the most complicated
      parts of the schema system!  It's very, *very* exciting to add the
      checkmarks to this part of the feature table -- it's one of the places
      I most feared "unknown unknowns", now it's put to bed.  Woot!
      f62d9445
  11. 02 Jul, 2020 2 commits
    • Eric Myhre's avatar
      Fix up the type structures for unions. · b60e5f7e
      Eric Myhre authored
      Also make some of the placeholder construction functions.
      (Not all of them.  Just the ones I intend to use first.)
      
      There was some very unfinished placeholder stuff going on there.
      (And ironically, it wasn't written in a very clear union-like way;
      it gets a lot less icky when it's rewritten so that it's uniony.)
      b60e5f7e
    • Eric Myhre's avatar
      Rename s/anyType/typeBase/. Internal. · 486a5dc8
      Eric Myhre authored
      The appearance of the word "any" there has started to perturb me;
      "any" is a concept that we also need to describe in schemas, and that
      type has nothing to do with it.  It's more of a base mixin.
      486a5dc8
  12. 19 Apr, 2020 1 commit
  13. 13 Apr, 2020 1 commit
    • Eric Myhre's avatar
      Support for structs with stringjoin reprsentation! · 2e79a3ea
      Eric Myhre authored
      This diff is pretty fun.  It's our first alternative representation,
      and it all works out nicely (no internal syntactical complications
      encountered or any other unexpected unpleasantness).
      
      It's also our first representation that involves potentially recursive
      destructuring work over a scalar!  So, you can see the new 'construct'
      method conventions appearing to handle that.  Works out neatly.
      This kind of recursive destructuring happens all within the span of
      processing one "token" so to speak, so naturally it can work without
      any stateful machinery.
      
      Some utility functions are added to the mixins package.
      (It's sort of betraying in the name of the package, but, well,
      it's an extremely expedient choice.  See comments about import
      management.  tl;dr: consistency makes things easier.)
      
      Tests for a recursive case of this will be coming soon, but requires
      addressing some other design flaws with the AdjunctConfig maps.
      (StructField is... not a good key.  It's not always hashable... such
      as when the StructField.Type field is inhabited by TypeStruct.)
      Easy enough to fix, just not going to cram it into this commit.
      
      The way null and the 'fcb' are handled here differs from previous
      generators: here, we *always* have an 'fcb', and just make a special
      one for the root builder.  This is... well, it works.  It was an
      attempt to simplify things and generate fewer method overrides, and
      it did.  However, I'm not going to dwell on this too long, because...
      
      The 'fcb' system is not working out well overall for other reasons:
      it's causing costly allocations.  (Taking a function pointer from
      a method requires at least two words: one for the function pointer
      and one for the value to apply it on.  That means an allocation.)
      So a serious rewrite of anything involing the 'fcb' strategy is needed.
      
      And that is going to be a little tricky.  The 'fcb' idea was itself
      a trying-slightly-too-hard-to-be-clever attempt to avoid an alternative
      solution that involves generating additional types per distinct child
      value slot (and I'm still unwilling to pursue that one -- it suggests
      far too many types).  The next best idea I have now seems to involve a
      lot of pointers into other assembler's state machines; this will surely
      be a bit touchy.  But no worse than any of the rest of all this, I
      suppose: it's *all* "a bit touchy".  Buckle up for fun diffs ahead.)
      
      So, considering these 'fcb' notes, this feels a bit mixed...
      "Two steps forward, one step back", so to speak.  Still: progress!
      And it's very exciting that we got through several new categories of
      behavior without hitting any other new roadbumps.
      2e79a3ea
  14. 10 Apr, 2020 1 commit
    • Eric Myhre's avatar
      Emit multiple packages in codegen tests. Exericse as plugins. · 616051d2
      Eric Myhre authored
      Using golang's plugin feature, we can... well, *do* this.
      
      To date, testing codegen has involved running the "test" in the gen
      package to get it to emit code; and then switching to the emitted
      package and _manually_ running the tests there.
      
      Now, running `go test` in the gen package is sufficient to do
      *everything*: both the generation, and the result compiling,
      and we can even write tests against the interfaces and run those,
      all in one step.
      
      There's also lots of stuff that becomes possible now that we can easily
      generate multiple separate packages with various codegen outputs:
      
      - Overall: everything is granular.  We can test selections of things,
        rather than needing to have everything fall into place at once.
      - Generally more organized results.
      - We can more easily inspect the size of generated code.
      - We can more easily inspect the size of the compiled result of gen!
        (Okay, not really.  I'm seeing a '.so' file that's 4MB is coming out
        from 200sloc of "String".  I don't think that's exactly informative.
        Some constant factor is thoroughly obscuring the data of interest.
        Nice idea in theory though, and maybe we'll get there later.)
      - We can diff the generated type for variations in adjunct config!
        (Probably not something that will end up tested, but neat to be able
        to do during dev.)
      
      Doing this with go plugins seemed like the neatest way to do this.
      It's certainly not the only way, though.  And in particular, I will
      confess that this will probably make developing from a windows host
      pretty painful: go plugins aren't supported on windows.  Mind,
      this doesn't mean you can't *use* codegen or its results on windows.
      It just means the tests won't work.  So, someone doing development
      _on the codegen itself_ would have to wait for the CI server to run
      the tests on their behalf.  Hopefully this is survivable.
      
      (There's also a fun wee wiggle in that loading a plugin has the
      requirement that it be built with the same "runtime".  The definition
      of "runtime" here happens to include whether or not things have been
      built in "race" mode.  So, the '-race' flag disappears from our CI
      config file in this diff; otherwise, CI will get the (rather confusing)
      error "plugin was built with a different version of package runtime".
      This isn't really worrying to ditch, though.  I'm... not even sure why
      the '-race' was in our CI script in the first place.  Must've arrived
      via cargo cult; we don't _have_ any concurrency in this library.)
      
      An alternative way to go about all this would be to have the tests for
      gen invoke `go test` (rather than `go build` in plugin mode) on each of
      the generated packages.  It strikes me as similar but worse.
      We still have to invoke the go tools from inside the test;
      we'd have *more* work to do to either copy tests into the gen'd package
      or else generate calls back to the parent package for test functions
      (which still have to be written against interfaces, so that they can
      compile even when the gen isn't done, as it indeed isn't when you
      freshly check out the repo -- exact same as with the plugin approach);
      and in exchange for the extra work, we get markedly worse output
      ('go test' doesn't nest nicely, afaict), and we can't separate the
      compiling of the generated code from the evaluation of tests on it,
      and we'd have no possibility of passing information via closures should
      we wish to develop table-driven tests where this would be useful.
      tl;dr: Highest cost, uglier, and worse.
      
      No matter which way we go about this, there *is* a speed trade-off.
      Invoking the compiler per package adds at least a half second of time
      for each package, in practice.  Worth it, though.
      
      And on the off chance that this plugin approach does burn later,
      and we do want to switch to child process 'go test' invocations...
      the good news is: we shouldn't actually have to rewrite very much.
      The everything-starts-from-NodeStyle-and-tests-against-Node work is
      exactly the same for making the plugin approach work, and will
      trivially pivot to working fine in for child 'go test' approaches,
      should we find it necessary to do so in the future.  So!  With our
      butts covered: a plugin'ing we shall go!
      
      Some of the code here still needs cleanup; this is a proof of concept
      checkpointing commit.  (The real thing probably won't have such
      function names as "TestFancier".)  But, we do get to see here:
      plugins work; more than one in the process works; and they work even
      when the same type names are in the generated packages.  All good.
      616051d2
  15. 07 Apr, 2020 1 commit
    • Eric Myhre's avatar
      Support for struct field rename directives. · 5f93c298
      Eric Myhre authored
      Some touches to ImplicitValue came along for the ride, but aren't
      exercised yet.
      
      Tests are getting unwieldy.  I think something must be done about
      this before proceding much further or it's going to start resulting
      in increasingly significant velocity loss.
      5f93c298
  16. 29 Mar, 2020 1 commit
  17. 29 Jan, 2020 1 commit
  18. 24 Oct, 2019 2 commits
  19. 11 Aug, 2019 1 commit
  20. 15 Jul, 2019 1 commit
    • Eric Myhre's avatar
      Cut gordian knot. · ff43f7af
      Eric Myhre authored
      I'm solving the decorum problem by making "temporary" builder methods for the
      schema.Type values as necessary.  This will allow the gengo package to use
      them freely for now.  When we're done with all this, I want to restrict building
      of schema.Type values to a single method in the schema package which takes the
      "ast" types -- the ones we're hoping to codegen (!) -- and do lots of checks
      and cross-linking; the temporary builder methods will do *none*, and *of course*
      they can't take the "ast" types yet since they don't exist yet.
      
      Boom, cycle broken.
      
      Also: fix generateStringKind from taking a TypeName redundantly.
      ff43f7af