1. 08 Jun, 2021 1 commit
    • Daniel Martí's avatar
      node/tests: cover yet more interface methods · b832b762
      Daniel Martí authored
      Including LookupBySegment and LookupByNode for both maps and lists, but
      also As<T> for the representation form of all scalar kinds.
      
      While at it, expand the interface implementation assertions, too.
      
      Note that we had to make the node/basic tests external,
      for the sake of testing LookupByNode without inserting an import cycle.
      b832b762
  2. 07 Jun, 2021 1 commit
    • Daniel Martí's avatar
      node/tests: cover more error cases for scalar kinds · dab6ec7c
      Daniel Martí authored
      In particular, calling non-scalar Node and Assembler methods should
      always fail.
      
      While at it, remove some unused code, since unions always have
      representation strategies.
      
      Brings up code coverage in bindnode from 72.7% to 75%.
      dab6ec7c
  3. 03 Jun, 2021 2 commits
    • Daniel Martí's avatar
      node/tests: add more extensive scalar kind tests · b6e2b10f
      Daniel Martí authored
      It covers AssignKind, AssignNode, and AsKind for every combination of
      assembler kind and method.
      
      We also verify that a constructed scalar node behaves the same with
      AsKind when using its representation, like the old test.
      
      There's effectively a triple loop as a test table, so the subtest name
      has up to three components separated by dashes, such as:
      
      	TestSchema/Scalars/Bytes-AssignNode-String
      
      We also use this test as a demo of quicktest instead of go-wish.
      
      Finally, adapt bindnode to pass these tests just like codegen. This was
      mainly a bunch of TODOs in the relevant methods.
      b6e2b10f
    • Daniel Martí's avatar
      node/bindnode: start running all schema tests · 6042d4d8
      Daniel Martí authored
      We add node/tests.SchemaTestAll to simplify this task, meaning we don't
      need to duplicate all test func declarations in node/bindnode.
      SchemaTestAll is also flexible enough to allow running multiple
      sub-tests per schema test in the future.
      
      There were two remaining places in node/tests that still weren't using
      ipld.DeepEqual, so fix those.
      
      Finally, bindnode needed a couple of changes to fully support
      ipld.DeepEqual. Most notable is iteration over maps, which required a
      bit of a refactor to keep ordered keys.
      6042d4d8
  4. 02 Jun, 2021 1 commit
    • Daniel Martí's avatar
      node/tests: put most of the schema test cases here · 5b6e1d30
      Daniel Martí authored
      Along with a generic Engine interface, so that they can be reused for
      other ipld.Node implementations, such as bindnode.
      
      node/bindnode will start using these in a follow-up commit, since this
      one is large enough as is.
      
      Tested that all three forms of testing schema/gen/go still work:
      
      	go test
      
      	CGO_ENABLED=0 go test
      
      	go test -tags=skipgenbehavtests
      5b6e1d30
  5. 25 May, 2021 1 commit
    • Daniel Martí's avatar
      node/bindnode: start of a reflect-based Node implementation · 806e36fb
      Daniel Martí authored
      Lots of TODOs and polishing to do, but it passes the schema/gen/go tests
      with minimal changes.
      
      Follow-up commits will continue filling in the gaps and adding
      documentation, notably examples. We'll also make the tests run by
      default, after a bit of refactoring.
      806e36fb
  6. 09 Apr, 2021 2 commits
  7. 07 Apr, 2021 1 commit
    • Daniel Martí's avatar
      schema/gen/go: avoid Maybe pointers for small types · f3d42e04
      Daniel Martí authored
      If we know that a schema type can be represented in Go with a small
      amount of bytes, using a pointer to store its "maybe" is rarely a good
      idea. For example, an optional string only weighs twice as much as a
      pointer, so a pointer adds overhead and will barely ever save any
      memory.
      
      Add a function to work out the byte size of a schema.TypeKind, relying
      on reflection and the basicnode package. Debug prints are also present
      if one wants to double-check the numbers. As of today, they are:
      
      	sizeOf(small): 32 (4x pointer size)
      	sizeOf(Bool): 1
      	sizeOf(Int): 8
      	sizeOf(Float): 8
      	sizeOf(String): 16
      	sizeOf(Bytes): 24
      	sizeOf(List): 24
      	sizeOf(Map): 32
      	sizeOf(Link): 16
      
      Below is the result on go-merkledag's BenchmarkRoundtrip after
      re-generating go-codec-dagpb with this change. Note that the dag-pb
      schema contains multiple optional fields, such as strings.
      
      	name         old time/op    new time/op    delta
      	Roundtrip-8    4.24µs ± 3%    3.78µs ± 0%  -10.87%  (p=0.004 n=6+5)
      
      	name         old alloc/op   new alloc/op   delta
      	Roundtrip-8    6.38kB ± 0%    6.24kB ± 0%   -2.26%  (p=0.002 n=6+6)
      
      	name         old allocs/op  new allocs/op  delta
      	Roundtrip-8       103 ± 0%        61 ± 0%  -40.78%  (p=0.002 n=6+6)
      
      Schema typekinds which don't directly map to basicnode prototypes, such
      as structs and unions, are left as a TODO for now.
      
      I did not do any measurements to arrive at the magic number of 4x, which
      is documented in the code. We might well increase it in the future, with
      more careful benchmarking. For now, it seems like a conservative starting
      point that should cover all basic types.
      
      Finally, re-generate within this repo.
      f3d42e04
  8. 18 Jan, 2021 1 commit
  9. 10 Jan, 2021 1 commit
    • Daniel Martí's avatar
      schema/gen/go: please vet a bit more · 6796504d
      Daniel Martí authored
      In particular, this removes ~50 out of the 2.7k warnings in 'go vet
      ./...' in this repository. Mainly, the "unreachable code" ones.
      
      This was caused by edge cases in some of the generated code which caused
      an unconditional return or panic statement to be followed by other code.
      Fix all of them with a bit more template logic.
      
      Some of the Next methods go a bit further. If they serve no purpose as
      the switch has no cases to be matched, just unconditionally return an
      error. In the future we can perhaps reuse a single function for that.
      
      Finally, I was having a hard time actually following the logic in
      kindedUnionNodeAssemblerMethodTemplateMunge, so I've indented the code a
      bit to follow the template logic and scoping.
      
      These changes move us towards pleasing vet, which is nice, but also make
      the code waste a bit less space.
      6796504d
  10. 03 Jan, 2021 1 commit
  11. 31 Dec, 2020 1 commit
  12. 27 Dec, 2020 1 commit
  13. 25 Dec, 2020 1 commit
    • Daniel Martí's avatar
      all: rename schema.Kind to TypeKind, ipld.ReprKind to Kind · 2d7d25c4
      Daniel Martí authored
      As discussed on the issue thread, ipld.Kind and schema.TypeKind are more
      intuitive, closer to the spec wording, and just generally better in the
      long run.
      
      The changes are almost entirely automated via the commands below. Very
      minor changes were needed in some of the generators, and then gofmt.
      
      	sed -ri 's/\<Kind\(\)/TypeKind()/g' **/*.go
      	git checkout fluent # since it uses reflect.Value.Kind
      
      	sed -ri 's/\<Kind_/TypeKind_/g' **/*.go
      	sed -i 's/\<Kind\>/TypeKind/g' **/*.go
      	sed -i 's/ReprKind/Kind/g' **/*.go
      
      Plus manually undoing a few renames, as per Eric's review.
      
      Fixes #94.
      2d7d25c4
  14. 17 Dec, 2020 1 commit
    • Daniel Martí's avatar
      all: rename AssignNode to ConvertFrom · 6e6625bd
      Daniel Martí authored
      This should be more intuitive to Go programmers, since assignments are
      generally trivial operations, but conversions imply that extra work
      might be needed to adapt the value to fit in the recipient.
      
      The entire change is just:
      
      	sed -ri 's/AssignNode/ConvertFrom/g' **/*.go
      
      Downstream users can very likely use the same line to fix their function
      declarations and calls.
      
      Fixes #95.
      6e6625bd
  15. 16 Dec, 2020 1 commit
    • Daniel Martí's avatar
      all: rewrite interfaces and APIs to support int64 · f6e9a891
      Daniel Martí authored
      We only supported representing Int nodes as Go's "int" builtin type.
      This is fine on 64-bit, but on 32-bit, it limited those node values to
      just 32 bits. This is a problem in practice, because it's reasonable to
      want more than 32 bits for integers.
      
      Moreover, this meant that IPLD would change behavior if built for a
      32-bit platform; it would not be able to decode large integers, for
      example, when in fact that was just a software limitation that 64-bit
      builds did not have.
      
      To fix this problem, consistently use int64 for AsInt and AssignInt.
      
      A lot more functions are part of this rewrite as well; mainly, those
      revolving around collections and iterating. Some might never need more
      than 32 bits in practice, but consistency and portability is preferred.
      Moreover, many are interfaces, and we want IPLD interfaces to be
      flexible, which will be important for ADLs.
      
      Below are some GNU sed lines which can be used to quickly update
      function signatures to use int64:
      
      	sed -ri 's/(func.* AsInt.*)\<int\>/\1int64/g' **/*.go
      	sed -ri 's/(func.* AssignInt.*)\<int\>/\1int64/g' **/*.go
      	sed -ri 's/(func.* Length.*)\<int\>/\1int64/g' **/*.go
      	sed -ri 's/(func.* LookupByIndex.*)\<int\>/\1int64/g' **/*.go
      	sed -ri 's/(func.* Next.*)\<int\>/\1int64/g' **/*.go
      	sed -ri 's/(func.* ValuePrototype.*)\<int\>/\1int64/g' **/*.go
      
      Note that the function bodies, as well as the code that calls said
      functions, may need to be manually updated with the integer type change.
      That cannot be automated, because it's possible that an automated fix
      would silently introduce potential overflows not being handled.
      
      Some TODOs and FIXMEs for overflow checks are removed, since we remove
      some now unnecessary int64->int conversions. On the other hand, the
      older codecs based on refmt need to gain some overflow check TODOs,
      since refmt uses ints. That is okay for now, since we'll phase out refmt
      pretty soon.
      
      While at it, update codectools to use int64 for token Length fields, so
      that it properly supports full IPLD integers without machine-dependent
      behavior and overflow checks. The budget integer is also updated to be
      int64, since the lengths it uses are now int64.
      
      Note that this refactor needed changes to the Go code generator as well
      as some of the tests, for the purpose of updating all the code.
      
      Finally, note that the code-generated iterator structs do not use int64
      fields internally, even though they must return int64 numbers to
      implement the interface. This is because they use the numeric fields to
      count up to a small finite amount (such as the number of fields in a Go
      struct), or up to the length of a map/slice. Neither of them can ever
      outgrow "int".
      
      Fixes #124.
      f6e9a891
  16. 14 Dec, 2020 2 commits
  17. 04 Dec, 2020 2 commits
    • Daniel Martí's avatar
      all: fix a lot of "unkeyed literal" vet warnings · 354f194f
      Daniel Martí authored
      Reduces the output of 'go vet ./...' from 374 lines to 96. Many warnings
      remain, but I have lost my patience for today.
      
      Most of the changes below were automated, especially the single-line
      mixins expressions. Unfortunately, many of the Traits structs required
      manual copy-pasting.
      354f194f
    • Daniel Martí's avatar
      node/mixins: use simpler filenames · f5f8b0d7
      Daniel Martí authored
      This is the mixins package, so "Mixin" in filenames is redundant.
      f5f8b0d7
  18. 01 Dec, 2020 1 commit
    • Daniel Martí's avatar
      node/gendemo: use the new code generator · 9fe4b32b
      Daniel Martí authored
      The files in this codegen demo correspond to an older version of the Go
      code generator. That's not terrible in itself, but it did make repeated
      uses of 'go test' fail:
      
      	$ go test
      	testing: warning: no tests to run
      	PASS
      	ok  	github.com/ipld/go-ipld-prime/node/gendemo	0.112s
      	$ go test
      	# github.com/ipld/go-ipld-prime/node/gendemo [github.com/ipld/go-ipld-prime/node/gendemo.test]
      	./minima.go:10:2: midvalue redeclared in this block
      		previous declaration at ./ipldsch_minima.go:13:27
      	[...]
      9fe4b32b
  19. 10 Sep, 2020 1 commit
    • Daniel Martí's avatar
      all: don't use buffers where readers suffice · 8e26c7e2
      Daniel Martí authored
      Buffers are not a good option for tests if the other side expects a
      reader. Otherwise, the code being tested could build assumptions around
      the reader stream being a single contiguous chunk of bytes, such as:
      
      	_ = r.(*bytes.Buffer).Bytes()
      
      This kind of hack might seem unlikely, but it's an easy mistake to make,
      especially with APIs like fmt which automatically call String methods.
      
      With bytes.Reader and strings.Reader, the types are much more
      restricted, so the tests need to be more faithful.
      8e26c7e2
  20. 05 Sep, 2020 1 commit
    • Eric Myhre's avatar
      New testcase system for exercising typed nodes; Revamp struct tests with it. · 5d732d47
      Eric Myhre authored
      This new system focuses on table-driven tests, and leans heavily upon
      json as a shorthand for expressing fixtures.
      
      It also makes a great deal more effort to exercise the different
      features of nodes (and their paired representation nodes) from all
      directions at once for each test datum, rather than requring that all
      be written out manually.
      
      The result is that the struct tests we've renovated have a lovely
      diffstat shrinkage: 111 insertions, 299 deletions...
      
      And yet the smaller line count results in *more* coverage.
      
      (Okay, the linecount increase for the testcase structure and helper
      methods is much bigger than the savings in fixture size... but,
      only *so far*.  I assume this will continue to pay off in the future.)
      
      Relatedly: a bug in struct map representations has been fixed.
      (It was the sibling of 5f589653, embarassingly.)  Thank goodness we now
      get proper coverage of this area.
      
      There's a few TODOs left to further expand the exercises, but those
      can slot in easily in subsequent commits.  Same goes for further
      expansion of usage of this new system.
      5d732d47
  21. 30 Jul, 2020 2 commits
  22. 09 Jul, 2020 1 commit
    • Eric Myhre's avatar
      Refactor the schema.Type info to support cycles. · 47abab45
      Eric Myhre authored
      This is full of mundane plonking away at adding stars and ampersands,
      of which approximately none are interesting.
      
      (I'm particularly frustrated by this because these are all placeholder
      types, and we're getting *very* close to replacing them, as we get
      closer and closer to self-hosting... at which point all of this bonking
      about will be made totally irrelevant.  And yet to close the last mile,
      these "small" fixes are surprisingly irritating.  Ah well.)
      
      The bits that *are* interesting:
      
      - the Spawn functions for type info now take strings rather than types
      (so that they don't provoke a cycle problem for the user when
      constructing the information to describe cyclic type info);
      
      - all of the Type info structure hold a pointer to the TypeSystem, and
      use that to look up reified Type info for related types, so that their
      methods don't force the caller to do that themselves.
      
      (The TypeSystem pointer was already there, amusingly; just never before
      initialized, because it hadn't turned out to be load-bearing yet.)
      
      It also would've been possible to just change all the methods on the
      Type types to return TypeName rather than full Type info.  That would
      avoid the need to use a TypeSystem pointer.  I didn't because:
      
      Overall, this was done in such a way as to minimize the diff that
      impacts within the templates.  This was a goal because updating
      templates is a fair bit more work than other code due to the weak
      compiler support.  And we'll end up reviewing and changing these
      methods when we get to our goal of self-hosting generation of the
      schema types from the schema-schema, so, it's not worth pushing around
      diffs in that same area when they'd be sure to be churned under soon.
      47abab45
  23. 29 Jun, 2020 2 commits
  24. 26 Jun, 2020 2 commits
  25. 15 Jun, 2020 1 commit
  26. 03 Jun, 2020 1 commit
  27. 22 May, 2020 3 commits
    • Eric Myhre's avatar
      gendemo package is now real generation :3 · e9455cdc
      Eric Myhre authored
      Previously, it was manually written prototypes of what gen "would" look like.
      
      Now it's the real deal :3
      e9455cdc
    • Eric Myhre's avatar
      Regen the realgen package. · 0009613a
      Eric Myhre authored
      Going to move it over to replace the (currently hand-written) gendemo
      package shortly... but spread that over a few commits, in case the
      diffs turn out interesting to look at.
      0009613a
    • Eric Myhre's avatar
      Refresh HACKME documents. · a7e8b2f2
      Eric Myhre authored
      Some of them still lived over the "gendemo" package, and those are now
      moved over here to the proper gen package.
      
      Linked to more of the other documents from the main HACKME doc.
      a7e8b2f2
  28. 26 Apr, 2020 1 commit
  29. 24 Apr, 2020 1 commit
    • Eric Myhre's avatar
      List gen support, and tests. · 0827921d
      Eric Myhre authored
      These are getting increasingly straightforward as most of the hard
      problems have been ironed out already when getting the other recursives
      (maps and structs) sorted.
      
      Now, we just need to stamp out the rest of the scalars, and I think
      this codegen stuff is at least early-alpha level usable!
      
      Also in this commit: added some new error types and fixed up the
      basicnode.List implementation to use it, removing some todos there.
      0827921d
  30. 19 Apr, 2020 1 commit
    • Eric Myhre's avatar
      MapNStrMap3StrInt benchmarks on codegen. · 3b33e05c
      Eric Myhre authored
      Marshal is on par with basicnode.  Both basicnode and then gen stuff
      does a solid job of alloc amortization on reads, so the dominant cost
      remaining for both is in getting iterators.  Thus, they come out
      pretty comparable overall.
      
      Unmarshal is winning *nicely* over basicnode.  Roughly a third fewer
      allocations, and gen is about 125% faster on the clock.
      
      I haven't looked to see if unmarshal can be further improved with any
      low-hanging-fruit sorts of fixes.  Wouldn't be surprised if it can.
      
      We're gonna need more standard benchmarks... and in particular,
      need them working without the marshal/unmarshal indirections.
      Those are handy, but add a *lot* of noise from directions we're not
      necessarily interested in when looking at different node impls.
      3b33e05c
  31. 16 Apr, 2020 1 commit
    • Eric Myhre's avatar
      Remove finish callback. Much faster. Bench. · 6d31b15f
      Eric Myhre authored
      If you've been following along for a while now, you don't need to see
      the benchmarks to know what's coming.  The long story short is:
      allocations are the root of all evil, and we got rid of some, and now
      things are significantly faster.
      
      Here's the numbers:
      
      basicnode (just for a baseline to compare to):
      
      ```
      BenchmarkMapStrInt_3n_AssembleStandard-8         1988986               588 ns/op             520 B/op          8 allocs/op
      BenchmarkMapStrInt_3n_AssembleEntry-8            2158921               559 ns/op             520 B/op          8 allocs/op
      BenchmarkMapStrInt_3n_Iteration-8               19679841                67.0 ns/op            16 B/op          1 allocs/op
      BenchmarkSpec_Marshal_Map3StrInt-8               1377094               870 ns/op             544 B/op          7 allocs/op
      BenchmarkSpec_Marshal_Map3StrInt_CodecNull-8     4560031               278 ns/op             176 B/op          3 allocs/op
      BenchmarkSpec_Unmarshal_Map3StrInt-8              368763              3239 ns/op            1608 B/op         32 allocs/op
      ```
      
      realgen, previously, using fcb:
      
      ```
      BenchmarkMapStrInt_3n_AssembleStandard-8         4293072               278 ns/op             208 B/op          5 allocs/op
      BenchmarkMapStrInt_3n_AssembleEntry-8            4643892               259 ns/op             208 B/op          5 allocs/op
      BenchmarkMapStrInt_3n_Iteration-8               20307603                59.9 ns/op            16 B/op          1 allocs/op
      BenchmarkSpec_Marshal_Map3StrInt-8               1346115               913 ns/op             544 B/op          7 allocs/op
      BenchmarkSpec_Marshal_Map3StrInt_CodecNull-8     4606304               256 ns/op             176 B/op          3 allocs/op
      BenchmarkSpec_Unmarshal_Map3StrInt-8              425662              2793 ns/op            1160 B/op         27 allocs/op
      ```
      
      realgen, new, improved:
      
      ```
      BenchmarkMapStrInt_3n_AssembleStandard-8         6138765               183 ns/op             129 B/op          3 allocs/op
      BenchmarkMapStrInt_3n_AssembleEntry-8            7276795               176 ns/op             129 B/op          3 allocs/op
      BenchmarkMapStrInt_3n_Iteration-8               19593212                67.2 ns/op            16 B/op          1 allocs/op
      BenchmarkSpec_Marshal_Map3StrInt-8               1309916               912 ns/op             544 B/op          7 allocs/op
      BenchmarkSpec_Marshal_Map3StrInt_CodecNull-8     4579935               257 ns/op             176 B/op          3 allocs/op
      BenchmarkSpec_Unmarshal_Map3StrInt-8              465195              2599 ns/op            1080 B/op         25 allocs/op
      ```
      
      So!  About 150% improvement on assembly between gen with fcb and our new-improved no-callback system.
      
      And about 321% improvement in total now for codegen structs over the basicnode map.
      
      That's the kind of ratio I was looking for :)
      
      As with all of these measurements: these will also get much bigger on bigger corpuses.
      Some of the improvements here are O(n) -> O(1), and some apply even more heartily in deeper trees, etc.
      But it's telling that even on very small corpuses, the impact is already huge.
      6d31b15f