Commits · 6b5a471f3b760197bd1738011042bc8119e18a57 · ld / go-ld-prime

24 Jan, 2021 12 commits

schema compiler: enum validation rules. · 6b5a471f

Eric Myhre authored Jan 24, 2021

And I think this now ports all the rules we'd previously written
against the attempted schema2 (the wrapper style) API.
So I can now remove the rest of that in the next commit.

It will soon be time to start updating all the gengo stuff to
use this, including all of its tests.

6b5a471f

schema compiler: union validation rules. · ea79ea43
Eric Myhre authored Jan 24, 2021
```
Also, shift some error message text towards a more consistent phrasing.
```
ea79ea43

schema compiler: union rules. · 6897bb3c

Eric Myhre authored Jan 22, 2021

Commit to the strategy of having the first flunked rule for a type
result in short-circuit skipping of subsequent rules.  It's simple,
and it's sufficient.

6897bb3c

schema: connect validation, Compile() now works. · 1b282cfa
Eric Myhre authored Jan 22, 2021
```
More rules are still to come.
```
1b282cfa

schema: clear up some things about TypeName vs TypeReference. · b6009032

Eric Myhre authored Jan 22, 2021

Both are now accessible.  Name is not always present.

Get rid of casts that are unnecessary.

Constructors for anonymous types are still upcoming;
all the current constructors dupe the name into the reference field.
Planning to add distinct methods on the Compiler for anon types.

b6009032

schema compiler: can accumulate more than one error per rule. · c6b9667d
Eric Myhre authored Jan 22, 2021

c6b9667d

schema: architecture design records for compiler. · 8777aae1

Eric Myhre authored Jan 22, 2021

It details a variety of considered approaches.

Spoiler: I'm not actually super pleased with the one I'm currently
pursuing.  The amount of boilerplate I'm grinding out for this is
really, really no fun at all.  It's possibly that the reasoning leading
here is still sound.  It's just unpleasant.

8777aae1

schema: beginning (re)implementation of validation rules. · fe935c9c

Eric Myhre authored Jan 22, 2021

Carving out hunks of the schema2 implementation of them (which still
hoped to use the dmt more directly) as I port them.

As comments in the diff state: I had a hope here that I could make
this relatively table-driven, which would increase legibility and
make it easier to port these checks to other implementations as well.
We'll... see how that goes; it's not easy to flatten.

fe935c9c

schema unification progress: map representations, enums, etc. · 3da7e2ad
Eric Myhre authored Jan 22, 2021

3da7e2ad

schema: so much boilerplate for feeding information to the Compiler that I... · d74ecb3e

Eric Myhre authored Jan 17, 2021

schema: so much boilerplate for feeding information to the Compiler that I wrote another supplementary code generator.

(I'm getting very weary of golang.)

This new bit of codegen makes the compiler.go file fairly readable
again, though, so I'm satisfied with it.

The Compiler API is now complete enough that I can start repairing
other things to use it properly.  The schemadmt.Schema.Compile()
function and all of its helpers compile again now.  So does *most*
of the whole codegen system... with the notable exception of all
the hardcoded typesystem spawning which used the old placeholder
methods which have now been stricken.

TypeSystem now maintains order.  This allowed me to remove some
sort operations from the code generator.  This also means the next
time any existing codegen is re-run, the output file will shift
significantly.  However, it shouldn't do so again in the future.

d74ecb3e

schema/compiler: move into schema package. · de9e49b0
Eric Myhre authored Jan 14, 2021
```
As with parent commit: this is a checkpoint.  CI will not be passing.
```
de9e49b0

schema: working to unify interfaces and dmt. Intermediate checkpoint commit. · f1859e77

Eric Myhre authored Jan 14, 2021

This commit does not pass CI or even fully compile, and while I usually
try to avoid those, A) I need a checkpoint!, and B) I think this one is
interestingly illustrative, and I'll probably want to refer to this
diff and the one that will follow it in the future as part of
architecture design records (or even possibly experience reports about
golang syntax).

In this commit: we have three packages:

- schema: full of interfaces (and only interfaces)
- schema/compiler: creates values matching schema interfaces
- schema/dmt: contains codegen'd types that parse schema documents.

The dmt package feeds data to the compiler package, and the compiler
package emits values matching the schema interface.
This all works very nicely and avoids import cycles.

(Avoiding import cycles has been nontrivial, here, unfortunately.
The schema/schema2 package (which is still present in this commit,
but will be removed shortly -- I've scraped most of it over into
this new 'compiler' package already, just not a bunch of the validation
rules stuff, yet) was a dream of making this all work by just having
thin wrapper types around the dmt types.  This didn't fly...
because codegen'd nodes comply with `schema.TypedNode`, and complying
with `schema.TypedNode` means they have a function which references
`schema.Type`... and that means we really must depend on that
interface and the package it's in.  Ooof.)

The big downer with this state, and why things are currently
non-compiling at this checkpoint I've made here, is that we have to
replicate a *lot* of methods into single-use interfaces in the schema
package for this to work.  This belies the meaning of "interface".
The reason we'd do this -- the reason to split 'compiler' into its own
package -- is most because I wanted to keep all the constructor
mechanisms for schema values out of the direct path of the user's eye,
because most users shouldn't be using the compiler directly at all.

But... I'm shifting to thinking this attempt to segregate the compiler
details isn't worth it.  A whole separate package costs too much.
Most concretely, it would make it impossible to make the `schema.Type`
interface "closed" (e.g. by having an unexported method), and I think
at that point we would be straying quite far from desired semantics.

f1859e77

21 Jan, 2021 1 commit

schema/gen/go: cache genned code in os.TempDir · 0b3adb9d

Daniel Martí authored Jan 18, 2021

This means we no longer clutter the repository with lots of files, even
if they are git-ignored. It's always a bit of a red flag when you run
"go test ./..." and the result is a bunch of leftover files.

We still want to keep the files around, for the sake of Go's build
cache. And we still want their paths to be static between "go test"
runs. So put them in a static dir under os.TempDir.

This does mean that concurrent runs of these tests will likely not work
well. I don't imagine that's going to be a problem anytime soon, though.
If it really becomes a problem in the future, we could figure something
out like grabbing a file lock for the directory.

The idea behind using os.TempDir is that it will likely remain in place
between a number of "go test" runs within a hacking session, but it will
be eventually cleaned up by the system, such as when rebooting.

Note that we need to use globbing since one can't build "proper
packages" located outside a module. The only exception is building an
ad-hoc set of explicit Go files. While at it, use filepath.Join, to be
nice.

0b3adb9d

18 Jan, 2021 1 commit

schema/gen/go: prevent some unkeyed literal vet errors · bf0cbde7

Daniel Martí authored Jan 10, 2021

In particular, use keys for ipld error structs. These have one field, so
the changes are pretty simple.

Reduces 'go vet ./...' from 2647 lines of output to 2365.

Updates #102.

bf0cbde7

10 Jan, 2021 2 commits

schema/gen/go: remove two common subtest levels · d02c3602

Daniel Martí authored Dec 01, 2020

Practically every subtest ends up at 7 or so levels of names, like:

TestMapsContainingMaybe/maybe-using-ptr/generate/compile/bhvtest/non-nullable/typed-create

However, note that the "generate" and "compile" levels are always there,
so their presence just adds verbosity in the output and makes the
developer's life more difficult.

Extremely nested sub-tests are already rare, so at least we can just
keep the components that add useful information in the output.

"bhvtest" is also pretty redundant, but that one actually matters - its
subtest can be skipped depending on build tags.

d02c3602

schema/gen/go: please vet a bit more · 6796504d

Daniel Martí authored Jan 06, 2021

In particular, this removes ~50 out of the 2.7k warnings in 'go vet
./...' in this repository. Mainly, the "unreachable code" ones.

This was caused by edge cases in some of the generated code which caused
an unconditional return or panic statement to be followed by other code.
Fix all of them with a bit more template logic.

Some of the Next methods go a bit further. If they serve no purpose as
the switch has no cases to be matched, just unconditionally return an
error. In the future we can perhaps reuse a single function for that.

Finally, I was having a hard time actually following the logic in
kindedUnionNodeAssemblerMethodTemplateMunge, so I've indented the code a
bit to follow the template logic and scoping.

These changes move us towards pleasing vet, which is nice, but also make
the code waste a bit less space.

6796504d

07 Jan, 2021 1 commit
- Use more explicit field names in initializers. · bc66b1e3
  Eric Myhre authored Jan 07, 2021
  
  bc66b1e3
03 Jan, 2021 4 commits
- gengo: support for unions with stringprefix representation. · 4da73fb9
  Eric Myhre authored Jan 03, 2021
  
  4da73fb9
- target of opporunity DRY improvement: use more shared templates for structs... · 80a4a975
  Eric Myhre authored Jan 03, 2021
```
target of opporunity DRY improvement: use more shared templates for structs with stringjoin representations.

(Encountered while working on support unions with stringprefix representations.)
```
  80a4a975
- fix small consistency typo in gen function names. · 1f397b3b
  Eric Myhre authored Jan 03, 2021
```
Should not affect most user code; though these are technically
exported symbols, they're very unlikely to be used directly.
```
  1f397b3b
- drop old generation mechanisms that were already deprecated. · 2f7f3437
  Eric Myhre authored Jan 03, 2021
  
  2f7f3437
31 Dec, 2020 1 commit

Revert "rename AssignNode to ConvertFrom" · f2323b3f

Eric Myhre authored Dec 31, 2020

This reverts commit 6e6625bd.

Discussed at
https://github.com/ipld/go-ipld-prime/pull/126#issuecomment-753003441

Long story short, the motivations of this rename are good,
but the new name also carries some connotations we're really not sure
about, and so we're going to undo this for now, and continue to think
about it in the future.

f2323b3f

27 Dec, 2020 1 commit
- Update a few more lingering ReprKind references. · 8fa241ea
  Eric Myhre authored Dec 27, 2020
  
  8fa241ea
25 Dec, 2020 1 commit

all: rename schema.Kind to TypeKind, ipld.ReprKind to Kind · 2d7d25c4

Daniel Martí authored Dec 17, 2020

As discussed on the issue thread, ipld.Kind and schema.TypeKind are more
intuitive, closer to the spec wording, and just generally better in the
long run.

The changes are almost entirely automated via the commands below. Very
minor changes were needed in some of the generators, and then gofmt.

	sed -ri 's/\<Kind\(\)/TypeKind()/g' **/*.go
	git checkout fluent # since it uses reflect.Value.Kind

	sed -ri 's/\<Kind_/TypeKind_/g' **/*.go
	sed -i 's/\<Kind\>/TypeKind/g' **/*.go
	sed -i 's/ReprKind/Kind/g' **/*.go

Plus manually undoing a few renames, as per Eric's review.

Fixes #94.

2d7d25c4

17 Dec, 2020 1 commit

all: rename AssignNode to ConvertFrom · 6e6625bd

Daniel Martí authored Dec 16, 2020

This should be more intuitive to Go programmers, since assignments are
generally trivial operations, but conversions imply that extra work
might be needed to adapt the value to fit in the recipient.

The entire change is just:

	sed -ri 's/AssignNode/ConvertFrom/g' **/*.go

Downstream users can very likely use the same line to fix their function
declarations and calls.

Fixes #95.

6e6625bd

16 Dec, 2020 1 commit

all: rewrite interfaces and APIs to support int64 · f6e9a891

Daniel Martí authored Dec 14, 2020

We only supported representing Int nodes as Go's "int" builtin type.
This is fine on 64-bit, but on 32-bit, it limited those node values to
just 32 bits. This is a problem in practice, because it's reasonable to
want more than 32 bits for integers.

Moreover, this meant that IPLD would change behavior if built for a
32-bit platform; it would not be able to decode large integers, for
example, when in fact that was just a software limitation that 64-bit
builds did not have.

To fix this problem, consistently use int64 for AsInt and AssignInt.

A lot more functions are part of this rewrite as well; mainly, those
revolving around collections and iterating. Some might never need more
than 32 bits in practice, but consistency and portability is preferred.
Moreover, many are interfaces, and we want IPLD interfaces to be
flexible, which will be important for ADLs.

Below are some GNU sed lines which can be used to quickly update
function signatures to use int64:

	sed -ri 's/(func.* AsInt.*)\<int\>/\1int64/g' **/*.go
	sed -ri 's/(func.* AssignInt.*)\<int\>/\1int64/g' **/*.go
	sed -ri 's/(func.* Length.*)\<int\>/\1int64/g' **/*.go
	sed -ri 's/(func.* LookupByIndex.*)\<int\>/\1int64/g' **/*.go
	sed -ri 's/(func.* Next.*)\<int\>/\1int64/g' **/*.go
	sed -ri 's/(func.* ValuePrototype.*)\<int\>/\1int64/g' **/*.go

Note that the function bodies, as well as the code that calls said
functions, may need to be manually updated with the integer type change.
That cannot be automated, because it's possible that an automated fix
would silently introduce potential overflows not being handled.

Some TODOs and FIXMEs for overflow checks are removed, since we remove
some now unnecessary int64->int conversions. On the other hand, the
older codecs based on refmt need to gain some overflow check TODOs,
since refmt uses ints. That is okay for now, since we'll phase out refmt
pretty soon.

While at it, update codectools to use int64 for token Length fields, so
that it properly supports full IPLD integers without machine-dependent
behavior and overflow checks. The budget integer is also updated to be
int64, since the lengths it uses are now int64.

Note that this refactor needed changes to the Go code generator as well
as some of the tests, for the purpose of updating all the code.

Finally, note that the code-generated iterator structs do not use int64
fields internally, even though they must return int64 numbers to
implement the interface. This is because they use the numeric fields to
count up to a small finite amount (such as the number of fields in a Go
struct), or up to the length of a map/slice. Neither of them can ever
outgrow "int".

Fixes #124.

f6e9a891

13 Dec, 2020 5 commits

cleanup: drop orphaned gitignore file. · 5f009262
Eric Myhre authored Dec 14, 2020

5f009262
comments on how to connect this test to full logical validation. · ad7aa3e4
Eric Myhre authored Dec 13, 2020
```
Cannot quite wire that up yet because of some other still incomplete features.
```
ad7aa3e4

fix parse test; the generated results it exercises have become stricter! · 8b35174d

Eric Myhre authored Dec 13, 2020

Since https://github.com/ipld/go-ipld-prime/pull/121, presence of
fields is actually checked... but that code also doesn't understand
implicit fields yet, which makes us need a lot of filler.

Also the lack of the "members" field for unions? That was just plain
wrong. Good think we're catching things like that now.

8b35174d

regenerate. · 8681ab7a

Eric Myhre authored Dec 13, 2020

filenames change due to https://github.com/ipld/go-ipld-prime/pull/105 .

gofmt also applied for the first time.

from here on out: `go generate` should just cause these files to be
automagically updated and formatted.

8681ab7a

sane generation commands for the schemadmt package. · 083de9eb

Eric Myhre authored Dec 13, 2020

Move the existing setup from the schema-schema "demo" dir to here;
and rig it up with go generate conventions that I'm hoisting back from
mvdan's https://github.com/ipld/go-ipld-adl-hamt/blob/master/gen.go .

Move the parse tests with it.

083de9eb

04 Dec, 2020 3 commits

draft of schema types using codegen for data model, with a package for the... · 2e68fd35

Eric Myhre authored Nov 02, 2020

draft of schema types using codegen for data model, with a package for the fully validated data which is implemented by retaining and accessing into the raw data.

2e68fd35

codegen: assembler for struct with map representation now validates all... · f8d654da

Eric Myhre authored Dec 04, 2020

codegen: assembler for struct with map representation now validates all non-optional fields are present.

This continues what https://github.com/ipld/go-ipld-prime/pull/111/ did
and adds the same logic to the map representation. The actual state
tracking works the same way (and was mostly already there).

Rearranged the tests slightly.

Made error messages include both field name and serial key when they
differ due to a rename directive. (It's possible this error would get
nicer if it used a list of StructField instead of just strings, but it
would also get more complicated. Maybe revisit later.)

f8d654da

all: fix a lot of "unkeyed literal" vet warnings · 354f194f

Daniel Martí authored Dec 01, 2020

Reduces the output of 'go vet ./...' from 374 lines to 96. Many warnings
remain, but I have lost my patience for today.

Most of the changes below were automated, especially the single-line
mixins expressions. Unfortunately, many of the Traits structs required
manual copy-pasting.

354f194f

30 Nov, 2020 1 commit

Allow overriden types (#116) · fe47b7f0

Will authored Nov 30, 2020

This change will look at the destination package that codegen is being built into, and will skip generation of types that are already declared by files not prefixed with `ipldsch_`.

This isn't the cleanest escape-hatch, but it's a start.

fe47b7f0

17 Nov, 2020 4 commits

add import to ipld in ipldsch_types.go · 9656675b
Will Scott authored Nov 17, 2020
```
cleanup from #105
```
9656675b

codegen: rename files. · 32e66f20

Eric Myhre authored Nov 17, 2020

An underscore; and less "gen", because reviewers indicated it felt redundant.

32e66f20

codegen: deterministic order for types in output. · 4f333954

Eric Myhre authored Oct 30, 2020

I'd still probably prefer to replace this with simply having a stable
order that is carried through consistently, but that remains blocked
behind getting self-hosted types, and while it so happens I also got
about 80% of the way there on those today, the second 80% may take
another day. Better make this stable rather than wait.

4f333954

codegen: rearrange output into finite number of files. · 76193e5d

Eric Myhre authored Oct 30, 2020

Also, emit some comments around the type definitions.

The old file layout is still available, but renamed to GenerateSplayed.
It will probably be removed in the future.

The new format does not currently have stable output order.
I'd like to preserve the original order given by the schema,
but our current placeholder types for schema data don't have this.
More work needed on this.

76193e5d

14 Nov, 2020 1 commit
- Return error rather than panic · bfb8b49e
  Will Scott authored Nov 14, 2020
  
  bfb8b49e