Unverified Commit d3ccd3c3 authored by Eric Myhre's avatar Eric Myhre Committed by GitHub

Merge pull request #64 from ipld/kinded-union-gen

Kinded union gen
parents d70a19d6 2ea4150a
HACKME: "don't repeat yourself": how-to (and, limits of that goal)
==================================================================
Which kind of extraction applies?
---------------------------------
Things vary in how identical they are.
- **Textually identical**: Some code is textually identical between different types,
varying only in the most simple and obvious details, like the actual type name it's attached to.
- These cases can often be extracted on the generation side...
- We tend to put them in `genparts{*}.go` files.
- But the output is still pretty duplicacious.
- **Textually identical minus simple variations**: Some code is textually *nearly* identical,
but varies in relatively minor ways (such as whether or not the "Repr" is part of munges, and "Representation()" calls are made, etc).
- These cases can often be extracted on the generation side...
- We tend to put them in `genparts{*}.go` files.
- There's just a bit more `{{ various templating }}` injected in them, compared to other textually identical templates.
- But the output is still pretty duplicacious.
- **Linkologically identical**: When code is not _only_ textually identical,
but also refers to identical types.
- These cases can be extracted on the generation side...
- but it may be questionable to do so: if its terse enough in the output, there's that much less incentive to make a template-side shorthand for it.
- The output in this case can actually be deduplicated!
- It's possible we haven't bothered yet. **That doesn't mean it's not worth it**; we probably just haven't had time yet. PRs welcome.
- How?
- functions? This is the most likely to apply.
- embedded types? We haven't seen many cases where this can help, yet (unfortunately?).
- shared constants?
- It's not always easy to do this.
- We usually put something in the "minima" file.
- We don't currently have a way to toggle whether whole features or shared constants are emitted in the minima file. Todo?
- This requires keeping state that records what's necessary as we go, so that we can do them all together at the end.
- May also require varying the imports at the top of the minima file. (But: by doing it only here, we can avoid that complexity in every other file.)
- **This is actually pretty rare**. _Things that are textually identical are not necessarily linkologically identical_.
- One can generally turn things that are textually identical into linkologically identical by injecting an interface into the types...
- ... but this isn't always a *good* idea:
- if this would cause more allocations? Yeah, very no.
- even if this can be done without a heap allocation, it probably means inlining and other optimizations will become impossible for the compiler to perform, and often, we're not okay with the performance implications of that either.
- **Identical if it wasn't for debugability**: In some cases, code varies only by some small constants...
and really, those constants could be removed entirely. If... we didn't care about debugging. Which we do.
- This is really the same as "textually identical minus simple variations", but worth talking about briefly just because of the user story around it.
- A bunch of the error-thunking methods on Node and NodeAssemblers exemplify this.
- It's really annoying that we can't remove this boilerplate entirely from the generated code.
- It's also basically impossible, because we *want* information that varies per type in those error messages.
What mechanism of extraction should be used?
--------------------------------------------
- (for gen-side dry) gen-side functions
- this is most of what we've done so far
- (for gen-side dry) sub-templates
- we currently don't really use this at all
- (for gen-side dry) template concatenation
- some of this: kinded union representations do this
- (for output-side dry) output side functions
- some of this: see "minima" file.
- (for output-side dry) output side embeds
- we currently don't really use this at all (it hasn't really turned out applicable in any cases yet).
Don't overdo it
---------------
I'd rather have longer templates than harder-to-read and harder-to-maintain templates.
There's a balance to this and it's tricky to pick out.
A good heuristic to consider might be: are we extracting this thing because we can?
Or because if we made changes to this thing in the future, we'd expect to need to make that change in every single place we've extracted it from,
which therefore makes the extraction a net win for maintainability?
If it's the latter: then yes, extract it.
If it's not clear: maybe let it be.
(It may be the case that the preferable balance for DRYing changes over time as we keep maintaining things.
We'll see; but it's certainly the case that the first draft of this package has favored length heavily.
There was a lot of "it's not clear" when the maintainability heuristic was applied during the first writing of this;
that may change! If so, that's great.)
...@@ -95,7 +95,7 @@ Legend: ...@@ -95,7 +95,7 @@ Legend:
| ... type level | ✔ | ✔ | | ... type level | ✔ | ✔ |
| ... keyed representation | ✔ | ✔ | | ... keyed representation | ✔ | ✔ |
| ... envelope representation | ✘ | ✘ | | ... envelope representation | ✘ | ✘ |
| ... kinded representation | | | | ... kinded representation | | |
| ... inline representation | ✘ | ✘ | | ... inline representation | ✘ | ✘ |
| ... byteprefix representation | ✘ | ✘ | | ... byteprefix representation | ✘ | ✘ |
......
This diff is collapsed.
...@@ -50,6 +50,8 @@ func Generate(pth string, pkgName string, ts schema.TypeSystem, adjCfg *AdjunctC ...@@ -50,6 +50,8 @@ func Generate(pth string, pkgName string, ts schema.TypeSystem, adjCfg *AdjunctC
switch t2.RepresentationStrategy().(type) { switch t2.RepresentationStrategy().(type) {
case schema.UnionRepresentation_Keyed: case schema.UnionRepresentation_Keyed:
EmitEntireType(NewUnionReprKeyedGenerator(pkgName, t2, adjCfg), f) EmitEntireType(NewUnionReprKeyedGenerator(pkgName, t2, adjCfg), f)
case schema.UnionRepresentation_Kinded:
EmitEntireType(NewUnionReprKindedGenerator(pkgName, t2, adjCfg), f)
default: default:
panic("unrecognized union representation strategy") panic("unrecognized union representation strategy")
} }
......
...@@ -54,6 +54,54 @@ func doTemplate(tmplstr string, w io.Writer, adjCfg *AdjunctCfg, data interface{ ...@@ -54,6 +54,54 @@ func doTemplate(tmplstr string, w io.Writer, adjCfg *AdjunctCfg, data interface{
panic("invalid enumeration value!") panic("invalid enumeration value!")
} }
}, },
"Kind": func(s string) ipld.ReprKind {
switch s {
case "map":
return ipld.ReprKind_Map
case "list":
return ipld.ReprKind_List
case "null":
return ipld.ReprKind_Null
case "bool":
return ipld.ReprKind_Bool
case "int":
return ipld.ReprKind_Int
case "float":
return ipld.ReprKind_Float
case "string":
return ipld.ReprKind_String
case "bytes":
return ipld.ReprKind_Bytes
case "link":
return ipld.ReprKind_Link
default:
panic("invalid enumeration value!")
}
},
"KindSymbol": func(k ipld.ReprKind) string {
switch k {
case ipld.ReprKind_Map:
return "ipld.ReprKind_Map"
case ipld.ReprKind_List:
return "ipld.ReprKind_List"
case ipld.ReprKind_Null:
return "ipld.ReprKind_Null"
case ipld.ReprKind_Bool:
return "ipld.ReprKind_Bool"
case ipld.ReprKind_Int:
return "ipld.ReprKind_Int"
case ipld.ReprKind_Float:
return "ipld.ReprKind_Float"
case ipld.ReprKind_String:
return "ipld.ReprKind_String"
case ipld.ReprKind_Bytes:
return "ipld.ReprKind_Bytes"
case ipld.ReprKind_Link:
return "ipld.ReprKind_Link"
default:
panic("invalid enumeration value!")
}
},
"add": func(a, b int) int { return a + b }, "add": func(a, b int) int { return a + b },
"title": func(s string) string { return strings.Title(s) }, "title": func(s string) string { return strings.Title(s) },
}). }).
......
...@@ -2,6 +2,8 @@ package schema ...@@ -2,6 +2,8 @@ package schema
import ( import (
"fmt" "fmt"
"github.com/ipld/go-ipld-prime"
) )
// Everything in this file is __a temporary hack__ and will be __removed__. // Everything in this file is __a temporary hack__ and will be __removed__.
...@@ -102,6 +104,9 @@ func SpawnUnion(name TypeName, members []TypeName, repr UnionRepresentation) *Ty ...@@ -102,6 +104,9 @@ func SpawnUnion(name TypeName, members []TypeName, repr UnionRepresentation) *Ty
func SpawnUnionRepresentationKeyed(table map[string]TypeName) UnionRepresentation_Keyed { func SpawnUnionRepresentationKeyed(table map[string]TypeName) UnionRepresentation_Keyed {
return UnionRepresentation_Keyed{table} return UnionRepresentation_Keyed{table}
} }
func SpawnUnionRepresentationKinded(table map[ipld.ReprKind]TypeName) UnionRepresentation_Kinded {
return UnionRepresentation_Kinded{table}
}
// The methods relating to TypeSystem are also mutation-heavy and placeholdery. // The methods relating to TypeSystem are also mutation-heavy and placeholdery.
......
...@@ -73,6 +73,22 @@ type Type interface { ...@@ -73,6 +73,22 @@ type Type interface {
// can vary in representation kind based on their value (specifically, // can vary in representation kind based on their value (specifically,
// kinded-representation unions have this property). // kinded-representation unions have this property).
Kind() Kind Kind() Kind
// RepresentationBehavior returns a description of how the representation
// of this type will behave in terms of the IPLD Data Model.
// This property varies based on the representation strategy of a type.
//
// In one case, the representation behavior cannot be known statically,
// and varies based on the data: kinded unions have this trait.
//
// This property is used by kinded unions, which require that their members
// all have distinct representation behavior.
// (It follows that a kinded union cannot have another kinded union as a member.)
//
// You may also be interested in a related property that might have been called "TypeBehavior".
// However, this method doesn't exist, because it's a deterministic property of `Kind()`!
// You can use `Kind.ActsLike()` to get type-level behavioral information.
RepresentationBehavior() ipld.ReprKind
} }
var ( var (
......
package schema package schema
import (
ipld "github.com/ipld/go-ipld-prime"
)
/* cookie-cutter standard interface stuff */ /* cookie-cutter standard interface stuff */
func (t *typeBase) _Type(ts *TypeSystem) { func (t *typeBase) _Type(ts *TypeSystem) {
...@@ -20,6 +24,47 @@ func (TypeUnion) Kind() Kind { return Kind_Union } ...@@ -20,6 +24,47 @@ func (TypeUnion) Kind() Kind { return Kind_Union }
func (TypeStruct) Kind() Kind { return Kind_Struct } func (TypeStruct) Kind() Kind { return Kind_Struct }
func (TypeEnum) Kind() Kind { return Kind_Enum } func (TypeEnum) Kind() Kind { return Kind_Enum }
func (TypeBool) RepresentationBehavior() ipld.ReprKind { return ipld.ReprKind_Bool }
func (TypeString) RepresentationBehavior() ipld.ReprKind { return ipld.ReprKind_String }
func (TypeBytes) RepresentationBehavior() ipld.ReprKind { return ipld.ReprKind_Bytes }
func (TypeInt) RepresentationBehavior() ipld.ReprKind { return ipld.ReprKind_Int }
func (TypeFloat) RepresentationBehavior() ipld.ReprKind { return ipld.ReprKind_Float }
func (TypeMap) RepresentationBehavior() ipld.ReprKind { return ipld.ReprKind_Map }
func (TypeList) RepresentationBehavior() ipld.ReprKind { return ipld.ReprKind_List }
func (TypeLink) RepresentationBehavior() ipld.ReprKind { return ipld.ReprKind_Link }
func (t TypeUnion) RepresentationBehavior() ipld.ReprKind {
switch t.representation.(type) {
case UnionRepresentation_Keyed:
return ipld.ReprKind_Map
case UnionRepresentation_Kinded:
return ipld.ReprKind_Invalid // you can't know with this one, until you see the value (and thus can its inhabitant's behavior)!
case UnionRepresentation_Envelope:
return ipld.ReprKind_Map
case UnionRepresentation_Inline:
return ipld.ReprKind_Map
default:
panic("unreachable")
}
}
func (t TypeStruct) RepresentationBehavior() ipld.ReprKind {
switch t.representation.(type) {
case StructRepresentation_Map:
return ipld.ReprKind_Map
case StructRepresentation_Tuple:
return ipld.ReprKind_List
case StructRepresentation_StringPairs:
return ipld.ReprKind_String
case StructRepresentation_Stringjoin:
return ipld.ReprKind_String
default:
panic("unreachable")
}
}
func (t TypeEnum) RepresentationBehavior() ipld.ReprKind {
// TODO: this should have a representation strategy switch too; sometimes that will indicate int representation behavior.
return ipld.ReprKind_String
}
/* interesting methods per Type type */ /* interesting methods per Type type */
// beware: many of these methods will change when we successfully bootstrap self-hosting. // beware: many of these methods will change when we successfully bootstrap self-hosting.
...@@ -103,6 +148,12 @@ func (r UnionRepresentation_Keyed) GetDiscriminant(t Type) string { ...@@ -103,6 +148,12 @@ func (r UnionRepresentation_Keyed) GetDiscriminant(t Type) string {
panic("that type isn't a member of this union") panic("that type isn't a member of this union")
} }
// GetMember returns type info for the member matching the kind argument,
// or may return nil if that kind is not mapped to a member of this union.
func (r UnionRepresentation_Kinded) GetMember(k ipld.ReprKind) TypeName {
return r.table[k]
}
// Fields returns a slice of descriptions of the object's fields. // Fields returns a slice of descriptions of the object's fields.
func (t TypeStruct) Fields() []StructField { func (t TypeStruct) Fields() []StructField {
a := make([]StructField, len(t.fields)) a := make([]StructField, len(t.fields))
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment