-**Textually identical**: Some code is textually identical between different types,
varying only in the most simple and obvious details, like the actual type name it's attached to.
- These cases can often be extracted on the generation side...
- We tend to put them in `genparts{*}.go` files.
- But the output is still pretty duplicacious.
-**Textually identical minus simple variations**: Some code is textually *nearly* identical,
but varies in relatively minor ways (such as whether or not the "Repr" is part of munges, and "Representation()" calls are made, etc).
- These cases can often be extracted on the generation side...
- We tend to put them in `genparts{*}.go` files.
- There's just a bit more `{{ various templating }}` injected in them, compared to other textually identical templates.
- But the output is still pretty duplicacious.
-**Linkologically identical**: When code is not _only_ textually identical,
but also refers to identical types.
- These cases can be extracted on the generation side...
- but it may be questionable to do so: if its terse enough in the output, there's that much less incentive to make a template-side shorthand for it.
- The output in this case can actually be deduplicated!
- It's possible we haven't bothered yet. **That doesn't mean it's not worth it**; we probably just haven't had time yet. PRs welcome.
- How?
- functions? This is the most likely to apply.
- embedded types? We haven't seen many cases where this can help, yet (unfortunately?).
- shared constants?
- It's not always easy to do this.
- We usually put something in the "minima" file.
- We don't currently have a way to toggle whether whole features or shared constants are emitted in the minima file. Todo?
- This requires keeping state that records what's necessary as we go, so that we can do them all together at the end.
- May also require varying the imports at the top of the minima file. (But: by doing it only here, we can avoid that complexity in every other file.)
-**This is actually pretty rare**. _Things that are textually identical are not necessarily linkologically identical_.
- One can generally turn things that are textually identical into linkologically identical by injecting an interface into the types...
- ... but this isn't always a *good* idea:
- if this would cause more allocations? Yeah, very no.
- even if this can be done without a heap allocation, it probably means inlining and other optimizations will become impossible for the compiler to perform, and often, we're not okay with the performance implications of that either.
-**Identical if it wasn't for debugability**: In some cases, code varies only by some small constants...
and really, those constants could be removed entirely. If... we didn't care about debugging. Which we do.
- This is really the same as "textually identical minus simple variations", but worth talking about briefly just because of the user story around it.
- A bunch of the error-thunking methods on Node and NodeAssemblers exemplify this.
- It's really annoying that we can't remove this boilerplate entirely from the generated code.
- It's also basically impossible, because we *want* information that varies per type in those error messages.
What mechanism of extraction should be used?
--------------------------------------------
- (for gen-side dry) gen-side functions
- this is most of what we've done so far
- (for gen-side dry) sub-templates
- we currently don't really use this at all
- (for gen-side dry) template concatenation
- some of this: kinded union representations do this
- (for output-side dry) output side functions
- some of this: see "minima" file.
- (for output-side dry) output side embeds
- we currently don't really use this at all (it hasn't really turned out applicable in any cases yet).
Don't overdo it
---------------
I'd rather have longer templates than harder-to-read and harder-to-maintain templates.
There's a balance to this and it's tricky to pick out.
A good heuristic to consider might be: are we extracting this thing because we can?
Or because if we made changes to this thing in the future, we'd expect to need to make that change in every single place we've extracted it from,
which therefore makes the extraction a net win for maintainability?
If it's the latter: then yes, extract it.
If it's not clear: maybe let it be.
(It may be the case that the preferable balance for DRYing changes over time as we keep maintaining things.
We'll see; but it's certainly the case that the first draft of this package has favored length heavily.
There was a lot of "it's not clear" when the maintainability heuristic was applied during the first writing of this;
someSwitchClausestring,// template condition for if *any* switch clause should be present
condClausestring,// template condition for the member this should match on when in the range
retClausestring,// clause returning the thing (how to delegate methodsig, generally)
appropriateKindstring,// for error messages
nopeSentinelstring,// for error return paths; generally the zero value for the first return type.
nopeSentinelOnlybool,// true if this method has no error return, just the sentinel.
)string{
// We really could just... call the methods directly (and elide the switch entirely all the time), in the case of the "interface" implementation strategy.
// We don't, though, because that would deprive us of getting the union type's name in the wrong-kind errors...
// and in addition to that being sadface in general, it would be downright unacceptable if that behavior varied based on implementation strategy.
//
// This error text doesn't tell us what the member kind is. This might read weirdly.
// It's possible we could try to cram a description of the inhabitant into the "TypeName" since it's stringy; but unclear if that's a good idea either.
// These template concatenations have evolved into a mess very quickly. This entire thing should be replaced.
// String concatenations of template clauses is an atrociously unhygenic way to compose things;
// it looked like we could limp by with it for a while, but it's gotten messier faster than expected.
// Much of this is familiar: the 'w', the 'm' are all as usual.
// Some things may look a little odd here compared to all other assemblers:
// we're kinda halfway between what's conventionally seen for a scalar and what's conventionally seen for a recursive.
// There's no 'maState' or 'laState'-typed fields (which feels like a scalar) because even if we end up acting like a map or list, that state is in the relevant child assembler.
// We don't even have a 'cm' field, because we can get away with something really funky: we can just copy our own 'm' _pointer_ into children; our doneness and their doneness is the same.
// We never have to worry about maybeism of our children; the nullable and optional modifiers aren't possible on union members.
// (We *do* still have to consider null values though, as null is still a kind, and thus can be routed to one of our members!)
// 'ca' is as it is in the type-level assembler: technically, not super necessary, except that it allows minimizing the amount of work that resetting needs to do.
doTemplate(`
type _{{ .Type | TypeSymbol }}__ReprAssembler struct {
func (na *_{{ .Type | TypeSymbol }}__ReprAssembler) reset() {
switch na.ca {
case 0:
return
{{- range $i, $member := .Type.Members }}
case {{ add $i 1 }}:
na.ca{{ add $i 1 }}.reset()
{{- end}}
default:
panic("unreachable")
}
}
`,w,g.AdjCfg,g)
}
funckindedUnionNodeAssemblerMethodTemplateMunge(
methodNamestring,// for error messages
methodSigstring,// output literally
condClausestring,// template condition for the member this should match on
retClausestring,// clause returning the thing (how to delegate methodsig, generally)
twoReturnsbool,// true if a nil should be returned as well as the error
)string{
// The value pointed to by `na.m` isn't modified here, because we're sharing it with the child, who should do so.
// This also means that value gets checked twice -- once by us, because we need to halt if we've already been used --
// and also a second time by the child when we delegate to it, which, unbeknownst to it, is irrelevant.
// I don't see a good way to remedy this shy of making more granular (unexported!) methods. (Might be worth it.)
// This probably also isn't the same for all of the assembler methods: the methods we delegate to aren't doing as many check branches when they're for scalars,
// because they expected to be used in contexts where many values of the 'm' enum aren't reachable -- an expectation we've suddenly subverted with this path!
//
// FUTURE: The error returns here are deeply questionable, and not as helpful as they could be.
// ErrNotUnionStructure is about the most applicable thing so far, but it's very freetext.
// ErrWrongKind wouldn't fit at all: assumes that we can say what kind of node we have, but this is the one case in the whole system where *we can't*; also, assumes there's one actual correct kind, and that too is false here!
maybeNilComma:=""
iftwoReturns{
maybeNilComma+="nil,"
}
return`
func (na *_{{ .Type | TypeSymbol }}__ReprAssembler) `+methodSig+` {
switch *na.m {
case schema.Maybe_Value, schema.Maybe_Null:
panic("invalid state: cannot assign into assembler that's already finished")
case midvalue:
panic("invalid state: cannot assign into assembler that's already working on a larger structure!")
}
{{- range $i, $member := .Type.Members }}
`+condClause+`
{{- if dot.Type | MaybeUsesPtr }}
if na.w == nil {
na.w = &_{{ dot.Type | TypeSymbol }}{}
}
{{- end}}
na.ca = {{ add $i 1 }}
{{- if (eq (dot.AdjCfg.UnionMemlayout dot.Type) "embedAll") }}
na.w.tag = {{ add $i 1 }}
na.ca{{ add $i 1 }}.w = &na.w.x{{ add $i 1 }}
na.ca{{ add $i 1 }}.m = na.m
return na.ca{{ add $i 1 }}`+retClause+`
{{- else if (eq (dot.AdjCfg.UnionMemlayout dot.Type) "interface") }}
return `+maybeNilComma+` schema.ErrNotUnionStructure{TypeName: "{{ .PkgName }}.{{ .Type.Name }}.Repr", Detail: "`+methodName+` called but is not valid for any of the kinds that are valid members of this union"}
// TODO: I think this may need some special handling to account for if our union is itself used in a nullable circumstance; that should overrule this behavior.