node.go 14.9 KB
Newer Older
tavit ohanian's avatar
tavit ohanian committed
1
package ld
Eric Myhre's avatar
Eric Myhre committed
2

3
// Node represents a value in LD.  Any point in a tree of data is a node:
4
// scalar values (like int64, string, etc) are nodes, and
5 6
// so are recursive values (like map and list).
//
7
// Nodes and kinds are described in the LD specs at
tavit ohanian's avatar
tavit ohanian committed
8
// https://gitlab.dms3.io/ld/specs/blob/master/data-model-layer/data-model.md .
9 10 11 12
//
// Methods on the Node interface cover the superset of all possible methods for
// all possible kinds -- but some methods only make sense for particular kinds,
// and thus will only make sense to call on values of the appropriate kind.
13
// (For example, 'Length' on an integer doesn't make sense,
14
// and 'AsInt' on a map certainly doesn't work either!)
15
// Use the Kind method to find out the kind of value before
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41
// calling kind-specific methods.
// Individual method documentation state which kinds the method is valid for.
// (If you're familiar with the stdlib reflect package, you'll find
// the design of the Node interface very comparable to 'reflect.Value'.)
//
// The Node interface is read-only.  All of the methods on the interface are
// for examining values, and implementations should be immutable.
// The companion interface, NodeBuilder, provides the matching writable
// methods, and should be use to create a (thence immutable) Node.
//
// Keeping Node immutable and separating mutation into NodeBuilder makes
// it possible to perform caching (or rather, memoization, since there's no
// such thing as cache invalidation for immutable systems) of computed
// properties of Node; use copy-on-write algorithms for memory efficiency;
// and to generally build pleasant APIs.
// Many library functions will rely on the immutability of Node (e.g.,
// assuming that pointer-equal nodes do not change in value over time),
// so any user-defined Node implementations should be careful to uphold
// the immutability contract.)
//
// There are many different concrete types which implement Node.
// The primary purpose of various node implementations is to organize
// memory in the program in different ways -- some in-memory layouts may
// be more optimal for some programs than others, and changing the Node
// (and NodeBuilder) implementations lets the programmer choose.
//
42
// For concrete implementations of Node, check out the "./node/" folder,
43
// and the packages within it.
44
// "node/basic" should probably be your first start; the Node and NodeBuilder
45 46 47 48 49 50
// implementations in that package work for any data.
// Other packages are optimized for specific use-cases.
// Codegen tools can also be used to produce concrete implementations of Node;
// these may be specific to certain data, but still conform to the Node
// interface for interoperability and to support higher-level functions.
//
51 52 53 54 55 56 57
// Nodes may also be *typed* -- see the 'schema' package and `schema.TypedNode`
// interface, which extends the Node interface with additional methods.
// Typed nodes have additional constraints and behaviors:
// for example, they may be a "struct" and have a specific type/structure
// to what data you can put inside them, but still behave as a regular Node
// in all ways this interface specifies (so you can traverse typed nodes, etc,
// without any additional special effort).
Eric Myhre's avatar
Eric Myhre committed
58
type Node interface {
59
	// Kind returns a value from the Kind enum describing what the
60
	// essential serializable kind of this node is (map, list, integer, etc).
61
	// Most other handling of a node requires first switching upon the kind.
62
	Kind() Kind
63

64
	// LookupByString looks up a child object in this node and returns it.
65
	// The returned Node may be any of the Kind:
66
	// a primitive (string, int64, etc), a map, a list, or a link.
67
	//
68
	// If the Kind of this Node is not Kind_Map, a nil node and an error
69 70 71
	// will be returned.
	//
	// If the key does not exist, a nil node and an error will be returned.
72
	LookupByString(key string) (Node, error)
73

74
	// LookupByNode is the equivalent of LookupByString, but takes a reified Node
75 76
	// as a parameter instead of a plain string.
	// This mechanism is useful if working with typed maps (if the key types
77
	// have constraints, and you already have a reified `schema.TypedNode` value,
78 79
	// using that value can save parsing and validation costs);
	// and may simply be convenient if you already have a Node value in hand.
80 81
	//
	// (When writing generic functions over Node, a good rule of thumb is:
82
	// when handling a map, check for `schema.TypedNode`, and in this case prefer
83
	// the LookupByNode(Node) method; otherwise, favor LookupByString; typically
84
	// implementations will have their fastest paths thusly.)
85
	LookupByNode(key Node) (Node, error)
86

87
	// LookupByIndex is the equivalent of LookupByString but for indexing into a list.
88
	// As with LookupByString, the returned Node may be any of the Kind:
89
	// a primitive (string, int64, etc), a map, a list, or a link.
90
	//
91
	// If the Kind of this Node is not Kind_List, a nil node and an error
92 93 94
	// will be returned.
	//
	// If idx is out of range, a nil node and an error will be returned.
95
	LookupByIndex(idx int64) (Node, error)
96

97
	// LookupBySegment is will act as either LookupByString or LookupByIndex,
98 99
	// whichever is contextually appropriate.
	//
100
	// Using LookupBySegment may imply an "atoi" conversion if used on a list node,
101 102
	// or an "itoa" conversion if used on a map node.  If an "itoa" conversion
	// takes place, it may error, and this method may return that error.
103
	LookupBySegment(seg PathSegment) (Node, error)
104 105 106

	// Note that when using codegenerated types, there may be a fifth variant
	// of lookup method on maps: `Get($GeneratedTypeKey) $GeneratedTypeValue`!
Eric Myhre's avatar
Eric Myhre committed
107

108 109
	// MapIterator returns an iterator which yields key-value pairs
	// traversing the node.
110
	// If the node kind is anything other than a map, nil will be returned.
111 112 113 114 115
	//
	// The iterator will yield every entry in the map; that is, it
	// can be expected that itr.Next will be called node.Length times
	// before itr.Done becomes true.
	MapIterator() MapIterator
116

117
	// ListIterator returns an iterator which traverses the node and yields indicies and list entries.
118
	// If the node kind is anything other than a list, nil will be returned.
119
	//
120 121 122
	// The iterator will yield every entry in the list; that is, it
	// can be expected that itr.Next will be called node.Length times
	// before itr.Done becomes true.
123 124
	//
	// List iteration is ordered, and indices yielded during iteration will range from 0 to Node.Length-1.
125
	// (The LD Data Model definition of lists only defines that it is an ordered list of elements;
126
	// the definition does not include a concept of sparseness, so the indices are always sequential.)
127
	ListIterator() ListIterator
128 129 130

	// Length returns the length of a list, or the number of entries in a map,
	// or -1 if the node is not of list nor map kind.
131
	Length() int64
132

133 134
	// Absent nodes are returned when traversing a struct field that is
	// defined by a schema but unset in the data.  (Absent nodes are not
135
	// possible otherwise; you'll only see them from `schema.TypedNode`.)
136
	// The absent flag is necessary so iterating over structs can
137 138
	// unambiguously make the distinction between values that are
	// present-and-null versus values that are absent.
139
	//
tavit ohanian's avatar
tavit ohanian committed
140
	// Absent nodes respond to `Kind()` as `ld.Kind_Null`,
141 142 143 144
	// for lack of any better descriptive value; you should therefore
	// always check IsAbsent rather than just a switch on kind
	// when it may be important to handle absent values distinctly.
	IsAbsent() bool
Eric Myhre's avatar
Eric Myhre committed
145

Eric Myhre's avatar
Eric Myhre committed
146
	IsNull() bool
Eric Myhre's avatar
Eric Myhre committed
147
	AsBool() (bool, error)
148
	AsInt() (int64, error)
149 150 151
	AsFloat() (float64, error)
	AsString() (string, error)
	AsBytes() ([]byte, error)
152
	AsLink() (Link, error)
153

154
	// Prototype returns a NodePrototype which can describe some properties of this node's implementation,
155 156
	// and also be used to get a NodeBuilder,
	// which can be use to create new nodes with the same implementation as this one.
157
	//
158
	// For typed nodes, the NodePrototype will also implement schema.Type.
159
	//
160
	// For Advanced Data Layouts, the NodePrototype will encapsulate any additional
161
	// parameters and configuration of the ADL, and will also (usually)
162
	// implement NodePrototypeSupportingAmend.
Eric Myhre's avatar
Eric Myhre committed
163
	//
164
	// Calling this method should not cause an allocation.
165
	Prototype() NodePrototype
166 167
}

Daniel Martí's avatar
Daniel Martí committed
168
// ADL represents an Advanced Data Layout, a special kind of Node which
169
// implements custom logic while still behaving like an LD node.
Daniel Martí's avatar
Daniel Martí committed
170 171
//
// For more details, see the docs at
tavit ohanian's avatar
tavit ohanian committed
172
// https://gitlab.dms3.io/ld/specs/blob/master/schemas/authoring-guide.md.
Daniel Martí's avatar
Daniel Martí committed
173 174 175 176 177 178 179 180
type ADL interface {
	Node

	// Substrate returns the underlying Data Model node, which can be used
	// to encode an ADL's raw layout.
	Substrate() Node
}

181 182
// NodePrototype describes a node implementation (all Node have a NodePrototype),
// and a NodePrototype can always be used to get a NodeBuilder.
183
//
184 185
// A NodePrototype may also provide other information about implementation;
// such information is specific to this library ("prototype" isn't a concept
186
// you'll find in the LD Specifications), and is usually provided through
187
// feature-detection interfaces (for example, see NodePrototypeSupportingAmend).
188
//
189
// Generic algorithms for working with LD Nodes make use of NodePrototype
190 191 192 193
// to get builders for new nodes when creating data, and can also use the
// feature-detection interfaces to help decide what kind of operations
// will be optimal to use on a given node implementation.
//
194 195
// Note that NodePrototype is not the same as schema.Type.
// NodePrototype is a (golang-specific!) way to reflect upon the implementation
196
// and in-memory layout of some LD data.
197 198
// schema.Type is information about how a group of nodes is related in a schema
// (if they have one!) and the rules that the type mandates the node must follow.
199 200
// (Every node must have a prototype; but schema types are an optional feature.)
type NodePrototype interface {
201
	// NewBuilder returns a NodeBuilder that can be used to create a new Node.
202
	//
203
	// Note that calling NewBuilder often performs an allocation
204
	// (while in contrast, getting a NodePrototype typically does not!) --
205 206 207 208
	// this may be consequential when writing high performance code.
	NewBuilder() NodeBuilder
}

209 210
// NodePrototypeSupportingAmend is a feature-detection interface that can be
// used on a NodePrototype to see if it's possible to build new nodes of this style
211 212 213 214 215 216
// while sharing some internal data in a copy-on-write way.
//
// For example, Nodes using an Advanced Data Layout will typically
// support this behavior, and since ADLs are often used for handling large
// volumes of data, detecting and using this feature can result in significant
// performance savings.
217
type NodePrototypeSupportingAmend interface {
218 219 220 221 222 223
	AmendingBuilder(base Node) NodeBuilder
	// FUTURE: probably also needs a `AmendingWithout(base Node, filter func(k,v) bool) NodeBuilder`, or similar.
	//  ("deletion" based APIs are also possible but both more complicated in interfaces added, and prone to accidentally quadratic usage.)
	// FUTURE: there should be some stdlib `Copy` (?) methods that automatically look for this feature, and fallback if absent.
	//  Might include a wide range of point `Transform`, etc, methods.
	// FUTURE: consider putting this (and others like it) in a `feature` package, if there begin to be enough of them and docs get crowded.
Eric Myhre's avatar
Eric Myhre committed
224
}
225

226 227 228
// MapIterator is an interface for traversing map nodes.
// Sequential calls to Next() will yield key-value pairs;
// Done() describes whether iteration should continue.
229
//
230 231 232 233 234 235 236 237 238 239 240 241
// Iteration order is defined to be stable: two separate MapIterator
// created to iterate the same Node will yield the same key-value pairs
// in the same order.
// The order itself may be defined by the Node implementation: some
// Nodes may retain insertion order, and some may return iterators which
// always yield data in sorted order, for example.
type MapIterator interface {
	// Next returns the next key-value pair.
	//
	// An error value can also be returned at any step: in the case of advanced
	// data structures with incremental loading, it's possible to encounter
	// cancellation or I/O errors at any point in iteration.
242 243 244 245
	// If an error will be returned by the next call to Next,
	// then the boolean returned by the Done method will be false
	// (meaning it's acceptable to check Done first and move on if it's true,
	// since that both means the iterator is complete and that there is no error).
246 247 248 249
	// If an error is returned, the key and value may be nil.
	Next() (key Node, value Node, err error)

	// Done returns false as long as there's at least one more entry to iterate.
Matthias Beyer's avatar
Fix doc  
Matthias Beyer committed
250
	// When Done returns true, iteration can stop.
251
	//
Eric Myhre's avatar
Eric Myhre committed
252 253
	// Note when implementing iterators for advanced data layouts (e.g. more than
	// one chunk of backing data, which is loaded incrementally): if your
254
	// implementation does any I/O during the Done method, and it encounters
Eric Myhre's avatar
Eric Myhre committed
255
	// an error, it must return 'false', so that the following Next call
256 257 258 259 260 261 262
	// has an opportunity to return the error.
	Done() bool
}

// ListIterator is an interface for traversing list nodes.
// Sequential calls to Next() will yield index-value pairs;
// Done() describes whether iteration should continue.
263
//
264 265 266 267
// ListIterator's Next method returns an index for convenience,
// but this number will always start at 0 and increment by 1 monotonically.
// A loop which iterates from 0 to Node.Length while calling Node.LookupByIndex
// is equivalent to using a ListIterator.
268 269 270 271 272 273
type ListIterator interface {
	// Next returns the next index and value.
	//
	// An error value can also be returned at any step: in the case of advanced
	// data structures with incremental loading, it's possible to encounter
	// cancellation or I/O errors at any point in iteration.
274 275 276 277
	// If an error will be returned by the next call to Next,
	// then the boolean returned by the Done method will be false
	// (meaning it's acceptable to check Done first and move on if it's true,
	// since that both means the iterator is complete and that there is no error).
278
	// If an error is returned, the key and value may be nil.
279
	Next() (idx int64, value Node, err error)
280 281 282 283

	// Done returns false as long as there's at least one more entry to iterate.
	// When Done returns false, iteration can stop.
	//
Eric Myhre's avatar
Eric Myhre committed
284 285
	// Note when implementing iterators for advanced data layouts (e.g. more than
	// one chunk of backing data, which is loaded incrementally): if your
286
	// implementation does any I/O during the Done method, and it encounters
Eric Myhre's avatar
Eric Myhre committed
287
	// an error, it must return 'false', so that the following Next call
288 289
	// has an opportunity to return the error.
	Done() bool
290 291 292 293 294
}

// REVIEW: immediate-mode AsBytes() method (as opposed to e.g. returning
// an io.Reader instance) might be problematic, esp. if we introduce
// AdvancedLayouts which support large bytes natively.
295 296 297 298 299 300 301 302 303 304
//
// Probable solution is having both immediate and iterator return methods.
// Returning a reader for bytes when you know you want a slice already
// is going to be high friction without purpose in many common uses.
//
// Unclear what SetByteStream() would look like for advanced layouts.
// One could try to encapsulate the chunking entirely within the advlay
// node impl... but would it be graceful?  Not sure.  Maybe.  Hopefully!
// Yes?  The advlay impl would still tend to use SetBytes for the raw
// data model layer nodes its composing, so overall, it shakes out nicely.