node.go 12.5 KB
Newer Older
Eric Myhre's avatar
Eric Myhre committed
1 2
package ipld

3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
// Node represents a value in IPLD.  Any point in a tree of data is a node:
// scalar values (like int, string, etc) are nodes, and
// so are recursive values (like map and list).
//
// Nodes and kinds are described in the IPLD specs at
// https://github.com/ipld/specs/blob/master/data-model-layer/data-model.md .
//
// Methods on the Node interface cover the superset of all possible methods for
// all possible kinds -- but some methods only make sense for particular kinds,
// and thus will only make sense to call on values of the appropriate kind.
// (For example, 'Length' on an int doesn't make sense,
// and 'AsInt' on a map certainly doesn't work either!)
// Use the ReprKind method to find out the kind of value before
// calling kind-specific methods.
// Individual method documentation state which kinds the method is valid for.
// (If you're familiar with the stdlib reflect package, you'll find
// the design of the Node interface very comparable to 'reflect.Value'.)
//
// The Node interface is read-only.  All of the methods on the interface are
// for examining values, and implementations should be immutable.
// The companion interface, NodeBuilder, provides the matching writable
// methods, and should be use to create a (thence immutable) Node.
//
// Keeping Node immutable and separating mutation into NodeBuilder makes
// it possible to perform caching (or rather, memoization, since there's no
// such thing as cache invalidation for immutable systems) of computed
// properties of Node; use copy-on-write algorithms for memory efficiency;
// and to generally build pleasant APIs.
// Many library functions will rely on the immutability of Node (e.g.,
// assuming that pointer-equal nodes do not change in value over time),
// so any user-defined Node implementations should be careful to uphold
// the immutability contract.)
//
// There are many different concrete types which implement Node.
// The primary purpose of various node implementations is to organize
// memory in the program in different ways -- some in-memory layouts may
// be more optimal for some programs than others, and changing the Node
// (and NodeBuilder) implementations lets the programmer choose.
//
// For concrete implementations of Node, check out the "./impl/" folder,
// and the packages within it.
// "impl/free" should probably be your first start; the Node and NodeBuilder
// implementations in that package work for any data.
// Other packages are optimized for specific use-cases.
// Codegen tools can also be used to produce concrete implementations of Node;
// these may be specific to certain data, but still conform to the Node
// interface for interoperability and to support higher-level functions.
//
// Nodes may also be *typed* -- see the 'schema' and 'impl/typed' packages.
// Typed nodes have additional constraints and behaviors (and have a
// `.Type().Kind()` in addition to their `.ReprKind()`!), but still behave
// as a regular Node in all the basic ways.
Eric Myhre's avatar
Eric Myhre committed
55
type Node interface {
56
	// ReprKind returns a value from the ReprKind enum describing what the
57 58
	// essential serializable kind of this node is (map, list, int, etc).
	// Most other handling of a node requires first switching upon the kind.
Eric Myhre's avatar
Eric Myhre committed
59
	ReprKind() ReprKind
60

61
	// LookupString looks up a child object in this node and returns it.
62 63
	// The returned Node may be any of the ReprKind:
	// a primitive (string, int, etc), a map, a list, or a link.
64 65 66 67 68
	//
	// If the Kind of this Node is not ReprKind_Map, a nil node and an error
	// will be returned.
	//
	// If the key does not exist, a nil node and an error will be returned.
69
	LookupString(key string) (Node, error)
70

71 72 73
	// Lookup is the equivalent of LookupString, but takes a reified Node
	// as a parameter instead of a plain string.
	// This mechanism is useful if working with typed maps (if the key types
74
	// have constraints, and you already have a reified `schema.TypedNode` value,
75 76
	// using that value can save parsing and validation costs);
	// and may simply be convenient if you already have a Node value in hand.
77 78
	//
	// (When writing generic functions over Node, a good rule of thumb is:
79
	// when handling a map, check for `schema.TypedNode`, and in this case prefer
80 81 82
	// the Lookup(Node) method; otherwise, favor LookupString; typically
	// implementations will have their fastest paths thusly.)
	Lookup(key Node) (Node, error)
83 84 85

	// LookupIndex is the equivalent of LookupString but for indexing into a list.
	// As with LookupString, the returned Node may be any of the ReprKind:
86
	// a primitive (string, int, etc), a map, a list, or a link.
87 88 89 90 91
	//
	// If the Kind of this Node is not ReprKind_List, a nil node and an error
	// will be returned.
	//
	// If idx is out of range, a nil node and an error will be returned.
92 93
	LookupIndex(idx int) (Node, error)

94 95 96 97 98 99 100
	// LookupSegment is will act as either LookupString or LookupIndex,
	// whichever is contextually appropriate.
	//
	// Using LookupSegment may imply an "atoi" conversion if used on a list node,
	// or an "itoa" conversion if used on a map node.  If an "itoa" conversion
	// takes place, it may error, and this method may return that error.
	LookupSegment(seg PathSegment) (Node, error)
101 102 103

	// Note that when using codegenerated types, there may be a fifth variant
	// of lookup method on maps: `Get($GeneratedTypeKey) $GeneratedTypeValue`!
Eric Myhre's avatar
Eric Myhre committed
104

105 106
	// MapIterator returns an iterator which yields key-value pairs
	// traversing the node.
107 108
	// If the node kind is anything other than a map, the iterator will
	// yield error values.
109 110 111 112 113
	//
	// The iterator will yield every entry in the map; that is, it
	// can be expected that itr.Next will be called node.Length times
	// before itr.Done becomes true.
	MapIterator() MapIterator
114

115 116 117 118
	// ListIterator returns an iterator which yields key-value pairs
	// traversing the node.
	// If the node kind is anything other than a list, the iterator will
	// yield error values.
119
	//
120 121 122 123
	// The iterator will yield every entry in the list; that is, it
	// can be expected that itr.Next will be called node.Length times
	// before itr.Done becomes true.
	ListIterator() ListIterator
124 125 126 127

	// Length returns the length of a list, or the number of entries in a map,
	// or -1 if the node is not of list nor map kind.
	Length() int
128

Eric Myhre's avatar
Eric Myhre committed
129 130
	// Undefined nodes are returned when traversing a struct field that is
	// defined by a schema but unset in the data.  (Undefined nodes are not
131
	// possible otherwise; you'll only see them from `schema.TypedNode`.)
132 133 134 135
	// The undefined flag is necessary so iterating over structs can
	// unambiguously make the distinction between values that are
	// present-and-null versus values that are absent.
	IsUndefined() bool
Eric Myhre's avatar
Eric Myhre committed
136

Eric Myhre's avatar
Eric Myhre committed
137
	IsNull() bool
Eric Myhre's avatar
Eric Myhre committed
138 139
	AsBool() (bool, error)
	AsInt() (int, error)
140 141 142
	AsFloat() (float64, error)
	AsString() (string, error)
	AsBytes() ([]byte, error)
143
	AsLink() (Link, error)
144 145 146 147 148 149 150 151 152 153 154 155 156

	// NodeBuilder returns a NodeBuilder which can be used to build
	// new nodes of the same implementation type as this one.
	//
	// For map and list nodes, the NodeBuilder's append-oriented methods
	// will work using this node's values as a base.
	// If this is a typed node, the NodeBuilder will carry the same
	// typesystem constraints as this Node.
	//
	// (This feature is used by the traversal package, especially in
	// e.g. traversal.Transform, for doing tree updates while keeping the
	// existing implementation preferences and doing as many operations
	// in copy-on-write fashions as possible.)
Eric Myhre's avatar
Eric Myhre committed
157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180
	//
	// ---
	//
	// More specifically, the contract of a NodeBuilder returned by this method
	// is that it should be able to "replace" this node with a new one of
	// similar properties.
	// E.g., for a string, the builder must be able to build a new string.
	// For a map, the builder must be able to build a new map.
	// For a *struct* (when using typed nodes), the builder must be able to
	// build new structs of the name type.
	// Note that the promise doesn't extend further: there's no requirement
	// that the builder be able to build maps if this node's kind is "string"
	// (you can see why this lack-of-contract is important when considering
	// typed nodes: if this node has a struct type, then should the builder
	// be able to build other structs of different types?  Of course not;
	// there'd be no way to define which other types to build!).
	// For nulls, this means the builder doesn't have to do much at all!
	//
	// (Some Nodes may return a NodeBuilder that can be used for much more
	// than replacing their own kind: for example, Node implementations from
	// the ipldfree package tend to return a NodeBuilder than can build any
	// other ipldfree.Node (e.g. even the builder obtained from a string node
	// will be able to build maps).  This is not required by the contract;
	// such packages only do so out of internal implementation convenience.)
181 182 183 184 185 186
	//
	// This "able to replace" behavior also has a specific application regarding
	// nodes implementing Advanced Data Layouts: it means that the NodeBuilder
	// returned by this method must produce a new Node using that same ADL.
	// For example, if a Node is a map implemented by some sort of HAMT, its
	// NodeBuilder must also produce a new HAMT.
187
	NodeBuilder() NodeBuilder
Eric Myhre's avatar
Eric Myhre committed
188
}
189

190 191 192
// MapIterator is an interface for traversing map nodes.
// Sequential calls to Next() will yield key-value pairs;
// Done() describes whether iteration should continue.
193
//
194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211
// Iteration order is defined to be stable: two separate MapIterator
// created to iterate the same Node will yield the same key-value pairs
// in the same order.
// The order itself may be defined by the Node implementation: some
// Nodes may retain insertion order, and some may return iterators which
// always yield data in sorted order, for example.
type MapIterator interface {
	// Next returns the next key-value pair.
	//
	// An error value can also be returned at any step: in the case of advanced
	// data structures with incremental loading, it's possible to encounter
	// cancellation or I/O errors at any point in iteration.
	// If an error is returned, the boolean will always be false (so it's
	// correct to check the bool first and short circuit to continuing if true).
	// If an error is returned, the key and value may be nil.
	Next() (key Node, value Node, err error)

	// Done returns false as long as there's at least one more entry to iterate.
Matthias Beyer's avatar
Fix doc  
Matthias Beyer committed
212
	// When Done returns true, iteration can stop.
213
	//
Eric Myhre's avatar
Eric Myhre committed
214 215
	// Note when implementing iterators for advanced data layouts (e.g. more than
	// one chunk of backing data, which is loaded incrementally): if your
216
	// implementation does any I/O during the Done method, and it encounters
Eric Myhre's avatar
Eric Myhre committed
217
	// an error, it must return 'false', so that the following Next call
218 219 220 221 222 223 224
	// has an opportunity to return the error.
	Done() bool
}

// ListIterator is an interface for traversing list nodes.
// Sequential calls to Next() will yield index-value pairs;
// Done() describes whether iteration should continue.
225
//
226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241
// A loop which iterates from 0 to Node.Length is a valid
// alternative to using a ListIterator.
type ListIterator interface {
	// Next returns the next index and value.
	//
	// An error value can also be returned at any step: in the case of advanced
	// data structures with incremental loading, it's possible to encounter
	// cancellation or I/O errors at any point in iteration.
	// If an error is returned, the boolean will always be false (so it's
	// correct to check the bool first and short circuit to continuing if true).
	// If an error is returned, the key and value may be nil.
	Next() (idx int, value Node, err error)

	// Done returns false as long as there's at least one more entry to iterate.
	// When Done returns false, iteration can stop.
	//
Eric Myhre's avatar
Eric Myhre committed
242 243
	// Note when implementing iterators for advanced data layouts (e.g. more than
	// one chunk of backing data, which is loaded incrementally): if your
244
	// implementation does any I/O during the Done method, and it encounters
Eric Myhre's avatar
Eric Myhre committed
245
	// an error, it must return 'false', so that the following Next call
246 247
	// has an opportunity to return the error.
	Done() bool
248 249 250 251 252
}

// REVIEW: immediate-mode AsBytes() method (as opposed to e.g. returning
// an io.Reader instance) might be problematic, esp. if we introduce
// AdvancedLayouts which support large bytes natively.
253 254 255 256 257 258 259 260 261 262
//
// Probable solution is having both immediate and iterator return methods.
// Returning a reader for bytes when you know you want a slice already
// is going to be high friction without purpose in many common uses.
//
// Unclear what SetByteStream() would look like for advanced layouts.
// One could try to encapsulate the chunking entirely within the advlay
// node impl... but would it be graceful?  Not sure.  Maybe.  Hopefully!
// Yes?  The advlay impl would still tend to use SetBytes for the raw
// data model layer nodes its composing, so overall, it shakes out nicely.