linksystem.go 13.4 KB
Newer Older
tavit ohanian's avatar
tavit ohanian committed
1
package ld
Eric Myhre's avatar
Eric Myhre committed
2 3 4 5 6 7 8 9

import (
	"fmt"
	"hash"
	"io"
)

// LinkSystem is a struct that composes all the individual functions
10
// needed to load and store content addressed data using LD --
Eric Myhre's avatar
Eric Myhre committed
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
// encoding functions, hashing functions, and storage connections --
// and then offers the operations a user wants -- Store and Load -- as methods.
//
// Typically, the functions which are fields of LinkSystem are not used
// directly by users (except to set them, when creating the LinkSystem),
// and it's the higher level operations such as Store and Load that user code then calls.
//
// The most typical way to get a LinkSystem is from the linking/cid package,
// which has a factory function called DefaultLinkSystem.
// The LinkSystem returned by that function will be based on CIDs,
// and use the multicodec registry and multihash registry to select encodings and hashing mechanisms.
// The BlockWriteOpener and BlockReadOpener must still be provided by the user;
// otherwise, only the ComputeLink method will work.
//
// Some implementations of BlockWriteOpener and BlockReadOpener may be
// found in the storage package.  Applications are also free to write their own.
// Custom wrapping of BlockWriteOpener and BlockReadOpener are also common,
// and may be reasonable if one wants to build application features that are block-aware.
type LinkSystem struct {
	EncoderChooser     func(LinkPrototype) (Encoder, error)
	DecoderChooser     func(Link) (Decoder, error)
	HasherChooser      func(LinkPrototype) (hash.Hash, error)
	StorageWriteOpener BlockWriteOpener
	StorageReadOpener  BlockReadOpener
35
	TrustedStorage     bool
36
	NodeReifier        NodeReifier
Eric Myhre's avatar
Eric Myhre committed
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69
}

// The following two types define the two directions of transform that a codec can be expected to perform:
// from Node to serial stream, and from serial stream to Node (via a NodeAssembler).
//
// You'll find a couple of implementations matching this shape in subpackages of 'codec' in this module
// (these are the handful of encoders and decoders we ship as "batteries included").
// Other encoder and decoder implementations can be found in other repositories/modules.
// It should also be easy to implement encodecs and decoders of your own!
//
// Encoder and Decoder functions can be used on their own, but are also often used via the LinkSystem construction,
// which handles all the other related operations necessary for a content-addressed storage system at once.
//
// Encoder and Decoder functions can be registered in the multicodec table in the `codec` package
// if they're providing functionality that matches the expectations for a multicodec identifier.
// This table will be used by some common EncoderChooser and DecoderChooser implementations
// (namely, the ones in LinkSystems produced by the `linking/cid` package).
type (
	// Encoder defines the shape of a function which traverses a Node tree
	// and emits its data in a serialized form into an io.Writer.
	//
	// The dual of Encoder is a Decoder, which takes a NodeAssembler
	// and fills it with deserialized data consumed from an io.Reader.
	// Typically, Decoder and Encoder functions will be found in pairs,
	// and will be expected to be able to round-trip each other's data.
	//
	// Encoder functions can be used directly.
	// Encoder functions are also often used via a LinkSystem when working with content-addressed storage.
	// LinkSystem methods will helpfully handle the entire process of traversing a Node tree,
	// encoding this data, hashing it, streaming it to the writer, and committing it -- all as one step.
	//
	// An Encoder works with Nodes.
	// If you have a native golang structure, and want to serialize it using an Encoder,
tavit ohanian's avatar
tavit ohanian committed
70
	// you'll need to figure out how to transform that golang structure into an ld.Node tree first.
Eric Myhre's avatar
Eric Myhre committed
71 72
	//
	// It may be useful to understand "multicodecs" when working with Encoders.
73
	// In LD, a system called "multicodecs" is typically used to describe encoding foramts.
Eric Myhre's avatar
Eric Myhre committed
74
	// A "multicodec indicator" is a number which describes an encoding;
75
	// the Link implementations used in LD (CIDs) store a multicodec indicator in the Link;
Eric Myhre's avatar
Eric Myhre committed
76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115
	// and in this library, a multicodec registry exists in the `codec` package,
	// and can be used to associate a multicodec indicator number with an Encoder function.
	// The default EncoderChooser in a LinkSystem will use this multicodec registry to select Encoder functions.
	// However, you can construct a LinkSystem that uses any EncoderChooser you want.
	// It is also possible to have and use Encoder functions that aren't registered as a multicodec at all...
	// we just recommend being cautious of this, because it may make your data less recognizable
	// when working with other systems that use multicodec indicators as part of their communication.
	Encoder func(Node, io.Writer) error

	// Decoder defines the shape of a function which produces a Node tree
	// by reading serialized data from an io.Reader.
	// (Decoder doesn't itself return a Node directly, but rather takes a NodeAssembler as an argument,
	// because this allows the caller more control over the Node implementation,
	// as well as some control over allocations.)
	//
	// The dual of Decoder is an Encoder, which takes a Node and
	// emits its data in a serialized form into an io.Writer.
	// Typically, Decoder and Encoder functions will be found in pairs,
	// and will be expected to be able to round-trip each other's data.
	//
	// Decoder functions can be used directly.
	// Decoder functions are also often used via a LinkSystem when working with content-addressed storage.
	// LinkSystem methods will helpfully handle the entire process of opening block readers,
	// verifying the hash of the data stream, and applying a Decoder to build Nodes -- all as one step.
	//
	// A Decoder works with Nodes.
	// If you have a native golang structure, and want to populate it with data using a Decoder,
	// you'll need to either get a NodeAssembler which proxies data into that structure directly,
	// or assemble a Node as intermediate storage and copy the data to the native structure as a separate step.
	//
	// It may be useful to understand "multicodecs" when working with Decoders.
	// See the documentation on the Encoder function interface for more discussion of multicodecs,
	// the multicodec table, and how this is typically connected to linking.
	Decoder func(NodeAssembler, io.Reader) error
)

// The following three types are the key functionality we need from a "blockstore".
//
// Some libraries might provide a "blockstore" object that has these as methods;
// it may also have more methods (like enumeration features, GC features, etc),
116
// but LD doesn't generally concern itself with those.
Eric Myhre's avatar
Eric Myhre committed
117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141
// We just need these key things, so we can "put" and "get".
//
// The functions are a tad more complicated than "put" and "get" so that they have good mechanical sympathy.
// In particular, the writing/"put" side is broken into two phases, so that the abstraction
// makes it easy to begin to write data before the hash that will identify it is fully computed.
type (
	// BlockReadOpener defines the shape of a function used to
	// open a reader for a block of data.
	//
	// In a content-addressed system, the Link parameter should be only
	// determiner of what block body is returned.
	//
	// The LinkContext may be zero, or may be used to carry extra information:
	// it may be used to carry info which hints at different storage pools;
	// it may be used to carry authentication data; etc.
	// (Any such behaviors are something that a BlockReadOpener implementation
	// will needs to document at a higher detail level than this interface specifies.
	// In this interface, we can only note that it is possible to pass such information opaquely
	// via the LinkContext or by attachments to the general-purpose Context it contains.)
	// The LinkContext should not have effect on the block body returned, however;
	// at most should only affect data availability
	// (e.g. whether any block body is returned, versus an error).
	//
	// Reads are cancellable by cancelling the LinkContext.Context.
	//
142
	// Other parts of the LD library suite (such as the traversal package, and all its functions)
Eric Myhre's avatar
Eric Myhre committed
143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171
	// will typically take a Context as a parameter or piece of config from the caller,
	// and will pass that down through the LinkContext, meaning this can be used to
	// carry information as well as cancellation control all the way through the system.
	//
	// BlockReadOpener is typically not used directly, but is instead
	// composed in a LinkSystem and used via the methods of LinkSystem.
	// LinkSystem methods will helpfully handle the entire process of opening block readers,
	// verifying the hash of the data stream, and applying a Decoder to build Nodes -- all as one step.
	//
	// BlockReadOpener implementations are not required to validate that
	// the contents which will be streamed out of the reader actually match
	// and hash in the Link parameter before returning.
	// (This is something that the LinkSystem composition will handle if you're using it.)
	//
	// Some implementations of BlockWriteOpener and BlockReadOpener may be
	// found in the storage package.  Applications are also free to write their own.
	BlockReadOpener func(LinkContext, Link) (io.Reader, error)

	// BlockWriteOpener defines the shape of a function used to open a writer
	// into which data can be streamed, and which will eventually be "commited".
	// Committing is done using the BlockWriteCommitter returned by using the BlockWriteOpener,
	// and finishes the write along with requiring stating the Link which should identify this data for future reading.
	//
	// The LinkContext may be zero, or may be used to carry extra information:
	// it may be used to carry info which hints at different storage pools;
	// it may be used to carry authentication data; etc.
	//
	// Writes are cancellable by cancelling the LinkContext.Context.
	//
172
	// Other parts of the LD library suite (such as the traversal package, and all its functions)
Eric Myhre's avatar
Eric Myhre committed
173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209
	// will typically take a Context as a parameter or piece of config from the caller,
	// and will pass that down through the LinkContext, meaning this can be used to
	// carry information as well as cancellation control all the way through the system.
	//
	// BlockWriteOpener is typically not used directly, but is instead
	// composed in a LinkSystem and used via the methods of LinkSystem.
	// LinkSystem methods will helpfully handle the entire process of traversing a Node tree,
	// encoding this data, hashing it, streaming it to the writer, and committing it -- all as one step.
	//
	// BlockWriteOpener implementations are expected to start writing their content immediately,
	// and later, the returned BlockWriteCommitter should also be able to expect that
	// the Link which it is given is a reasonable hash of the content.
	// (To give an example of how this might be efficiently implemented:
	// One might imagine that if implementing a disk storage mechanism,
	// the io.Writer returned from a BlockWriteOpener will be writing a new tempfile,
	// and when the BlockWriteCommiter is called, it will flush the writes
	// and then use a rename operation to place the tempfile in a permanent path based the Link.)
	//
	// Some implementations of BlockWriteOpener and BlockReadOpener may be
	// found in the storage package.  Applications are also free to write their own.
	BlockWriteOpener func(LinkContext) (io.Writer, BlockWriteCommitter, error)

	// BlockWriteCommitter defines the shape of a function which, together
	// with BlockWriteOpener, handles the writing and "committing" of a write
	// to a content-addressable storage system.
	//
	// BlockWriteCommitter is a function which is will be called at the end of a write process.
	// It should flush any buffers and close the io.Writer which was
	// made available earlier from the BlockWriteOpener call that also returned this BlockWriteCommitter.
	//
	// BlockWriteCommitter takes a Link parameter.
	// This Link is expected to be a reasonable hash of the content,
	// so that the BlockWriteCommitter can use this to commit the data to storage
	// in a content-addressable fashion.
	// See the documentation of BlockWriteOpener for more description of this
	// and an example of how this is likely to be reduced to practice.
	BlockWriteCommitter func(Link) error
210 211 212 213 214 215 216 217 218 219 220 221 222 223 224

	// NodeReifier defines the shape of a function that given a node with no schema
	// or a basic schema, constructs Advanced Data Layout node
	//
	// The LinkSystem itself is passed to the NodeReifier along with a link context
	// because Node interface methods on an ADL may actually traverse links to other
	// pieces of context addressed data that need to be loaded with the Link system
	//
	// A NodeReifier return one of three things:
	// - original node, no error = no reification occurred, just use original node
	// - reified node, no error = the simple node was converted to an ADL
	// - nil, error = the simple node should have been converted to an ADL but something
	// went wrong when we tried to do so
	//
	NodeReifier func(LinkContext, Node, *LinkSystem) (Node, error)
Eric Myhre's avatar
Eric Myhre committed
225 226 227 228 229 230 231 232 233 234 235 236
)

// ErrLinkingSetup is returned by methods on LinkSystem when some part of the system is not set up correctly,
// or when one of the components refuses to handle a Link or LinkPrototype given.
// (It is not yielded for errors from the storage nor codec systems once they've started; those errors rise without interference.)
type ErrLinkingSetup struct {
	Detail string // Perhaps an enum here as well, which states which internal function was to blame?
	Cause  error
}

func (e ErrLinkingSetup) Error() string { return fmt.Sprintf("%s: %v", e.Detail, e.Cause) }
func (e ErrLinkingSetup) Unwrap() error { return e.Cause }