tag, in the absence of such a tag, the file represents a single document.
The indexer requires one metadata field for documents: _docno_, the value
of this field must be unique within the index, otherwise the document is
@@ -220,22 +264,268 @@ When an application generated (using the addString interface) document includes
first added by the parser, and the second added by the application. The
parser will use the last occurrence as the authoritative _docno_ record.
+### Immutable Index State
+
+Immutable index repository state enables mutable index recovery and sharing on the p2p network.
+
+The immutable index repository state is updated when a container is created, and when a document is added to a container.
+
+Immutable index repository state consists of the following dag node tree:
+
+```mermaid
+erDiagram
+ Reposet-root ||--o{ Store-block : contains
+ Reposet-root {
+ Link-array-256 metastore-block
+ Link-array-256 infostore-block
+ }
+ Store-block ||--o{ RepoProps : contains
+ Store-block {
+ Link-array-65536 RepoProps
+ }
+ RepoProps {
+ String Type
+ String Kind
+ String Name
+ String Path
+ String CreatedAt
+ Link Params
+ Link-array-256 Corpus
+ Boolean PublishStatus
+ Integer PublishInterval
+ Map Stats
+ }
+ RepoProps ||--|| Params : links-to
+ Params {
+ Bytes Tagged-parameters-file
+ }
+ RepoProps ||--o{ Corpus : links-to
+ Corpus {
+ Link-array-65536 file-id
+ }
+```
+
+Initial design makes the following choices and assumptions:
+
+1) limit maximum block sizes to 2MiBs
+
+2) maximum corpus size per index repo container = 10TiBs
+
+3) an average document size of 1 KiB
+
+To accommodate the above assumptions, the following constraints can be computed:
+
+- max block size = 2 x 1024 x 1024 = 2097152
+- size of CID string = 32 bytes
+- CIDs per block = 2097152 / 32 = 65536
+- max corpus size 10 x 1024 x 1024 x 1024 = 10737418240
+- max documents per container 10737418240 / 1024 = 10485760
+- max CIDs per container = 10485760 / 32 = 327680
+- max corpus blocks per container = 327680 / 65536 = 5
+
+_describe repoprops and store blocks_
+
+
+### Managing Index Repositories
+
+#### Configure
+
+Index configuration property structure is shown below. Notes on current implementation constraints and limitations are discussed here. Configuration rules are likely to change as the functionality evolves.
+
+Once an index of a specific kind is created, its configuration parameters must not be modified. Otherwise the search engine will misbehave with incorrect query results or worse, the engine may crash. Modifying index structure will renumber terms like metadata and fields, effectively corrupting the configuration.
+
+When recovering a corrupted index from immutable state, the documents re-added to the index will need to be re-encoded, to avoid incorrect time information. Recovery tools will be developed at a later time.
+
+DMS3 configuration file allows configuring various index repository properties. The search engine supports its own set of configuration properties via the parameters file. The lifecycle management library further imposes additional conventions when mapping dms3 index configuration to create the search engine parameters file. The following is a summary of the current mapping conventions that will likely evolve over time:
+
+index and corpus parameter configuration
+ params[index] = cfg Indexer.Path * code overrides configured value *
+ params[corpus][annotation] = cfg Indexer.Corpus.Annotation * not used *
+ params[corpus] = cfg Indexer.Corpus
+ params[corpus][path] = cfg Indexer.Corpus.Path * code overrides configured value *
+ params[corpus][class] = cfg Indexer.Corpus.Class
+ params[corpus][metadata] = cfg Indexer.Corpus.Metadata * not used *
+
+optional parameter configuration
+ params[memory] = cfg Indexer.Memory
+ params[stemmer][name] = cfg Indexer.Stemmer
+ params[normalize] = cfg Indexer.Normalize
+ params[stopper][word] = cfg Indexer.Stopper[i]
+
+metadata and field parameter configuration is hard coded
+
+document kind-specific field parameter configuration
+ params[field][name][f] = cfg Metadata.Kind[i][f]
+
+note: the infospace interface can override some parameters at time of
+index creation (see MakeIndex).
+
+TODO: remove from index configuration:
+ Indexer.Corpus:
+ annotations not used
+ path is overriden (computed)
+ metadata not used
+
+ Indexer.Path:
+ path is overriden (computed)
+
+ Indexer.Stopper: []
+ is overriden, or complemented by global stopwords file
+
+
+{
+ "Indexer": {
+ "Corpus": {
+ "Class": "html",
+ "Path": ""
+ },
+ "MaxDocs": "100M",
+ "Memory": "100M",
+ "Normalize": true,
+ "Path": "",
+ "Stemmer": "krovetz",
+ "Stopper": [
+ "a",
+ "an",
+ "the",
+ "as"
+ ]
+ },
+ "Metadata": {
+ "Kind": [
+ {
+ "Field": [
+ "About",
+ "Address",
+ "Affiliation",
+ "Author",
+ "Brand",
+ "Citation",
+ "Description",
+ "Email",
+ "Headline",
+ "Keywords",
+ "Language", Valid
+ "Name",
+ "Telephone",
+ "Version"
+ ],
+ "Name": "blog"
+ }
+ ]
+ "Publisher": [
+ {
+ "Schedule:" [
+ {
+ "Interval": "immediately",
+ "Status": "enabled",
+ "Name": [
+ "myblog20"
+ ]
+ },
+ {
+ "Interval": "daily",
+ "Status": "enabled",
+ "Name": [
+ "mideastern-foods"
+ ]
+ },
+ {
+ "Interval": "weekly",
+ "Status": "disabled",
+ "Name": [
+ ],
+ },
+ ]
+ }
+ ]
+ },
+ "Retriever": {
+ "MaxResultCount": 100
+ }
+}
+
+
+_Details to be documented at a later time..._
+
+#### Index
+
+_Details to be documented at a later time..._
+
+#### Query
+
+_Details to be documented at a later time..._
+
+#### Track
+
+A key-value is stored in the dms3 KVS data store when a new index repository is created.
+
+A key composing the container namespace name of an index repository is used to lookup repo statistics in the KVS.
+
+_Additional details to be documented at a later time..._
+
+
+#### Recover
+
+The immutable index state is used to reconstruct the mutable index state.
+
+Index container recovery will be on a container instance basis.
+
+_Details to be documented at a later time..._
+
+#### Publish
+
+An index may be marked for publishing to share its content on the p2p network,
+
+The publish properties of an index is specified by the index configuration file.
+
+A number of mutually exclusive publishing schedules are supported. An index repository may be assigned to at most one schedule.
+
+Publishing properties are bound to the repository _name_ key of the container namespace, and affects all container instances within the named sub-space.
+
+The publish properties define:
+
+Status
+: Current publishing status. The value is _enabled_ or _disabled_. The default value is initialized to _disabled_ when the index repository is created.
+
+Interval
+: The interval duration at which index state updates are published. Valid interval values include: _immediate_, _daily_, _weekly_, _biweekly_, _monthly_, _quarterly_, _half-annual_, _annual_
+
+Name
+: The list of index repository names to be published.
+
+Once index publishing schedule is configured, you must run the daemon with index publishing and subscription feature enabled:
+
+```
+dms3 daemon [--enable-index-pubsub]
+```
-## Network protocol stack
+## Network Protocol Stacks
DMS3 offers two classes of fault tolerant services.
-1. Centralized practical byzantine fault tolerant (PBFT) services provide protection for personal private data.
+1. Decentralized Information Blockchain protocol services to provide distribution and access services for shared public data.
- For personal information not publicly shared, DMS3 offers centralized PBFT services for maintaining immutable data integrity. The immutable state for a repository consists of the initial configuration parameters, documents added to the repository corpus, and other repository management state.
+2. Decentralized Financial Blockchain protocol services to provide smart contract based information trading services.
-2. Decentralized p2p protocol services to provide distribution and access services for shared public data.
+3. Centralized practical byzantine fault tolerant (PBFT) services provide data storage scaling, protection, access, and distribution services.
+
+4. Decentralized p2p protocol services to provide distribution and access services for shared public data.
- For shared public information, this service defrays the hosting and access costs of dedicated compute and storage resources.
The following sub-sections describe these services.
-### Practical byzantine fault tolerant (PBFT) services
+### Decentralized Information Blockchain Services
+
+_Details to be documented at a later time..._
+
+
+### Decentralized Financial Blockchain Services
+
+_Details to be documented at a later time..._
+
+
+### Practical Byzantine Fault Tolerant (PBFT) Services
DMS3 allows participants to offer centralized data protection services to protect the users' personal information.
@@ -252,6 +542,8 @@ dms3 implements two PBFS services.
DMS3 PBFS services automatically recover from a configured number of arbitrary simultaneous faults.
+_Additional details to be documented at a later time..._
+
#### Index fault recovery
@@ -268,13 +560,13 @@ There are two options to recover an index repository.
#### Publishing a repository
-Use the _index_ _publish_ command to enable other users on the p2p network to search and access the contents of a repository.
+Use the _index_ _config_ command to enable publishing of repositories to be shared with other users on the p2p network.
### Published information distribution and protection services
-High demand for published information can overburden participant's compute resources, dms3 enables optional paid p2p services to offload compute and storage onto other p2p nodes.
+High demand for published information can overburden participant's compute resources. DMS3 enables optional paid p2p services to offload bandwidth, compute, and storage loads onto a proprietary fault tolerant centralized Data Cloud.
-p2p protocol network enables participants to contribute compute and storage resources to gain income in return to the use of their resources.
+p2p protocol network also enables participants to contribute compute and storage resources to gain income in return to the use of their resources.
#### Information curators
diff --git a/docs/diagrams.md b/docs/diagrams.md
index 250a5e0815c5927292f1b848e0293de8deaeef63..82045cafc13e8063d39cf802187f1701c07a23b5 100644
--- a/docs/diagrams.md
+++ b/docs/diagrams.md
@@ -64,3 +64,23 @@ graph TD;
B-->D;
C-->D;
```
+
+```mermaid
+erDiagram
+ CUSTOMER ||--o{ ORDER : places
+ CUSTOMER {
+ string name
+ string custNumber
+ string sector
+ }
+ ORDER ||--|{ LINE-ITEM : contains
+ ORDER {
+ int orderNumber
+ string deliveryAddress
+ }
+ LINE-ITEM {
+ string productCode
+ int quantity
+ float pricePerUnit
+ }
+```
diff --git a/docs/img/dms3-2.jpg b/docs/img/dms3-2.jpg
new file mode 100644
index 0000000000000000000000000000000000000000..823038807163c7c73b334315f7ddaee8ca20b4b6
Binary files /dev/null and b/docs/img/dms3-2.jpg differ
diff --git a/docs/notes.md b/docs/notes.md
index 96871d30317d28f831a2ab527e71aba8d77aef7c..c649990f066a5410347756e42220acac559c5c38 100644
--- a/docs/notes.md
+++ b/docs/notes.md
@@ -1,556 +1,25 @@
-## Implementation Notes
+## Appendix Notes
-### Preexisting Component behavior
+### Packaged Assets
-### Peer Node Builder and Command Execution Environment
+When DMS3 is initialized, a list of assets are added to the DMS3 UnixFS.
-egrep --color NewNode ../go-dms3-fs/core/builder.go
- implements NewNode, primary function used to build a new peer node object.
+The list of assets include:
-Following programs reference NewNode to create a new peer node:
+Default stopwords
+: This file is used by the search engine indexer. It is referenced from the params file generated when creating a new index container. the params file assumes this file exists at a well known path and file name in the dms3 repo index root (by default: ~/.dms3/index/stopwords)
-egrep --color NewNode ../go-dms3-fs/cmd/dms3fswatch/main.go
- Online: true
- watch and adds local filesystem objects to dms3fs
+Kind of Content schema
+: Metadata defining document field structure and semantics. The definitions are used to create index containers hosting documents of a certain kind.
-egrep --color NewNode ../go-dms3-fs/cmd/dms3fs/init.go
- Online: false
- adds default assets to dms3fs
- initializes dms3ns keyspace
+### Kind of Content Schema
-egrep --color NewNode ../go-dms3-fs/cmd/dms3fs/main.go
- Online: false
- implements the primary CLI binary for dms3fs
- command details specify necessary execution context (client or daemon)
- client is used when command does not use repo and can run on client
- daemon is used when running and command uses repo or details require it
- see [go-dms3-fs/cmd/dms3fs/main.go]commandShouldRunOnDaemon
+The index config file is initialized with document kind metadata from the pre-packaged asset list. The asset list kind schema is a subset sourced from a working group focused on defining schema standards: [schema.org](https://schema.org/).
-egrep --color NewNode ../go-dms3-fs/cmd/dms3fs/daemon.go
- Online: true
- Runs a network-connected DMS3FS node
+As a privacy concerned solution, the DMS3 subset excludes elements of schema typically used for tracking and targeted advertising, or that they are intended for features outside the scope of DMS3.
-CLI commands run by the primary CLI binary for dms3fs
-
-egrep --color NewNode ../go-dms3-fs/core/commands/add.go
- adds local filesystem objects to dms3fs
-
-egrep --color NewNode ../go-dms3-fs/core/commands/index/mk.go (move to new behavior section)
- make/add local index repository objects to dms3fs
-
-
-### Mutable File System (mfs)
-
-A merkle DAG tracks permanent filesystem objects. The Linked Data (LD) format allows creating relationships between DAG objects designed to support an object in a Unix filesystem that has a name and a hierarchical path. The DAG graph supports an extensible node data format allowing use of arbitrary data structures managed in the DAG.
-
-A mutable file system uses the well know key "/local/filesroot" to provide unix-like [path/name] filesystem operations as a convenience. The mfs root Cid is periodically published in the datastore to keep the mfs root Cid fresh.
-
-NewRoot references with publish function specified
-egrep --color \.NewRoot ../go-dms3-fs/fuse/dms3ns/dms3ns_unix.go
- implements fuse dms3ns filesystem. uses dms3ns publisher
-egrep --color \.NewRoot ../go-dms3-fs/core/core.go
- add command makes a new Dms3FsNode calling [core/builder.go]NewNode() builder [core/core.go]NewNode() calls setupNode(), which calls loadsFilesRoot(), which calls [../]go-mfs/system.go]mfs.NewRoot() to create or reuse a mfs root DAG node, which must be of type TDirectory or THAMTShard. NewRoot() starts a republisher for the mfs root, which republishes the mfs root cid to the datastore under key "/local/filesroot" (using short/long intervals: time.Millisecond*300, time.Second*3).
-
- Dms3FsNode node uses record.NamespacedValidator().
-
- Files or Directories added/managed using the files commands and with using builder will have the mfs root as parent. so when child list is updated, republisher will update mfs root cid in the datastore.
-
-NewRoot references without publish function specified
-egrep --color \.NewRoot ../go-dms3-fs/core/commands/add.go
-egrep --color \.NewRoot ../go-dms3-fs/core/commands/index/mk.go
-
-
-### Block, Exchange, and Bitswap Services
-
-Objects stored in the merkle DAG can be accessed by local and remote peer nodes. A number of path resolvers are used to effectively resolve a given path to a dms3fs path, where the last component of the path represents a Cid. The peer node object includes properties that enable access to the DAGservice and Blockservice interfaces. The Blockservice enables seamless retrieval of a block from the local node or a remote provider. When storing a block on the local node, the Blockservice informs the Exchange service of the existence of the Cid, which in turn informs the Bitswap service responsible for swapping blocks between peer nodes. The Bitswap service runs a pubsub protocol to manage WantLists of Cids to swap with peer nodes.
-
-#### Object Path Resolution
-
-egrep --color Resolve ./core/pathresolver.go
- Resolve first calls ResolveDMS3NS() to resolve dms3ns path to dms3fs path, then calls the provided path resolver to resolve to a DAG node. the provided path resolver is single hop resolver [go-path/resolver/resolver.go]Resolver
- r := &resolver.Resolver{
- DAG: n.DAG,
- ResolveOnce: uio.ResolveUnixfsOnce, // or resolver.ResolveOnce, aka [go-path/resolver/resolver.go]ResolveOnce
- }
-egrep --color ResolveToCid ../go-dms3-fs/core/pathresolver.go
- ResolveToCid is used by plumbing commands like Pin and
-
-core path resolver uses the [go-merkledag/merkledag.go]Get() to retrieve a DAG node from a resolved path. DAG Get() in turn uses the Dms3FsNode blockservice to fetch the DAG node block.
- b, err := n.Blocks.GetBlock(ctx, c)
-
-#### Object Access
-
-egrep --color bserv\.New ../go-dms3-fs/core/builder.go
-DAG node getter uses [dms3-fs/go-blockservice]New() to get Blocks
-the Dms3FsNode [core/builder.go]setupNode() creates the blockservice
- n.Blocks = bserv.New(n.Blockstore, n.Exchange)
- n.DAG = dag.NewDAGService(n.Blocks)
-[core/core.go]startOnlineServicesWithHost() creates the Dms3FsNode block exchange service.
- // setup exchange service
- bitswapNetwork := bsnet.NewFromDms3FsHost(n.PeerHost, n.Routing)
- n.Exchange = bitswap.New(ctx, bitswapNetwork, n.Blockstore)
-the blockservice GetBlock() seamlessly retrieves blocks from the local node or a peer node:
-
- block, err := bs.Get(c)
- if err == nil {
- return block, nil
- }
- if err == blockstore.ErrNotFound && f != nil {
- // TODO be careful checking ErrNotFound. If the underlying
- // implementation changes, this will break.
- log.Debug("Blockservice: Searching bitswap")
- blk, err := f.GetBlock(ctx, c)
- if err != nil {
- if err == blockstore.ErrNotFound {
- return nil, ErrNotFound
- }
- return nil, err
- }
- log.Event(ctx, "BlockService.BlockFetched", c)
- return blk, nil
- }
-
-go-fs-blockstore/blockstore.go - the blockstore hosts blocks in a flatfs datastore.
-go-block-format/blocks.go - defines the block format
-go-ds-flatfs/flatfs.go - implements a datastore that stores all objects in a two-level directory structure in the local file system, regardless of the hierarchy of the keys.
-go-fs-ds-help/key.go - provides conversion between Cid and datastore key
-the blockstore Get() retrieves the block from the flatfs datastore using a key converted from the Cid, and returns a formated basic block consisting of data read and the cid key.
-
-#### Locating Remote Objects
-
-So how does exchange find the block?
-egrep --color Fetcher ../go-fs-exchange-interface/interface.go
-Fetcher is an interface that includes GetBlock() and GetBlocks() functions implemented by the [go-bitswap/bitswap.go] service. the [go-bitswap/network/] exchanges DAG blocks with peer nodes using a [github.com/gxed/pubsub/] protocol. GetBlocks() adds cids of blocks into WantList. the [go-blockservice/blockservice.go]AddBlock() function announces to the exchange service that this node has a block (HasBlock()), which may trigger notification to peer node(s) that want the added cid block.
-
-
-
-## DMS3 Introduced Component behavior
-
-### Corpus Documents
-
-Corpus documents added to an index repository are also stored in the DMS3FS.
-
-Index repository and reposet state is stored in the DMS3FS UnixFS block store.
-
-The Merkle DAG is used to store repository properties and relationships in the UnixFS by managing index repository related directory and file nodes.
-
-The index datastore is used to track the Cid of a DAG reposet, and Cids of corpus documents in the DMS3FS for a specific repo. The tracking in index datastore applies a key convention that facilitates lookup of reposet and corpus documents given repository class, kind, name, and repo name.
-
-Notes on index documents:
-- index environment addFile
- - assumes metadata fields are stored within the file being added
- - returns -1, not docid. docid can subsequently be looked up via metadata fields
-- index environment addString
- - includes metadata vector as input parameter
- - returns -docid
-- the only indri required metadata is docno. all index documents have unique docno values
-- the only infospace required metadata is base-time
- - represents absolute start time of partition window
- - document abs-time fields are encoded as rel-time to partition window start time
-
-### Configuration (mutable)
-
-To Be Defined
-
-
-### Repository (immutable)
-
-Repositories provide a common framework for information organization, indexing, and query services. The framework is common to both infostore and metastore and differs only in the semantics of content in each state space.
-
-A repository set facilitates life-cycle management of growth in a specific kind of content.
-
-DMS3FS stores the following for repository properties:
-
-1. A DMS3FS UnixFS stored index properties
-
- - A reposet is tracked in the datastore using a key convention:
-
- "/local/filesroot/index/reposet/_kind_/_type_/_name_"
-
- where,
- _type_ is either the string "infostore" or the string "metastore" and
- _kind_ is a locally unique string for the kind of reposet,
- _name_ is a locally unique string for the name of reposet
-
- - a reposet direcory contains the following file entries:
-
- $ dms3fs ls QmSyVYKKQ3bH8EcNh9PUrQecwrcSHWXDBRt7yL1JsJbC5Z
- zb2rhek8ZLjWVXJdTxJhLo7d4o8oPTJZDLe1WwtcQtrDcqhQb 3128 params
- Qma1pjUPVL7Qu68bciyrqvz1doS6SHzo6qQfGRYysW4qma 129 reposetprops
- QmSbzdraUBhMfNuu6EMz4BqiPkyFgY99n9vSZJtdaVSm9Z 168 w1543348319-a1-c1-o0
-
- where,
- params - is a configuration file that informa search engine functions. This file is common for all repositories in a reposet. The contents are controlled via kind specific index configuration parameters.
- reposetprops - a JSON encoded reposet properties file
- w1543348319-a1-c1-o0 - a JSON encoded repo properties file for a repository in the reposet.
-
- Reposet properties are defined as follows:
-
- ```go
-type reposetProps struct {
- Type: string, // reposet class or type, "infostore" or "metastore"
- Kind: string, // locally unique reposet kind
- Name string, // locally unique reposet name
- CreatedAt int64, // create time
- MaxAreas uint8, // maximum number of areas
- MaxCats uint8, // maximum number of categories
- MaxDocs uint64, // maximum number of documents per repo
-}
-
-example:
- $ dms3fs cat Qma1pjUPVL7Qu68bciyrqvz1doS6SHzo6qQfGRYysW4qma
- {
- "Type":"metastore",
- "Kind":"blog",
- "Name":"myblog20",
- "CreatedAt":1543348319,
- "MaxAreas":64,
- "MaxCats":64,
- "MaxDocs":50000000
- }
-```
-
- Repo properties are defined as follows:
-
- ```go
-type repoProps struct {
- Type: string, // repo class or type, "infostore" or "metastore"
- Kind: string, // locally unique repo kind
- Name string, // locally unique repo name
- Offset: 0, // shard tag1 create time (seconds) since reposet create
- Area: 1, // shard tag2 area index
- Cat: 1, // shard tag3 category index
- "Path": string // local fs path to params file
-}
-
-example:
- $ dms3fs cat QmSbzdraUBhMfNuu6EMz4BqiPkyFgY99n9vSZJtdaVSm9Z
- {
- "Type":"metastore",
- "Kind":"blog",
- "Name":"w1543348319-a1-c1-o0",
- "Offset":0,
- "Area":1,
- "Cat":1,
- "Path":"/home/username/.dms3-fs/index/reposet/blog/myblog20/params"
- }
-
-```
-
-### Repository (mutable)
-
-Local filesystem stored repository mutable state:
-
- a. Directory Path
-
- A path is created when creating a new reposet that contains files and sub folders used by the indexer. A copy of the parameters file is placed in the local file system for the index server to read its configuration from. The path of a repository is composed of the following components:
-
- "_kind_/_type_/_name_", where
- _root_ is the index repository root
- _kind_ is a locally unique string for the kind of repository,
- _type_ is either the string "infostore" or the string "metastore", and
- _shard_ is a locally unique string for the name of repository
-
- For example, the very first "blog" kind repository created at Unix time of 1538751225 (seconds since 1970 epoch) will create the following folder structure by default:
-
- ```go
- //
- // local filesystem repository file folder hierarchy.
- //
- // index , cfg parameter Indexer.Path, must be relative path
- // -
- // reposet root
- // - /reposet
- // reposet kind root
- // - /reposet/
- // reposet root folder
- // - /reposet//
- // reponame, composed as:
- // - window: uint64, // creation time (Unix, seconds), sharding tag
- // - area: uint8, // area number, sharding tag
- // - cat: uint8, // category number, sharding tag
- // - offset: int64, // time since creation (seconds), recovery tag
- // repo specific files and sub-folders, cfg parameters are ignored for these
- // - /reposet////corpus, cfg parameter Indexer.Corpus.Path
- // - /reposet////metadata, cfg parameter Indexer.Corpus.Metadata
- // - /reposet//params, repo params file, no corresponding cfg parameter
- // params file is common for all repos in a reposet
- //
-```
-
-### index information space publisher/resolver
-
- current behavior/capability
-
- command execution process calls core.NewNode() that builds the peer node
- object with mode = {localmode, offlinemode, onlinemode}.
-
- network services are only enabled when the daemon is started.
-
- core.NewNode() calls setupNode() which uses startOnlineServises()
- dms3-fs/go-dms3-fs/core/builder.go setupNode() calls startOnlineServices()
- to initialize the routing subsystem. startOnlineServices() calls
- startOnlineServicesWithHost() to initialize the set of Host services:
- func NewNode(ctx context.Context, cfg *BuildCfg) (*Dms3FsNode, error) {
- n.RecordValidator = record.NamespacedValidator{
- "pk": record.PublicKeyValidator{},
- "dms3ns": dms3ns.Validator{KeyBook: n.Peerstore},
- }
- builder.go initializes BuildCfg.Host = DefaultHostOption = constructPeerHost
- and passes it to startOnlineServices() in core.go:
- constructPeerHost() {
- return dms3p2p.New(ctx, options...)
- dms3-p2p/go-p2p/config/config.go
- dms3-p2p/go-p2p/p2p.go
- When mode == OnlineMode(), following services are initialized:
- Online services initialized by core/core.go and builder.go
- // Online
- PeerHost p2phost.Host // the network host (server+client)
- Bootstrapper io.Closer // the periodic bootstrapper
- Routing routing.Dms3FsRouting // the routing system. recommend dms3fs-dht
- Exchange exchange.Interface // the block exchange + strategy (bitswap)
- Namesys namesys.NameSystem // the name system, resolves paths to hashes
- Ping *ping.PingService
- Reprovider *rp.Reprovider // the value reprovider system
- Dms3NsRepub *dms3nsrp.Republisher
-
- (PeerHost service)
- dms3-p2p/go-p2p/p2p/host/basic/basic_host.go - basic host service
- dms3-p2p/go-p2p/p2p/protocol/identify - IDService node ID protocol
- ID service check major/minor version to see if peer is compatible.
-
- (Bootstrapper service) - periodic check for known peers
- core/core.go
-
- (Routing service)
- routing.Dms3FsRouting // the routing system. recommend dms3fs-dht
- dms3-p2p/go-p2p-kad-dht (recommended, default)
- dms3-p2p/go-p2p-pubsub-router
- dms3-p2p/go-floodsub ()
- dms3-fs/go-fs-routing/none
- see bottom of core/core.go and startOnlineServicesWithHost()
- type RoutingOption func(context.Context, p2phost.Host, ds.Batching, record.Validator) (routing.Dms3FsRouting, error)
- var DHTOption RoutingOption = constructDHTRouting
- var DHTClientOption RoutingOption = constructClientDHTRouting
- var NilRouterOption RoutingOption = nilrouting.ConstructNilRouting
-
- the RoutingOption function is invoked by referencing one of:
- DHTOption, DHTClientOption, and NilRouterOption
- see cmd/dms3fs/daemon.go and ../go-dms3-fs/core/builder.go
-
- (Exchange service)
- dms3-fs/go-fs-exchange-interface/interface.go
- protocol used to exchange block with peer nodes
-
- (Namesys service)
- dms3ns publish use case
- package sequence
- dms3-fs/go-dms3-fs/core/commands/name/publish.go - publish command
- dms3-fs/go-dms3-fs/namesys/ - name publisher/resolve
- dms3-fs/go-dms3-fs/go-dms3ns/ - name record create/update/validate
- dms3-fs/go-datastore/ - local store
- dms3-p2p/go-p2p-routing/ - routing store
- runs local only and writes record entry to ds and routing
- // ds dms3-fs/go-datastore
- ds key: ds.NewKey("/dms3ns/" + base32.RawStdEncoding.EncodeToString([]byte(id)))
- // r routing.ValueStore, routing dms3-p2p/go-p2p-routing
- r key: "/dms3ns/"+h(pubkey)
- dms3-fs/go-dms3-fs/core/commands/name/publish.go
- err := n.Namesys.PublishWithEOL(ctx, k, ref, eol)
- // see dms3-fs/go-dms3-fs/namesys/ - dms3ns publish/resolve impl
- record, err := p.updateRecord(ctx, k, value, eol)
- // Create record
- entry, err := dms3ns.Create(k, []byte(value), seqno, eol)
- // Set the TTL
- entry.Ttl = proto.Uint64(uint64(ttl.Nanoseconds()))
- data, err := proto.Marshal(entry)
- // Put the new record.
- if err := p.ds.Put(Dms3NsDsKey(id), data); err != nil
- ds.NewKey("/dms3ns/" + base32.RawStdEncoding.EncodeToString([]byte(id)))
- return entry, nil
- return PutRecordToRouting(ctx, p.routing, k.GetPublic(), record)
- // Store dms3ns entry at "/dms3ns/"+h(pubkey)
- return r.PutValue(timectx, dms3nskey, data)
- dms3nskey = RecordKey(pid peer.ID) string
- return "/dms3ns/" + string(pid)
- // see dms3-fs/go-dms3-fs/go-dms3ns/record.go
- // r routing.ValueStore, see dms3-p2p/go-p2p-routing
-
- (Ping service)
- dms3-p2p/go-p2p/p2p/protocol/ping
-
- (Reprovider service)
- swarm exchange use case
- dms3-fs/go-dms3-fs/exchange/reprovide/providers.go
- provides set of pinned (DirectKeys and RecursiveKeys) cids into
- routing.ContentRouting - routes cids to swarm peers
-
- (Dms3NsRepub service)
- dms3-fs/go-dms3-fs/namesys/republisher
-
-
- dms3-p2p/go-p2p/p2p/protocol/index - need: lookup & query
- dms3-fs/go-dms3inf/dms3info.go - need: index info record management
- dms3-fs/go-dms3-fs/infosys/ - dms3ns name publisher/resolver implementation
-
-routing strategies in
-less ../../dms3-p2p/go-p2p-routing/routing.go
-
-routing clients
----------------
-egrep --color \.Dms3FsRouting ../go-fs-routing/offline/offline.go
-
-egrep --color \.Dms3FsRouting ../go-fs-routing/none/none_client.go
-
-egrep --color \.Dms3FsRouting ../go-dms3-fs/core/core.go
-egrep --color \.Dms3FsRouting ../go-dms3-fs/core/commands/dht.go
-
-egrep --color \.ContentRouting ../go-bitswap/network/dms3fs_impl.go
-
-egrep --color \.ContentRouting ../go-dms3-fs/exchange/reprovide/reprovide.go
-
-egrep --color \.ValueStore ../go-dms3-fs/namesys/publisher.go
-egrep --color \.ValueStore ../go-dms3-fs/namesys/routing.go
-egrep --color \.ValueStore ../go-dms3-fs/namesys/namesys.go
-
-routing providers
------------------
-type ContentRouting interface {
- // Provide adds the given cid to the content routing system. If 'true' is
- // passed, it also announces it, otherwise it is just kept in the local
- // accounting of which objects are being provided.
- Provide(context.Context, *cid.Cid, bool) error
-
- // Search for peers who are able to provide a given key
- FindProvidersAsync(context.Context, *cid.Cid, int) <-chan pstore.PeerInfo
-}
-
-// PeerRouting is a way to find information about certain peers.
-// This can be implemented by a simple lookup table, a tracking server,
-// or even a DHT.
-type PeerRouting interface {
- // Find specific Peer
- // FindPeer searches for a peer with given ID, returns a pstore.PeerInfo
- // with relevant addresses.
- FindPeer(context.Context, peer.ID) (pstore.PeerInfo, error)
-}
-
-// ValueStore is a basic Put/Get interface.
-type ValueStore interface {
- // PutValue adds value corresponding to given Key.
- PutValue(context.Context, string, []byte, ...ropts.Option) error
-
- // GetValue searches for the value corresponding to given Key.
- GetValue(context.Context, string, ...ropts.Option) ([]byte, error)
-}
-
-egrep --color \.Dms3FsRouting ../../dms3-p2p/go-p2p-routing-helpers/parallel.go
-egrep --color \.PeerRouting ../../dms3-p2p/go-p2p-routing-helpers/parallel.go
-egrep --color \.ContentRouting ../../dms3-p2p/go-p2p-routing-helpers/parallel.go
-egrep --color \.ValueStore ../../dms3-p2p/go-p2p-routing-helpers/parallel.go
-
-egrep --color \.ContentRouting ../../dms3-p2p/go-p2p-routing-helpers/composed.go
-egrep --color \.PeerRouting ../../dms3-p2p/go-p2p-routing-helpers/composed.go
-egrep --color \.Dms3FsRouting ../../dms3-p2p/go-p2p-routing-helpers/composed.go
-egrep --color \.ValueStore ../../dms3-p2p/go-p2p-routing-helpers/composed.go
-
-egrep --color \.ValueStore ../../dms3-p2p/go-p2p-routing-helpers/limited.go
-
-egrep --color \.Dms3FsRouting ../../dms3-p2p/go-p2p-routing-helpers/null.go
-
-egrep --color \.Dms3FsRouting ../../dms3-p2p/go-p2p-routing-helpers/tiered.go
-
-egrep --color \.ContentRouting ../../dms3-p2p/go-p2p-pubsub-router/pubsub.go
-../../dms3-p2p/go-p2p-pubsub-router/pubsub.go
-
-recored validators
-egrep --color Validator ./core/builder.go
-egrep --color Validator ../go-dms3ns/record.go
-egrep --color Validator ../../dms3-p2p/go-p2p-kad-dht/opts/options.go
-egrep --color Validator ../../dms3-p2p/go-floodsub/pubsub.go
-egrep --color Validator ../../dms3-p2p/go-p2p-record/validator.go
-egrep --color Validator ../../dms3-p2p/go-p2p-record/pubkey.go
-egrep --color Validator ../../dms3-p2p/go-p2p-pubsub-router/pubsub.go
-
-for command examples of record get/put/validate, see:
-dms3fs name publish (and resolve)
-dms3fs name pubsub *
-dms3fs pubsub *
-
-
-## Index and Query Commands
-
-
-### Index Commands
-
-Example blog use case:
-
-```bash
-dms3fs index config show # show index configuration
-dms3fs index config --json Metadata {} # reset all index metadata
-dms3fs index config --json Metadata.Kind [{}] # reset kind metadata
-dms3fs index config --json Metadata.Kind '[{"Name": "blog", "Field": ["author","name"]}]' # set blog fields
-dms3fs index config --json Metadata.Kind '[{"Name": "blog", "Field": ["About", "Address", "Affiliation", "Author", "Brand", "Citation", "Description", "Email", "Headline", "Keywords", "Language", "Name", "Telephone", "Version"]}]' # set blog fields
-dms3fs index config --json Metadata.Kind # show metadata kinds
-
-dms3fs index mkidx -k=blog -n myblog # make blog kind infostore
-dms3fs index mkdoc -k=blog > b.xml # make empty blog template
- # edit the blog document: b.xml
-dms3fs index addoc b.xml # add blog to infostore
-dms3fs index rmdoc # remove doc from repository
-
-dms3fs index mkidx -k=blog -n myblog # make metastore for blog infostore
-dms3fs index mkdoc -k=blog > n.xml # make empty blog metastore template
- # edit metastore document: ns.xml
-dms3fs index addoc ns.xml # add it to metastore infostore
-dms3fs index publish # publish path, infostore or metastore
-
-dms3fs index ls # list infostore
-dms3fs index stats # display infostore stats
-dms3fs index show # show service status
-dms3fs index start # start service
-dms3fs index stop # stop service
-dms3fs index restart # reset/restart service
-dms3fs index recover # rebuild index data repository
-```
-Notes:
-[cmd/dms3fs/daemon.go]serveHTTPApi registers http.ServeMux url handlers under "/api" and creates a goroutine to serve the url endpoints.
-
-[core/corehttp/corehttp.go] module defines most of the urls served by the daemon, and defines the Serve function called by the goroutine created in
-[cmd/dms3fs/daemon.go]serveHTTPApi.
-
-[core/corehttp/commands.go] module defines CommandsOption function that constructs a ServerOption for hooking commands into the HTTP server.
-
-all commands under [core/commands/root.go]Root and [core/commands/root.go]RootRO are hooked as daemon api url endpoints, and executed via the daemon. cli command Root is defined in [cmd/dms3fs/dms3fs.go] along with cmdDetailsMap, which restricts command execution environment/requirement.
-
-the index repository management command cmdDetails should be updated in the cmdDetailsMap to properly mark commands that cannot run on daemon (such as index/config/edit).
-All index repository management commands are wired in core/commands/root.go
-the commands are implemented in commands/index/*.go. the configuration command is implemented in commands/idxconfig.go, because of dependency on private function unwrapOutput in package commands. core functions are located in core/coreindex/, coreapi in core/coreapi/index.go, core/coreapi/interface/index.go, core/coreapi/interface/options/index.go (core api needs updates, currently testing in local environment).
-
-### Query Commands
-
-```bash
-dms3fs index query -options... ... # find docs in infostore
-```
-
-### Kind of Content
-
-Documents in a infostore should be structured using common metadata fields to enable more refined matching when searching using a robust query language.
-
-Users are free to choose any preferred document metadata field structure. Standardized fields improve ease of discovery use of common search query patterns.
-
-Significant collaborative community effort has been invested to create, maintain, and promote common [**_schema vocabulary_**](https://schema.org/) for structured data on the Internet.
-
-As an advocate for personal privacy, DMS3 takes a minimalist approach to limit use of metadata vocabulary to that which is relevant for the sharing endpoints. DMS3 is not an advocate for collecting and aggregating personal data for the promotion and placement of sponsored messages targeting individuals.
-
-DMS3 suggested subset vocabulary usage is discussed below. We encourage community discussions and and suggestions for what may constitute appropriate common metadata for various kind of content. Use of suggested vocabulary is strictly at the option of the information source.
-
-#### Blog Kind of Content
-
-This secion defines the metadata fields for Blog content.
-
-##### [Blog](https://schema.org/Blog "Schema Org. Blog") metadata fields
+#### [Blog](https://schema.org/Blog "Schema Org. Blog") metadata fields
- About
- The subject matter of the content
@@ -569,7 +38,7 @@ This secion defines the metadata fields for Blog content.
- Description property from [Thing](https://schema.org/thing "Schema Org. Thing")
- A description of the content
-##### [Person](https://schema.org/person "Schema Org. Person") metadata fields
+#### [Person](https://schema.org/person "Schema Org. Person") metadata fields
- Address
- Physical address of the person.
diff --git a/docs/why.md b/docs/why.md
index c5c6d7700f80c4de5abd3f636f0616fdbfde4d86..5b43aa05b57ef2ef88229d567aba3f62f8a9e87d 100644
--- a/docs/why.md
+++ b/docs/why.md
@@ -213,3 +213,9 @@ In "An Ugly Truth: Inside Facebook's Battle for Domination," New York Times repo
The new book explores in-depth the inner workings of the company and its top executives. The word ugly in the title comes from a memo written by one of Facebook's own vice presidents, with Frenkel and Kang’s reporting highlighting that many of the platform’s perceived flaws are deliberate design choices.
“So many of Facebook's problems are built into the way that they do business,” Frenkel says. “The very business model that they're premised on … is to keep you online.”
+
+##### [Apple says it will reject any government demands to use new child sexual abuse image detection system for surveillance](https://www.cnbc.com/2021/08/09/apple-will-reject-demands-to-use-csam-system-for-surveillance-.html)
+
+"Some cryptographers are worried about what could happen if a country such as China were to pass a law saying the system also has to include politically sensitive images. Apple CEO Tim Cook has previously said that the company follows laws in every country where it conducts business."
+
+“It’s truly disappointing that Apple got so hung up on its particular vision of privacy that it ended up betraying the fulcrum of user control: being able to trust that your device is truly yours,” technology commentator Ben Thompson wrote in a newsletter on Monday.
diff --git a/mkdocs.yml b/mkdocs.yml
index ec98454221b53bd55e033e88c114848aefb5f7fe..58de9ec7aac13b1dbc83e93240ee99fccfbdaea6 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -1,5 +1,6 @@
site_name: DMS3 documentation
+site_url: https://dms3.io/
nav:
- Home: index.md
@@ -11,14 +12,20 @@ nav:
- Architecture Overview: arch.md
- High-level Design: design.md
- Roadmap: roadmap.md
+ - Appendix Notes: notes.md
# - Example Math and Diagrams: diagrams.md
-# - Implementation Notes: notes.md
theme:
# name: readthedocs
name: rtd-dropdown
custom_dir: docs/
+plugins:
+ - search
+ - mermaid2:
+ arguments:
+ theme: 'dark'
+
markdown_extensions:
- footnotes
- pymdownx.arithmatex
@@ -39,11 +46,13 @@ markdown_extensions:
format: !!python/name:pymdownx.arithmatex.fence_mathjax_format
extra_css:
-# - https://unpkg.com/mermaid@7.1.2/dist/mermaid.css
+ - 'css/theme_extra.css'
+ - https://unpkg.com/mermaid@7.1.2/dist/mermaid.css
extra_javascript:
- 'js/converter.js'
- - 'js/mermaid.min.js'
- - 'js/mermaid.min.js.map'
+# - 'js/mermaid.min.js'
+ - https://unpkg.com/mermaid/dist/mermaid.min.js
+# - 'js/mermaid.min.js.map'
- 'https://cdnjs.cloudflare.com/ajax/libs/underscore.js/1.9.1/underscore-min.js'
- 'https://cdnjs.cloudflare.com/ajax/libs/underscore.js/1.9.1/underscore-min.js.map'
- 'https://cdnjs.cloudflare.com/ajax/libs/raphael/2.3.0/raphael.min.js'