Unverified Commit b0f337df authored by dirkmc's avatar dirkmc Committed by GitHub

Add separate how bitswap works doc (#294)

* docs: add separate how bitswap works doc

* feat: update architecture diagram and add implementation description
parent 53af318c
......@@ -45,6 +45,8 @@ wants those blocks.
`go-bitswap` provides an implementation of the Bitswap protocol in go.
[Learn more about how Bitswap works](./docs/how-bitswap-works.md)
## Install
`go-bitswap` requires Go >= 1.11 and can be installed using Go modules
......@@ -75,8 +77,7 @@ exchange := bitswap.New(ctx, network, bstore)
Parameter Notes:
1. `ctx` is just the parent context for all of Bitswap
2. `network` is a network abstraction provided to Bitswap on top
of libp2p & content routing.
2. `network` is a network abstraction provided to Bitswap on top of libp2p & content routing.
3. `bstore` is an IPFS blockstore
### Get A Block Synchronously
......@@ -107,11 +108,11 @@ blockChannel, err := exchange.GetBlocks(ctx, cids)
Parameter Notes:
1. `ctx` is the context for this request, which can be cancelled to cancel the request
2. `cids` is an slice of content IDs for the blocks you're requesting
2. `cids` is a slice of content IDs for the blocks you're requesting
### Get Related Blocks Faster With Sessions
In IPFS, content blocks are often connected to each other through a MerkleDAG. If you know ahead of time that block requests are related, Bitswap can make several optimizations internally in how it requests those blocks in order to get them faster. Bitswap provides a mechanism called a Bitswap session to manage a series of block requests as part of a single higher level operation. You should initialize a bitswap session any time you intend to make a series of block requests that are related -- and whose responses are likely to come from the same peers.
In IPFS, content blocks are often connected to each other through a MerkleDAG. If you know ahead of time that block requests are related, Bitswap can make several optimizations internally in how it requests those blocks in order to get them faster. Bitswap provides a mechanism called a Bitswap Session to manage a series of block requests as part of a single higher level operation. You should initialize a Bitswap Session any time you intend to make a series of block requests that are related -- and whose responses are likely to come from the same peers.
```golang
var ctx context.Context
......@@ -125,7 +126,7 @@ var relatedCids []cids.cid
relatedBlocksChannel, err := session.GetBlocks(ctx, relatedCids)
```
Note that new session returns an interface with a GetBlock and GetBlocks method that have the same signature as the overall Bitswap exchange.
Note that `NewSession` returns an interface with `GetBlock` and `GetBlocks` methods that have the same signature as the overall Bitswap exchange.
### Tell bitswap a new block was added to the local datastore
......@@ -136,53 +137,6 @@ var exchange bitswap.Bitswap
err := exchange.HasBlock(blk)
```
## Implementation
The following diagram outlines the major tasks Bitswap handles, and their consituent components:
![Bitswap Components](./docs/go-bitswap.png)
### Sending Blocks
Internally, when a message with a wantlist is received, it is sent to the
decision engine to be considered. The decision engine checks the CID for
each block in the wantlist against local storage and creates a task for
each block it finds in the peer request queue. The peer request queue is
a priority queue that sorts available tasks by some metric. Currently,
that metric is very simple and aims to fairly address the tasks of each peer.
More advanced decision logic will be implemented in the future. Task workers
pull tasks to be done off of the queue, retrieve the block to be sent, and
send it off. The number of task workers is limited by a constant factor.
### Requesting Blocks
The want manager handles client requests for new blocks. The 'WantBlocks' method
is invoked for each block (or set of blocks) requested. The want manager ensures
that connected peers are notified of the new block that we want by sending the
new entries to a message queue for each peer. The message queue will loop while
there is work available and:
1. Ensure it has a connection to its peer
2. grab the message to be sent
3. Send the message
If new messages are added while the loop is in steps 1 or 3, the messages are
combined into one to avoid having to keep an actual queue and send multiple
messages. The same process occurs when the client receives a block and sends a
cancel message for it.
### Sessions
Sessions track related requests for blocks, and attempt to optimize transfer speed and reduce the number of duplicate blocks sent across the network. The basic optimization of sessions is to limit asks for blocks to the peers most likely to have that block and most likely to respond quickly. This is accomplished by tracking who responds to each block request, and how quickly they respond, and then optimizing future requests with that information. Sessions try to distribute requests amongst peers such that there is some duplication of data in the responses from different peers, for redundancy, but not too much.
### Finding Providers
When bitswap can't find a connected peer who already has the block it wants, it falls back to querying a content routing system (a DHT in IPFS's case) to try to locate a peer with the block.
Bitswap routes these requests through the ProviderQueryManager system, which rate-limits these requests and also deduplicates in-process requests.
### Providing
As a bitswap client receives blocks, by default it announces them on the provided content routing system (again, a DHT in most cases). This behaviour can be disabled by passing `bitswap.ProvideEnabled(false)` as a parameter when initializing Bitswap. IPFS currently has its own experimental provider system ([go-ipfs-provider](https://github.com/ipfs/go-ipfs-provider)) which will eventually replace Bitswap's system entirely.
## Contribute
PRs are welcome!
......
docs/go-bitswap.png

46.5 KB | W: | H:

docs/go-bitswap.png

82.9 KB | W: | H:

docs/go-bitswap.png
docs/go-bitswap.png
docs/go-bitswap.png
docs/go-bitswap.png
  • 2-up
  • Swipe
  • Onion skin
......@@ -3,15 +3,17 @@
node "Top Level Interface" {
[Bitswap]
}
node "Sending Blocks" {
node "Sending Blocks" {
[Bitswap] --* [Engine]
[Engine] -left-* [Ledger]
[Engine] -right-* [PeerTaskQueue]
[Engine] --> [TaskWorker (workers.go)]
}
[Bitswap] --* "Sending Blocks"
node "Requesting Blocks" {
[Bitswap] --* [WantManager]
[WantManager] --> [BlockPresenceManager]
[WantManager] --> [PeerManager]
[PeerManager] --* [MessageQueue]
}
......@@ -27,13 +29,16 @@ node "Finding Providers" {
node "Sessions (smart requests)" {
[Bitswap] --* [SessionManager]
[SessionManager] --> [SessionInterestManager]
[SessionManager] --o [Session]
[SessionManager] --o [SessionPeerManager]
[SessionManager] --o [SessionRequestSplitter]
[Session] --* [sessionWantSender]
[Session] --* [SessionPeerManager]
[Session] --* [SessionRequestSplitter]
[Session] --> [WantManager]
[SessionPeerManager] --> [ProvideQueryManager]
[Session] --> [ProvideQueryManager]
[Session] --* [sessionWants]
[Session] --> [SessionInterestManager]
[sessionWantSender] --> [BlockPresenceManager]
[sessionWantSender] --> [PeerManager]
}
node "Network" {
......
How Bitswap Works
=================
When a client requests blocks, Bitswap sends the CID of those blocks to its peers as "wants". When Bitswap receives a "want" from a peer, it responds with the corresponding block.
### Requesting Blocks
#### Sessions
Bitswap Sessions allow the client to make related requests to the same group of peers. For example typically requests to fetch all the blocks in a file would be made with a single session.
#### Discovery
To discover which peers have a block, Bitswap broadcasts a `want-have` message to all peers it is connected to asking if they have the block.
Any peers that have the block respond with a `HAVE` message. They are added to the Session.
If no connected peers have the block, Bitswap queries the DHT to find peers that have the block.
### Wants
When the client requests a block, Bitswap sends a `want-have` message with the block CID to all peers in the Session to ask who has the block.
Bitswap simultaneously sends a `want-block` message to one of the peers in the Session to request the block. If the peer does not have the block, it responds with a `DONT_HAVE` message. In that case Bitswap selects another peer and sends the `want-block` to that peer.
If no peers have the block, Bitswap broadcasts a `want-have` to all connected peers, and queries the DHT to find peers that have the block.
#### Peer Selection
Bitswap uses a probabilistic algorithm to select which peer to send `want-block` to, favouring peers that
- sent `HAVE` for the block
- were discovered as providers of the block in the DHT
- were first to send blocks to previous session requests
The selection algorithm includes some randomness so as to allow peers that are discovered later, but are more responsive, to rise in the ranking.
#### Periodic Search Widening
Periodically the Bitswap Session selects a random CID from the list of "pending wants" (wants that have been sent but for which no block has been received). Bitswap broadcasts a `want-have` to all connected peers and queries the DHT for the CID.
### Serving Blocks
#### Processing Requests
When Bitswap receives a `want-have` it checks if the block is in the local blockstore.
If the block is in the local blockstore Bitswap responds with `HAVE`. If the block is small Bitswap sends the block itself instead of `HAVE`.
If the block is not in the local blockstore, Bitswap checks the `send-dont-have` flag on the request. If `send-dont-have` is true, Bitswap sends `DONT_HAVE`. Otherwise it does not respond.
#### Processing Incoming Blocks
When Bitswap receives a block, it checks to see if any peers sent `want-have` or `want-block` for the block. If so it sends `HAVE` or the block itself to those peers.
#### Priority
Bitswap keeps requests from each peer in separate queues, ordered by the priority specified in the request message.
To select which peer to send the next response to, Bitswap chooses the peer with the least amount of data in its send queue. That way it will tend to "keep peers busy" by always keeping some data in each peer's send queue.
Implementation
==============
![Bitswap Components](./docs/go-bitswap.png)
### Bitswap
The Bitswap class receives incoming messages and implements the Exchange API.
When a message is received, Bitswap
- Records some statistics about the message
- Informs the Engine of any new wants
So that the Engine can send responses to the wants
- Informs the Engine of any received blocks
So that the Engine can send the received blocks to any peers that want them
- Informs the WantManager of received blocks, HAVEs and DONT_HAVEs
So that the WantManager can inform interested sessions
When the client makes an API call, Bitswap creates a new Session and calls the corresponding method (eg `GetBlocks()`).
### Sending Blocks
When the Engine is informed of new wants it
- Adds the wants to the Ledger (peer A wants block with CID Qmhash...)
- Checks the blockstore for the corresponding blocks, and adds a task to the PeerTaskQueue
- If the blockstore does not have a wanted block, adds a `DONT_HAVE` task
- If the blockstore has the block
- for a `want-have` adds a `HAVE` task
- for a `want-block` adds a `block` task
When the Engine is informed of new blocks it checks the Ledger to see if any peers want information about those blocks.
- For each block
- For each peer that sent a `want-have` for the corresponding block
Adds a `HAVE` task to the PeerTaskQueue
- For each peer that sent a `want-block` for the corresponding block
Adds a `block` task to the PeerTaskQueue
The Engine periodically pops tasks off the PeerTaskQueue, and creates a message with `blocks`, `HAVEs` and `DONT_HAVEs`.
The PeerTaskQueue prioritizes tasks such that the peers with the least amount of data in their send queue are highest priority, so as to "keep peers busy".
### Requesting Blocks
When the WantManager is informed of a new message, it
- informs the SessionManager
The SessionManager informs the Sessions that are interested in the received blocks and wants
- informs the PeerManager of received blocks
The PeerManager checks if any wants were send to a peer for the received blocks. If so it sends a `CANCEL` message to those peers.
### Sessions
The Session starts in "discovery" mode. This means it doesn't have any peers yet, and needs to discover which peers have the blocks it wants.
When the client initially requests blocks from a Session, the Session
- informs the SessionInterestManager that it is interested in the want
- informs the sessionWantManager of the want
- tells the WantManager to broadcast a `want-have` to all connected peers so as to discover which peers have the block
- queries the ProviderQueryManager to discover which peers have the block
When the session receives a message with `HAVE` or a `block`, it informs the SessionPeerManager. The SessionPeerManager keeps track of all peers in the session.
When the session receives a message with a `block` it informs the SessionInterestManager.
Once the session has peers it is no longer in "discovery" mode. When the client requests subsequent blocks the Session informs the sessionWantSender. The sessionWantSender tells the PeerManager to send `want-have` and `want-block` to peers in the session.
For each block that the Session wants, the sessionWantSender decides which peer is most likely to have a block by checking with the BlockPresenceManager which peers have sent a `HAVE` for the block. If no peers or multiple peers have sent `HAVE`, a peer is chosen probabilistically according to which how many times each peer was first to send a block in response to previous wants requested by the Session. The sessionWantSender sends a single "optimistic" `want-block` to the chosen peer, and sends `want-have` to all other peers in the Session.
When a peer responds with `DONT_HAVE`, the Session sends `want-block` to the next best peer, and so on until the block is received.
### PeerManager
The PeerManager creates a MessageQueue for each peer that connects to Bitswap. It remembers which `want-have` / `want-block` has been sent to each peer, and directs any new wants to the correct peer.
The MessageQueue groups together wants into a message, and sends the message to the peer. It monitors for timeouts and simulates a `DONT_HAVE` response if a peer takes too long to respond.
### Finding Providers
When bitswap can't find a connected peer who already has the block it wants, it falls back to querying a content routing system (a DHT in IPFS's case) to try to locate a peer with the block.
Bitswap routes these requests through the ProviderQueryManager system, which rate-limits these requests and also deduplicates in-process requests.
### Providing
As a bitswap client receives blocks, by default it announces them on the provided content routing system (again, a DHT in most cases). This behaviour can be disabled by passing `bitswap.ProvideEnabled(false)` as a parameter when initializing Bitswap. IPFS currently has its own experimental provider system ([go-ipfs-provider](https://github.com/ipfs/go-ipfs-provider)) which will eventually replace Bitswap's system entirely.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment