Unverified Commit 688dcf34 authored by Hannah Howard's avatar Hannah Howard Committed by GitHub

docs(architecture): update architecture docs (#154)

parent f39ea50a
...@@ -66,6 +66,7 @@ Having outlined all the steps to execute a single roundtrip Graphsync request, t ...@@ -66,6 +66,7 @@ Having outlined all the steps to execute a single roundtrip Graphsync request, t
- minimize network traffic and not send duplicate data - minimize network traffic and not send duplicate data
- not get blocked if any one request or response becomes blocked for whatever reason - not get blocked if any one request or response becomes blocked for whatever reason
- have ways of protecting itself from getting overwhelmed by a malicious peer (i.e. be less vulnerable to Denial Of Service attacks) - have ways of protecting itself from getting overwhelmed by a malicious peer (i.e. be less vulnerable to Denial Of Service attacks)
- manage memory effectively when executing large queries on either side
To do this, GraphSync maintains several independent threads of execution (i.e. goroutines). Specifically: To do this, GraphSync maintains several independent threads of execution (i.e. goroutines). Specifically:
- On the requestor side: - On the requestor side:
...@@ -75,8 +76,7 @@ To do this, GraphSync maintains several independent threads of execution (i.e. g ...@@ -75,8 +76,7 @@ To do this, GraphSync maintains several independent threads of execution (i.e. g
4. Each outgoing request has an independent thread collecting and buffering final responses before they are returned to the caller. Graphsync returns responses to the caller through a channel. If the caller fails to immediately read the response channel, this should not block other requests from being processed. 4. Each outgoing request has an independent thread collecting and buffering final responses before they are returned to the caller. Graphsync returns responses to the caller through a channel. If the caller fails to immediately read the response channel, this should not block other requests from being processed.
- On the responder side: - On the responder side:
1. We maintain an independent thread to receive incoming requests and track outgoing responses. As each incoming request is received, it's put into a prioritized queue. 1. We maintain an independent thread to receive incoming requests and track outgoing responses. As each incoming request is received, it's put into a prioritized queue.
2. We maintain fixed number of threads that continuously pull the highest priority request from the queue and perform the selector query for that request 2. We maintain fixed number of threads that continuously pull the highest priority request from the queue and perform the selector query for that request. We marshal and deduplicate outgoing responses and blocks before they are sent back. This minimizes data sent on the wire and allows queries to proceed without getting blocked by the network.
3. Each peer we respond to has an independent thread marshaling and deduplicating outgoing responses and blocks before they are sent back. This minimizes data sent on the wire and allows queries to proceed without getting blocked by the network.
- At the messaging layer: - At the messaging layer:
1. Each peer we send messages to has an independent thread collecting and buffering message data while waiting for the last message to finish sending. This allows higher level operations to execute without getting blocked by a slow network 1. Each peer we send messages to has an independent thread collecting and buffering message data while waiting for the last message to finish sending. This allows higher level operations to execute without getting blocked by a slow network
...@@ -143,7 +143,7 @@ In addition, an optimized responder implementation accounts for the following co ...@@ -143,7 +143,7 @@ In addition, an optimized responder implementation accounts for the following co
* *Preserve Bandwith* - Be efficient with network usage, deduplicate data, and buffer response output so that each new network message contains all response data we have at the time the pipe becomes free. * *Preserve Bandwith* - Be efficient with network usage, deduplicate data, and buffer response output so that each new network message contains all response data we have at the time the pipe becomes free.
The responder implementation is managed by the Response Manager. The ResponseManager delegates to PeerTaskQueue to rate limit the number of in progress selector traversals and ensure no one peer is given more priority than others. As data is generated from selector traversals, the ResponseManager uses the PeerResponseManager to aggregate response data for each peer and send compact messages over the network. The responder implementation is managed by the Response Manager. The ResponseManager delegates to PeerTaskQueue to rate limit the number of in progress selector traversals and ensure no one peer is given more priority than others. As data is generated from selector traversals, the ResponseManager uses the ResponseAssembler to aggregate response data for each peer and send compact messages over the network.
The following diagram outlines in greater detail go-graphsync's responder implementation, covering how it's initialized and how it responds to requests: The following diagram outlines in greater detail go-graphsync's responder implementation, covering how it's initialized and how it responds to requests:
![Responding To A Request](responder-sequence.png) ![Responding To A Request](responder-sequence.png)
...@@ -160,12 +160,21 @@ Meanwhile, the ResponseManager starts a fixed number of workers (currently 6), e ...@@ -160,12 +160,21 @@ Meanwhile, the ResponseManager starts a fixed number of workers (currently 6), e
The net here is that no peer can have more than a fixed number of requests in progress at once, and even if a peer sends infinite requests, other peers will still jump ahead of it and get a chance to process their requests. The net here is that no peer can have more than a fixed number of requests in progress at once, and even if a peer sends infinite requests, other peers will still jump ahead of it and get a chance to process their requests.
### Peer Response Sender -- Deduping blocks and data ### ResponseAssembler -- Deduping blocks and data
Once a request is dequeued, we generate an intercepted loader and provide it to go-ipld-prime to execute a traversal. Each call to the loader will generate a block that we either have or don't. We need to transmit that information across the network. However, that information needs to be encoded in the GraphSync message format, and combined with any other responses we may be sent to the same peer at the same time, ideally without sending blocks more times than necessary. Once a request is dequeued, we generate an intercepted loader and provide it to go-ipld-prime to execute a traversal. Each call to the loader will generate a block that we either have or don't. We need to transmit that information across the network. However, that information needs to be encoded in the GraphSync message format, and combined with any other responses we may be sent to the same peer at the same time, ideally without sending blocks more times than necessary.
These tasks are generally managed by the PeerResponseManager which spins up one PeerResponseSender for each peer. The PeerResponseSender tracks links with the LinkTracker and aggregates responses with the ResponseBuilder. Every time the PeerResponseSender is called by the intercepted loader, it users the LinkTracker and ResponseBuilder to add block information and metadata to the response. Meanwhile, the PeerResponseSender runs a continuous loop that is synchronized with the message sending layer -- a new response is aggregated until the message sending layer notifies that the last message was sent, at which point the new response is encoded and sent. These tasks are managed by the ResponseAssembler. The ResponseAssembber creates a LinkTracker for each peer to track what blocks have been sent. Responses are sent by calling Transaction on the ResponseAssembler, which provides a ResponseBuilder interface that can be used to assemble responses. Transaction is named as such because all data added to a response by calling methods on the provided ResponseBuilder is gauranteed to go out in the name network message.
## Message Sending Layer ## Message Sending Layer
The message sending layer is the simplest major component, consisting of a PeerManager which tracks peers, and a message queue for each peer. The PeerManager spins up new new message queues on demand. When a new request is received, it spins up a queue as needed and delegates sending to the message queue which collects message data until the network stream is ready for another message. It then encodes and sends the message to the network The message consists of a PeerManager which tracks peers, and a message queue for each peer. The PeerManager spins up new new message queues on demand. When a new request is received, it spins up a queue as needed and delegates sending to the message queue which collects message data until the network stream is ready for another message. It then encodes and sends the message to the network.
The message queue system contains a mechanism for applying backpressure to a query execution to make sure that a slow network connection doesn't cause us to load all the blocks for the query into memory while we wait for messages to go over the network. Whenever you attempt to queue data into the message queue, you provide an estimated size for the data that will be held in memory till the message goes out. Internally, the message queue uses the Allocator to track memory usage, and the call to queue data will block if there is too much data buffered in memory. When messages are sent out, memory is released, which will unblock requests to queue data for the message queue.
## Hooks And Listeners
go-graphsync provides a variety of points in the request/response lifecycle where one can provide a hook to inspect the current state of the request/response and potentially take action. These hooks provide the core mechanisms for authenticating requests, processing graphsync extensions, pausing and resuming, and generally enabling a higher level consumer of the graphsync to precisely control the request/response lifecycle.
Graphsync also provides listeners that enable a caller to be notified when various asynchronous events happen in the request response lifecycle. Currently graphsync contains an internal pubsub notification system (see [notifications](../notifications)) to escalate low level asynchonous events back to high level modules that pass them to external listeners. A future refactor might look for a way to remove this notification system as it adds additional complexity.
docs/processes.png

95.3 KB | W: | H:

docs/processes.png

86.4 KB | W: | H:

docs/processes.png
docs/processes.png
docs/processes.png
docs/processes.png
  • 2-up
  • Swipe
  • Onion skin
...@@ -51,16 +51,6 @@ fork again ...@@ -51,16 +51,6 @@ fork again
:ipld.Traverse; :ipld.Traverse;
end fork end fork
} }
partition "Aggregating Responses" {
:PeerResponseManager;
fork
:PeerResponseSender;
fork again
:PeerResponseSender;
fork again
:PeerResponseSender;
end fork
}
} }
endif endif
partition "Message Sending Layer" { partition "Message Sending Layer" {
......
docs/responder-sequence.png

217 KB | W: | H:

docs/responder-sequence.png

174 KB | W: | H:

docs/responder-sequence.png
docs/responder-sequence.png
docs/responder-sequence.png
docs/responder-sequence.png
  • 2-up
  • Swipe
  • Onion skin
@startuml Responding To A Request @startuml Responding To A Request
participant "GraphSync\nTop Level\nInterface" as TLI participant "GraphSync\nTop Level\nInterface" as TLI
participant ResponseManager participant ResponseManager
participant "Query Workers" as QW participant "Query Executor" as QW
participant PeerTaskQueue participant PeerTaskQueue
participant PeerTracker participant PeerTracker
participant PeerResponseManager participant Traverser
participant PeerResponseSender participant ResponseAssembler
participant LinkTracker participant LinkTracker
participant ResponseBuilder participant ResponseBuilder
participant IPLD
participant "Intercepted Loader" as ILoader participant "Intercepted Loader" as ILoader
participant Loader participant Loader
participant "Message Sending\nLayer" as Message participant "Message Sending\nLayer" as Message
...@@ -37,63 +36,40 @@ ResponseManager -> ResponseManager : Cancel Request Context ...@@ -37,63 +36,40 @@ ResponseManager -> ResponseManager : Cancel Request Context
end end
end end
else else
par
loop until shutdown loop until shutdown
note over QW: Request Processing Loop note over QW: Request Processing Loop
QW -> PeerTaskQueue : Pop Request QW -> PeerTaskQueue : Pop Request
PeerTaskQueue -> PeerTracker : Pop Request PeerTaskQueue -> PeerTracker : Pop Request
PeerTracker -> PeerTaskQueue : Next Request\nTo Process PeerTracker -> PeerTaskQueue : Next Request\nTo Process
PeerTaskQueue -> QW : Next Request\nTo Process PeerTaskQueue -> QW : Next Request\nTo Process
QW -> IPLD : DecodeNode QW -> QW : Process incoming request hooks
IPLD -> QW : Selector Spec Node QW -> ILoader ** : Create w/ Request, Peer, and Loader
QW -> IPLD : ParseSelector QW -> Traverser ** : Create to manage selector traversal
IPLD -> QW : Root Node, IPLD Selector
QW -> PeerResponseManager : SenderForPeer
PeerResponseManager -> PeerResponseSender ** : Create for peer\nas neccesary
PeerResponseSender -> LinkTracker ** : Create
PeerResponseSender -> QW : PeerResponseSender
activate PeerResponseSender
QW -> ILoader ** : Create w/ RequestID, PeerResponseSender, Loader
QW -> IPLD : Start Traversal Of Selector
loop until traversal complete or request context cancelled loop until traversal complete or request context cancelled
note over PeerResponseSender: Selector Traversal Loop note over Traverser: Selector Traversal Loop
IPLD -> ILoader : Request to load blocks\nto perform traversal Traverser -> ILoader : Request to load blocks\nto perform traversal
ILoader -> Loader : Load blocks\nfrom local storage ILoader -> Loader : Load blocks\nfrom local storage
Loader -> ILoader : Blocks From\nlocal storage or error Loader -> ILoader : Blocks From\nlocal storage or error
ILoader -> IPLD : Blocks to continue\n traversal or error ILoader -> Traverser : Blocks to continue\n traversal or error
ILoader -> PeerResponseSender : Block or error to Send Back ILoader -> QW : Block or error to Send Back
activate PeerResponseSender QW -> QW: Processing outgoing block hooks
PeerResponseSender -> LinkTracker : Notify block or\n error, ask whether\n block is duplicate QW -> ResponseAssembler: Add outgoing responses
LinkTracker -> PeerResponseSender : Whether to\n send block activate ResponseAssembler
PeerResponseSender -> ResponseBuilder ** : Create New As Neccesary ResponseAssembler -> LinkTracker ** : Create for peer if not already present
PeerResponseSender -> ResponseBuilder : Aggregate Response Metadata & Block ResponseAssembler -> LinkTracker : Notify block or\n error, ask whether\n block is duplicate
PeerResponseSender -> PeerResponseSender : Signal Work To Do LinkTracker -> ResponseAssembler : Whether to\n send block
deactivate PeerResponseSender ResponseAssembler -> ResponseBuilder : Aggregate Response Metadata & Block
end ResponseAssembler -> Message : Send aggregate response
IPLD -> QW : Traversal Complete deactivate ResponseAssembler
QW -> PeerResponseSender : Request Finished
activate PeerResponseSender
PeerResponseSender -> LinkTracker : Query If Errors\n Were Present
LinkTracker -> PeerResponseSender : True/False\n if errors present
PeerResponseSender -> ResponseBuilder : Aggregate request finishing
PeerResponseSender -> PeerResponseSender : Signal Work To Do
deactivate PeerResponseSender
end
else
loop until shutdown / disconnect
note over PeerResponseSender: Message Sending\nLoop
PeerResponseSender -> PeerResponseSender : Wait For Work Signal
...
PeerResponseSender -> ResponseBuilder : build response
ResponseBuilder -> PeerResponseSender : Response message data to send
PeerResponseSender -> Message : Send response message data
activate Message
Message -> PeerResponseSender : Channel For When Message Processed
...
Message -> PeerResponseSender : Notification on channel
deactivate Message
end end
deactivate PeerResponseSender Traverser -> QW : Traversal Complete
QW -> ResponseAssembler : Request Finished
activate ResponseAssembler
ResponseAssembler -> LinkTracker : Query If Errors\n Were Present
LinkTracker -> ResponseAssembler : True/False\n if errors present
ResponseAssembler -> ResponseBuilder : Aggregate request finishing
ResponseAssembler -> Message : Send aggregate response
deactivate ResponseAssembler
end end
deactivate QW deactivate QW
end end
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment