Unverified Commit e5c7a3e6 authored by Peter Goodspeed-Niklaus's avatar Peter Goodspeed-Niklaus Committed by GitHub
Browse files

Convert guide from single markdown file to mdbook (#1247)

* move old implementers' guide, add skeleton of new

* Split the old implementers' guide into the new one's sections

This is mostly a straightforward copying operation, moving the
appropriate sections from the old guide to the new. However, there
are certain differences between the old text and the new:

- removed horizontal rules between the sections
- promoted headers appropriately within each section
- deleted certain sections which were in the old guide's ToC but
  which were not actually present in the old guide.
- added Peer Set Manager to the new ToC

* remove description headers

It is redundant and unnecessary. Descriptions fall directly under the
top-level header for any given section.

* add stub description of the backing module

* add stub description for the availability module

* add stub description for collators

* add stub description for validity

* add stub description for utility

* highlight TODO and REVIEW comments

* add guide readme describing how to use mdbook

* fix markdownlint lints

* re-title parachains overview

* internal linking for types

* module and subsystem internal links

* .gitignore should have a trailing newline

* node does not have modules, just subsystems
parent ca82e933
Pipeline #96452 passed with stages
in 23 minutes and 28 seconds
# The Polkadot Parachain Host Implementers' Guide
The implementers' guide is compiled from several source files with [mdBook](https://github.com/rust-lang/mdBook). To view it live, locally, from the repo root:
```sh
cargo install mdbook
mdbook serve roadmap/implementors-guide
open http://localhost:3000
```
[book]
authors = ["Rob Habermeier", "Peter Goodspeed-Niklaus"]
language = "en"
multilingual = false
src = "src"
title = "The Polkadot Parachain Host Implementers' Guide"
This diff is collapsed.
# Preamble
This document aims to describe the purpose, functionality, and implementation of a host for Polkadot's _parachains_. It is not for the implementor of a specific parachain but rather for the implementor of the Parachain Host, which provides security and advancement for constituent parachains. In practice, this is for the implementors of Polkadot.
There are a number of other documents describing the research in more detail. All referenced documents will be linked here and should be read alongside this document for the best understanding of the full picture. However, this is the only document which aims to describe key aspects of Polkadot's particular instantiation of much of that research down to low-level technical details and software architecture.
# Summary
[Preamble](README.md)
- [Whence Parachains](whence-parachains.md)
- [Parachains Overview](parachains-overview.md)
- [Architecture Overview](architecture.md)
- [Runtime Architecture](runtime/README.md)
- [Initializer Module](runtime/initializer.md)
- [Configuration Module](runtime/configuration.md)
- [Paras Module](runtime/paras.md)
- [Scheduler Module](runtime/scheduler.md)
- [Inclusion Module](runtime/inclusion.md)
- [InclusionInherent Module](runtime/inclusioninherent.md)
- [Validity Module](runtime/validity.md)
- [Node Architecture](node/README.md)
- [Subsystems and Jobs](node/subsystems-and-jobs.md)
- [Overseer](node/overseer.md)
- [Backing Subsystems](node/backing/README.md)
- [Candidate Selection](node/backing/candidate-selection.md)
- [Candidate Backing](node/backing/candidate-backing.md)
- [Statement Distribution](node/backing/statement-distribution.md)
- [PoV Distribution](node/backing/pov-distribution.md)
- [Availability Subsystems](node/availability/README.md)
- [Availability Distribution](node/availability/availability-distribution.md)
- [Bitfield Distribution](node/availability/bitfield-distribution.md)
- [Bitfield Signing](node/availability/bitfield-signing.md)
- [Collators](node/collators/README.md)
- [Collation Generation](node/collators/collation-generation.md)
- [Collation Distribution](node/collators/collation-distribution.md)
- [Validity](node/validity/README.md)
- [Utility Subsystems](node/utility/README.md)
- [Availability Store](node/utility/availability-store.md)
- [Candidate Validation](node/utility/candidate-validation.md)
- [Provisioner](node/utility/provisioner.md)
- [Network Bridge](node/utility/network-bridge.md)
- [Misbehavior Arbitration](node/utility/misbehavior-arbitration.md)
- [Peer Set Manager](node/utility/peer-set-manager.md)
- [Data Structures and Types](type-definitions.md)
[Glossary](glossary.md)
[Further Reading](further-reading.md)
# Architecture Overview
Our Parachain Host includes a blockchain known as the relay-chain. A blockchain is a Directed Acyclic Graph (DAG) of state transitions, where every block can be considered to be the head of a linked-list (known as a "chain" or "fork") with a cumulative state which is determined by applying the state transition of each block in turn. All paths through the DAG terminate at the Genesis Block. In fact, the blockchain is a tree, since each block can have only one parent.
```text
+----------------+ +----------------+
| Block 4 | | Block 5 |
+----------------+ +----------------+
\ /
V V
+---------------+
| Block 3 |
+---------------+
|
V
+----------------+ +----------------+
| Block 1 | | Block 2 |
+----------------+ +----------------+
\ /
V V
+----------------+
| Genesis |
+----------------+
```
A blockchain network is comprised of nodes. These nodes each have a view of many different forks of a blockchain and must decide which forks to follow and what actions to take based on the forks of the chain that they are aware of.
So in specifying an architecture to carry out the functionality of a Parachain Host, we have to answer two categories of questions:
1. What is the state-transition function of the blockchain? What is necessary for a transition to be considered valid, and what information is carried within the implicit state of a block?
1. Being aware of various forks of the blockchain as well as global private state such as a view of the current time, what behaviors should a node undertake? What information should a node extract from the state of which forks, and how should that information be used?
The first category of questions will be addressed by the Runtime, which defines the state-transition logic of the chain. Runtime logic only has to focus on the perspective of one chain, as each state has only a single parent state.
The second category of questions addressed by Node-side behavior. Node-side behavior defines all activities that a node undertakes, given its view of the blockchain/block-DAG. Node-side behavior can take into account all or many of the forks of the blockchain, and only conditionally undertake certain activities based on which forks it is aware of, as well as the state of the head of those forks.
```text
__________________________________
/ \
| Runtime |
| |
\_________(Runtime API )___________/
| ^
V |
+----------------------------------------------+
| |
| Node |
| |
| |
+----------------------------------------------+
+ +
| |
--------------------+ +------------------------
Transport
------------------------------------------------
```
It is also helpful to divide Node-side behavior into two further categories: Networking and Core. Networking behaviors relate to how information is distributed between nodes. Core behaviors relate to internal work that a specific node does. These two categories of behavior often interact, but can be heavily abstracted from each other. Core behaviors care that information is distributed and received, but not the internal details of how distribution and receipt function. Networking behaviors act on requests for distribution or fetching of information, but are not concerned with how the information is used afterwards. This allows us to create clean boundaries between Core and Networking activities, improving the modularity of the code.
```text
___________________ ____________________
/ Core \ / Networking \
| | Send "Hello" | |
| |- to "foo" --->| |
| | | |
| | | |
| | | |
| | Got "World" | |
| |<-- from "bar" --| |
| | | |
\___________________/ \____________________/
______| |______
___Transport___
```
Node-side behavior is split up into various subsystems. Subsystems are long-lived workers that perform a particular category of work. Subsystems can communicate with each other, and do so via an [Overseer](/node/overseer.html) that prevents race conditions.
Runtime logic is divided up into Modules and APIs. Modules encapsulate particular behavior of the system. Modules consist of storage, routines, and entry-points. Routines are invoked by entry points, by other modules, upon block initialization or closing. Routines can read and alter the storage of the module. Entry-points are the means by which new information is introduced to a module and can limit the origins (user, root, parachain) that they accept being called by. Each block in the blockchain contains a set of Extrinsics. Each extrinsic targets a a specific entry point to trigger and which data should be passed to it. Runtime APIs provide a means for Node-side behavior to extract meaningful information from the state of a single fork.
These two aspects of the implementation are heavily dependent on each other. The Runtime depends on Node-side behavior to author blocks, and to include Extrinsics which trigger the correct entry points. The Node-side behavior relies on Runtime APIs to extract information necessary to determine which actions to take.
# Further Reading
- Polkadot Wiki on Consensus: <https://wiki.polkadot.network/docs/en/learn-consensus>
- Polkadot Runtime Spec: <https://github.com/w3f/polkadot-spec/tree/spec-rt-anv-vrf-gen-and-announcement/runtime-spec>
# Glossary
Here you can find definitions of a bunch of jargon, usually specific to the Polkadot project.
- BABE: (Blind Assignment for Blockchain Extension). The algorithm validators use to safely extend the Relay Chain. See [the Polkadot wiki][0] for more information.
- Backable Candidate: A Parachain Candidate which is backed by a majority of validators assigned to a given parachain.
- Backed Candidate: A Backable Candidate noted in a relay-chain block
- Backing: A set of statements proving that a Parachain Candidate is backable.
- Collator: A node who generates Proofs-of-Validity (PoV) for blocks of a specific parachain.
- Extrinsic: An element of a relay-chain block which triggers a specific entry-point of a runtime module with given arguments.
- GRANDPA: (Ghost-based Recursive ANcestor Deriving Prefix Agreement). The algorithm validators use to guarantee finality of the Relay Chain.
- Inclusion Pipeline: The set of steps taken to carry a Parachain Candidate from authoring, to backing, to availability and full inclusion in an active fork of its parachain.
- Module: A component of the Runtime logic, encapsulating storage, routines, and entry-points.
- Module Entry Point: A recipient of new information presented to the Runtime. This may trigger routines.
- Module Routine: A piece of code executed within a module by block initialization, closing, or upon an entry point being triggered. This may execute computation, and read or write storage.
- Node: A participant in the Polkadot network, who follows the protocols of communication and connection to other nodes. Nodes form a peer-to-peer network topology without a central authority.
- Parachain Candidate, or Candidate: A proposed block for inclusion into a parachain.
- Parablock: A block in a parachain.
- Parachain: A constituent chain secured by the Relay Chain's validators.
- Parachain Validators: A subset of validators assigned during a period of time to back candidates for a specific parachain
- Parathread: A parachain which is scheduled on a pay-as-you-go basis.
- Proof-of-Validity (PoV): A stateless-client proof that a parachain candidate is valid, with respect to some validation function.
- Relay Parent: A block in the relay chain, referred to in a context where work is being done in the context of the state at this block.
- Runtime: The relay-chain state machine.
- Runtime Module: See Module.
- Runtime API: A means for the node-side behavior to access structured information based on the state of a fork of the blockchain.
- Secondary Checker: A validator who has been randomly selected to perform secondary approval checks on a parablock which is pending approval.
- Subsystem: A long-running task which is responsible for carrying out a particular category of work.
- Validator: Specially-selected node in the network who is responsible for validating parachain blocks and issuing attestations about their validity.
- Validation Function: A piece of Wasm code that describes the state-transition function of a parachain.
Also of use is the [Substrate Glossary](https://substrate.dev/docs/en/overview/glossary).
[0]: https://wiki.polkadot.network/docs/en/learn-consensus
# Node Architecture
## Design Goals
* Modularity: Components of the system should be as self-contained as possible. Communication boundaries between components should be well-defined and mockable. This is key to creating testable, easily reviewable code.
* Minimizing side effects: Components of the system should aim to minimize side effects and to communicate with other components via message-passing.
* Operational Safety: The software will be managing signing keys where conflicting messages can lead to large amounts of value to be slashed. Care should be taken to ensure that no messages are signed incorrectly or in conflict with each other.
The architecture of the node-side behavior aims to embody the Rust principles of ownership and message-passing to create clean, isolatable code. Each resource should have a single owner, with minimal sharing where unavoidable.
Many operations that need to be carried out involve the network, which is asynchronous. This asynchrony affects all core subsystems that rely on the network as well. The approach of hierarchical state machines is well-suited to this kind of environment.
We introduce a hierarchy of state machines consisting of an overseer supervising subsystems, where Subsystems can contain their own internal hierarchy of jobs. This is elaborated on in the next section on Subsystems.
# Availability Subsystems
The availability subsystems are responsible for ensuring that Proofs of Validity of backed candidates are widely available within the validator set, without requiring every node to retain a full copy. They accomplish this by broadly distributing erasure-coded chunks of the PoV, keeping track of which validator has which chunk by means of signed bitfields. They are also responsible for reassembling a complete PoV when required, e.g. when a fisherman reports a potentially invalid block.
# Availability Distribution
Distribute availability erasure-coded chunks to validators.
After a candidate is backed, the availability of the PoV block must be confirmed by 2/3+ of all validators. Validating a candidate successfully and contributing it to being backable leads to the PoV and erasure-coding being stored in the [Availability Store](/node/utility/availability-store.html).
## Protocol
`ProtocolId`:`b"avad"`
Input:
- NetworkBridgeUpdate(update)
Output:
- NetworkBridge::RegisterEventProducer(`ProtocolId`)
- NetworkBridge::SendMessage(`[PeerId]`, `ProtocolId`, `Bytes`)
- NetworkBridge::ReportPeer(PeerId, cost_or_benefit)
- AvailabilityStore::QueryPoV(candidate_hash, response_channel)
- AvailabilityStore::StoreChunk(candidate_hash, chunk_index, inclusion_proof, chunk_data)
## Functionality
Register on startup an event producer with `NetworkBridge::RegisterEventProducer`.
For each relay-parent in our local view update, look at all backed candidates pending availability. Distribute via gossip all erasure chunks for all candidates that we have to peers.
We define an operation `live_candidates(relay_heads) -> Set<AbridgedCandidateReceipt>` which returns a set of candidates a given set of relay chain heads that implies a set of candidates whose availability chunks should be currently gossiped. This is defined as all candidates pending availability in any of those relay-chain heads or any of their last `K` ancestors. We assume that state is not pruned within `K` blocks of the chain-head.
We will send any erasure-chunks that correspond to candidates in `live_candidates(peer_most_recent_view_update)`. Likewise, we only accept and forward messages pertaining to a candidate in `live_candidates(current_heads)`. Each erasure chunk should be accompanied by a merkle proof that it is committed to by the erasure trie root in the candidate receipt, and this gossip system is responsible for checking such proof.
We re-attempt to send anything live to a peer upon any view update from that peer.
On our view change, for all live candidates, we will check if we have the PoV by issuing a `QueryPoV` message and waiting for the response. If the query returns `Some`, we will perform the erasure-coding and distribute all messages to peers that will accept them.
If we are operating as a validator, we note our index `i` in the validator set and keep the `i`th availability chunk for any live candidate, as we receive it. We keep the chunk and its merkle proof in the [Availability Store](/node/utility/availability-store.html) by sending a `StoreChunk` command. This includes chunks and proofs generated as the result of a successful `QueryPoV`.
> TODO: back-and-forth is kind of ugly but drastically simplifies the pruning in the availability store, as it creates an invariant that chunks are only stored if the candidate was actually backed
>
> K=3?
# Bitfield Distribution
Validators vote on the availability of a backed candidate by issuing signed bitfields, where each bit corresponds to a single candidate. These bitfields can be used to compactly determine which backed candidates are available or not based on a 2/3+ quorum.
## Protocol
`ProtocolId`: `b"bitd"`
Input:
- `DistributeBitfield(relay_parent, SignedAvailabilityBitfield)`: distribute a bitfield via gossip to other validators.
- `NetworkBridgeUpdate(NetworkBridgeUpdate)`
Output:
- `NetworkBridge::RegisterEventProducer(ProtocolId)`
- `NetworkBridge::SendMessage([PeerId], ProtocolId, Bytes)`
- `NetworkBridge::ReportPeer(PeerId, cost_or_benefit)`
- `BlockAuthorshipProvisioning::Bitfield(relay_parent, SignedAvailabilityBitfield)`
## Functionality
This is implemented as a gossip system. Register a [network bridge](/node/utility/network-bridge.html) event producer on startup and track peer connection, view change, and disconnection events. Only accept bitfields relevant to our current view and only distribute bitfields to other peers when relevant to their most recent view. Check bitfield signatures in this subsystem and accept and distribute only one bitfield per validator.
When receiving a bitfield either from the network or from a `DistributeBitfield` message, forward it along to the block authorship (provisioning) subsystem for potential inclusion in a block.
# Bitfield Signing
Validators vote on the availability of a backed candidate by issuing signed bitfields, where each bit corresponds to a single candidate. These bitfields can be used to compactly determine which backed candidates are available or not based on a 2/3+ quorum.
## Protocol
Output:
- BitfieldDistribution::DistributeBitfield: distribute a locally signed bitfield
- AvailabilityStore::QueryChunk(CandidateHash, validator_index, response_channel)
## Functionality
Upon onset of a new relay-chain head with `StartWork`, launch bitfield signing job for the head. Stop the job on `StopWork`.
## Bitfield Signing Job
Localized to a specific relay-parent `r`
If not running as a validator, do nothing.
- Determine our validator index `i`, the set of backed candidates pending availability in `r`, and which bit of the bitfield each corresponds to.
- > TODO: wait T time for availability distribution?
- Start with an empty bitfield. For each bit in the bitfield, if there is a candidate pending availability, query the [Availability Store](/node/utility/availability-store.html) for whether we have the availability chunk for our validator index.
- For all chunks we have, set the corresponding bit in the bitfield.
- Sign the bitfield and dispatch a `BitfieldDistribution::DistributeBitfield` message.
# Backing Subsystems
The backing subsystems, when conceived as a black box, receive an arbitrary quantity of parablock candidates and associated proofs of validity from arbitrary untrusted collators. From these, they produce a bounded quantity of backable candidates which relay chain block authors may choose to include in a subsequent block.
In broad strokes, the flow operates like this:
- **Candidate Selection** winnows the field of parablock candidates, selecting up to one of them to second.
- **Candidate Backing** ensures that a seconding candidate is valid, then generates the appropriate `Statement`. It also keeps track of which candidates have received the backing of a quorum of other validators.
- **Statement Distribution** is the networking component which ensures that all validators receive each others' statements.
- **PoV Distribution** is the networking component which ensures that validators considering a candidate can get the appropriate PoV.
# Candidate Backing
The Candidate Backing subsystem ensures every parablock considered for relay block inclusion has been seconded by at least one validator, and approved by a quorum. Parablocks for which no validator will assert correctness are discarded. If the block later proves invalid, the initial backers are slashable; this gives polkadot a rational threat model during subsequent stages.
Its role is to produce backable candidates for inclusion in new relay-chain blocks. It does so by issuing signed [`Statement`s](/type-definitions.html#statement-type) and tracking received statements signed by other validators. Once enough statements are received, they can be combined into backing for specific candidates.
Note that though the candidate backing subsystem attempts to produce as many backable candidates as possible, it does _not_ attempt to choose a single authoritative one. The choice of which actually gets included is ultimately up to the block author, by whatever metrics it may use; those are opaque to this subsystem.
Once a sufficient quorum has agreed that a candidate is valid, this subsystem notifies the [Overseer](/node/overseer.html), which in turn engages block production mechanisms to include the parablock.
## Protocol
The [Candidate Selection subsystem](/node/backing/candidate-selection.html) is the primary source of non-overseer messages into this subsystem. That subsystem generates appropriate [`CandidateBackingMessage`s](/type-definitions.html#candidate-backing-message), and passes them to this subsystem.
This subsystem validates the candidates and generates an appropriate [`Statement`](/type-definitions.html#statement-type). All `Statement`s are then passed on to the [Statement Distribution subsystem](/node/backing/statement-distribution.html) to be gossiped to peers. When this subsystem decides that a candidate is invalid, and it was recommended to us to second by our own Candidate Selection subsystem, a message is sent to the Candidate Selection subsystem with the candidate's hash so that the collator which recommended it can be penalized.
## Functionality
The subsystem should maintain a set of handles to Candidate Backing Jobs that are currently live, as well as the relay-parent to which they correspond.
### On Overseer Signal
* If the signal is an [`OverseerSignal`](/type-definitions.html#overseer-signal)`::StartWork(relay_parent)`, spawn a Candidate Backing Job with the given relay parent, storing a bidirectional channel with the Candidate Backing Job in the set of handles.
* If the signal is an [`OverseerSignal`](/type-definitions.html#overseer-signal)`::StopWork(relay_parent)`, cease the Candidate Backing Job under that relay parent, if any.
### On `CandidateBackingMessage`
* If the message corresponds to a particular relay-parent, forward the message to the Candidate Backing Job for that relay-parent, if any is live.
> big TODO: "contextual execution"
>
> * At the moment we only allow inclusion of _new_ parachain candidates validated by _current_ validators.
> * Allow inclusion of _old_ parachain candidates validated by _current_ validators.
> * Allow inclusion of _old_ parachain candidates validated by _old_ validators.
>
> This will probably blur the lines between jobs, will probably require inter-job communication and a short-term memory of recently backable, but not backed candidates.
## Candidate Backing Job
The Candidate Backing Job represents the work a node does for backing candidates with respect to a particular relay-parent.
The goal of a Candidate Backing Job is to produce as many backable candidates as possible. This is done via signed [`Statement`s](/type-definitions.html#statement-type) by validators. If a candidate receives a majority of supporting Statements from the Parachain Validators currently assigned, then that candidate is considered backable.
### On Startup
* Fetch current validator set, validator -> parachain assignments from runtime API.
* Determine if the node controls a key in the current validator set. Call this the local key if so.
* If the local key exists, extract the parachain head and validation function for the parachain the local key is assigned to.
### On Receiving New Signed Statement
```rust
if let Statement::Seconded(candidate) = signed.statement {
if candidate is unknown and in local assignment {
spawn_validation_work(candidate, parachain head, validation function)
}
}
// add `Seconded` statements and `Valid` statements to a quorum. If quorum reaches validator-group
// majority, send a `BlockAuthorshipProvisioning::BackableCandidate(relay_parent, Candidate, Backing)` message.
```
### Spawning Validation Work
```rust
fn spawn_validation_work(candidate, parachain head, validation function) {
asynchronously {
let pov = (fetch pov block).await
// dispatched to sub-process (OS process) pool.
let valid = validate_candidate(candidate, validation function, parachain head, pov).await;
if valid {
// make PoV available for later distribution. Send data to the availability store to keep.
// sign and dispatch `valid` statement to network if we have not seconded the given candidate.
} else {
// sign and dispatch `invalid` statement to network.
}
}
}
```
### Fetch Pov Block
Create a `(sender, receiver)` pair.
Dispatch a `PovFetchSubsystemMessage(relay_parent, candidate_hash, sender)` and listen on the receiver for a response.
### On Receiving `CandidateBackingMessage`
* If the message is a `CandidateBackingMessage::RegisterBackingWatcher`, register the watcher and trigger it each time a new candidate is backable. Also trigger it once initially if there are any backable candidates at the time of receipt.
* If the message is a `CandidateBackingMessage::Second`, sign and dispatch a `Seconded` statement only if we have not seconded any other candidate and have not signed a `Valid` statement for the requested candidate. Signing both a `Seconded` and `Valid` message is a double-voting misbehavior with a heavy penalty, and this could occur if another validator has seconded the same candidate and we've received their message before the internal seconding request.
> TODO: send statements to Statement Distribution subsystem, handle shutdown signal from candidate backing subsystem
# Candidate Selection
The Candidate Selection Subsystem is run by validators, and is responsible for interfacing with Collators to select a candidate, along with its PoV, to second during the backing process relative to a specific relay parent.
This subsystem includes networking code for communicating with collators, and tracks which collations specific collators have submitted. This subsystem is responsible for disconnecting and blacklisting collators who have submitted collations that are found to have submitted invalid collations by other subsystems.
This subsystem is only ever interested in parablocks assigned to the particular parachain which this validator is currently handling.
New parablock candidates may arrive from a potentially unbounded set of collators. This subsystem chooses either 0 or 1 of them per relay parent to second. If it chooses to second a candidate, it sends an appropriate message to the [Candidate Backing subsystem](/node/backing/candidate-backing.html) to generate an appropriate [`Statement`](/type-definitions.html#statement-type).
In the event that a parablock candidate proves invalid, this subsystem will receive a message back from the Candidate Backing subsystem indicating so. If that parablock candidate originated from a collator, this subsystem will blacklist that collator. If that parablock candidate originated from a peer, this subsystem generates a report for the [Misbehavior Arbitration subsystem](/node/utility/misbehavior-arbitration.html).
## Protocol
Input: None
Output:
- Validation requests to Validation subsystem
- [`CandidateBackingMessage`](/type-definitions.html#candidate-backing-message)`::Second`
- Peer set manager: report peers (collators who have misbehaved)
## Functionality
Overarching network protocol + job for every relay-parent
> TODO The Candidate Selection network protocol is currently intentionally unspecified pending further discussion.
Several approaches have been selected, but all have some issues:
- The most straightforward approach is for this subsystem to simply second the first valid parablock candidate which it sees per relay head. However, that protocol is vulnerable to a single collator which, as an attack or simply through chance, gets its block candidate to the node more often than its fair share of the time.
- It may be possible to do some BABE-like selection algorithm to choose an "Official" collator for the round, but that is tricky because the collator which produces the PoV does not necessarily actually produce the block.
- We could use relay-chain BABE randomness to generate some delay `D` on the order of 1 second, +- 1 second. The collator would then second the first valid parablock which arrives after `D`, or in case none has arrived by `2*D`, the last valid parablock which has arrived. This makes it very hard for a collator to game the system to always get its block nominated, but it reduces the maximum throughput of the system by introducing delay into an already tight schedule.
- A variation of that scheme would be to randomly choose a number `I`, and have a fixed acceptance window `D` for parablock candidates. At the end of the period `D`, count `C`: the number of parablock candidates received. Second the one with index `I % C`. Its drawback is the same: it must wait the full `D` period before seconding any of its received candidates, reducing throughput.
## Candidate Selection Job
- Aware of validator key and assignment
- One job for each relay-parent, which selects up to one collation for the Candidate Backing Subsystem
# PoV Distribution
This subsystem is responsible for distributing PoV blocks. For now, unified with [Statement Distribution subsystem](/node/backing/statement-distribution.html).
## Protocol
Handle requests for PoV block by candidate hash and relay-parent.
## Functionality
Implemented as a gossip system, where `PoV`s are not accepted unless we know a `Seconded` message.
> TODO: this requires a lot of cross-contamination with statement distribution even if we don't implement this as a gossip system. In a point-to-point implementation, we still have to know _who to ask_, which means tracking who's submitted `Seconded`, `Valid`, or `Invalid` statements - by validator and by peer. One approach is to have the Statement gossip system to just send us this information and then we can separate the systems from the beginning instead of combining them
# Statement Distribution
The Statement Distribution Subsystem is responsible for distributing statements about seconded candidates between validators.
## Protocol
`ProtocolId`: `b"stmd"`
Input:
- NetworkBridgeUpdate(update)
Output:
- NetworkBridge::RegisterEventProducer(`ProtocolId`)
- NetworkBridge::SendMessage(`[PeerId]`, `ProtocolId`, `Bytes`)
- NetworkBridge::ReportPeer(PeerId, cost_or_benefit)
## Functionality
Implemented as a gossip protocol. Register a network event producer on startup. Handle updates to our view and peers' views. Neighbor packets are used to inform peers which chain heads we are interested in data for.
Statement Distribution is the only backing subsystem which has any notion of peer nodes, who are any full nodes on the network. Validators will also act as peer nodes.
It is responsible for signing statements that we have generated and forwarding them, and for detecting a variety of Validator misbehaviors for reporting to [Misbehavior Arbitration](/node/utility/misbehavior-arbitration.html). During the Backing stage of the inclusion pipeline, it's the main point of contact with peer nodes, who distribute statements by validators. On receiving a signed statement from a peer, assuming the peer receipt state machine is in an appropriate state, it sends the Candidate Receipt to the [Candidate Backing subsystem](/node/backing/candidate-backing.html) to handle the validator's statement.
Track equivocating validators and stop accepting information from them. Forward double-vote proofs to the double-vote reporting system. Establish a data-dependency order:
- In order to receive a `Seconded` message we have the on corresponding chain head in our view
- In order to receive an `Invalid` or `Valid` message we must have received the corresponding `Seconded` message.
And respect this data-dependency order from our peers by respecting their views. This subsystem is responsible for checking message signatures.
The Statement Distribution subsystem sends statements to peer nodes and detects double-voting by validators. When validators conflict with each other or themselves, the Misbehavior Arbitration system is notified.
## Peer Receipt State Machine
There is a very simple state machine which governs which messages we are willing to receive from peers. Not depicted in the state machine: on initial receipt of any [`SignedStatement`](/type-definitions.html#signed-statement-type), validate that the provided signature does in fact sign the included data. Note that each individual parablock candidate gets its own instance of this state machine; it is perfectly legal to receive a `Valid(X)` before a `Seconded(Y)`, as long as a `Seconded(X)` has been received.
A: Initial State. Receive `SignedStatement(Statement::Second)`: extract `Statement`, forward to Candidate Backing, proceed to B. Receive any other `SignedStatement` variant: drop it.
B: Receive any `SignedStatement`: extract `Statement`, forward to Candidate Backing. Receive `OverseerMessage::StopWork`: proceed to C.
C: Receive any message for this block: drop it.
## Peer Knowledge Tracking
The peer receipt state machine implies that for parsimony of network resources, we should model the knowledge of our peers, and help them out. For example, let's consider a case with peers A, B, and C, validators X and Y, and candidate M. A sends us a `Statement::Second(M)` signed by X. We've double-checked it, and it's valid. While we're checking it, we receive a copy of X's `Statement::Second(M)` from `B`, along with a `Statement::Valid(M)` signed by Y.
Our response to A is just the `Statement::Valid(M)` signed by Y. However, we haven't heard anything about this from C. Therefore, we send it everything we have: first a copy of X's `Statement::Second`, then Y's `Statement::Valid`.
This system implies a certain level of duplication of messages--we received X's `Statement::Second` from both our peers, and C may experience the same--but it minimizes the degree to which messages are simply dropped.
And respect this data-dependency order from our peers. This subsystem is responsible for checking message signatures.
No jobs, `StartWork` and `StopWork` pulses are used to control neighbor packets and what we are currently accepting.
# Collators
Collators are special nodes which bridge a parachain to the relay chain. They are simultaneously full nodes of the parachain, and at least light clients of the relay chain. Their overall contribution to the system is the generation of Proofs of Validity for parachain candidates.
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment