From fdbbf14fa41f2f728c163502307fb957cadf61bf Mon Sep 17 00:00:00 2001
From: Pierre Krieger <pierre.krieger1708@gmail.com>
Date: Thu, 26 Mar 2020 12:26:30 +0100
Subject: [PATCH] Improve sc-network's documentation for network protocols
 (#5364)

* Improve sc-network's documentation for network protocols

* Add note about protocol id

* Apply suggestions from code review

Co-Authored-By: Max Inden <mail@max-inden.de>

Co-authored-by: Max Inden <mail@max-inden.de>
---
 substrate/client/network/src/lib.rs | 107 ++++++++++++++++++++++------
 1 file changed, 85 insertions(+), 22 deletions(-)

diff --git a/substrate/client/network/src/lib.rs b/substrate/client/network/src/lib.rs
index f29fb00a4fd..a5107a02559 100644
--- a/substrate/client/network/src/lib.rs
+++ b/substrate/client/network/src/lib.rs
@@ -90,23 +90,49 @@
 //! ## Substreams
 //!
 //! Once a connection has been established and uses multiplexing, substreams can be opened. When
-//! a substream is open, the **multistream-select** protocol is used to negotiate which protocol to
-//! use on that given substream. In practice, Substrate opens the following substreams:
-//!
-//! - We periodically open an ephemeral substream in order to ping the remote and check whether the
-//! connection is still alive. Failure for the remote to reply leads to a disconnection. This uses
-//! the libp2p ping protocol.
-//! - We periodically open an ephemeral substream in order to ask information from the remote. This
-//! is called [the `identify` protocol](https://github.com/libp2p/specs/tree/master/identify).
-//! - We periodically open ephemeral substreams for Kademlia random walk queries. Each Kademlia
-//! query is done in a new separate substream. This uses the
-//! [standard libp2p Kademlia protocol](https://github.com/libp2p/specs/pull/108).
-//! - We optionally keep a substream alive for all Substrate-based communications. The name of the
-//! protocol negotiated is based on the *protocol ID* passed as part of the network configuration.
-//! This protocol ID should be unique for each chain and prevents nodes from different chains from
-//! connecting to each other. More information below.
-//!
-//! ## The Substrate substream
+//! a substream is open, the **multistream-select** protocol is used to negotiate which protocol
+//! to use on that given substream.
+//!
+//! Protocols that are specific to a certain chain have a `<protocol-id>` in their name. This
+//! "protocol ID" is defined in the chain specifications. For example, the protocol ID of Polkadot
+//! is "dot". In the protocol names below, `<protocol-id>` must be replaced with the corresponding
+//! protocol ID.
+//!
+//! > **Note**: It is possible for the same connection to be used for multiple chains. For example,
+//! >           one can use both the `/dot/sync/2` and `/sub/sync/2` protocols on the same
+//! >           connection, provided that the remote supports them.
+//!
+//! Substrate uses the following standard libp2p protocols:
+//!
+//! - **`/ipfs/ping/1.0.0`**. We periodically open an ephemeral substream in order to ping the
+//! remote and check whether the connection is still alive. Failure for the remote to reply leads
+//! to a disconnection.
+//! - **[`/ipfs/id/1.0.0`](https://github.com/libp2p/specs/tree/master/identify)**. We
+//! periodically open an ephemeral substream in order to ask information from the remote.
+//! - **[`/ipfs/kad/1.0.0`](https://github.com/libp2p/specs/pull/108)**. We periodically open
+//! ephemeral substreams for Kademlia random walk queries. Each Kademlia query is done in a
+//! separate substream.
+//!
+//! Additionally, Substrate uses the following non-libp2p-standard protocols:
+//!
+//! - **`/substrate/<protocol-id>/<version>`** (where `<protocol-id>` must be replaced with the
+//! protocol ID of the targeted chain, and `<version>` is a number between 2 and 6). For each
+//! connection we optionally keep an additional substream for all Substrate-based communications alive.
+//! This protocol is considered legacy, and is progressively being replaced with alternatives.
+//! This is designated as "The legacy Substrate substream" in this documentation. See below for
+//! more details.
+//! - **`/<protocol-id>/sync/2`** is a request-response protocol (see below) that lets one perform
+//! requests for information about blocks. Each request is the encoding of a `BlockRequest` and
+//! each response is the encoding of a `BlockResponse`, as defined in the `api.v1.proto` file in
+//! this source tree.
+//! - **`/<protocol-id>/light/2`** is a request-response protocol (see below) that lets one perform
+//! light-client-related requests for information about the state. Each request is the encoding of
+//! a `light::Request` and each response is the encoding of a `light::Response`, as defined in the
+//! `light.v1.proto` file in this source tree.
+//! - Notifications protocols that are registered using the `register_notifications_protocol`
+//! method. For example: `/paritytech/grandpa/1`. See below for more information.
+//!
+//! ## The legacy Substrate substream
 //!
 //! Substrate uses a component named the **peerset manager (PSM)**. Through the discovery
 //! mechanism, the PSM is aware of the nodes that are part of the network and decides which nodes
@@ -119,8 +145,8 @@
 //! Note that at the moment there is no mechanism in place to solve the issues that arise where the
 //! two sides of a connection open the unique substream simultaneously. In order to not run into
 //! issues, only the dialer of a connection is allowed to open the unique substream. When the
-//! substream is closed, the entire connection is closed as well. This is a bug, and should be
-//! fixed by improving the protocol.
+//! substream is closed, the entire connection is closed as well. This is a bug that will be
+//! resolved by deprecating the protocol entirely.
 //!
 //! Within the unique Substrate substream, messages encoded using
 //! [*parity-scale-codec*](https://github.com/paritytech/parity-scale-codec) are exchanged.
@@ -137,9 +163,46 @@
 //! substream open with is chosen, and the information is requested from it.
 //! - Gossiping. Used for example by grandpa.
 //!
-//! It is intended that in the future each of these components gets more isolated, so that they
-//! are free to open and close their own substreams, and so that syncing and light client requests
-//! are able to communicate with nodes outside of the range of the PSM.
+//! ## Request-response protocols
+//!
+//! A so-called request-response protocol is defined as follow:
+//!
+//! - When a substream is opened, the opening side sends a message whose content is
+//! protocol-specific. The message must be prefixed with an
+//! [LEB128-encoded number](https://en.wikipedia.org/wiki/LEB128) indicating its length. After the
+//! message has been sent, the writing side is closed.
+//! - The remote sends back the response prefixed with a LEB128-encoded length, and closes its
+//! side as well.
+//!
+//! Each request is performed in a new separate substream.
+//!
+//! ## Notifications protocols
+//!
+//! A so-called notifications protocol is defined as follow:
+//!
+//! - When a substream is opened, the opening side sends a handshake message whose content is
+//! protocol-specific. The handshake message must be prefixed with an
+//! [LEB128-encoded number](https://en.wikipedia.org/wiki/LEB128) indicating its length. The
+//! handshake message can be of length 0, in which case the sender has to send a single `0`.
+//! - The receiver then either immediately closes the substream, or answers with its own
+//! LEB128-prefixed protocol-specific handshake response. The message can be of length 0, in which
+//! case a single `0` has to be sent back. The receiver is then encouraged to close its sending
+//! side.
+//! - Once the handshake has completed, the notifications protocol is unidirectional. Only the
+//! node which initiated the substream can push notifications. If the remote wants to send
+//! notifications as well, it has to open its own undirectional substream.
+//! - Each notification must be prefixed with an LEB128-encoded length. The encoding of the
+//! messages is specific to each protocol.
+//!
+//! The API of `sc-network` allows one to register user-defined notification protocols.
+//! `sc-network` automatically tries to open a substream towards each node for which the legacy
+//! Substream substream is open. The handshake is then performed automatically.
+//!
+//! For example, the `sc-finality-grandpa` crate registers the `/paritytech/grandpa/1`
+//! notifications protocol.
+//!
+//! At the moment, for backwards-compatibility, notification protocols are tied to the legacy
+//! Substrate substream. In the future, though, it will no longer be the case.
 //!
 //! # Usage
 //!
-- 
GitLab