Skip to content
Snippets Groups Projects
user avatar
Alexandru Gheorghe authored
There is a problem in the way we update `authorithy-discovery` next keys
and because of that nodes that enter the active set would be noticed at
the start of the session they become active, instead of the start of the
previous session as it was intended. This is problematic because:

1. The node itself advertises its addresses on the DHT only when it
notices it should become active on around ~10m loop, so in this case it
would notice after it becomes active.
2. The other nodes won't be able to detect the new nodes addresses at
the beginning of the session, so it won't added them to the reserved
set.

With 1 + 2, we end-up in a situation where the the new node won't be
able to properly connect to its peers because it won't be in its peers
reserved set. Now, the nodes accept by default`MIN_GOSSIP_PEERS: usize =
25` connections to nodes that are not in the reserved set, but given
Kusama size(> 1000 nodes) you could easily have more than`25` new nodes
entering the active set or simply the nodes don't have slots anymore
because, they already have connections to peers not in the active set.

In the end what the node would notice is 0 backing rewards because it
wasn't directly connected to the peers in its backing group.

## Root-cause

The flow is like this:
1. At BAD_SESSION - 1, in `rotate_session` new nodes are added to
QueuedKeys
https://github.com/paritytech/polkadot-sdk/blob/02e1a7f4/substrate/frame/session/src/lib.rs#L609
```
 <QueuedKeys<T>>::put(queued_amalgamated.clone());
<QueuedChanged<T>>::put(next_changed);
```
2. AuthorityDiscovery::on_new_session is called with `changed` being the
value of `<QueuedChanged<T>>:` at BAD_SESSION - **2** because it was
saved before being updated
https://github.com/paritytech/polkadot-sdk/blob/02e1a7f4/substrate/frame/session/src/lib.rs#L613
3. At BAD_SESSION - 1, `AuthorityDiscovery::on_new_session` doesn't
updated its next_keys because `changed` was false.
4. For the entire durations of `BAD_SESSION - 1` everyone calling
runtime api `authorities`(should return past, present and future
authorities) won't discover the nodes that should become active .
5. At the beginning of BAD_SESSION, all nodes discover the new nodes are
authorities, but it is already too late because reserved_nodes are
updated only at the beginning of the session by the `gossip-support`.
See above why this bad.

## Fix
Update next keys with the queued_validators at every session, not matter
the value of `changed` this is the same way babe pallet correctly does
it.
https://github.com/paritytech/polkadot-sdk/blob/02e1a7f4

/substrate/frame/babe/src/lib.rs#L655

## Notes

- The issue doesn't reproduce with proof-authorities changes like
`versi` because `changed` would always be true and `AuthorityDiscovery`
correctly updates its next_keys every time.
- Confirmed at session `37651` on kusama that this is exactly what it
happens by looking at blocks with polkadot.js.

## TODO
- [ ] Move versi on proof of stake and properly test before and after
fix to confirm there is no other issue.

---------

Signed-off-by: default avatarAlexandru Gheorghe <alexandru.gheorghe@parity.io>
Co-authored-by: default avatarBastian Köcher <git@kchr.de>
8d0cd4ff

NOTE: We have recently made significant changes to our repository structure. In order to streamline our development process and foster better contributions, we have merged three separate repositories Cumulus, Substrate and Polkadot into this repository. Read more about the changes here.

Polkadot SDK

StackExchange

The Polkadot SDK repository provides all the resources needed to start building on the Polkadot network, a multi-chain blockchain platform that enables different blockchains to interoperate and share information in a secure and scalable way. The Polkadot SDK comprises three main pieces of software:

Polkadot

PolkadotForum Polkadot-license

Implementation of a node for the https://polkadot.network in Rust, using the Substrate framework. This directory currently contains runtimes for the Westend and Rococo test networks. Polkadot, Kusama and their system chain runtimes are located in the runtimes repository maintained by the Polkadot Technical Fellowship.

Substrate

SubstrateRustDocs Substrate-license

Substrate is the primary blockchain SDK used by developers to create the parachains that make up the Polkadot network. Additionally, it allows for the development of self-sovereign blockchains that operate completely independently of Polkadot.

Cumulus

CumulusRustDocs Cumulus-license

Cumulus is a set of tools for writing Substrate-based Polkadot parachains.

Releases

[!NOTE]
Our release process is still Work-In-Progress and may not yet reflect the aspired outline here.

The Polkadot-SDK has two release channels: stable and nightly. Production software is advised to only use stable. nightly is meant for tinkerers to try out the latest features. The detailed release process is described in RELEASE.md.

Stable

stable releases have a support duration of three months. In this period, the release will not have any breaking changes. It will receive bug fixes, security fixes, performance fixes and new non-breaking features on a two week cadence.

Nightly

nightly releases are released every night from the master branch, potentially with breaking changes. They have pre-release version numbers in the format major.0.0-nightlyYYMMDD.

Upstream Dependencies

Below are the primary upstream dependencies utilized in this project:

Security

The security policy and procedures can be found in docs/contributor/SECURITY.md.

Contributing & Code of Conduct

Ensure you follow our contribution guidelines. In every interaction and contribution, this project adheres to the Contributor Covenant Code of Conduct.

Additional Resources

  • For monitoring upcoming changes and current proposals related to the technical implementation of the Polkadot network, visit the Requests for Comment (RFC) repository. While it's maintained by the Polkadot Fellowship, the RFC process welcomes contributions from everyone.