Commits · f0eec07f93759331e6520ccc67f3d3291f0122c4 · parity / Mirrored projects / polkadot-sdk

Jan 13, 2025

Increase the number of pvf execute workers (#7116) · f0eec07f

Alexandru Gheorghe authored 2 months ago


Reference hardware requirements have been bumped to at least 8 cores so
we can no allocate 50% of that capacity to PVF execution.

---------

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>

Unverified

f0eec07f

Nov 11, 2024

Remove network starter that is no longer needed (#6400) · b601d57a

Nazar Mokrynskyi authored 4 months ago


# Description

This seems to be an old artifact of the long closed
https://github.com/paritytech/substrate/issues/6827 that I noticed when
working on related code earlier.

## Integration

`NetworkStarter` was removed, simply remove its usage:
```diff
-let (network, system_rpc_tx, tx_handler_controller, start_network, sync_service) =
+let (network, system_rpc_tx, tx_handler_controller, sync_service) =
    build_network(BuildNetworkParams {
...
-start_network.start_network();
```

## Review Notes

Changes are trivial, the only reason for this to not be accepted is if
it is desired to not start network automatically for whatever reason, in
which case the description of network starter needs to change.

# Checklist

* [x] My PR includes a detailed description as outlined in the
"Description" and its two subsections above.
* [ ] My PR follows the [labeling requirements](

https://github.com/paritytech/polkadot-sdk/blob/master/docs/contributor/CONTRIBUTING.md#Process
) of this project (at minimum one label for `T` required)
* External contributors: ask maintainers to put the right label on your
PR.

---------

Co-authored-by: GitHub Action <action@github.com>
Co-authored-by: Bastian Köcher <git@kchr.de>

Unverified

b601d57a

Nov 07, 2024

Syncing strategy refactoring (part 3) (#5737) · 12d90524

Nazar Mokrynskyi authored 4 months ago

# Description

This is a continuation of
https://github.com/paritytech/polkadot-sdk/pull/5666 that finally fixes
https://github.com/paritytech/polkadot-sdk/issues/5333.

This should allow developers to create custom syncing strategies or even
the whole syncing engine if they so desire. It also moved syncing engine
creation and addition of corresponding protocol outside
`build_network_advanced` method, which is something Bastian expressed as
desired in
https://github.com/paritytech/polkadot-sdk/issues/5#issuecomment-1700816458

Here I replaced strategy-specific types and methods in `SyncingStrategy`
trait with generic ones. Specifically `SyncingAction` is now used by all
strategies instead of strategy-specific types with conversions.
`StrategyKey` was an enum with a fixed set of options and now replaced
with an opaque type that strategies create privately and send to upper
layers as an opaque type. Requests and responses are now handled in a
generic way regardl...

Unverified

12d90524

Oct 25, 2024
- substrate-offchain: upgrade hyper to v1 (#5919) · 5a142856
  Shoyu Vanilla (Flint) authored 4 months ago
```
Closes #4896
```
  Unverified
  
  5a142856
Oct 24, 2024

Enable approval-voting-parallel by default on kusama (#6218) · c0df223e

Alexandru Gheorghe authored 4 months ago


The approval-voting-parallel introduced with
https://github.com/paritytech/polkadot-sdk/pull/4849 has been tested on
`versi` and approximately 3 weeks on parity's existing kusama nodes
https://github.com/paritytech/devops/issues/3583, things worked as
expected, so enable it by default on all kusama nodes in the next
release.

The next step will be enabling by default on polkadot if no issue
arrises while running on kusama.

---------

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>

Unverified

c0df223e

Oct 15, 2024

fork-aware transaction pool added (#4639) · 26c11fc5

Michal Kucharczyk authored 5 months ago

### Fork-Aware Transaction Pool Implementation

This PR introduces a fork-aware transaction pool (fatxpool) enhancing
transaction management by maintaining the valid state of txpool for
different forks.

### High-level overview
The high level overview was added to
[`sc_transaction_pool::fork_aware_txpool`](https://github.com/paritytech/polkadot-sdk/blob/3ad0a1b7/substrate/client/transaction-pool/src/fork_aware_txpool/mod.rs#L21)
module. Use:
```
cargo doc --document-private-items -p sc-transaction-pool --open
```
to build the doc. It should give a good overview and nice entry point
into the new pool's mechanics.

<details>
<summary>Quick overview (documentation excerpt)</summary>

#### View
For every fork, a view is created. The view is a persisted state of the
transaction pool computed and updated at the tip of the fork. The view
is built around the existing `ValidatedPool` structure.

A view is created on every new best block notification. To create a
view, one of the existing views is chosen and cloned.

When the chain progresses, the view is kept in the cache
(`retracted_views`) to allow building blocks upon intermediary blocks in
the fork.

The views are deleted on finalization: views lower than the finalized
block are removed.

The views are updated with the transactions from the mempool—all
transactions are sent to the newly created views.
A maintain process is also executed for the newly created
views—basically resubmitting and pruning transactions from the
appropriate tree route.

##### View store
View store is the helper structure that acts as a container for all the
views. It provides some convenient methods.

##### Submitting transactions
Every transaction is submitted to every view at the tips of the forks.
Retracted views are not updated.
Every transaction also goes into the mempool.

##### Internal mempool
Shortly, the main purpose of an internal mempool is to prevent a
transaction from being lost. That could happen when a transaction is
invalid on one fork and could be valid on another. It also allows the
txpool to accept transactions when no blocks have been reported yet.

The mempool removes its transactions when they get finalized.
Transactions are also periodically verified on every finalized event and
removed from the mempool if no longer valid.

#### Events
Transaction events from multiple views are merged and filtered to avoid
duplicated events.
`Ready` / `Future` / `Inblock` events are originated in the Views and
are de-duplicated and forwarded to external listeners.
`Finalized` events are originated in fork-aware-txpool logic.
`Invalid` events requires special care and can be originated in both
view and fork-aware-txpool logic.

#### Light maintain
Sometime transaction pool does not have enough time to prepare fully
maintained view with all retracted transactions being revalidated. To
avoid providing empty ready transaction set to block builder (what would
result in empty block) the light maintain was implemented. It simply
removes the imported transactions from ready iterator.

#### Revalidation
Revalidation is performed for every view. The revalidation process is
started after a trigger is executed. The revalidation work is terminated
just after a new best block / finalized event is notified to the
transaction pool.
The revalidation result is applied to the newly created view which is
built upon the revalidated view.

Additionally, parts of the mempool are also revalidated to make sure
that no transactions are stuck in the mempool.

#### Logs
The most important log allowing to understand the state of the txpool
is:
```
maintain: txs:(0, 92) views:[2;[(327, 76, 0), (326, 68, 0)]] event:Finalized { hash: 0x8...f, tree_route: [] } took:3.463522ms
^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^
unwatched txs in mempool ────┘ │ │ │ │ │ │ │ │ │ │
watched txs in mempool ───────┘ │ │ │ │ │ │ │ │ │
views ───────────────┘ │ │ │ │ │ │ │ │
1st view block # ──────────┘ │ │ │ │ │ │ │
number of ready tx ───────┘ │ │ │ │ │ │
numer of future tx ─────┘ │ │ │ │ │
2nd view block # ──────┘ │ │ │ │
number of ready tx ──────────┘ │ │ │
number of future tx ───────┘ │ │
event ────────┘ │
duration ──────────────────────────────────────────────────┘
```
It is logged after the maintenance is done.

The `debug` level enables per-transaction logging, allowing to keep
track of all transaction-related actions that happened in txpool.
</details>

### Integration notes

For teams having a custom node, the new txpool needs to be instantiated,
typically in `service.rs` file, here is an example:

https://github.com/paritytech/polkadot-sdk/blob/9c547ff3

/cumulus/polkadot-omni-node/lib/src/common/spec.rs#L152-L161

To enable new transaction pool the following cli arg shall be specified:
`--pool-type=fork-aware`. If it works, there shall be information
printed in the log:
```
2024-09-20 21:28:17.528 INFO main txpool: [Parachain] creating ForkAware txpool.
````

For debugging the following debugs shall be enabled:
```
"-lbasic-authorship=debug",
"-ltxpool=debug",
```
*note:* trace for txpool enables per-transaction logging.

### Future work
The current implementation seems to be stable, however further
improvements are required.
Here is the umbrella issue for future work:
- https://github.com/paritytech/polkadot-sdk/issues/5472

Partially fixes: #1202

---------

Co-authored-by: Bastian Köcher <git@kchr.de>
Co-authored-by: Sebastian Kunert <skunert49@gmail.com>
Co-authored-by: Iulian Barbu <14218860+iulianbarbu@users.noreply.github.com>

Unverified

26c11fc5

Oct 04, 2024

Remove jaeger everywhere (#5875) · dada6cea

Alexandru Gheorghe authored 5 months ago


Jaeger tracing went mostly unused and it created bigger problems like
wasting CPU or memory leaks, so remove it entirely.

Fixes: https://github.com/paritytech/polkadot-sdk/issues/4995

---------

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>

Unverified

dada6cea

Sep 26, 2024

[5 / 5] Introduce approval-voting-parallel (#4849) · b16237ad

Alexandru Gheorghe authored 5 months ago

This is the implementation of the approach described here:
https://github.com/paritytech/polkadot-sdk/issues/1617#issuecomment-2150321612
&
https://github.com/paritytech/polkadot-sdk/issues/1617#issuecomment-2154357547
&
https://github.com/paritytech/polkadot-sdk/issues/1617#issuecomment-2154721395.

## Description of changes

The end goal is to have an architecture where we have single
subsystem(`approval-voting-parallel`) and multiple worker types that
would full-fill the work that currently is fulfilled by the
`approval-distribution` and `approval-voting` subsystems. The main loop
of the new subsystem would do just the distribution of work to the
workers.

The new subsystem will have:
- N approval-distribution workers: This would do the work that is
currently being done by the approval-distribution subsystem and in
addition to that will also perform the crypto-checks that an assignment
is valid and that a vote is correctly signed. Work is assigned via the
following formula: `worker_index = msg.validator % WORKER_COUNT`, this
guarantees that all assignments and approvals from the same validator
reach the same worker.
- 1 approval-voting worker: This would receive an already valid message
and do everything the approval-voting currently does, except the
crypto-checking that has been moved already to the approval-distribution
worker.

On the hot path of processing messages **no** synchronisation and
waiting is needed between approval-distribution and approval-voting
workers.

## Guidelines for reading

The full implementation is broken in 5 PRs and all of them are
self-contained and improve things incrementally even without the
parallelisation being implemented/enabled, the reason this approach was
taken instead of a big-bang PR, is to make things easier to review and
reduced the risk of breaking this critical subsystems.

After reading the full description of this PR, the changes should be
read in the following order:
1. https://github.com/paritytech/polkadot-sdk/pull/4848, some other
micro-optimizations for networks with a high number of validators. This
change gives us a speed up by itself without any other changes.
2. https://github.com/paritytech/polkadot-sdk/pull/4845 , this contains
only interface changes to decouple the subsystem from the `Context` and
be able to run multiple instances of the subsystem on different threads.
**No functional changes**
3. https://github.com/paritytech/polkadot-sdk/pull/4928, moving of the
crypto checks from approval-voting in approval-distribution, so that the
approval-distribution has no reason to wait after approval-voting
anymore. This change gives us a speed up by itself without any other
changes.
4. https://github.com/paritytech/polkadot-sdk/pull/4846, interface
changes to make approval-voting runnable on a separate thread. **No
functional changes**
5. This PR, where we instantiate an `approval-voting-parallel` subsystem
that runs on different workers the logic currently in
`approval-distribution` and `approval-voting`.
6. The next step after this changes get merged and deploy would be to
bring all the files from approval-distribution, approval-voting,
approval-voting-parallel into a single rust crate, to make it easier to
maintain and understand the structure.

## Results
Running subsystem-benchmarks with 1000 validators 100 fully ocuppied
cores and triggering all assignments and approvals for all tranches

#### Approval does not lags behind.
Master
```
Chain selection approved after 72500 ms hash=0x0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a
```
With this PoC
```
Chain selection approved after 3500 ms hash=0x0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a
```

#### Gathering enough assignments

Enough assignments are gathered in less than 500ms, so that gives un a
guarantee that un-necessary work does not get triggered, on master on
the same benchmark because the subsystems fall behind on work, that
number goes above 32 seconds on master.

#### Cpu usage:
Master
```
CPU usage, seconds total per block
approval-distribution 96.9436 9.6944
approval-voting 117.4676 11.7468
test-environment 44.0092 4.4009
```
With this PoC
```
CPU usage, seconds total per block
approval-distribution 0.0014 0.0001 --- unused
approval-voting 0.0437 0.0044. --- unused
approval-voting-parallel 5.9560 0.5956
approval-voting-parallel-0 22.9073 2.2907
approval-voting-parallel-1 23.0417 2.3042
approval-voting-parallel-2 22.0445 2.2045
approval-voting-parallel-3 22.7234 2.2723
approval-voting-parallel-4 21.9788 2.1979
approval-voting-parallel-5 23.0601 2.3060
approval-voting-parallel-6 22.4805 2.2481
approval-voting-parallel-7 21.8330 2.1833
approval-voting-parallel-db 37.1954 3.7195. --- the approval-voting thread.
```

# Enablement strategy

Because just some trivial plumbing is needed in approval-distribution
and approval-voting to be able to run things in parallel and because
this subsystems plays a critical part in the system this PR proposes
that we keep both ways of running the approval work, as separated
subsystems and just a single subsystem(`approval-voting-parallel`) which
has multiple workers for the distribution work and one worker for the
approval-voting work and switch between them with a comandline flag.

The benefits for this is twofold.
1. With the same polkadot binary we can easily switch just a few
validators to use the parallel approach and gradually make this the
default way of running, if now issues arise.
2. In the worst case scenario were it becomes the default way of running
things, but we discover there are critical issues with it we have the
path to quickly disable it by asking validators to adjust their command
line flags.

# Next steps
- [x] Make sure through various testing we are not missing anything
- [x] Polish the implementations to make them production ready
- [x] Add Unittest Tests for approval-voting-parallel.
- [x] Define and implement the strategy for rolling this change, so that
the blast radius is minimal(single validator) in case there are problems
with the implementation.
- [x] Versi long running tests.
- [x] Add relevant metrics.

@ordian @eskimor @sandreim @AndreiEres

, let me know what you think.

---------

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>

Unverified

b16237ad

Sep 22, 2024

Moved presets to the testnet runtimes (#5327) · 8735c663

Branislav Kontur authored 5 months ago


It is a first step for switching to the `frame-omni-bencher` for CI.

This PR includes several changes related to generating chain specs plus:

- [x] pallet `assigned_slots` fix missing `#[serde(skip)]` for phantom
- [x] pallet `paras_inherent` benchmark fix - cherry-picked from
https://github.com/paritytech/polkadot-sdk/pull/5688
- [x] migrates `get_preset` to the relevant runtimes
- [x] fixes Rococo genesis presets - does not work
https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/7317249
- [x] fixes Rococo benchmarks for CI 
- [x] migrate westend genesis
- [x] remove wococo stuff

Closes: https://github.com/paritytech/polkadot-sdk/issues/5680

## Follow-ups
- Fix for frame-omni-bencher
https://github.com/paritytech/polkadot-sdk/pull/5655
- Enable new short-benchmarking CI -
https://github.com/paritytech/polkadot-sdk/pull/5706
- Remove gitlab pipelines for short benchmarking
- refactor all Cumulus runtimes to use `get_preset` -
https://github.com/paritytech/polkadot-sdk/issues/5704
- https://github.com/paritytech/polkadot-sdk/issues/5705
- https://github.com/paritytech/polkadot-sdk/issues/5700
- [ ] Backport to the stable

---------

Co-authored-by: command-bot <>
Co-authored-by: ordian <noreply@reusable.software>

Unverified

8735c663

Sep 17, 2024

Syncing strategy refactoring (part 2) (#5666) · 43cd6fd4

Nazar Mokrynskyi authored 6 months ago

# Description

Follow-up to https://github.com/paritytech/polkadot-sdk/pull/5469 and
mostly covering https://github.com/paritytech/polkadot-sdk/issues/5333.

The primary change here is that syncing strategy is no longer created
inside of syncing engine, instead syncing strategy is an argument of
syncing engine, more specifically it is an argument to `build_network`
that most downstream users will use. This also extracts addition of
request-response protocols outside of network construction, making sure
they are physically not present when they don't need to be (imagine
syncing strategy that uses none of Substrate's protocols in its
implementation for example).

This technically allows to completely replace syncing strategy with
whatever strategy chain might need.

There will be at least one follow-up PR that will simplify
`SyncingStrategy` trait and other public interfaces to remove mentions
of block/state/warp sync requests, replacing them with generic APIs,
such that strategies where warp sync is not applicable don't have to
provide dummy method implementations, etc.

## Integration

Downstream projects will have to write a bit of boilerplate calling
`build_polkadot_syncing_strategy` function to create previously default
syncing strategy.

## Review Notes

Please review PR through individual commits rather than the final diff,
it will be easier that way. The changes are mostly just moving code
around one step at a time.

# Checklist

* [x] My PR includes a detailed description as outlined in the
"Description" and its two subsections above.
* [x] My PR follows the [labeling requirements](

https://github.com/paritytech/polkadot-sdk/blob/master/docs/contributor/CONTRIBUTING.md#Process
) of this project (at minimum one label for `T` required)
* External contributors: ask maintainers to put the right label on your
PR.
* [x] I have made corresponding changes to the documentation (if
applicable)

Unverified

43cd6fd4

Sep 12, 2024

[4 / 5] Make approval-voting runnable on a worker thread (#4846) · a34cc8df

Alexandru Gheorghe authored 6 months ago


This is part of the work to further optimize the approval subsystems, if
you want to understand the full context start with reading
https://github.com/paritytech/polkadot-sdk/pull/4849#issue-2364261568,

# Description
This PR contain changes to make possible the run of single
approval-voting instance on a worker thread, so that it can be
instantiated by the approval-voting-parallel subsystem.

This does not contain any functional changes it just decouples the
subsystem from the subsystem Context and introduces more specific trait
dependencies for each function instead of all of them requiring a
context.

This change can be merged  independent of the followup PRs.

---------

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>
Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>

Unverified

a34cc8df

Sep 05, 2024

Add benchmark for the number of minimum cpu cores (#5127) · a947cb83

Alexandru Gheorghe authored 6 months ago


Fixes: https://github.com/paritytech/polkadot-sdk/issues/5122.

This PR extends the existing single core `benchmark_cpu` to also build a
score of the entire processor by spawning `EXPECTED_NUM_CORES(8)`
threads and averaging their throughput.

This is better than simply checking the number of cores, because also
covers multi-tenant environments where the OS sees a high number of
available CPUs, but because it has to share it with the rest of his
neighbours its total throughput does not satisfy the minimum
requirements.


## TODO
- [x] Obtain reference values on the reference hardware.

---------

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>

Unverified

a947cb83

Sep 02, 2024

Improve `sc-service` API (#5364) · da654103

Nazar Mokrynskyi authored 6 months ago


This improves `sc-service` API by not requiring the whole
`&Configuration`, using specific configuration options instead.
`RpcConfiguration` was also extracted from `Configuration` to group all
RPC options together.

We don't use Substrate's CLI and would rather not use `Configuration`
either, but some key public functions require it even though they
ignored most of the fields anyway.

`RpcConfiguration` is very helpful not just for consolidation of the
fields, but also to finally make RPC optional for our use case, while
Substrate still runs RPC server on localhost even if listening address
is explicitly set to `None`, which is annoying (and I suspect there is a
reason for it, so didn't want to change the default just yet).

While this is a breaking change, most developers will not notice it if
they use higher-level APIs.

Fixes https://github.com/paritytech/polkadot-sdk/issues/2897

---------

Co-authored-by: Niklas Adolfsson <niklasadolfsson1@gmail.com>

Unverified

da654103

Aug 28, 2024

rpc server: listen to `ipv6 socket` if available and... · 09254eb9

Niklas Adolfsson authored 6 months ago

rpc server: listen to `ipv6 socket` if available and `--experimental-rpc-endpoint` CLI option (#4792)

Close https://github.com/paritytech/polkadot-sdk/issues/3488,
https://github.com/paritytech/polkadot-sdk/issues/4331

This changes/adds the following:

1. The default setting is that substrate starts a rpc server that
listens to localhost both Ipv4 and Ipv6 on the same port. Ipv6 is
allowed to fail because some platforms may not support it
2. A new RPC CLI option `--experimental-rpc-endpoint` which allow to
configure arbitrary listen addresses including the port, if this is
enabled no other interfaces are enabled.
3. If the local addr is not found for any of the sockets the server is
not started throws an error.
4. Remove the deny_unsafe from the RPC implementations instead this is
an extension to allow different polices for different interfaces/sockets
such one may enable unsafe on local interface and safe on only the
external interface.

So for instance in this PR it's now possible to start up three RPC...

Unverified

09254eb9

Aug 23, 2024

Remove the need to wait for target block header in warp sync implementation (#5431) · 6d819a61

Nazar Mokrynskyi authored 6 months ago

I'm not sure if this is exactly what
https://github.com/paritytech/polkadot-sdk/issues/3537 meant, but I
think it should be fine to wait for relay chain before initializing
parachain node fully, which removed the need for background task and
extra hacks throughout the stack just to know where warp sync should
start.

Previously there were both `WarpSyncParams` and `WarpSyncConfig`, but
there was no longer any point in having two data structures, so I
simplified it to just `WarpSyncConfig`.

Fixes https://github.com/paritytech/polkadot-sdk/issues/3537

Unverified

6d819a61

Jul 23, 2024

network/metrics: Expose number of banned peers from peerstore and enable litep2p metrics (#4977) · 7f905e28

Alexandru Vasile authored 7 months ago


This PR extends the metrics exposed by the peerstore with the total
number of banned peers.

The new metric is exposed under
`substrate_sub_libp2p_peerset_num_banned_peers`.

To easily extend metrics in the future, the `fn num_known_peers` is
removed in favor of `fn status`.

While at it, enable the metrics for litep2p:
- total number of peers from peerstore (needed to debug memory
consumption)
- total number of banned peers from peerstore (needed to debug
reputation bans and disconnects)

Have added a couple of tests to validate that the number of banned peers
is exposed properly.

Part of: https://github.com/paritytech/polkadot-sdk/issues/4681


### Testing Done
Using [subp2p-explorer](https://github.com/lexnv/subp2p-explorer) have
submitted random data on tx protocol.
The peer gets banned, the num of banned peers is incremented then the
peer is disconnected.

cc @paritytech/networking

---------

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Co-authored-by: Dmitry Markin <dmitry@markin.tech>

Unverified

7f905e28

Jun 05, 2024

Unify dependency aliases (#4633) · d2fd5364

Oliver Tale-Yazdi authored 9 months ago

Inherited workspace dependencies cannot be renamed by the crate using
them (see [1](https://github.com/rust-lang/cargo/issues/12546),
[2](https://stackoverflow.com/questions/76792343/can-inherited-dependencies-in-rust-be-aliased-in-the-cargo-toml-file)).
Since we want to use inherited workspace dependencies everywhere, we
first need to unify all aliases that we use for a dependency throughout
the workspace.
The umbrella crate is currently excluded from this procedure, since it
should be able to export the crates by their original name without much
hassle.

For example: one crate may alias `parity-scale-codec` to `codec`, while
another crate does not alias it at all. After this change, all crates
have to use `codec` as name. The problematic combinations were:
- conflicting aliases: most crates aliases as `A` but some use `B`.
- missing alias: most of the crates alias a dep but some dont.
- superfluous alias: most crates dont alias a dep but some do.

The script that i used first determines whether most crates opted to
alias a dependency or not. From that info it decides whether to use an
alias or not. If it decided to use an alias, the most common one is used
everywhere.

To reproduce, i used
[this](https://github.com/ggwpez/substrate-scripts/blob/master/uniform-crate-alias.py)
python script in combination with
[this](https://github.com/ggwpez/zepter/blob/38ad10585fe98a5a86c1d2369738bc763a77057b/renames.json)
error output from Zepter.

---------

Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
Co-authored-by: Bastian Köcher <git@kchr.de>

Unverified

d2fd5364

May 30, 2024

Beefy client generic on aduthority Id (#1816) · bcab07a8

drskalman authored 9 months ago


Revived version of https://github.com/paritytech/substrate/pull/13311 .
Except Signature is not generic and is dictated by AuthorityId.

---------

Co-authored-by: Davide Galassi <davxy@datawok.net>
Co-authored-by: Robert Hambrock <roberthambrock@gmail.com>
Co-authored-by: Adrian Catangiu <adrian@parity.io>

Unverified

bcab07a8

May 28, 2024

Add availability-recovery from systematic chunks (#1644) · 523e6256

Alin Dima authored 9 months ago


**Don't look at the commit history, it's confusing, as this branch is
based on another branch that was merged**

Fixes #598 
Also implements [RFC
#47](https://github.com/polkadot-fellows/RFCs/pull/47)

## Description

- Availability-recovery now first attempts to request the systematic
chunks for large POVs (which are the first ~n/3 chunks, which can
recover the full data without doing the costly reed-solomon decoding
process). This has a fallback of recovering from all chunks, if for some
reason the process fails. Additionally, backers are also used as a
backup for requesting the systematic chunks if the assigned validator is
not offering the chunk (each backer is only used for one systematic
chunk, to not overload them).
- Quite obviously, recovering from systematic chunks is much faster than
recovering from regular chunks (4000% faster as measured on my apple M2
Pro).
- Introduces a `ValidatorIndex` -> `ChunkIndex` mapping which is
different for every core, in order to avoid only querying the first n/3
validators over and over again in the same session. The mapping is the
one described in RFC 47.
- The mapping is feature-gated by the [NodeFeatures runtime
API](https://github.com/paritytech/polkadot-sdk/pull/2177) so that it
can only be enabled via a governance call once a sufficient majority of
validators have upgraded their client. If the feature is not enabled,
the mapping will be the identity mapping and backwards-compatibility
will be preserved.
- Adds a new chunk request protocol version (v2), which adds the
ChunkIndex to the response. This may or may not be checked against the
expected chunk index. For av-distribution and systematic recovery, this
will be checked, but for regular recovery, no. This is backwards
compatible. First, a v2 request is attempted. If that fails during
protocol negotiation, v1 is used.
- Systematic recovery is only attempted during approval-voting, where we
have easy access to the core_index. For disputes and collator
pov_recovery, regular chunk requests are used, just as before.

## Performance results

Some results from subsystem-bench:

with regular chunk recovery: CPU usage per block 39.82s
with recovery from backers: CPU usage per block 16.03s
with systematic recovery: CPU usage per block 19.07s

End-to-end results here:
https://github.com/paritytech/polkadot-sdk/issues/598#issuecomment-1792007099

#### TODO:

- [x] [RFC #47](https://github.com/polkadot-fellows/RFCs/pull/47)
- [x] merge https://github.com/paritytech/polkadot-sdk/pull/2177 and
rebase on top of those changes
- [x] merge https://github.com/paritytech/polkadot-sdk/pull/2771 and
rebase
- [x] add tests
- [x] preliminary performance measure on Versi: see
https://github.com/paritytech/polkadot-sdk/issues/598#issuecomment-1792007099
- [x] Rewrite the implementer's guide documentation
- [x] https://github.com/paritytech/polkadot-sdk/pull/3065 
- [x] https://github.com/paritytech/zombienet/issues/1705 and fix
zombienet tests
- [x] security audit
- [x] final versi test and performance measure

---------

Signed-off-by: alindima <alin@parity.io>
Co-authored-by: Javier Viola <javier@parity.io>

Unverified

523e6256

May 27, 2024

`sc-chain-spec`: deprecated code removed (#4410) · 2d3a6932

Michal Kucharczyk authored 9 months ago

This PR removes deprecated code:
- The `RuntimeGenesisConfig` generic type parameter in
`GenericChainSpec` struct.
- `ChainSpec::from_genesis` method allowing to create chain-spec using
closure providing runtime genesis struct
- `GenesisSource::Factory` variant together with no longer needed
`GenesisSource`'s generic parameter `G` (which was intended to be a
runtime genesis struct).


https://github.com/paritytech/polkadot-sdk/blob/17b56fae/substrate/client/chain-spec/src/chain_spec.rs#L559-L563

Unverified

2d3a6932

May 24, 2024

availability-recovery: bump chunk fetch threshold to 1MB for Polkadot and 4MB... · f469fbfb

Andrei Sandu authored 9 months ago

availability-recovery: bump chunk fetch threshold to 1MB for Polkadot and 4MB for Kusama + testnets (#4399)

Doing this change ensures that we minimize the CPU usage we spend in
reed-solomon by only doing the re-encoding into chunks if PoV size is
less than 4MB (which means all PoVs right now)
 
Based on susbystem benchmark results we concluded that it is safe to
bump this number higher. At worst case scenario the network pressure for
a backing group of 5 is around 25% of the network bandwidth in hw specs.

Assuming 6s block times (max_candidate_depth 3) and needed_approvals 30
the amount of bandwidth usage of a backing group used would hover above
`30 * 4 * 3 = 360MB` per relay chain block. Given a backing group of 5
that gives 72MB per block per validator -> 12 MB/s.

<details>
<summary>Reality check on Kusama PoV sizes (click for chart)</summary>
<br>
<img width="697" alt="Screenshot 2024-05-07 at 14 30 38"
src="https://github.com/paritytech/polkadot-sdk/assets/54316454/bfed32d4-8623-48b0-9ec0-8b95dd2a9d8c">
</details>

---------

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>

Unverified

f469fbfb

May 07, 2024
- chore: fix typos (#4395) · 29c8130b
  jimdssd authored 10 months ago
  
  Unverified
  
  29c8130b
Apr 24, 2024

Plumbing to increase pvf workers configuration based on chain id (#4252) · 9a0049d0

Alexandru Gheorghe authored 10 months ago


Part of https://github.com/paritytech/polkadot-sdk/issues/4126 we want
to safely increase the execute_workers_max_num gradually from chain to
chain and assess if there are any negative impacts.

This PR performs the necessary plumbing to be able to increase it based
on the chain id, it increase the number of execution workers from 2 to 4
on test network but lives kusama and polkadot unchanged until we gather
more data.

---------

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>

Unverified

9a0049d0

Apr 08, 2024

Integrate litep2p into Polkadot SDK (#2944) · 80616f6d

Aaro Altonen authored 11 months ago


[litep2p](https://github.com/altonen/litep2p) is a libp2p-compatible P2P
networking library. It supports all of the features of `rust-libp2p`
that are currently being utilized by Polkadot SDK.

Compared to `rust-libp2p`, `litep2p` has a quite different architecture
which is why the new `litep2p` network backend is only able to use a
little of the existing code in `sc-network`. The design has been mainly
influenced by how we'd wish to structure our networking-related code in
Polkadot SDK: independent higher-levels protocols directly communicating
with the network over links that support bidirectional backpressure. A
good example would be `NotificationHandle`/`RequestResponseHandle`
abstractions which allow, e.g., `SyncingEngine` to directly communicate
with peers to announce/request blocks.

I've tried running `polkadot --network-backend litep2p` with a few
different peer configurations and there is a noticeable reduction in
networking CPU usage. For high load (`--out-peers 200`), networking CPU
usage goes down from ~110% to ~30% (80 pp) and for normal load
(`--out-peers 40`), the usage goes down from ~55% to ~18% (37 pp).

These should not be taken as final numbers because:

a) there are still some low-hanging optimization fruits, such as
enabling [receive window
auto-tuning](https://github.com/libp2p/rust-yamux/pull/176), integrating
`Peerset` more closely with `litep2p` or improving memory usage of the
WebSocket transport
b) fixing bugs/instabilities that incorrectly cause `litep2p` to do less
work will increase the networking CPU usage
c) verification in a more diverse set of tests/conditions is needed

Nevertheless, these numbers should give an early estimate for CPU usage
of the new networking backend.

This PR consists of three separate changes:
* introduce a generic `PeerId` (wrapper around `Multihash`) so that we
don't have use `NetworkService::PeerId` in every part of the code that
uses a `PeerId`
* introduce `NetworkBackend` trait, implement it for the libp2p network
stack and make Polkadot SDK generic over `NetworkBackend`
  * implement `NetworkBackend` for litep2p

The new library should be considered experimental which is why
`rust-libp2p` will remain as the default option for the time being. This
PR currently depends on the master branch of `litep2p` but I'll cut a
new release for the library once all review comments have been
addresses.

---------

Signed-off-by: Alexandru Vasile <alexandru.vasile@parity.io>
Co-authored-by: Dmitry Markin <dmitry@markin.tech>
Co-authored-by: Alexandru Vasile <60601340+lexnv@users.noreply.github.com>
Co-authored-by: Alexandru Vasile <alexandru.vasile@parity.io>

Unverified

80616f6d

Apr 02, 2024

beefy: error logs for validators with dummy keys (#3939) · 5eff3f94

Adrian Catangiu authored 11 months ago


This outputs:
```
2024-04-02 14:36:02.135 ERROR tokio-runtime-worker beefy: 🥩 for session starting at block 21990151
no BEEFY authority key found in store, you must generate valid session keys
(https://wiki.polkadot.network/docs/maintain-guides-how-to-validate-polkadot#generating-the-session-keys)
```
error log entry, once every session, for nodes running with
`Role::Authority` that have no public BEEFY key in their keystore

---------

Co-authored-by: Bastian Köcher <git@kchr.de>

Unverified

5eff3f94

Mar 22, 2024

Make public addresses go first in authority discovery DHT records (#3757) · 9d2963c2

Dmitry Markin authored 1 year ago

Make sure explicitly set by the operator public addresses go first in
the authority discovery DHT records.

Also update `Discovery` behavior to eliminate duplicates in the returned
addresses.

This PR should improve situation with
https://github.com/paritytech/polkadot-sdk/issues/3519.

Obsoletes https://github.com/paritytech/polkadot-sdk/pull/3657.

Unverified

9d2963c2

Feb 28, 2024

Collator overseer builder unification (#3335) · 7ec0b874

maksimryndin authored 1 year ago


resolve https://github.com/paritytech/polkadot-sdk/issues/3116

a follow-up on
https://github.com/paritytech/polkadot-sdk/pull/3061#pullrequestreview-1847530265:

- [x] reuse collator overseer builder for polkadot-node and collator
- [x] run zombienet test (0001-parachains-smoke-test.toml)
- [x] make wasm build errors more user-friendly for an easier problem
detection when using different toolchains in Rust

---------

Co-authored-by: ordian <write@reusable.software>
Co-authored-by: s0me0ne-unkn0wn <48632512+s0me0ne-unkn0wn@users.noreply.github.com>

Unverified

7ec0b874

Jan 29, 2024

Do not run unneeded subsystems on collator and its alongside node (#3061) · 3e8139e7

s0me0ne-unkn0wn authored 1 year ago

Currently, collators and their alongside nodes spin up a full-scale
overseer running a bunch of subsystems that are not needed if the node
is not a validator. That was considered to be harmless; however, we've
got problems with unused subsystems getting stalled for a reason not
currently known, resulting in the overseer exiting and bringing down the
whole node.

This PR aims to only run needed subsystems on such nodes, replacing the
rest with `DummySubsystem`.

It also enables collator-optimized availability recovery subsystem
implementation.

Partially solves #1730.

Unverified

3e8139e7

Jan 12, 2024

Extract warp sync strategy from `ChainSync` (#2467) · 5208bed7

Dmitry Markin authored 1 year ago


Extract `WarpSync` (and `StateSync` as part of warp sync) from
`ChainSync` as independent syncing strategy called by `SyncingEngine`.
Introduce `SyncingStrategy` enum as a proxy between `SyncingEngine` and
specific syncing strategies.

## Limitations
Gap sync is kept in `ChainSync` for now because it shares the same set
of peers as block syncing implementation in `ChainSync`. Extraction of a
common context responsible for peer management in syncing strategies
able to run in parallel is planned for a follow-up PR.

## Further improvements
A possibility of conversion of `SyncingStartegy` into a trait should be
evaluated. The main stopper for this is that different strategies need
to communicate different actions to `SyncingEngine` and respond to
different events / provide different APIs (e.g., requesting
justifications is only possible via `ChainSync` and not through
`WarpSync`; `SendWarpProofRequest` action is only relevant to
`WarpSync`, etc.)

---------

Co-authored-by: Aaro Altonen <48052676+altonen@users.noreply.github.com>

Unverified

5208bed7

Jan 04, 2024

Update missing worker binaries error (#2853) · e07476e3

Marcin S. authored 1 year ago

Please let me know if this would be a better UX. Should we have specific
links in the error message?

Unverified

e07476e3

Dec 05, 2023
- PVF: Add Secure Validator Mode (#2486) · c046a9d5
  Marcin S. authored 1 year ago
```
Co-authored-by: Javier Viola <javier@parity.io>
```
  Unverified
  
  c046a9d5
Nov 28, 2023

Rework the event system of `sc-network` (#1370) · e71c484d

Aaro Altonen authored 1 year ago


This commit introduces a new concept called `NotificationService` which
allows Polkadot protocols to communicate with the underlying
notification protocol implementation directly, without routing events
through `NetworkWorker`. This implies that each protocol has its own
service which it uses to communicate with remote peers and that each
`NotificationService` is unique with respect to the underlying
notification protocol, meaning `NotificationService` for the transaction
protocol can only be used to send and receive transaction-related
notifications.

The `NotificationService` concept introduces two additional benefits:
  * allow protocols to start using custom handshakes
  * allow protocols to accept/reject inbound peers

Previously the validation of inbound connections was solely the
responsibility of `ProtocolController`. This caused issues with light
peers and `SyncingEngine` as `ProtocolController` would accept more
peers than `SyncingEngine` could accept which caused peers to have
differing views of their own states. `SyncingEngine` would reject excess
peers but these rejections were not properly communicated to those peers
causing them to assume that they were accepted.

With `NotificationService`, the local handshake is not sent to remote
peer if peer is rejected which allows it to detect that it was rejected.

This commit also deprecates the use of `NetworkEventStream` for all
notification-related events and going forward only DHT events are
provided through `NetworkEventStream`. If protocols wish to follow each
other's events, they must introduce additional abtractions, as is done
for GRANDPA and transactions protocols by following the syncing protocol
through `SyncEventStream`.

Fixes https://github.com/paritytech/polkadot-sdk/issues/512
Fixes https://github.com/paritytech/polkadot-sdk/issues/514
Fixes https://github.com/paritytech/polkadot-sdk/issues/515
Fixes https://github.com/paritytech/polkadot-sdk/issues/554
Fixes https://github.com/paritytech/polkadot-sdk/issues/556

---
These changes are transferred from
https://github.com/paritytech/substrate/pull/14197 but there are no
functional changes compared to that PR

---------

Co-authored-by: Dmitry Markin <dmitry@markin.tech>
Co-authored-by: Alexandru Vasile <60601340+lexnv@users.noreply.github.com>

Unverified

e71c484d

polkadot: remove grandpa pause support (#2511) · b0b4451f
André Silva authored 1 year ago
```
This was never used and we probably don't need it anyway.
```
Unverified

b0b4451f

polkadot: disable block authoring backoff on production networks (#2510) · 58a1f9c5

André Silva authored 1 year ago

Currently the polkadot node will backoff from block authoring if
finality starts lagging. This PR disables this mechanism on production
networks (polkadot and kusama) and adds a flags to optionally force
enabling it.

Unverified

58a1f9c5

Nov 23, 2023

sp-api: Move macro related re-exports to `__private` (#2446) · 21f1811c

Bastian Köcher authored 1 year ago

This moves the macro related re-exports to `__private` to make it more
obvious for downstream users that they are using an internal api.

---------

Co-authored-by: command-bot <>

Unverified

21f1811c

Nov 03, 2023

substrate: sysinfo: Expose failed hardware requirements (#2144) · dca14239

Alexandru Gheorghe authored 1 year ago


The check_hardware functions does not give us too much information as to
what is failing, so let's return the list of failed metrics, so that callers can print 
it.

This would make debugging easier, rather than try to guess which
dimension is actually failing.

Signed-off-by: Alexandru Gheorghe <alexandru.gheorghe@parity.io>

Unverified

dca14239

Oct 18, 2023

Start BEEFY client by default for Polkadot nodes (#1913) · 6c39bb4a

Serban Iorga authored 1 year ago

Fellowship companion:
https://github.com/polkadot-fellows/runtimes/pull/65

This starts the BEEFY client by default for Polkadot nodes.

Governance/sudo call is later required to enable/start consensus.

Part of https://github.com/paritytech/parity-bridges-common/issues/2420

Unverified

6c39bb4a

Sep 27, 2023

Migrate polkadot-primitives to v6 (#1543) · 7cbe0c76

Chris Sosnin authored 1 year ago


- Async-backing related primitives are stable `primitives::v6`
- Async-backing API is now part of `api_version(7)`
- It's enabled on Rococo and Westend runtimes

---------

Signed-off-by: Andrei Sandu <andrei-mihail@parity.io>
Co-authored-by: Andrei Sandu <54316454+sandreim@users.noreply.github.com>

Unverified

7cbe0c76

Sep 19, 2023

Remove Polkadot & Kusama native runtime (#1304) · 6079b6dd

Bastian Köcher authored 1 year ago

This pull request removes the Polkadot and Kusama native runtime from
the polkadot node. This brings some implications with it:

There are no more kusama/polkadot-dev chain specs available. We will
need to write some tooling in the fellowship repo to provide them
easily.

The try-runtime job for polkadot & kusama is not available anymore as we
don't have the dev chain specs anymore.

Certain benchmarking commands will also not work until we migrate them
to use a runtime api.

Some crates in utils are still depending on the polkadot/kusama native
runtime that will also need to be fixed.

Port of: https://github.com/paritytech/polkadot/pull/7467

Unverified

6079b6dd

Sep 15, 2023

Modular block request handler (#1524) · b35b28ca

Rahul Subramaniyam authored 1 year ago


Submit the outstanding PRs from the old repos(these were already
reviewed and approved before the repo rorg, but not yet submitted):
Main PR: https://github.com/paritytech/substrate/pull/14014
Companion PRs: https://github.com/paritytech/polkadot/pull/7134,
https://github.com/paritytech/cumulus/pull/2489

The changes in the PR:
1. ChainSync currently calls into the block request handler directly.
Instead, move the block request handler behind a trait. This allows new
protocols to be plugged into ChainSync.
2. BuildNetworkParams is changed so that custom relay protocol
implementations can be (optionally) passed in during network creation
time. If custom protocol is not specified, it defaults to the existing
block handler
3. BlockServer and BlockDownloader traits are introduced for the
protocol implementation. The existing block handler has been changed to
implement these traits
4. Other changes:
[X] Make TxHash serializable. This is needed for exchanging the
serialized hash in the relay protocol messages
[X] Clean up types no longer used(OpaqueBlockRequest,
OpaqueBlockResponse)

---------

Co-authored-by: Dmitry Markin <dmitry@markin.tech>
Co-authored-by: command-bot <>

Unverified

b35b28ca