Commits · 85d0b2b1b0c91de59939477d87f233586c268f52 · parity / Mirrored projects / polkadot-sdk

Apr 19, 2024

Pvf refactor execute worker errors follow up (#4071) · 4eabe5e0

maksimryndin authored Apr 19, 2024

follow up of https://github.com/paritytech/polkadot-sdk/pull/2604
closes https://github.com/paritytech/polkadot-sdk/pull/2604

- [x] take relevant changes from Marcin's PR 
- [x] extract common duplicate code for workers (low-hanging fruits)

~Some unpassed ci problems are more general and should be fixed in
master (see https://github.com/paritytech/polkadot-sdk/pull/4074)~



Proposed labels: **T0-node**, **R0-silent**, **I4-refactor**

-----

kusama address: FZXVQLqLbFV2otNXs6BMnNch54CFJ1idpWwjMb3Z8fTLQC6

---------

Co-authored-by: s0me0ne-unkn0wn <[email protected]>

4eabe5e0

Use higher priority for PVF preparation in dispute/approval context (#4172) · 04a9071e

Andrei Sandu authored Apr 19, 2024

Related to https://github.com/paritytech/polkadot-sdk/issues/4126


discussion

Currently all preparations have same priority and this is not ideal in
all cases. This change should improve the finality time in the context
of on-demand parachains and when `ExecutorParams` are updated on-chain
and a rebuild of all artifacts is required. The desired effect is to
speed up approval and dispute PVF executions which require preparation
and delay backing executions which require preparation.

---------

Signed-off-by: Andrei Sandu <[email protected]>

04a9071e

Apr 18, 2024

approval-voting: Make sure we always mark approved candidates approved in a... · 37e338f0

Alexandru Gheorghe authored Apr 18, 2024

approval-voting: Make sure we always mark approved candidates approved in a different relay chain context (#4153)

... see for more detail why this is needed

https://github.com/paritytech/polkadot-sdk/issues/4149#issuecomment-2058472444

## TODO:
- [x] Unittests
- [x] Replicate scenario from
https://github.com/paritytech/polkadot-sdk/issues/4149 and confirm this
fixes it: https://github.com/paritytech/polkadot-sdk/issues/4149

 [
Replicated on a zombienet with some hacked nodes, that we can end up in
this state where no-wake is schedule and the nodes are pending new
assignments]

---------

Signed-off-by: Alexandru Gheorghe <[email protected]>
Co-authored-by: Andrei Sandu <[email protected]>

37e338f0

chain-selection: allow reverting current block (#4103) · 91d4a207

ordian authored Apr 18, 2024

Block reversion of the current block is technically possible as can be
seen from

https://github.com/paritytech/polkadot-sdk/blob/39b1f50f1c251def87c1625d68567ed252dc6272/polkadot/runtime/parachains/src/disputes.rs#L1215-L1223

- [x] Fix the test

91d4a207

[ci] Update ci image with rust 1.77 and 2024-04-10 (#4077) · 76719da2

Alexander Samusev authored Apr 18, 2024

cc https://github.com/paritytech/ci_cd/issues/974



---------

Co-authored-by: command-bot <>
Co-authored-by: Bastian Köcher <[email protected]>

76719da2

Apr 16, 2024

move fragment_tree module to its own folder (#4148) · 4b5c3fd0

Alin Dima authored Apr 16, 2024

Will make https://github.com/paritytech/polkadot-sdk/pull/4035 easier to
review (the mentioned PR already does this move so the diff will be
clearer).

Also called out as part of:
https://github.com/paritytech/polkadot-sdk/pull/3233#discussion_r1490867383

4b5c3fd0

Apr 12, 2024

Runtime API: introduce `candidates_pending_availability` (#4027) · 2dfe5f74

Andrei Sandu authored Apr 12, 2024

Fixes https://github.com/paritytech/polkadot-sdk/issues/3576



Required by elastic scaling collators.
Deprecates old API: `candidate_pending_availability`.

TODO:
- [x] PRDoc

---------

Signed-off-by: Andrei Sandu <[email protected]>

2dfe5f74

Apr 08, 2024

Deprecate `para_id()` from `CoreState` in polkadot primitives (#3979) · 59f868d1

Tsvetomir Dimitrov authored Apr 08, 2024

With Coretime enabled we can no longer assume there is a static 1:1
mapping between core index and para id. This mapping should be obtained
from the scheduler/claimqueue on block by block basis.

This PR modifies `para_id()` (from `CoreState`) to return the scheduled
`ParaId` for occupied cores and removes its usages in the code.

Closes https://github.com/paritytech/polkadot-sdk/issues/3948



---------

Co-authored-by: Andrei Sandu <[email protected]>

59f868d1

Fix some typos (#4018) · bd4471b4
HongKuang authored Apr 08, 2024
```
Signed-off-by: hongkuang <[email protected]>
```
bd4471b4

Apr 05, 2024
- chore: fix some comments (#4004) · a0eed0a6
  divdeploy authored Apr 06, 2024
```
Signed-off-by: divdeploy <[email protected]>
```
  a0eed0a6
Apr 02, 2024

Align dependencies with `parity-bridges-common` (#3937) · 8e95a3e1

Serban Iorga authored Apr 02, 2024

Working towards migrating the `parity-bridges-common` repo inside
`polkadot-sdk`. This PR upgrades some dependencies in order to align
them with the versions used in `parity-bridges-common`

Related to
https://github.com/paritytech/parity-bridges-common/issues/2538

8e95a3e1

Apr 01, 2024

primitives: Move out of staging released APIs (#3925) · d6f68bb9

Alexandru Gheorghe authored Apr 01, 2024



Runtime release 1.2 includes bumping of the ParachainHost APIs up to
v10, so let's move all the released APIs out of vstaging folder, this PR
does not include any logic changes only renaming of the modules and some
moving around.

Signed-off-by: Alexandru Gheorghe <[email protected]>

d6f68bb9

Mar 28, 2024
- bugfix: request fragment tree membership for all candidates (#3874) · 6a0859eb
  Alin Dima authored Mar 28, 2024
  
  6a0859eb
Mar 26, 2024

fix regression in approval-voting introduced in #3747 (#3831) · 3fc5b826

ordian authored Mar 26, 2024



Fixes #3826.

The docs on the `candidates` field of `BlockEntry` were incorrectly
stating that they are sorted by core index. The (incorrect) optimization
was introduced in #3747 based on this assumption. The actual ordering is
based on `CandidateIncluded` events ordering in the runtime. We revert
this optimization here.

- [x] verify the underlying issue
- [x] add a regression test

---------

Co-authored-by: Bastian Köcher <[email protected]>

3fc5b826

Fix spelling mistakes across the whole repository (#3808) · 002d9260

Dcompoze authored Mar 26, 2024

**Update:** Pushed additional changes based on the review comments.

**This pull request fixes various spelling mistakes in this
repository.**

Most of the changes are contained in the first **3** commits:

- `Fix spelling mistakes in comments and docs`

- `Fix spelling mistakes in test names`

- `Fix spelling mistakes in error messages, panic messages, logs and
tracing`

Other source code spelling mistakes are separated into individual
commits for easier reviewing:

- `Fix the spelling of 'authority'`

- `Fix the spelling of 'REASONABLE_HEADERS_IN_JUSTIFICATION_ANCESTRY'`

- `Fix the spelling of 'prev_enqueud_messages'`

- `Fix the spelling of 'endpoint'`

- `Fix the spelling of 'children'`

- `Fix the spelling of 'PenpalSiblingSovereignAccount'`

- `Fix the spelling of 'PenpalSudoAccount'`

- `Fix the spelling of 'insufficient'`

- `Fix the spelling of 'PalletXcmExtrinsicsBenchmark'`

- `Fix the spelling of 'subtracted'`

- `Fix the spelling of 'CandidatePendingAvailability'`

- `Fix the spelling of 'exclusive'`

- `Fix the spelling of 'until'`

- `Fix the spelling of 'discriminator'`

- `Fix the spelling of 'nonexistent'`

- `Fix the spelling of 'subsystem'`

- `Fix the spelling of 'indices'`

- `Fix the spelling of 'committed'`

- `Fix the spelling of 'topology'`

- `Fix the spelling of 'response'`

- `Fix the spelling of 'beneficiary'`

- `Fix the spelling of 'formatted'`

- `Fix the spelling of 'UNKNOWN_PROOF_REQUEST'`

- `Fix the spelling of 'succeeded'`

- `Fix the spelling of 'reopened'`

- `Fix the spelling of 'proposer'`

- `Fix the spelling of 'InstantiationNonce'`

- `Fix the spelling of 'depositor'`

- `Fix the spelling of 'expiration'`

- `Fix the spelling of 'phantom'`

- `Fix the spelling of 'AggregatedKeyValue'`

- `Fix the spelling of 'randomness'`

- `Fix the spelling of 'defendant'`

- `Fix the spelling of 'AquaticMammal'`

- `Fix the spelling of 'transactions'`

- `Fix the spelling of 'PassingTracingSubscriber'`

- `Fix the spelling of 'TxSignaturePayload'`

- `Fix the spelling of 'versioning'`

- `Fix the spelling of 'descendant'`

- `Fix the spelling of 'overridden'`

- `Fix the spelling of 'network'`

Let me know if this structure is adequate.

**Note:** The usage of the words `Merkle`, `Merkelize`, `Merklization`,
`Merkelization`, `Merkleization`, is somewhat inconsistent but I left it
as it is.

~~**Note:** In some places the term `Receival` is used to refer to
message reception, IMO `Reception` is the correct word here, but I left
it as it is.~~

~~**Note:** In some places the term `Overlayed` is used instead of the
more acceptable version `Overlaid` but I also left it as it is.~~

~~**Note:** In some places the term `Applyable` is used instead of the
correct version `Applicable` but I also left it as it is.~~

**Note:** Some usage of British vs American english e.g. `judgement` vs
`judgment`, `initialise` vs `initialize`, `optimise` vs `optimize` etc.
are both present in different places, but I suppose that's
understandable given the number of contributors.

~~**Note:** There is a spelling mistake in `.github/CODEOWNERS` but it
triggers errors in CI when I make changes to it, so I left it as it
is.~~

002d9260

Mar 25, 2024

elastic scaling: preserve candidate ordering in provisioner (#3778) · c6f7ccf5
Alin Dima authored Mar 25, 2024
```
https://github.com/paritytech/polkadot-sdk/issues/3742
```
c6f7ccf5

[Bridges] Move chain definitions to separate folder (#3822) · 0711729d

Serban Iorga authored Mar 25, 2024

Related to
https://github.com/paritytech/parity-bridges-common/issues/2538

This PR doesn't contain any functional changes. 

The PR moves specific bridged chain definitions from
`bridges/primitives` to `bridges/chains` folder in order to facilitate
the migration of the `parity-bridges-repo` into `polkadot-sdk` as
discussed in https://hackmd.io/LprWjZ0bQXKpFeveYHIRXw?view

Apart from this it also includes some cosmetic changes to some
`Cargo.toml` files as a result of running `diener workspacify`.

0711729d

Mar 21, 2024

approval-voting: remove some inefficiences on startup (#3747) · 64a707a4

ordian authored Mar 21, 2024

Small refactoring to reduce the algorithmic complexity of the initial
message distribution in approval voting after a sync from O(n_candidates
^ 2) to O(n_candidates).

64a707a4

Mar 20, 2024

Expose `ClaimQueue` via a runtime api and use it in `collation-generation` (#3580) · e58e854a

Tsvetomir Dimitrov authored Mar 20, 2024

The PR adds two things:
1. Runtime API exposing the whole claim queue
2. Consumes the API in `collation-generation` to fetch the next
scheduled `ParaEntry` for an occupied core.

Related to https://github.com/paritytech/polkadot-sdk/issues/1797

e58e854a

Mar 15, 2024

collator protocol changes for elastic scaling (validator side) (#3302) · 02e1a7f4

ordian authored Mar 15, 2024

Fixes #3128.

This introduces a new variant for the collation response from the
collator that includes the parent head data. For now, collators won't
send this new variant. We'll need to change the collator side of the
collator protocol to detect all the cores assigned to a para and send
the parent head data in the case when it's more than 1 core.

- [x] validate approach
- [x] check head data hash

02e1a7f4

Mar 12, 2024

Add api-name in `cannot query the runtime API version` warning (#3653) · 1ead5977

Alexandru Gheorghe authored Mar 12, 2024



Sometimes we see nodes printing this warning:
```
cannot query the runtime API version: Api called for an unknown Block: State already discarded for
```

The log is harmless, but let's print the api we got this for, so that we
can track its call site and truly confirm it is harmless or fix it.

Signed-off-by: Alexandru Gheorghe <[email protected]>

1ead5977

Add a PolkaVM-based executor (#3458) · b0f34e4b

Koute authored Mar 12, 2024

This PR adds a new PolkaVM-based executor to Substrate.

- The executor can now be used to actually run a PolkaVM-based runtime,
and successfully produces blocks.
- The executor is always compiled-in, but is disabled by default.
- The `SUBSTRATE_ENABLE_POLKAVM` environment variable must be set to `1`
to enable the executor, in which case the node will accept both WASM and
PolkaVM program blobs (otherwise it'll default to WASM-only). This is
deliberately undocumented and not explicitly exposed anywhere (e.g. in
the command line arguments, or in the API) to disincentivize anyone from
enabling it in production. If/when we'll move this into production usage
I'll remove the environment variable and do it "properly".
- I did not use our legacy runtime allocator for the PolkaVM executor,
so currently every allocation inside of the runtime will leak guest
memory until that particular instance is destroyed. The idea here is
that I will work on the https://github.com/polkadot-fellows/RFCs/pull/4
which will remove the need for the legacy allocator under WASM, and that
will also allow us to use a proper non-leaking allocator under PolkaVM.
- I also did some minor cleanups of the WASM executor and deleted some
dead code.

No prdocs included since this is not intended to be an end-user feature,
but an unofficial experiment, and shouldn't affect any current
production user. Once this is production-ready a full Polkadot
Fellowship RFC will be necessary anyway.

b0f34e4b

Mar 08, 2024

fix some typos (#3587) · ea458d0b

cuinix authored Mar 09, 2024



Signed-off-by: cuinix <[email protected]>
Co-authored-by: Bastian Köcher <[email protected]>

ea458d0b

Mar 07, 2024

move substrate-bip39 into polkadot-sdk (#3579) · 30c32e3d

André Silva authored Mar 07, 2024

Moves [substrate-bip39](https://github.com/paritytech/substrate-bip39)
into substrate. All git history is preserved. Dependencies have been
updated to use the same version as the rest of the repo.

Fixes https://github.com/paritytech/polkadot-sdk/issues/1934

.

---------

Co-authored-by: Maciej Hirsz <[email protected]>
Co-authored-by: Maciej Hirsz <[email protected]>
Co-authored-by: Gav Wood <[email protected]>
Co-authored-by: Stanislav Tkach <[email protected]>
Co-authored-by: Robert Habermeier <[email protected]>
Co-authored-by: Pierre Krieger <[email protected]>
Co-authored-by: Demi M. Obenour <[email protected]>
Co-authored-by: Bastian Köcher <[email protected]>
Co-authored-by: NikVolf <[email protected]>
Co-authored-by: Bastian Köcher <[email protected]>
Co-authored-by: Benjamin Kampmann <[email protected]>
Co-authored-by: Maciej Hirsz <[email protected]>
Co-authored-by: cheme <[email protected]>
Co-authored-by: adoerr <[email protected]>
Co-authored-by: Jun Jiang <[email protected]>
Co-authored-by: Dan Shields <[email protected]>
Co-authored-by: Michal Kucharczyk <[email protected]>

30c32e3d

Mar 01, 2024

provisioner: allow multiple cores assigned to the same para (#3233) · 62b78a16

Alin Dima authored Mar 01, 2024

https://github.com/paritytech/polkadot-sdk/issues/3130

builds on top of https://github.com/paritytech/polkadot-sdk/pull/3160

Processes the availability cores and builds a record of how many
candidates it should request from prospective-parachains and their
predecessors.
Tries to supply as many candidates as the runtime can back. Note that
the runtime changes to back multiple candidates per para are not yet
done, but this paves the way for it.

The following backing/inclusion policy is assumed:
1. the runtime will never back candidates of the same para which don't
form a chain with the already backed candidates. Even if the others are
still pending availability. We're optimistic that they won't time out
and we don't want to back parachain forks (as the complexity would be
huge).
2. if a candidate is timed out of the core before being included, all of
its successors occupying a core will be evicted.
3. only the candidates which are made available and form a chain
starting from the on-chain para head may be included/enacted and cleared
from the cores. In other words, if para head is at A and the cores are
occupied by B->C->D, and B and D are made available, only B will be
included and its core cleared. C and D will remain on the cores awaiting
for C to be made available or timed out. As point (2) above already
says, if C is timed out, D will also be dropped.
4. The runtime will deduplicate candidates which form a cycle. For
example if the provisioner supplies candidates A->B->A, the runtime will
only back A (as the state output will be the same)

Note that if a candidate is timed out, we don't guarantee that in the
next relay chain block the block author will be able to fill all of the
timed out cores of the para. That increases complexity by a lot.
Instead, the provisioner will supply N candidates where N is the number
of candidates timed out, but doesn't include their successors which will
be also deleted by the runtime. This'll be backfilled in the next relay
chain block.

Adjacent changes:
- Also fixes: https://github.com/paritytech/polkadot-sdk/issues/3141
- For non prospective-parachains, don't supply multiple candidates per
para (we can't have elastic scaling without prospective parachains
enabled). paras_inherent should already sanitise this input but it's
more efficient this way.

Note: all of these changes are backwards-compatible with the
non-elastic-scaling scenario (one core per para).

62b78a16

Feb 29, 2024

Fix accidental no-shows on node restart (#3277) · 761937ec

Alexandru Gheorghe authored Feb 29, 2024



If approval was in progress we didn't actually restart it, so we end up
in a situation where we distribute our assignment, but we don't
distribute any approval.

---------

Signed-off-by: Alexandru Gheorghe <[email protected]>

761937ec

Feb 28, 2024

PVF: re-preparing artifact on failed runtime construction (#3187) · 42613667

maksimryndin authored Feb 28, 2024

resolve https://github.com/paritytech/polkadot-sdk/issues/3139

- [x] use a distinguishable error for `execute_artifact`
- [x] remove artifact in case of a `RuntimeConstruction` error during
the execution
- [x] augment the `validate_candidate_with_retry` of `ValidationBackend`
with the case of retriable `RuntimeConstruction` error during the
execution
- [x] update the book
(https://paritytech.github.io/polkadot-sdk/book/node/utility/pvf-host-and-workers.html#retrying-execution-requests

)
- [x] add a test
- [x] run zombienet tests

---------

Co-authored-by: s0me0ne-unkn0wn <[email protected]>

42613667

Feb 23, 2024

Runtime: allow backing multiple candidates of same parachain on different cores (#3231) · 2431001e

Andrei Sandu authored Feb 23, 2024

Fixes https://github.com/paritytech/polkadot-sdk/issues/3144

Builds on top of https://github.com/paritytech/polkadot-sdk/pull/3229

### Summary
Some preparations for Runtime to support elastic scaling, guarded by
config node features bit `FeatureIndex::ElasticScalingMVP`. This PR
introduces a per-candidate `CoreIndex` but does it in a hacky way to
avoid changing `CandidateCommitments`, `CandidateReceipts` primitives
and networking protocols.

#### Including `CoreIndex` in `BackedCandidate`
If the `ElasticScalingMVP` feature bit is enabled then
`BackedCandidate::validator_indices` is extended by 8 bits.
The value stored in these bits represents the assumed core index for the
candidate.

It is temporary solution which works by creating a mapping from
`BackedCandidate` to `CoreIndex` by assuming the `CoreIndex` can be
discovered by checking in which validator group the validator that
signed the statement is.

TODO:
- [x] fix tests
- [x] add new tests
- [x] Bump runtime API for Kusama, so we have that node features thing!
-> https://github.com/polkadot-fellows/runtimes/pull/194



---------

Signed-off-by: Andrei Sandu <[email protected]>
Signed-off-by: alindima <[email protected]>
Co-authored-by: alindima <[email protected]>

2431001e

Feb 22, 2024

Check that the validation code matches the parachain code (#3433) · 9bf1a5e2

Bastian Köcher authored Feb 22, 2024

This introduces a check to ensure that the parachain code matches the
validation code stored in the relay chain state. If not, it will print a
warning. This should be mainly useful for parachain builders to make
sure they have setup everything correctly.

9bf1a5e2

Elastic scaling: use an assumed `CoreIndex` in `candidate-backing` (#3229) · 60e537b9

Andrei Sandu authored Feb 22, 2024

First step in implementing
https://github.com/paritytech/polkadot-sdk/issues/3144



### Summary of changes
- switch statement `Table` candidate mapping from `ParaId` to
`CoreIndex`
- introduce experimental `InjectCoreIndex`  node feature.
- determine and assume a `CoreIndex` for a candidate based on statement
validator index. If the signature is valid it means validator controls
the validator that index and we can easily map it to a validator
group/core.
- introduce a temporary provisioner fix until we fully enable elastic
scaling in the subystem. The fix ensures we don't fetch the same
backable candidate when calling `get_backable_candidate` for each core.

TODO:
- [x] fix backing tests
- [x] fix statement table tests
- [x] add new test

---------

Signed-off-by: Andrei Sandu <[email protected]>
Signed-off-by: alindima <[email protected]>
Co-authored-by: alindima <[email protected]>

60e537b9

Feb 20, 2024

Lift dependencies to the workspace (Part 2/x) (#3366) · e89d0fca

Oliver Tale-Yazdi authored Feb 20, 2024



Lifting some more dependencies to the workspace. Just using the
most-often updated ones for now.
It can be reproduced locally.

```sh
# First you can check if there would be semver incompatible bumps (looks good in this case):
$ zepter transpose dependency lift-to-workspace --ignore-errors syn quote thiserror "regex:^serde.*"

# Then apply the changes:
$ zepter transpose dependency lift-to-workspace --version-resolver=highest syn quote thiserror "regex:^serde.*" --fix

# And format the changes:
$ taplo format --config .config/taplo.toml
```

---------

Signed-off-by: Oliver Tale-Yazdi <[email protected]>

e89d0fca

Downgrade log message to `trace` (#3405) · ef6ac94f
Bastian Köcher authored Feb 20, 2024
```
This spams logs in `Debug` with no useful information.
```
ef6ac94f

Feb 17, 2024

do not block finality for "disabled" disputes (#3358) · 612587b7

ordian authored Feb 17, 2024

- [x] test with zombienet-sdk
- [x] prdoc

Relevant Issues:
https://github.com/paritytech/polkadot-sdk/issues/3314 (connected to the
cause)
https://github.com/paritytech/polkadot-sdk/issues/3345

 (solves)

---------

Co-authored-by: Kian Paimani <[email protected]>

612587b7

Feb 12, 2024

Lift dependencies to the workspace (Part 1) (#2070) · e80c2473

Oliver Tale-Yazdi authored Feb 12, 2024

Changes (partial https://github.com/paritytech/polkadot-sdk/issues/994):
- Set log to `0.4.20` everywhere
- Lift `log` to the workspace

Starting with a simpler one after seeing
https://github.com/paritytech/polkadot-sdk/pull/2065 from @jsdw

.
This sets the `default-features` to `false` in the root and then
overwrites that in each create to its original value. This is necessary
since otherwise the `default` features are additive and its impossible
to disable them in the crate again once they are enabled in the
workspace.

I am using a tool to do this, so its mostly a test to see that it works
as expected.

---------

Signed-off-by: Oliver Tale-Yazdi <[email protected]>

e80c2473

Feb 11, 2024

refactor pvf security module (#3047) · 4883e144

maksimryndin authored Feb 11, 2024

resolve https://github.com/paritytech/polkadot-sdk/issues/2321



- [x] refactor `security` module into a conditionally compiled
- [x] rename `amd64` into x86-64 for consistency with conditional
compilation guards and remove reference to a particular vendor
- [x] run unit tests and zombienet

---------

Co-authored-by: s0me0ne-unkn0wn <[email protected]>

4883e144

Feb 06, 2024
- prospective-parachains: respond with multiple backable candidates (#3160) · 7df1ae3b
  Alin Dima authored Feb 06, 2024
```
Fixes https://github.com/paritytech/polkadot-sdk/issues/3129
```
  7df1ae3b
Feb 05, 2024

Introduce approval-voting/distribution benchmark (#2621) · f9f88688

Alexandru Gheorghe authored Feb 05, 2024

## Summary
Built on top of the tooling and ideas introduced in
https://github.com/paritytech/polkadot-sdk/pull/2528, this PR introduces
a synthetic benchmark for measuring and assessing the performance
characteristics of the approval-voting and approval-distribution
subsystems.

Currently this allows, us to simulate the behaviours of these systems
based on the following dimensions:
```
TestConfiguration:
# Test 1
- objective: !ApprovalsTest
    last_considered_tranche: 89
    min_coalesce: 1
    max_coalesce: 6
    enable_assignments_v2: true
    send_till_tranche: 60
    stop_when_approved: false
    coalesce_tranche_diff: 12
    workdir_prefix: "/tmp"
    num_no_shows_per_candidate: 0
    approval_distribution_expected_tof: 6.0
    approval_distribution_cpu_ms: 3.0
    approval_voting_cpu_ms: 4.30
  n_validators: 500
  n_cores: 100
  n_included_candidates: 100
  min_pov_size: 1120
  max_pov_size: 5120
  peer_bandwidth: 524288000000
  bandwidth: 524288000000
  latency:
    min_latency:
      secs: 0
      nanos: 1000000
    max_latency:
      secs: 0
      nanos: 100000000
  error: 0
  num_blocks: 10
```

## The approach
1. We build a real overseer with the real implementations for
approval-voting and approval-distribution subsystems.
2. For a given network size, for each validator we pre-computed all
potential assignments and approvals it would send, because this a
computation heavy operation this will be cached on a file on disk and be
re-used if the generation parameters don't change.
3. The messages will be sent accordingly to the configured parameters
and those are split into 3 main benchmarking scenarios.

## Benchmarking scenarios

### Best case scenario *approvals_throughput_best_case.yaml*
It send to the approval-distribution only the minimum required tranche
to gathered the needed_approvals, so that a candidate is approved.

### Behaviour in the presence of no-shows *approvals_no_shows.yaml*
It sends the tranche needed to approve a candidate when we have a
maximum of *num_no_shows_per_candidate* tranches with no-shows for each
candidate.

### Maximum throughput *approvals_throughput.yaml*
It sends all the tranches for each block and measures the used CPU and
necessary network bandwidth. by the approval-voting and
approval-distribution subsystem.

## How to run it
```
cargo run -p polkadot-subsystem-bench --release -- test-sequence --path polkadot/node/subsystem-bench/examples/approvals_throughput.yaml
```

## Evaluating performance
### Use the real subsystems metrics
If you follow the steps in
https://github.com/paritytech/polkadot-sdk/tree/master/polkadot/node/subsystem-bench#install-grafana
for installing locally prometheus and grafana, all real metrics for the
`approval-distribution`, `approval-voting` and overseer are available.
E.g:
<img width="2149" alt="Screenshot 2023-12-05 at 11 07 46"
src="https://github.com/paritytech/polkadot-sdk/assets/49718502/cb8ae2dd-178b-4922-bfa4-dc37e572ed38">

<img width="2551" alt="Screenshot 2023-12-05 at 11 09 42"
src="https://github.com/paritytech/polkadot-sdk/assets/49718502/8b4542ba-88b9-46f9-9b70-cc345366081b">

<img width="2154" alt="Screenshot 2023-12-05 at 11 10 15"
src="https://github.com/paritytech/polkadot-sdk/assets/49718502/b8874d8d-632e-443a-9840-14ad8e90c54f">

<img width="2535" alt="Screenshot 2023-12-05 at 11 10 52"
src="https://github.com/paritytech/polkadot-sdk/assets/49718502/779a439f-fd18-4985-bb80-85d5afad78e2">

### Profile with pyroscope
1. Setup pyroscope following the steps in
https://github.com/paritytech/polkadot-sdk/tree/master/polkadot/node/subsystem-bench#install-pyroscope,
then run any of the benchmark scenario with `--profile` as the
arguments.
2. Open the pyroscope dashboard in grafana, e.g:
<img width="2544" alt="Screenshot 2024-01-09 at 17 09 58"
src="https://github.com/paritytech/polkadot-sdk/assets/49718502/58f50c99-a910-4d20-951a-8b16639303d9">



### Useful  logs
1. Network bandwidth requirements:
```
Payload bytes received from peers: 503993 KiB total, 50399 KiB/block
Payload bytes sent to peers: 629971 KiB total, 62997 KiB/block
```

2. Cpu usage by the approval-distribution/approval-voting subsystems.
```
approval-distribution CPU usage 84.061s
approval-distribution CPU usage per block 8.406s
approval-voting CPU usage 96.532s
approval-voting CPU usage per block 9.653s
```

3. Time passed until a given block is approved
```
 Chain selection approved  after 3500 ms hash=0x0101010101010101010101010101010101010101010101010101010101010101
Chain selection approved  after 4500 ms hash=0x0202020202020202020202020202020202020202020202020202020202020202
```

### Using benchmark to quantify improvements from
https://github.com/paritytech/polkadot-sdk/pull/1178 +
https://github.com/paritytech/polkadot-sdk/pull/1191

Using a versi-node we compare the scenarios where all new optimisations
are disabled with a scenarios where tranche0 assignments are sent in a
single message and a conservative simulation where the coalescing of
approvals gives us just 50% reduction in the number of messages we send.

Overall, what we see is a speedup of around 30-40% in the time it takes
to process the necessary messages and a 30-40% reduction in the
necessary bandwidth.

#### Best case scenario comparison(minimum required tranches sent).
Unoptimised
```
    Number of blocks: 10
    Payload bytes received from peers: 53289 KiB total, 5328 KiB/block
    Payload bytes sent to peers: 52489 KiB total, 5248 KiB/block
    approval-distribution CPU usage 6.732s
    approval-distribution CPU usage per block 0.673s
    approval-voting CPU usage 9.523s
    approval-voting CPU usage per block 0.952s
```

vs Optimisation enabled
```
   Number of blocks: 10
   Payload bytes received from peers: 32141 KiB total, 3214 KiB/block
   Payload bytes sent to peers: 37314 KiB total, 3731 KiB/block
   approval-distribution CPU usage 4.658s
   approval-distribution CPU usage per block 0.466s
   approval-voting CPU usage 6.236s
   approval-voting CPU usage per block 0.624s
```

#### Worst case all tranches sent, very unlikely happens when sharding
breaks.

Unoptimised
```
   Number of blocks: 10
   Payload bytes received from peers: 746393 KiB total, 74639 KiB/block
   Payload bytes sent to peers: 729151 KiB total, 72915 KiB/block
   approval-distribution CPU usage 118.681s
   approval-distribution CPU usage per block 11.868s
   approval-voting CPU usage 124.118s
   approval-voting CPU usage per block 12.412s
```

vs optimised
```
    Number of blocks: 10
    Payload bytes received from peers: 503993 KiB total, 50399 KiB/block
    Payload bytes sent to peers: 629971 KiB total, 62997 KiB/block
    approval-distribution CPU usage 84.061s
    approval-distribution CPU usage per block 8.406s
    approval-voting CPU usage 96.532s
    approval-voting CPU usage per block 9.653s
```


## TODOs
[x] Polish implementation.
[x] Use what we have so far to evaluate
https://github.com/paritytech/polkadot-sdk/pull/1191

 before merging.
[x] List of features and additional dimensions we want to use for
benchmarking.
[x] Run benchmark on hardware similar with versi and kusama nodes.
[ ] Add benchmark to be run in CI for catching regression in
performance.
[ ] Rebase on latest changes for network emulation.

---------

Signed-off-by: Andrei Sandu <[email protected]>
Signed-off-by: Alexandru Gheorghe <[email protected]>
Co-authored-by: Andrei Sandu <[email protected]>
Co-authored-by: Andrei Sandu <[email protected]>

f9f88688

Feb 02, 2024

Enable async backing by default for Rococo/Westend (#3162) · e0674cb3

Bastian Köcher authored Feb 02, 2024



This change is mainly for people running the local variants. They can
directly start with async backing.

---------

Signed-off-by: Alexandru Gheorghe <[email protected]>
Co-authored-by: Alexandru Gheorghe <[email protected]>

e0674cb3

Jan 29, 2024

Do not run unneeded subsystems on collator and its alongside node (#3061) · 3e8139e7

s0me0ne-unkn0wn authored Jan 29, 2024

Currently, collators and their alongside nodes spin up a full-scale
overseer running a bunch of subsystems that are not needed if the node
is not a validator. That was considered to be harmless; however, we've
got problems with unused subsystems getting stalled for a reason not
currently known, resulting in the overseer exiting and bringing down the
whole node.

This PR aims to only run needed subsystems on such nodes, replacing the
rest with `DummySubsystem`.

It also enables collator-optimized availability recovery subsystem
implementation.

Partially solves #1730.

3e8139e7

Jan 26, 2024

Sync Cargo.toml and crates.io versions (#3034) · 3717ec38

Liam Aharon authored Jan 27, 2024

Related https://github.com/paritytech/polkadot-sdk/issues/3032

---

Using https://github.com/liamaharon/cargo-workspace-version-tools/

 

`cargo run -- sync --path ../polkadot-sdk`

---------

Signed-off-by: Oliver Tale-Yazdi <[email protected]>
Co-authored-by: Oliver Tale-Yazdi <[email protected]>

3717ec38