Skip to content
Snippets Groups Projects
  1. Dec 11, 2024
  2. Dec 10, 2024
  3. Nov 14, 2024
  4. Nov 12, 2024
  5. Nov 05, 2024
  6. Oct 18, 2024
  7. Oct 16, 2024
  8. Oct 01, 2024
  9. Sep 25, 2024
  10. Sep 24, 2024
  11. Sep 09, 2024
    • Alexandru Gheorghe's avatar
      [backport] Add benchmark for the number of minimum cpu cores (#5127) (#5613) · 823ecee0
      Alexandru Gheorghe authored
      This backports https://github.com/paritytech/polkadot-sdk/pull/5127, to
      the stable branch.
      
      Unfortunately https://polkadot.subsquare.io/referenda/1051 passed after
      the cut-off deadline and I missed the window of getting this PR merged.
      
      The change itself is super low-risk it just prints a new message to
      validators that starting with January 2025 the required minimum of
      hardware cores will be 8, I see value in getting this in front of the
      validators as soon as possible.
      
      Since we did not release things yet and it does not invalidate any QA we
      already did, it should be painless to include it in the current release.
      
      (cherry picked from commit a947cb83)
  12. Sep 02, 2024
    • EgorPopelyaev's avatar
    • Bastian Köcher's avatar
      collator-protocol: Handle unknown validator heads (#5538) · f58e2b80
      Bastian Köcher authored
      There is a race condition when a validator sends its heads to the
      collator, but the collator doesn't yet know these heads. Before it is
      aware of these heads by importing the block(s), any collation registered
      on the collator is not announced to the validators. The collations
      aren't advertised, because the collator doesn't know yet that these
      heads of the validator are descendants of the collations relay parent.
      
      The solution is to store these unknown heads of the validators and to
      handle them when the collator updates its own view.
    • Nazar Mokrynskyi's avatar
      Improve `sc-service` API (#5364) · da654103
      Nazar Mokrynskyi authored
      
      This improves `sc-service` API by not requiring the whole
      `&Configuration`, using specific configuration options instead.
      `RpcConfiguration` was also extracted from `Configuration` to group all
      RPC options together.
      
      We don't use Substrate's CLI and would rather not use `Configuration`
      either, but some key public functions require it even though they
      ignored most of the fields anyway.
      
      `RpcConfiguration` is very helpful not just for consolidation of the
      fields, but also to finally make RPC optional for our use case, while
      Substrate still runs RPC server on localhost even if listening address
      is explicitly set to `None`, which is annoying (and I suspect there is a
      reason for it, so didn't want to change the default just yet).
      
      While this is a breaking change, most developers will not notice it if
      they use higher-level APIs.
      
      Fixes https://github.com/paritytech/polkadot-sdk/issues/2897
      
      ---------
      
      Co-authored-by: default avatarNiklas Adolfsson <niklasadolfsson1@gmail.com>
    • Alexandru Gheorghe's avatar
      [3 / 5] Move crypto checks in the approval-distribution (#4928) · 6b854acc
      Alexandru Gheorghe authored
      
      # Prerequisite 
      This is part of the work to further optimize the approval subsystems, if
      you want to understand the full context start with reading
      https://github.com/paritytech/polkadot-sdk/pull/4849#issue-2364261568,
      
      # Description
      This PR contain changes, so that the crypto checks are performed by the
      approval-distribution subsystem instead of the approval-voting one. The
      benefit for these, is twofold:
      1. Approval-distribution won't have to wait every single time for the
      approval-voting to finish its job, so the work gets to be pipelined
      between approval-distribution and approval-voting.
      
      2. By running in parallel multiple instances of approval-distribution as
      described here
      https://github.com/paritytech/polkadot-sdk/pull/4849#issue-2364261568,
      this significant body of work gets to run in parallel.
      
      ## Changes:
      1. When approval-voting send `ApprovalDistributionMessage::NewBlocks` it
      needs to pass the core_index and candidate_hash of the candidates.
      2. ApprovalDistribution needs to use `RuntimeInfo` to be able to fetch
      the SessionInfo from the runtime.
      3. Move `approval-voting` logic that checks VRF assignment into
      `approval-distribution`
      4. Move `approval-voting` logic that checks vote is correctly signed
      into `approval-distribution`
      5. Plumb `approval-distribution` and `approval-voting` tests to support
      the new logic.
      
      ## Benefits
      Even without parallelisation the gains are significant, for example on
      my machine if we run approval subsystem bench for 500 validators and 100
      cores and trigger all 89 tranches of assignments and approvals, the
      system won't fall behind anymore because of late processing of messages.
      ```
      Before change
      Chain selection approved  after 11500 ms hash=0x0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a
      
      After change
      
      Chain selection approved  after 5500 ms hash=0x0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a
      ```
      
      ## TODO:
      - [x] Run on versi.
      - [x] Update parachain host documentation.
      
      ---------
      
      Signed-off-by: default avatarAlexandru Gheorghe <alexandru.gheorghe@parity.io>
  13. Aug 30, 2024
  14. Aug 29, 2024
    • ordian's avatar
      inclusion: bench `enact_candidate` weight (#5270) · ddd58c15
      ordian authored
      On top of #5082.
      
      ## Background
      
      Previously, before #3479, we would
      [include](https://github.com/paritytech/polkadot-sdk/blame/75074952/polkadot/runtime/parachains/src/builder.rs#L508C12-L508C44)
      the cost enacting the candidate into the cost of processing a single
      bitfield.
      [Now](https://github.com/paritytech/polkadot-sdk/blame/dd48544a/polkadot/runtime/parachains/src/builder.rs#L529)
      it is different, although the benchmarks seems to be not-up-to date.
      Including the cost of enacting a candidate into a processing a single
      bitfield cost was incorrect, since we multiple that by the number of
      bitfields we have. Instead, we should separate calculate the cost of
      processing a single bitfield without enactment, and multiple the cost of
      enactment by the actual number of processed candidates (which is limited
      by the number cores, not validators).
      
      ## Bench
      
      Previously, the weight of `enact_candidate` was calculated manually
      (without a benchmark) and then neglected:
      https://github.com/paritytech/polkadot-sdk/blob/dd48544a
      
      /polkadot/runtime/parachains/src/inclusion/mod.rs#L584
      
      In this PR, we have a benchmark for it and it's based on the number of
      ump and sent hrmp messages as well as whether the candidate has a
      runtime upgrade (new_validation_code).
      The differences from the previous attempt
      https://github.com/paritytech/polkadot/pull/6929 are that
      * we don't include the cost of enactment into the cost of processing a
      backed candidate.
      The reason for it is that enactment happens not in the same block as
      backing (typically the next one), since we process bitfields before
      backing votes.
      * we don't take into account the size of the runtime upgrade, the
      benchmark weight doesn't seem to depend much on it, but rather whether
      there was one or not.
      
      Similarly to the previous attempt, we don't account for dmp messages
      (fixed cost). Also we don't account properly for received hrmp messages
      (hrmp_watermark) because the cost of it depends on the runtime state and
      can't be statically deduced in the benchmark (unless we pass the
      information about channels as benchmark u32 arguments).
      
      The total weight cost of processing a parainherent now includes the cost
      of enactment of each candidate, but we don't do filtering based on that
      (because we enact after processing bitfields and making other changes to
      the storage).
      
      ## Numbers
      
      ```
      Reads = 7 + (0 * u) + (3 * h) + (8 * c)
      Writes = 10 + (1 * u) + (3 * h) + (7 * c)
      ```
      In addition, there is a fixed cost of a few of ms (!) per candidate. 
      
      This might result a full block slightly overflowing its weight with 200
      enacted candidates, which in turn could prevent non-mandatory
      transactions from being included in a block.
      
      Given our modest limits on max ump and hrmp messages:
      ```
        maxUpwardMessageNumPerCandidate: 16
        hrmpMaxMessageNumPerCandidate: 10
      ```
      and the fact that runtime upgrades are can't happen very frequently
      (`validation_upgrade_cooldown`), we might only go over the limits in
      case of many disputes.
      
      TODOs:
      - [x] Fix the overweight test
      - [x] Generate the weights for Westend and Rococo
      - [x] PRDoc
      
      ---------
      
      Co-authored-by: command-bot <>
      Co-authored-by: default avatarAlin Dima <alin@parity.io>
  15. Aug 28, 2024
    • Niklas Adolfsson's avatar
      rpc server: listen to `ipv6 socket` if available and... · 09254eb9
      Niklas Adolfsson authored
      rpc server: listen to `ipv6 socket` if available and `--experimental-rpc-endpoint` CLI option (#4792)
      
      Close https://github.com/paritytech/polkadot-sdk/issues/3488,
      https://github.com/paritytech/polkadot-sdk/issues/4331
      
      This changes/adds the following:
      
      1. The default setting is that substrate starts a rpc server that
      listens to localhost both Ipv4 and Ipv6 on the same port. Ipv6 is
      allowed to fail because some platforms may not support it
      2. A new RPC CLI option `--experimental-rpc-endpoint` which allow to
      configure arbitrary listen addresses including the port, if this is
      enabled no other interfaces are enabled.
      3. If the local addr is not found for any of the sockets the server is
      not started throws an error.
      4. Remove the deny_unsafe from the RPC implementations instead this is
      an extension to allow different polices for different interfaces/sockets
      such one may enable unsafe on local interface and safe on only the
      external interface.
      
      So for instance in this PR it's now possible to start up three RPC
      endpoints as follows:
      ```
      $ polkadot --experimental-rpc-endpoint "listen-addr=127.0.0.1:9944,rpc-methods=unsafe" --experimental-rpc-endpoint "listen-addr=0.0.0.0:9945,rpc-methods=safe,rate-limit=100" --experimental-rpc-endpoint "listen-addr=[::1]:9944,optional=true"
      ```
      
      #### Needs to be addressed
      
      ~1. Support binding to a random port if it's fails with the default
      stuff for backward compatible reasons~
      ~2. How to sync that the rpc CLI params and that the rpc-listen-addr
      align, hard to maintain...~
      ~3. Add similar warning prints for exposing unsafe methods on external
      interfaces..~
      ~4. Inline todos + the hacky String conversion from rpc params.~
      
      #### Cons with this PR
      
      Manual strings parsing impl more error-prone than relying on clap....
      
      //cc @jsdw @BulatSaif @PierreBesson @bkchr
      
      
      
      ---------
      
      Co-authored-by: default avatarSebastian Kunert <skunert49@gmail.com>
    • Alexandru Gheorghe's avatar
      Update approval-voting-regression-bench (#5504) · f0fd083e
      Alexandru Gheorghe authored
      
      The accepted divergence rate of 1/1000 is excessive and leads to false
      positives especially after
      https://github.com/paritytech/polkadot-sdk/pull/4772 and
      https://github.com/paritytech/polkadot-sdk/pull/5042, so let's increase
      it to 1/100 since we do have some randomness in the system and there is
      no point in being that strict.
      
      Fixes: https://github.com/paritytech/polkadot-sdk/issues/5463
      
      ---------
      
      Signed-off-by: default avatarAlexandru Gheorghe <alexandru.gheorghe@parity.io>
  16. Aug 27, 2024
    • Frazz's avatar
      Adding stkd bootnodes (#5470) · 7a2c5375
      Frazz authored
      Opening this PR to add our bootnodes for the IBP. These nodes are
      located in Santiago Chile, we own and manage the underlying hardware. If
      you need any more information please let me know.
      
      
      Commands to test:
      
      ```
      ./polkadot --tmp --name "testing-bootnode" --chain kusama --reserved-only --reserved-nodes "/dns/kusama.bootnode.stkd.io/tcp/30633/wss/p2p/12D3KooWJHhnF64TXSmyxNkhPkXAHtYNRy86LuvGQu1LTi5vrJCL" --no-hardware-benchmarks
      
      ./polkadot --tmp --name "testing-bootnode" --chain paseo --reserved-only --reserved-nodes "/dns/paseo.bootnode.stkd.io/tcp/30633/wss/p2p/12D3KooWMdND5nwfCs5M2rfp5kyRo41BGDgD8V67rVRaB3acgZ53" --no-hardware-benchmarks
      
      ./polkadot --tmp --name "testing-bootnode" --chain polkadot --reserved-only --reserved-nodes "/dns/polkadot.bootnode.stkd.io/tcp/30633/wss/p2p/12D3KooWEymrFRHz6c17YP3FAyd8kXS5gMRLgkW4U77ZJD2ZNCLZ" --no-hardware-benchmarks
      
      ./polkadot --tmp --name "testing-bootnode" --chain westend --reserved-only --reserved-nodes "/dns/westend.bootnode.stkd.io/tcp/30633/wss/p2p/12D3KooWHaQKkJiTPqeNgqDcW7dfYgJxYwT8YqJMtTkueSu6378V" --no-hardware-benchmarks
      ```
  17. Aug 23, 2024
    • Nazar Mokrynskyi's avatar
      Remove the need to wait for target block header in warp sync implementation (#5431) · 6d819a61
      Nazar Mokrynskyi authored
      I'm not sure if this is exactly what
      https://github.com/paritytech/polkadot-sdk/issues/3537 meant, but I
      think it should be fine to wait for relay chain before initializing
      parachain node fully, which removed the need for background task and
      extra hacks throughout the stack just to know where warp sync should
      start.
      
      Previously there were both `WarpSyncParams` and `WarpSyncConfig`, but
      there was no longer any point in having two data structures, so I
      simplified it to just `WarpSyncConfig`.
      
      Fixes https://github.com/paritytech/polkadot-sdk/issues/3537
  18. Aug 22, 2024
  19. Aug 20, 2024
    • Alexandru Gheorghe's avatar
      approval-distribution: Fix preallocation of ApprovalEntries (#5411) · f239abac
      Alexandru Gheorghe authored
      
      We preallocated the approvals field in the ApprovalEntry by up to a
      factor of two in the worse conditions, since we can't have more than 6
      approvals and candidates.len() will return 20 if you have just the 20th
      bit set.
      This adds to a lot of wasted memory because we have an ApprovalEntry for
      each assignment we received
      
      This was discovered while running rust jemalloc-profiling with the steps
      from here: https://www.magiroux.com/rust-jemalloc-profiling/
      
      Just with this optimisation approvals subsystem-benchmark memory usage
      on the worst case scenario is reduced from 6.1GiB to 2.4 GiB, even cpu
      usage of approval-distribution decreases by 4-5%.
      
      ---------
      
      Signed-off-by: default avatarAlexandru Gheorghe <alexandru.gheorghe@parity.io>
  20. Aug 18, 2024
  21. Aug 16, 2024
  22. Aug 14, 2024
    • Alexandru Gheorghe's avatar
      Fix OurViewChange small race (#5356) · 05a8ba66
      Alexandru Gheorghe authored
      
      Always queue OurViewChange event before we send view changes to our
      peers, because otherwise we risk the peers sending us a message that can
      be processed by our subsystems before OurViewChange.
      
      Normally, this is not really a problem because the latency of the
      ViewChange we send to our peers is way higher that our subsystem
      processing OurViewChange, however on testnets like versi where CPU is
      sometimes overcommitted this race gets triggered occasionally, so let's
      fix it by sending the messages in the right order.
      
      ---------
      
      Signed-off-by: default avatarAlexandru Gheorghe <alexandru.gheorghe@parity.io>
  23. Aug 12, 2024
    • Alexander Theißen's avatar
      `polkadot-node-core-pvf-common`: Fix test compilation error (#5310) · 8e8dc618
      Alexander Theißen authored
      This crate only uses `tempfile` on linux but includes it unconditionally
      in its `Cargo.toml`. It also sets `#![deny(unused_crate_dependencies)]`.
      This leads to an hard error to anything that is not Linux.
      
      This PR fixes this error. I am wondering why CI didn't catch that.
      Shouldn't the test at least be compiled (but not run) on macOS?
    • Alin Dima's avatar
      fix av-distribution Jaeger spans mem leak (#5321) · fc906d5d
      Alin Dima authored
      Fixes https://github.com/paritytech/polkadot-sdk/issues/5258
    • Alin Dima's avatar
      prospective-parachains rework: take II (#4937) · 0b52a2c1
      Alin Dima authored
      Resolves https://github.com/paritytech/polkadot-sdk/issues/4800
      
      # Problem
      In https://github.com/paritytech/polkadot-sdk/pull/4035, we removed
      support for parachain forks and cycles and added support for backing
      unconnected candidates (candidates for which we don't yet know the full
      path to the latest included block), which is useful for elastic scaling
      (parachains using multiple cores).
      
      Removing support for backing forks turned out to be a bad idea, as there
      are legitimate cases for a parachain to fork (if they have other
      consensus mechanism for example, like BABE or PoW). This leads to
      validators getting lower backing rewards (depending on whether they back
      the winning fork or not) and a higher pressure on only the half of the
      backing group (during availability-distribution for example). Since we
      don't yet have approval voting rewards, backing rewards are a pretty big
      deal (which may change in the future).
      
      # Description
      
      A backing group is now allowed to back forks. Once a candidate becomes
      backed (has the minimum backing votes), we don't accept new forks unless
      they adhere to the new fork selection rule (have a lower candidate
      hash).
      This helps with keeping the implementation simpler, since forks will
      only be taken into account for candidates which are not backed yet (only
      seconded).
      Having this fork selection rule also helps with reducing the work
      backing validators need to do, since they have a shared way of picking
      the winning fork. Once they see a candidate backed, they can all decide
      to back a fork and not accept new ones.
      But they still accept new ones during the seconding phase (until the
      backing quorum is reached).
      
      Therefore, a block author which is not part of the backing group will
      likely not even see the forks (only the winning one).
      
      Just as before, a parachain producing forks will still not be able to
      leverage elastic scaling but will still work with a single core. Also,
      cycles are still not accepted.
      
      ## Some implementation details
      
      `CandidateStorage` is no longer a subsystem-wide construct. It was
      previously holding candidates from all relay chain forks and complicated
      the code. Each fragment chain now holds their candidate chain and their
      potential candidates. This should not increase the storage consumption
      since the heavy candidate data is already wrapped in an Arc and shared.
      It however allows for great simplifications and increase readability.
      
      `FragmentChain`s are now only creating a chain with backed candidates
      and the fork selection rule. As said before, `FragmentChain`s are now
      also responsible for maintaining their own potential candidate storage.
      
      Since we no longer have the subsytem-wide `CandidateStorage`, when
      getting a new leaf update, we use the storage of our latest ancestor,
      which may contain candidates seconded/backed that are still in scope.
      
      When a candidate is backed, the fragment chains which hold it are
      recreated (due to the fork selection rule, it could trigger a "reorg" of
      the fragment chain).
      
      I generally tried to simplify the subsystem and not introduce
      unneccessary optimisations that would otherwise complicate the code and
      not gain us much (fragment chains wouldn't realistically ever hold many
      candidates)
      
      TODO:
      - [x] update metrics
      - [x] update docs and comments
      - [x] fix and add unit tests
      - [x] tested with fork-producing parachain
      - [x] tested with cycle-producing parachain
      - [x] versi test
      - [x] prdoc
  24. Aug 09, 2024
    • s0me0ne-unkn0wn's avatar
      Move PVF code and PoV decompression to PVF host workers (#5142) · 47c1b4cd
      s0me0ne-unkn0wn authored
      Closes #5071 
      
      This PR aims to
      * Move all the blocking decompression from the candidate validation
      subsystem to the PVF host workers;
      * Run the candidate validation subsystem on the non-blocking pool again.
      
      Upsides: no blocking operations in the subsystem's main loop. PVF
      throughput is not limited by the ability of the subsystem to decompress
      a lot of stuff. Correctness and homogeneity improve, as the artifact
      used to be identified by the hash of decompressed code, and now they are
      identified by the hash of compressed code, which coincides with the
      on-chain `ValidationCodeHash`.
      
      Downsides: the PVF code decompression is now accounted for in the PVF
      preparation timeout (be it pre-checking or actual preparation). Taking
      into account that the decompression duration is on the order of
      milliseconds, and the preparation timeout is on the order of seconds, I
      believe it is negligible.