Skip to content
Snippets Groups Projects
  1. Feb 12, 2025
  2. Feb 10, 2025
  3. Jan 30, 2025
    • Stephane Gurgenidze's avatar
      malus-collator: implement malicious collator submitting same collation to all... · 48f69cca
      Stephane Gurgenidze authored
      malus-collator: implement malicious collator submitting same collation to all backing groups (#6924)
      
      ## Issues
      - [[#5049] Elastic scaling: zombienet
      tests](https://github.com/paritytech/polkadot-sdk/issues/5049)
      - [[#4526] Add zombienet tests for malicious
      collators](https://github.com/paritytech/polkadot-sdk/issues/4526)
      
      ## Description
      Modified the undying collator to include a malus mode, in which it
      submits the same collation to all assigned backing groups.
      
      ## TODO
      * [X] Implement malicious collator that submits the same collation to
      all backing groups;
      * [X] Avoid the core index check in the collation generation subsystem:
      https://github.com/paritytech/polkadot-sdk/blob/master/polkadot/node/collation-generation/src/lib.rs#L552-L553;
      * [X] Resolve the mismatch between the descriptor and the commitments
      core index: https://github.com/paritytech/polkadot-sdk/pull/7104
      * [X] Implement `duplicate_collations` test with zombienet-sdk;
      * [X] Add PRdoc.
  4. Jan 28, 2025
    • Alin Dima's avatar
      cumulus: bump PARENT_SEARCH_DEPTH and add test for 12-core elastic scaling (#6983) · e6aad5b0
      Alin Dima authored
      On top of https://github.com/paritytech/polkadot-sdk/pull/6757
      
      Fixes https://github.com/paritytech/polkadot-sdk/issues/6858 by bumping
      the `PARENT_SEARCH_DEPTH` constant to a larger value (30) and adds a
      zombienet-sdk test that exercises the 12-core scenario.
      
      This is a node-side limit that restricts the number of allowed pending
      availability candidates when choosing the parent parablock during
      authoring.
      This limit is rather redundant, as the parachain runtime already
      restricts the unincluded segment length to the configured value in the
      [FixedVelocityConsensusHook](https://github.com/paritytech/polkadot-sdk/blob/88d900af/cumulus/pallets/aura-ext/src/consensus_hook.rs#L35)
      (which ideally should be equal to this `PARENT_SEARCH_DEPTH`).
      
      For 12 cores, a value of 24 should be enough, but I bumped it to 30 to
      have some extra buffer.
      
      There are two other potential ways of fixing this:
      - remove t...
  5. Jan 21, 2025
  6. Jan 07, 2025
  7. Dec 18, 2024
  8. Dec 13, 2024
    • Tsvetomir Dimitrov's avatar
      Collation fetching fairness (#4880) · 5153e2b5
      Tsvetomir Dimitrov authored
      Related to https://github.com/paritytech/polkadot-sdk/issues/1797
      
      # The problem
      When fetching collations in collator protocol/validator side we need to
      ensure that each parachain has got a fair core time share depending on
      its assignments in the claim queue. This means that the number of
      collations fetched per parachain should ideally be equal to (but
      definitely not bigger than) the number of claims for the particular
      parachain in the claim queue.
      
      # Why the current implementation is not good enough
      The current implementation doesn't guarantee such fairness. For each
      relay parent there is a `waiting_queue` (PerRelayParent -> Collations ->
      waiting_queue) which holds any unfetched collations advertised to the
      validator. The collations are fetched on first in first out principle
      which means that if two parachains share a core and one of the
      parachains is more aggressive it might starve the second parachain. How?
      At each relay parent up to `max_candidate_depth` candidates ...
  9. Dec 05, 2024
  10. Dec 04, 2024
  11. Nov 07, 2024
  12. Nov 04, 2024
  13. Oct 24, 2024
  14. Oct 22, 2024
    • Serban Iorga's avatar
      Fix TrustedQueryApi Error (#6170) · 356386b5
      Serban Iorga authored
      Related to https://github.com/paritytech/polkadot-sdk/issues/6161
      
      This seems to fix the `JavaScript heap out of memory` error encountered
      in the bridge zombienet tests lately.
      
      This is just a partial fix, since we also need to address
      https://github.com/paritytech/polkadot-sdk/issues/6133 in order to fully
      fix the bridge zombienet tests
  15. Oct 21, 2024
    • Alin Dima's avatar
      runtime: remove ttl (#5461) · ee803b74
      Alin Dima authored
      
      Resolves https://github.com/paritytech/polkadot-sdk/issues/4776
      
      This will enable proper core-sharing between paras, even if one of them
      is not producing blocks.
      
      TODO:
      - [x] duplicate first entry in the claim queue if the queue used to be
      empty
      - [x] don't back anything if at the end of the block there'll be a
      session change
      - [x] write migration for removing the availability core storage
      - [x] update and write unit tests
      - [x] prdoc
      - [x] add zombienet test for synchronous backing
      - [x] add zombienet test for core-sharing paras where one of them is not
      producing any blocks
      
      _Important note:_
      The `ttl` and `max_availability_timeouts` fields of the
      HostConfiguration are not removed in this PR, due to #64.
      Adding the workaround with the storage version check for every use of
      the active HostConfiguration in all runtime APIs would be insane, as
      it's used in almost all runtime APIs.
      
      So even though the ttl and max_availability_timeouts fields will now be
      unused, they will remain part of the host configuration.
      
      These will be removed in a separate PR once #64 is fixed. Tracked by
      https://github.com/paritytech/polkadot-sdk/issues/6067
      
      ---------
      
      Signed-off-by: default avatarAndrei Sandu <andrei-mihail@parity.io>
      Co-authored-by: default avatarAndrei Sandu <andrei-mihail@parity.io>
      Co-authored-by: default avatarAndrei Sandu <54316454+sandreim@users.noreply.github.com>
      Co-authored-by: command-bot <>
    • Javier Viola's avatar
      fix js oom `js-scripts` (#6139) · dbaa428c
      Javier Viola authored
      
      Fix `oom` failures (`FATAL ERROR: Ineffective mark-compacts near heap
      limit Allocation failed - JavaScript heap out of memory`), like:
      
      https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/7602589
      https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/7602594
      
      ---------
      
      Co-authored-by: default avatarBastian Köcher <git@kchr.de>
    • Serban Iorga's avatar
      Fix and re-enable `zombienet-substrate-0002-validators-warp-sync` (#6154) · 9f515e02
      Serban Iorga authored
      Closes https://github.com/paritytech/polkadot-sdk/issues/5974
      
      Fixed as per
      https://github.com/paritytech/polkadot-sdk/issues/5974#issuecomment-2426463359
    • Javier Viola's avatar
      bump zombienet version `v1.3.116` (#6155) · 73a51fd9
      Javier Viola authored
      Bump zombienet version, includes fixes for `ci`. (mostly timeouts for
      k8s).
  16. Oct 16, 2024
  17. Oct 15, 2024
  18. Oct 11, 2024
  19. Oct 10, 2024
  20. Oct 09, 2024
  21. Oct 08, 2024
  22. Oct 05, 2024
  23. Oct 04, 2024
  24. Oct 03, 2024
  25. Oct 02, 2024
  26. Sep 27, 2024
  27. Sep 26, 2024
    • Javier Viola's avatar
      bump zombienet version `v1.3.110` (#5834) · 17243e03
      Javier Viola authored
      Bump `zombienet` version to prevent report fails at teardown phase.
    • Alexander Samusev's avatar
      [ci] Disable cargo-hfuzz, disable cargo-doc (#5843) · 7626a9d6
      Alexander Samusev authored
      Changes in PR:
      - disabled cargo-hfuzz until [the
      issue](https://github.com/paritytech/polkadot-sdk/issues/5812) is fixed.
      - enabled condition to skip jobs when no rust files are changed
    • Alexandru Gheorghe's avatar
      [5 / 5] Introduce approval-voting-parallel (#4849) · b16237ad
      Alexandru Gheorghe authored
      This is the implementation of the approach described here:
      https://github.com/paritytech/polkadot-sdk/issues/1617#issuecomment-2150321612
      &
      https://github.com/paritytech/polkadot-sdk/issues/1617#issuecomment-2154357547
      &
      https://github.com/paritytech/polkadot-sdk/issues/1617#issuecomment-2154721395.
      
      ## Description of changes
      
      The end goal is to have an architecture where we have single
      subsystem(`approval-voting-parallel`) and multiple worker types that
      would full-fill the work that currently is fulfilled by the
      `approval-distribution` and `approval-voting` subsystems. The main loop
      of the new subsystem would do just the distribution of work to the
      workers.
      
      The new subsystem will have:
      - N approval-distribution workers: This would do the work that is
      currently being done by the approval-distribution subsystem and in
      addition to that will also perform the crypto-checks that an assignment
      is valid and that a vote is correctly signed. Work is assigned via the
      following formula: `worker_index = msg.validator % WORKER_COUNT`, this
      guarantees that all assignments and approvals from the same validator
      reach the same worker.
      - 1 approval-voting worker: This would receive an already valid message
      and do everything the approval-voting currently does, except the
      crypto-checking that has been moved already to the approval-distribution
      worker.
      
      On the hot path of processing messages **no** synchronisation and
      waiting is needed between approval-distribution and approval-voting
      workers.
      
      <img width="1431" alt="Screenshot 2024-06-07 at 11 28 08"
      src="https://github.com/paritytech/polkadot-sdk/assets/49718502/a196199b-b705-4140-87d4-c6900ba8595e">
      
      
      
      ## Guidelines for reading
      
      The full implementation is broken in 5 PRs and all of them are
      self-contained and improve things incrementally even without the
      parallelisation being implemented/enabled, the reason this approach was
      taken instead of a big-bang PR, is to make things easier to review and
      reduced the risk of breaking this critical subsystems.
      
      After reading the full description of this PR, the changes should be
      read in the following order:
      1. https://github.com/paritytech/polkadot-sdk/pull/4848, some other
      micro-optimizations for networks with a high number of validators. This
      change gives us a speed up by itself without any other changes.
      2. https://github.com/paritytech/polkadot-sdk/pull/4845 , this contains
      only interface changes to decouple the subsystem from the `Context` and
      be able to run multiple instances of the subsystem on different threads.
      **No functional changes**
      3. https://github.com/paritytech/polkadot-sdk/pull/4928, moving of the
      crypto checks from approval-voting in approval-distribution, so that the
      approval-distribution has no reason to wait after approval-voting
      anymore. This change gives us a speed up by itself without any other
      changes.
      4. https://github.com/paritytech/polkadot-sdk/pull/4846, interface
      changes to make approval-voting runnable on a separate thread. **No
      functional changes**
      5. This PR, where we instantiate an `approval-voting-parallel` subsystem
      that runs on different workers the logic currently in
      `approval-distribution` and `approval-voting`.
      6. The next step after this changes get merged and deploy would be to
      bring all the files from approval-distribution, approval-voting,
      approval-voting-parallel into a single rust crate, to make it easier to
      maintain and understand the structure.
      
      ## Results
      Running subsystem-benchmarks with 1000 validators 100 fully ocuppied
      cores and triggering all assignments and approvals for all tranches
      
      #### Approval does not lags behind. 
       Master
      ```
      Chain selection approved  after 72500 ms hash=0x0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a
      ```
      With this PoC
      ```
      Chain selection approved  after 3500 ms hash=0x0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a
      ```
      
      #### Gathering enough assignments
       
      Enough assignments are gathered in less than 500ms, so that gives un a
      guarantee that un-necessary work does not get triggered, on master on
      the same benchmark because the subsystems fall behind on work, that
      number goes above 32 seconds on master.
       
      <img width="2240" alt="Screenshot 2024-06-20 at 15 48 22"
      src="https://github.com/paritytech/polkadot-sdk/assets/49718502/d2f2b29c-5ff6-44b4-a245-5b37ab8e58bc">
      
      
      #### Cpu usage:
      Master
      ```
      CPU usage, seconds                     total   per block
      approval-distribution                96.9436      9.6944
      approval-voting                     117.4676     11.7468
      test-environment                     44.0092      4.4009
      ```
      With this PoC
      ```
      CPU usage, seconds                     total   per block
      approval-distribution                 0.0014      0.0001 --- unused
      approval-voting                       0.0437      0.0044.  --- unused
      approval-voting-parallel              5.9560      0.5956
      approval-voting-parallel-0           22.9073      2.2907
      approval-voting-parallel-1           23.0417      2.3042
      approval-voting-parallel-2           22.0445      2.2045
      approval-voting-parallel-3           22.7234      2.2723
      approval-voting-parallel-4           21.9788      2.1979
      approval-voting-parallel-5           23.0601      2.3060
      approval-voting-parallel-6           22.4805      2.2481
      approval-voting-parallel-7           21.8330      2.1833
      approval-voting-parallel-db          37.1954      3.7195.  --- the approval-voting thread.
      ```
      
      # Enablement strategy
      
      Because just some trivial plumbing is needed in approval-distribution
      and approval-voting to be able to run things in parallel and because
      this subsystems plays a critical part in the system this PR proposes
      that we keep both ways of running the approval work, as separated
      subsystems and just a single subsystem(`approval-voting-parallel`) which
      has multiple workers for the distribution work and one worker for the
      approval-voting work and switch between them with a comandline flag.
      
      The benefits for this is twofold.
      1. With the same polkadot binary we can easily switch just a few
      validators to use the parallel approach and gradually make this the
      default way of running, if now issues arise.
      2. In the worst case scenario were it becomes the default way of running
      things, but we discover there are critical issues with it we have the
      path to quickly disable it by asking validators to adjust their command
      line flags.
      
      
      # Next steps
      - [x] Make sure through various testing we are not missing anything 
      - [x] Polish the implementations to make them production ready
      - [x] Add Unittest Tests for approval-voting-parallel.
      - [x] Define and implement the strategy for rolling this change, so that
      the blast radius is minimal(single validator) in case there are problems
      with the implementation.
      - [x]  Versi long running tests.
      - [x] Add relevant metrics.
      
      @ordian @eskimor @sandreim @AndreiEres
      
      , let me know what you think.
      
      ---------
      
      Signed-off-by: default avatarAlexandru Gheorghe <alexandru.gheorghe@parity.io>
  28. Sep 25, 2024
    • Liam Aharon's avatar
      MBM `try-runtime` support (#4251) · cc6a5130
      Liam Aharon authored
      
      # MBM try-runtime support
      
      This MR adds support to the try-runtime trait such that the
      try-runtime-CLI will be able to support MBM testing
      [here](https://github.com/paritytech/try-runtime-cli/pull/90). It mainly
      adds two feature-gated hooks to the `SteppedMigration` hook to
      facilitate testing. These hooks are named `pre_upgrade` and
      `post_upgrade` and have the same signature and implications as for
      single-block migrations.
      
      ## Integration
      
      To make use of this in your Multi-Block-Migration, just implement the
      two new hooks and test pre- and post-conditions in them:
      
      ```rust
      #[cfg(feature = "try-runtime")]
      fn pre_upgrade() -> Result<Vec<u8>, frame_support::sp_runtime::TryRuntimeError> {
      	// ...
      }
      
      #[cfg(feature = "try-runtime")]
      fn post_upgrade(prev: Vec<u8>) -> Result<(), frame_support::sp_runtime::TryRuntimeError> {
          // ...
      }
      ```
      
      You may return an error or panic in these functions to indicate failure.
      This will then show up in the try-runtime-CLI and can be used in CI for
      testing.
      
      Changes:
      - Adds `try-runtime` gated methods `pre_upgrade` and `post_upgrade` on
      `SteppedMigration`
      - Adds `try-runtime` gated methods `nth_pre_upgrade` and
      `nth_post_upgrade` on `SteppedMigrations`
      - Modifies `pallet_migrations` implementation to run pre_upgrade and
      post_upgrade steps at the appropriate times, and panic in the event of
      migration failure.
      
      ---------
      
      Signed-off-by: default avatarOliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
      Signed-off-by: default avatargeorgepisaltu <george.pisaltu@parity.io>
      Co-authored-by: default avatarOliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
      Co-authored-by: default avatarclaravanstaden <claravanstaden64@gmail.com>
      Co-authored-by: default avatarggwpez <ggwpez@users.noreply.github.com>
      Co-authored-by: default avatargeorgepisaltu <george.pisaltu@parity.io>
  29. Sep 22, 2024
    • Branislav Kontur's avatar
      Moved presets to the testnet runtimes (#5327) · 8735c663
      Branislav Kontur authored
      
      It is a first step for switching to the `frame-omni-bencher` for CI.
      
      This PR includes several changes related to generating chain specs plus:
      
      - [x] pallet `assigned_slots` fix missing `#[serde(skip)]` for phantom
      - [x] pallet `paras_inherent` benchmark fix - cherry-picked from
      https://github.com/paritytech/polkadot-sdk/pull/5688
      - [x] migrates `get_preset` to the relevant runtimes
      - [x] fixes Rococo genesis presets - does not work
      https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/7317249
      - [x] fixes Rococo benchmarks for CI 
      - [x] migrate westend genesis
      - [x] remove wococo stuff
      
      Closes: https://github.com/paritytech/polkadot-sdk/issues/5680
      
      ## Follow-ups
      - Fix for frame-omni-bencher
      https://github.com/paritytech/polkadot-sdk/pull/5655
      - Enable new short-benchmarking CI -
      https://github.com/paritytech/polkadot-sdk/pull/5706
      - Remove gitlab pipelines for short benchmarking
      - refactor all Cumulus runtimes to use `get_preset` -
      https://github.com/paritytech/polkadot-sdk/issues/5704
      - https://github.com/paritytech/polkadot-sdk/issues/5705
      - https://github.com/paritytech/polkadot-sdk/issues/5700
      - [ ] Backport to the stable
      
      ---------
      
      Co-authored-by: command-bot <>
      Co-authored-by: default avatarordian <noreply@reusable.software>
  30. Sep 18, 2024