- Jan 11, 2023
-
-
Marcin S. authored
* pvf: Add checks for result sender when retrying preparation in tests * pvf: Fix missing execution request when retrying preparation * Update comment
-
- Jan 10, 2023
-
-
Marcin S. authored
* Minor fixes * Fix compile errors
-
Marcin S. authored
* Replace async-std with tokio in PVF subsystem * Rework workers to use `select!` instead of a mutex The improvement in code readability is more important than the thread overhead. * Remove unnecessary `fuse` * Add explanation for `expect()` * Update node/core/pvf/src/worker_common.rs Co-authored-by: Bastian Köcher <[email protected]> * Update node/core/pvf/src/worker_common.rs Co-authored-by: Bastian Köcher <[email protected]> * Address some review comments * Shutdown tokio runtime * Run cargo fmt * Add a small note about retries * Fix up merge * Rework `cpu_time_monitor_loop` to return when other thread finishes * Add error string to PrepareError::IoErr variant * Log when artifacts fail to prepare * Fix `cpu_time_monitor_loop`; fix test * Fix text * Fix a couple of potential minor data races. First data race was due to logging in the CPU monitor thread even if the job (other thread) finished. It can technically finish before or after the log. Maybe best would be to move this log to the `select!`s, where we are guaranteed to have chosen the timed-out branch, although there would be a bit of duplication. Also, it was possible for this thread to complete before we executed `finished_tx.send` in the other thread, which would trigger an error as the receiver has already been dropped. And right now, such a spurious error from `send` would be returned even if the job otherwise succeeded. * Update Cargo.lock Co-authored-by: Bastian Köcher <[email protected]>
-
- Jan 04, 2023
-
-
Marcin S. authored
-
- Dec 20, 2022
-
-
Marcin S. authored
* PVF preparation: do not conflate errors + Adds some more granularity to the prepare errors. + Better distinguish whether errors occur on the host side or the worker. + Do not kill the worker if the error happened on the host side. + Do not retry preparation if the error was `Panic`. + Removes unnecessary indirection with `Selected` type. * Add missing docs, resolve TODOs * Address review comments and remove TODOs * Fix error in CI * Undo unnecessary change * Update couple of comments * Don't return error for stream shutdown * Update node/core/pvf/src/worker_common.rs
-
- Dec 06, 2022
-
-
Marcin S. authored
* Let the PVF host kill the worker on timeout * Fix comment * Fix inaccurate comments; add missing return statement * Fix a comment * Fix comment
-
- Nov 30, 2022
-
-
Marcin S. authored
* Put in skeleton logic for CPU-time-preparation Still needed: - Flesh out logic - Refactor some spots - Tests * Continue filling in logic for prepare worker CPU time changes * Fix compiler errors * Update lenience factor * Fix some clippy lints for PVF module * Fix compilation errors * Address some review comments * Add logging * Add another log * Address some review comments; change Mutex to AtomicBool * Refactor handling response bytes * Add CPU clock timeout logic for execute jobs * Properly handle AtomicBool flag * Use `Ordering::Relaxed` * Refactor thread coordination logic * Fix bug * Add some timing information to execute tests * Add section about the mitigation to the IG * minor: Change more `Ordering`s to `Relaxed` * candidate-validation: Fix build errors
-
alexgparity authored
* Add clippy config and remove .cargo from gitignore * first fixes * Clippyfied * Add clippy CI job * comment out rusty-cachier * minor * fix ci * remove DAG from check-dependent-project * add DAG to clippy Co-authored-by: alvicsam <[email protected]>
-
- Nov 28, 2022
-
-
Marcin S. authored
-
- Nov 23, 2022
-
-
Marcin S. authored
* Add PVF module documentation TODO (once the PRs land): - [ ] Document executor parametrization. - [ ] Document CPU time measurement of timeouts. * Update node/core/pvf/src/lib.rs Co-authored-by: Andrei Sandu <[email protected]> * Clarify meaning of PVF acronym * Move PVF doc to implementer's guide * Clean up implementer's guide a bit * Add page for PVF types * pvf: Better separation between crate docs and implementer's guide * ci: Add "prevalidating" to the dictionary * ig: Remove types/chain.md The types contained therein did not exist and the file was not referenced anywhere. Co-authored-by: Andrei Sandu <[email protected]>
-
- Nov 08, 2022
-
-
Marcin S. authored
* Fix a couple of typos * Retry failed PVF execution PVF execution that fails due to AmbiguousWorkerDeath should be retried once. This should reduce the occurrence of failures due to transient conditions. Closes #6195 * Address a couple of nits * Write tests; refactor (add `validate_candidate_with_retry`) * Update node/core/candidate-validation/src/lib.rs Co-authored-by: Andronik <[email protected]> Co-authored-by: eskimor <[email protected]> Co-authored-by: Andronik <[email protected]>
-
Marcin S. authored
-
- Nov 01, 2022
-
-
Marcin S. authored
* Rename timeout consts and timeout parameter; bump leniency * Update implementor's guide with info about PVFs * Make glossary a bit easier to read * Add a note to LENIENT_PREPARATION_TIMEOUT * Remove PVF-specific section from glossary * Fix some typos
-
- Oct 26, 2022
-
-
Marcin S. authored
* Log exit status code for workers * Make log for execute job conclusion match prepare job conclusion Trace log for conclusion of prepare job: ```rs gum::debug!( target: LOG_TARGET, validation_code_hash = ?artifact_id.code_hash, ?worker, ?rip, "prepare worker concluded", ); ``` Co-authored-by: parity-processbot <>
-
- Oct 13, 2022
-
-
Marcin S. authored
* Add some documentation * Add `compilation_timeout` parameter for PVF preparation job * Update buckets in prometheus metrics * Update prepare/queue tests * Update pvf-prechecking overview in implementer docs * Fix some CI checks
-
- Sep 15, 2022
-
-
ordian authored
-
- Jun 24, 2022
-
-
Sergei Shulepov authored
* pvf: ensure enough stack space * fix typos Co-authored-by: Andronik <[email protected]> * Use rayon to cache the thread Co-authored-by: Andronik <[email protected]>
-
- May 30, 2022
-
-
Gavin Wood authored
* Fix warnings * Bump
-
- May 19, 2022
-
-
Koute authored
Switch to pooling copy-on-write instantiation strategy for WASM (companion for Substrate#11232) (#5337) * Switch to pooling copy-on-write instantiation strategy for WASM * Fix compilation of `polkadot-test-service` * Update comments * Move `max_memory_size` to `Semantics` * Rename `WasmInstantiationStrategy` to `WasmtimeInstantiationStrategy` * Update a safety comment * update lockfile for {"substrate"} Co-authored-by: parity-processbot <>
-
- Apr 09, 2022
-
-
Sergei Shulepov authored
The PVF host is designed to avoid spawning tasks to minimize knowledge of outer code. Using `async_std::task::spawn` (or Tokio's counterpart) deemed unacceptable, `SpawnNamed` undesirable. Instead there is only one task returned that is spawned by the candidate-validation subsystem. The tasks from the sub-components are polled by that root task. However, the way the tasks are bundled was incorrect. There was a giant select that was polling those tasks. Particularly, that implies that as soon as one of the arms of that select goes into await those sub-tasks stop getting polled. This is a recipe for a deadlock which indeed happened here. Specifically, the deadlock happened during sending messages to the execute queue by calling [`send_execute`](https://github.com/paritytech/polkadot/blob/a68d9be35656dcd96e378fd9dd3d613af754d48a/node/core/pvf/src/host.rs#L601). When the channel to the queue reaches the capacity, the control flow is suspended until the queue handles those messages. Since this code is essentially reached from [one of the select arms](https://github.com/paritytech/polkadot/blob/a68d9be35656dcd96e378fd9dd3d613af754d48a/node/core/pvf/src/host.rs#L371), the queue won't be given the control and thus no further progress can be made. This problem is solved by bundling the tasks one level higher instead, by `selecting` over those long-running tasks. We also stop treating returning from those long-running tasks as error conditions, since that can happen during legit shutdown.
-
- Mar 24, 2022
-
-
Koute authored
* Rename to BagError * Additional parameter for 'revert' command * Set aux revert param to None * Align to changes in how the WASM executor is configured in `substrate` * update lockfile for {"substrate"} * update lockfile for {"substrate"} * Update substrate * Update substrate Co-authored-by: Keith Yeung <[email protected]> Co-authored-by: Davide Galassi <[email protected]> Co-authored-by: Shawn Tabrizi <[email protected]> Co-authored-by: parity-processbot <>
-
- Mar 15, 2022
-
-
Bernhard Schuster authored
* add some gum * bump expander * gum * fix all remaining issues * last fixup * Update node/gum/proc-macro/src/lib.rs Co-authored-by: Bastian Köcher <[email protected]> * change * netowrk * fixins * chore * allow optional fmt str + args, prep for expr as kv field * tracing -> gum rename fallout * restrict further * allow multiple levels of field accesses * another round of docs and a slip of the pen * update ADR * fixup lock fiel * use target: instead of target= * minors * fix * chore * Update node/gum/README.md Co-authored-by: Andrei Sandu <[email protected]> Co-authored-by: Bastian Köcher <[email protected]> Co-authored-by: Andrei Sandu <[email protected]>
-
- Jan 31, 2022
-
-
sandreim authored
Signed-off-by: Andrei Sandu <[email protected]>
-
- Dec 24, 2021
-
-
cheme authored
* merge master (do not compile) * fix * lock * update lock * Update to refactoring. * runtime version * fmt * remove trie patch * remove patch * No layout alias for bridge proof. * update depupdate depss * No switch until migration. * master lock * test * test * Revert "test" This reverts commit 57325ef73332bf4b054aa4a667bb716fcf8a0d89. * Revert "test" This reverts commit ce74d0e2062806f72c0e9e9ca07b14165f43521e. * rename feature * state version as parameter, use the feature only on runtimes. * update * update to state version in runtime * state version from storage * update lockfile for substrate Co-authored-by: parity-processbot <>
-
- Dec 14, 2021
-
-
Sergey Pepyakin authored
We wanted to change niceness to accomodate the fact that some of the preparation tasks are low priority. For example, when a node sees that there is a new para was onboarded the node may start preparing right away. Since all other activities are more important, such as network I/O or validation of the backed candidates and preparation of the immediatelly needed PVFs. However, it turned out that this approach does not work: generally non-root processes can only decrease niceness and they cannot increase it to the previous value, as was assumed by the code. Apart from that, https://github.com/paritytech/polkadot/pull/4123 assumes all PVFs are prepared in the same way. Specifically, that if a PVF preparation failed before, then PVF pre-checking will also report that it was failed, even though it could happen that preparation failed due to being low-priority. In order to avoid such cases, we decided to simplify the whole preparation model. Preparation under low priority does not work well with that. Closes https://github.com/paritytech/polkadot/issues/4520
-
Koute authored
* Align PVF executor to changes in Substrate * Update to the newest `substrate`
-
- Dec 10, 2021
-
-
Pierre Krieger authored
* Companion PR for removing Prometheus metrics prefix * Was missing some metrics * Fix missing renames * Fix test * Fixes * Update test * Update Substrate * Second time * remove prefix from intergration test for zombienet * update zombienet image * Update Substrate Co-authored-by: Bastian Köcher <[email protected]> Co-authored-by: Javier Viola <[email protected]>
-
- Dec 09, 2021
-
-
Chris Sosnin authored
-
- Dec 06, 2021
-
-
Bernhard Schuster authored
-
- Dec 01, 2021
-
-
antonio-dropulic authored
407bf44a8a add missing license header (#1204) 9babb19810 Custom relay strategy (#1198) c287872a11 fix clippy things (#1200) 3a40e62789 Expose some const value and type (#1186) 32b61476d1 increase sleep before connectingMillau (#1195) aabe7041fa revert messages transactions mortality (#1194) 3651f4f909 Message transactions mortality (#1191) 364d6e155d Bump dependencies (#1180) f0389acc08 cargo +nightly fmt --all (#1192) b270b6a016 Unify error enums in substrate and ethereum clients with `thiserror` (#1094) 58c4946f74 Limit max call size of Rialto/Millau runtimes (#1187) fd56a8cd56 Add UI to the deployment (#1047) 16f01dc736 Westend -> Millau alerts are pending before notifications are sent (#1184) 5628c11ece replace collective flip with babe randomness in Rialto (#1188) 1094a63b00 ignore another (pretty bad) RUSTSEC (#1185) 379fe323ea fix/ignore cargo deny issues (#1183) 92af5e6e64 additional log in finality relay + rephrase "failed" (#1182) b996a3b681 Rialto parachain in test deployments (#1178) 28d9332b44 Resubmit transactions strategy for Polkadot/Kusama (#1175) d0172c6847 Playing with CI (#1179) fb6f42456d fix checks order when registering parachain (#1177) ee828c005a Register-parachain subcommand of substrate-relay (#1170) 8cd2b1a112 Token swap pallet benchmarks (#1174) bb811accb1 fix collision with westend bridge (#1172) 8d2fba70ed add token swaps to test deployments (#1169) b6d1bdfe2c publish rialto parachain collator image (#1171) 834ae4a10a Fix OutboundLaneData types (#1159) 5ee0ea1626 copypasted -> copied (#1168) c3bb835f18 fix spelling (#1167) f90d041dc9 Upgrade `jsonrpsee` to v0.3 (#1051) 598c9b6d0d add some basic tests for swap tokens (#1164) 05e88c61f5 publish images when tag of specific format(e.g. v2021-09-27 + v2021-09-27-1) is published (#1166) 7f3f94a6e0 Fix CI again (#1165) ff37de332f Move calculation relayer reward into `MessageDeliveryAndDispatchPayment` (#1153) 36fbba839b fix clippy warning (#1163) 16da44d018 explicit wasm build (#1158) c9c8226449 Match substrate's fmt (#1148) 2fdd7f3e5e Fix/ignore clippy warnings (#1157) 43dfcc2686 Adding LookupAddress (#1156) 951eaa5582 Add rialto-parachain runtime and node (#1142) 803d266d61 Rename MessageId -> BridgeMessageId (#1152) 5f234484fc Box large arguments of GRANDPA pallet (#1154) cf9abc1011 Fix spelling (#1150) ab83ba2e58 Relay subcommand that performs token RLT <> MLAU token swap (#1141) 832536caf0 Polkadot <> Kusama relayers (#1122) 6d0daa8975 Add `OnMessageAccepted` callback (#1134) 5d03a20b3e Integrate token swap pallet into Millau runtime (#1099) ea4cfa833e Adding MultiAddress type and ValidationCodeHash (#1139) c20325a784 Add tests for `Raw` and `BridgeSendMessage` enum `Call` variants (#1125) 6d802416e2 increase pause before pining Rialto nodes (#1137) b54fa56b62 calculate fee using full message payload (#1132) ca5d8178f5 Add parachain pallets to rialto runtime (#1053) 9eaae4142e fix transaction resubmitter limits for Millau -> Rialto transactions (#1135) 9d4e17783c add --mandatory-headers-only cli option to complex relay (#1129) 1c5e0ec1cb Add local CI info to README (#1131) a8e0929e14 chore: spellchecker fixes (#1130) 3b8e2118e3 set fee for importing mandatory headers to zero (#1127) 49bba9aa52 another bunch of words for spellchecker (#1128) 8a72eafef6 Increase pause before messages generation start (#1126) 1f0ba9a191 Move some associated types from relay_substrate_client::Chain to bp_runtime::Chain (#1087) 74bc1a5b54 Transactions resubmitter (#1083) 21ba001f26 log max balance drop when sending message (#1117) 638a7ddffa Code Cleaning (#1124) be6555c51b Fix buildah logout (#1120) 87539c4a98 Format code work (#1116) 526fe7fdd7 fix spelling (#1119) bd4ce7f241 Fix spelling (#1118) 3c1147858e added missing constants to Kusama/Polkadot primitives (#1114) 52093b22ab Fix delivery transaction estimation used by rational relayer (#1109) 77a2f2fbed Remove fund account checks from upgrade. (#1111) 824334802b Rename param and update comment (#1108) d7784bfe06 Fix spellcheck (#1110) 0b18f5906a Refactor substrate messages source and substrate messages target (#1105) b27240bbff fix compilation (#1107) 9697da4fe8 Emit mortal transactions from relay (#1073) b29396c077 Change vault vars type to env vars (#1084) 35e0bbdc0c Make clippy mandatory. (#1103) a517e8541f Remove unused deps (#1102) 873dae608a Remove unnessary deps (#1101) 13450b74ee Stored conversion rate updater (#1005) 74389829f3 [BREAKING] Migrate messages pallet to frame v2 (#1088) 424da938dd README fix (#1100) 865744c909 upgrade currency exchange pallet to frame v2 (#1097) b5038148b3 Add missing docs (#1095) 0791e911c1 Common crate for substrate-relay (#1082) 3834c9d880 Update high-level-overview.md (#1093) c93553face Increase the time window for messaging alerts. (#1092) 8b9cc3cecd migrate pallet-shift-session-manager to frame v2 (#1090) dc91813c22 migrate eth PoA pallet to frame v2 (#1091) f16bb098cc Migrate dispatch pallet to frame v2 (#1089) 19f4325348 Bridge/This Chain Ids should be exposed as constants on pallet level. (#1085) 6381122df7 Change ChainSpec::from_genesis for Rialto and Millau chains to reflect the chain names. (#1079) 0f1d33e973 Make CI happy again (#1086) 238e65d96f fix typo (#1080) fc008457b6 Token-swap-over-bridge pallet (#944) 3fb97fa5ef Fix full spellcheck (#1076) eae4ed7170 fixed wrong trace (#1075) 219a0fad04 merge two weight-related loops in messages pallet (#1071) fc85632fdb increase_message_fee depends on stored mesage size (#1066) 530f37a23b companion for https://github.com/paritytech/polkadot/pull/3507 (#1067) 53b8cba683 sc_basic_authorship=trace for millau nodes (#1074) 9874e05e98 Improve traces of message generator scripts (#1069) 7b5ee84fbb extract message_details impl into runtime common (#1070) 5a4aed5a8b refund weight for mot pruning messages (#1062) 90e3d1e111 Fix Westend -> Millau sync (#1064) 427d30ddfc When restarting client, also "restart" tokio runtime (#1065) d47c05eeef Change get pipeline sensitive variables from Vault instead of GitLab settings (#1063) d775a85415 use tokio reactor to execute jsonrpsee futures (#1061) 15c8cd61cb Use BABE to author blocks on Rialto (previously: Aura) (#1050) 5186293500 Allow reading suri && password override from file (#1059) b506298262 Update jsonrpsee reference (#1049) 1734d00517 enable weight fee adjustent in Rialto/Millau (#1044) 607265afae Pay dispatch fee at target chain cli option (#1043) ce79ef91be bump dependencies before start referencing polkadot repo (#1048) 924fa24f6d Cli option for greedy relayer + run no-losses relayer by default (#1042) e21eba7b59 Yrong README Fixup + M1 Fixes (#1045) 20d08204a2 Confirm delivery detects when more than expected messages are confirmed (#1039) 994b846b52 pre and post dispatch weights of OnDeliveryConfirmed callback (#1040) 1dd5297e84 give real value to Rialto and Millau tokens (#1038) 035bee8715 Use real conversion rate in greedy relayer strategy (#1035) 9cfaecd0f7 fixed metrics prefix (#1037) 1d8d224937 Use kebab-case for bridge arguments (#1036) f30a4c79a6 Shared reference to conversion rate metric value (#1034) c34d7a5cbb estimate transaction fee (#1015) 93404b18bb change alert period from 2m to 10m for Westend -> Millau (GRANDPA or public node itself is lagging sometimes) (#1032) git-subtree-dir: bridges git-subtree-split: 407bf44a8a5f4e60aceef2dc755cd9ff09929ac3
-
- Nov 29, 2021
-
-
Sergey Pepyakin authored
Closes https://github.com/paritytech/polkadot/issues/4293 This PR changes the way how we treat a certain subset of PVF preparation errors. Specifically, now only the deterministic errors are treated as invalid candidates. That is, the errors that are easily attributable to either the the PVF contents or the wasmtime code, but not e.g. I/O errors that could be triggered by the OS (insufficient memory, disk failure, too much load, etc). The latter are treated as internal errors and thus do not trigger the disputes.
-
- Nov 27, 2021
-
-
Sergey Pepyakin authored
* Do not log PVF prunning every hour This lowers the level of the PVF pruning. Closes https://github.com/paritytech/polkadot/issues/4361 * Fix typo: ambigious -> ambiguous The correct spelling is ambiguous ([dictionary](https://dictionary.cambridge.org/dictionary/english/ambiguous))
-
- Nov 26, 2021
-
-
Sergey Pepyakin authored
This lowers the level of the PVF pruning. Closes https://github.com/paritytech/polkadot/issues/4361
-
- Nov 24, 2021
-
-
Sergey Pepyakin authored
-
- Nov 18, 2021
-
-
Sergey Pepyakin authored
* prepare worker: Catch unexpected unwinds * Use more specific wording for unknown panic payload
-
- Nov 15, 2021
-
-
Sergey Pepyakin authored
* Increase preparation-timeout to 60 seconds * Adapt `pvf_preparation_time` metric to the new value
-
- Nov 13, 2021
-
-
Chris Sosnin authored
* pvf host: store only compiled artifacts on disk * Correctly handle failed artifacts * Serialize result of PVF preparation uniquely * Set the artifact state depending on the result * Return the result of PVF preparation directly * Move PrepareError to the error module * Update doc comments * Update misleading comment * pvf host: turn off parallel compilation * pvf host: implement precheck requests * Fix warnings * Unnecessary clone * Add a note about timed out outcome * Revert the pool outcome handling behavior * Move the prepare result type into error mod * Test prepare done * fmt * Add an explanation to wasmtime config * Split pvf host test * Add precheck to dictionary Co-authored-by: Sergei Shulepov <[email protected]>
-
Sergey Pepyakin authored
* Limit the number of PVF workers In particular, limit the number of preparation workers to 1 (soft & hard) and limit the number of execution workers to 2. The reason why we are doing this is that it seems many workers launched at the same time can cause problems. I.e. if there are more than 2 preparation workers, the time for preparation rises significantly to the point of reaching the timeout. This was mostly observed with parallel_compilation=true, so each worker used `numcpu` threads and now we are looking to flip that parameter to `false`. That said, we want to err on the safe side here and gradually enable it later if our measurements show that we can do that safely. * Adjust the test to accomodate the changed config value
-
- Nov 12, 2021
-
-
Sergey Pepyakin authored
-
Arkadiy Paronyan authored
* Remove light client companion * Update substrate * cargo fmt * Fixed benches * fmt
-