From 29474f98937dfaecdcb1f71ff91491839c4e9e05 Mon Sep 17 00:00:00 2001 From: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io> Date: Wed, 25 May 2022 05:47:21 +0200 Subject: [PATCH] Document benchmarking CLI (#11246) * Decrese default repeats Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io> * Add benchmarking READMEs Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io> * Update docs Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io> * Update docs Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io> * Update README Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io> * Review fixes Co-authored-by: Shawn Tabrizi <shawntabrizi@gmail.com> Co-authored-by: parity-processbot <> Co-authored-by: Shawn Tabrizi <shawntabrizi@gmail.com> --- substrate/frame/benchmarking/README.md | 14 +- .../utils/frame/benchmarking-cli/README.md | 47 +++++- .../benchmarking-cli/src/block/README.md | 118 +++++++++++++++ .../benchmarking-cli/src/machine/README.md | 71 +++++++++ .../benchmarking-cli/src/overhead/README.md | 136 ++++++++++++++++++ .../benchmarking-cli/src/overhead/bench.rs | 4 +- .../benchmarking-cli/src/pallet/README.md | 3 + .../benchmarking-cli/src/shared/README.md | 15 ++ .../benchmarking-cli/src/storage/README.md | 105 ++++++++++++++ 9 files changed, 505 insertions(+), 8 deletions(-) create mode 100644 substrate/utils/frame/benchmarking-cli/src/block/README.md create mode 100644 substrate/utils/frame/benchmarking-cli/src/machine/README.md create mode 100644 substrate/utils/frame/benchmarking-cli/src/overhead/README.md create mode 100644 substrate/utils/frame/benchmarking-cli/src/pallet/README.md create mode 100644 substrate/utils/frame/benchmarking-cli/src/shared/README.md create mode 100644 substrate/utils/frame/benchmarking-cli/src/storage/README.md diff --git a/substrate/frame/benchmarking/README.md b/substrate/frame/benchmarking/README.md index 38c683cb8db..f0fe05cc140 100644 --- a/substrate/frame/benchmarking/README.md +++ b/substrate/frame/benchmarking/README.md @@ -43,7 +43,7 @@ The benchmarking framework comes with the following tools: * [A set of macros](./src/lib.rs) (`benchmarks!`, `add_benchmark!`, etc...) to make it easy to write, test, and add runtime benchmarks. * [A set of linear regression analysis functions](./src/analysis.rs) for processing benchmark data. -* [A CLI extension](../../utils/frame/benchmarking-cli/) to make it easy to execute benchmarks on your +* [A CLI extension](../../utils/frame/benchmarking-cli/README.md) to make it easy to execute benchmarks on your node. The end-to-end benchmarking pipeline is disabled by default when compiling a node. If you want to @@ -150,9 +150,13 @@ feature flag: ```bash cd bin/node/cli -cargo build --release --features runtime-benchmarks +cargo build --profile=production --features runtime-benchmarks ``` +The production profile applies various compiler optimizations. +These optimizations slow down the compilation process *a lot*. +If you are just testing things out and don't need final numbers, don't include `--profile=production`. + ## Running Benchmarks Finally, once you have a node binary with benchmarks enabled, you need to execute your various @@ -161,13 +165,13 @@ benchmarks. You can get a list of the available benchmarks by running: ```bash -./target/release/substrate benchmark --chain dev --pallet "*" --extrinsic "*" --repeat 0 +./target/production/substrate benchmark pallet --chain dev --pallet "*" --extrinsic "*" --repeat 0 ``` Then you can run a benchmark like so: ```bash -./target/release/substrate benchmark \ +./target/production/substrate benchmark pallet \ --chain dev \ # Configurable Chain Spec --execution=wasm \ # Always test with Wasm --wasm-execution=compiled \ # Always used `wasm-time` @@ -200,7 +204,7 @@ used for joining all the arguments passed to the CLI. To get a full list of available options when running benchmarks, run: ```bash -./target/release/substrate benchmark --help +./target/production/substrate benchmark --help ``` License: Apache-2.0 diff --git a/substrate/utils/frame/benchmarking-cli/README.md b/substrate/utils/frame/benchmarking-cli/README.md index 9718db58b37..e6a48b61fd2 100644 --- a/substrate/utils/frame/benchmarking-cli/README.md +++ b/substrate/utils/frame/benchmarking-cli/README.md @@ -1 +1,46 @@ -License: Apache-2.0 \ No newline at end of file +# The Benchmarking CLI + +This crate contains commands to benchmark various aspects of Substrate and the hardware. +All commands are exposed by the Substrate node but can be exposed by any Substrate client. +The goal is to have a comprehensive suite of benchmarks that cover all aspects of Substrate and the hardware that its running on. + +Invoking the root benchmark command prints a help menu: +```sh +$ cargo run --profile=production -- benchmark + +Sub-commands concerned with benchmarking. + +USAGE: + substrate benchmark <SUBCOMMAND> + +OPTIONS: + -h, --help Print help information + -V, --version Print version information + +SUBCOMMANDS: + block Benchmark the execution time of historic blocks + machine Command to benchmark the hardware. + overhead Benchmark the execution overhead per-block and per-extrinsic + pallet Benchmark the extrinsic weight of FRAME Pallets + storage Benchmark the storage speed of a chain snapshot +``` + +All examples use the `production` profile for correctness which makes the compilation *very* slow; for testing you can use `--release`. +For the final results the `production` profile and reference hardware should be used, otherwise the results are not comparable. + +The sub-commands are explained in depth here: +- [block] Compare the weight of a historic block to its actual resource usage +- [machine] Gauges the speed of the hardware +- [overhead] Creates weight files for the *Block*- and *Extrinsic*-base weights +- [pallet] Creates weight files for a Pallet +- [storage] Creates weight files for *Read* and *Write* storage operations + +License: Apache-2.0 + +<!-- LINKS --> + +[pallet]: ../../../frame/benchmarking/README.md +[machine]: src/machine/README.md +[storage]: src/storage/README.md +[overhead]: src/overhead/README.md +[block]: src/block/README.md diff --git a/substrate/utils/frame/benchmarking-cli/src/block/README.md b/substrate/utils/frame/benchmarking-cli/src/block/README.md new file mode 100644 index 00000000000..7e99f0df9d4 --- /dev/null +++ b/substrate/utils/frame/benchmarking-cli/src/block/README.md @@ -0,0 +1,118 @@ +# The `benchmark block` command + +The whole benchmarking process in Substrate aims to predict the resource usage of an unexecuted block. +This command measures how accurate this prediction was by executing a block and comparing the predicted weight to its actual resource usage. +It can be used to measure the accuracy of the pallet benchmarking. + +In the following it will be explained once for Polkadot and once for Substrate. + +## Polkadot # 1 +<sup>(Also works for Kusama, Westend and Rococo)</sup> + + +Suppose you either have a synced Polkadot node or downloaded a snapshot from [Polkachu]. +This example uses a pruned ParityDB snapshot from the 2022-4-19 with the last block being 9939462. +For pruned snapshots you need to know the number of the last block (to be improved [here]). +Pruned snapshots normally store the last 256 blocks, archive nodes can use any block range. + +In this example we will benchmark just the last 10 blocks: +```sh +cargo run --profile=production -- benchmark block --from 9939453 --to 9939462 --db paritydb +``` + +Output: +```pre +Block 9939453 with 2 tx used 4.57% of its weight ( 26,458,801 of 579,047,053 ns) +Block 9939454 with 3 tx used 4.80% of its weight ( 28,335,826 of 590,414,831 ns) +Block 9939455 with 2 tx used 4.76% of its weight ( 27,889,567 of 586,484,595 ns) +Block 9939456 with 2 tx used 4.65% of its weight ( 27,101,306 of 582,789,723 ns) +Block 9939457 with 2 tx used 4.62% of its weight ( 26,908,882 of 582,789,723 ns) +Block 9939458 with 2 tx used 4.78% of its weight ( 28,211,440 of 590,179,467 ns) +Block 9939459 with 4 tx used 4.78% of its weight ( 27,866,077 of 583,260,451 ns) +Block 9939460 with 3 tx used 4.72% of its weight ( 27,845,836 of 590,462,629 ns) +Block 9939461 with 2 tx used 4.58% of its weight ( 26,685,119 of 582,789,723 ns) +Block 9939462 with 2 tx used 4.60% of its weight ( 26,840,938 of 583,697,101 ns) +``` + +### Output Interpretation + +<sup>(Only results from reference hardware are relevant)</sup> + +Each block is executed multiple times and the results are averaged. +The percent number is the interesting part and indicates how much weight was used as compared to how much was predicted. +The closer to 100% this is without exceeding 100%, the better. +If it exceeds 100%, the block is marked with "**OVER WEIGHT!**" to easier spot them. This is not good since then the benchmarking under-estimated the weight. +This would mean that an honest validator would possibly not be able to keep up with importing blocks since users did not pay for enough weight. +If that happens the validator could lag behind the chain and get slashed for missing deadlines. +It is therefore important to investigate any overweight blocks. + +In this example you can see an unexpected result; only < 5% of the weight was used! +The measured blocks can be executed much faster than predicted. +This means that the benchmarking process massively over-estimated the execution time. +Since they are off by so much, it is an issue [polkadot#5192]. + +The ideal range for these results would be 85-100%. + +## Polkadot # 2 + +Let's take a more interesting example where the blocks use more of their predicted weight. +Every day when validators pay out rewards, the blocks are nearly full. +Using an archive node here is the easiest. + +The Polkadot blocks TODO-TODO for example contain large batch transactions for staking payout. + +```sh +cargo run --profile=production -- benchmark block --from TODO --to TODO --db paritydb +``` + +```pre +TODO +``` + +## Substrate + +It is also possible to try the procedure in Substrate, although it's a bit boring. + +First you need to create some blocks with either a local or dev chain. +This example will use the standard development spec. +Pick a non existing directory where the chain data will be stored, eg `/tmp/dev`. +```sh +cargo run --profile=production -- --dev -d /tmp/dev +``` +You should see after some seconds that it started to produce blocks: +```pre +… +✨ Imported #1 (0x801d…9189) +… +``` +You can now kill the node with `Ctrl+C`. Then measure how long it takes to execute these blocks: +```sh +cargo run --profile=production -- benchmark block --from 1 --to 1 --dev -d /tmp/dev --pruning archive +``` +This will benchmark the first block. If you killed the node at a later point, you can measure multiple blocks. +```pre +Block 1 with 1 tx used 72.04% of its weight ( 4,945,664 of 6,864,702 ns) +``` + +In this example the block used ~72% of its weight. +The benchmarking therefore over-estimated the effort to execute the block. +Since this block is empty, its not very interesting. + +## Arguments + +- `--from` Number of the first block to measure (inclusive). +- `--to` Number of the last block to measure (inclusive). +- `--repeat` How often each block should be measured. +- [`--db`] +- [`--pruning`] + +License: Apache-2.0 + +<!-- LINKS --> + +[Polkachu]: https://polkachu.com/snapshots +[here]: https://github.com/paritytech/substrate/issues/11141 +[polkadot#5192]: https://github.com/paritytech/polkadot/issues/5192 + +[`--db`]: ../shared/README.md#arguments +[`--pruning`]: ../shared/README.md#arguments diff --git a/substrate/utils/frame/benchmarking-cli/src/machine/README.md b/substrate/utils/frame/benchmarking-cli/src/machine/README.md new file mode 100644 index 00000000000..f22a8ea54b8 --- /dev/null +++ b/substrate/utils/frame/benchmarking-cli/src/machine/README.md @@ -0,0 +1,71 @@ +# The `benchmark machine` command + +Different Substrate chains can have different hardware requirements. +It is therefore important to be able to quickly gauge if a piece of hardware fits a chains' requirements. +The `benchmark machine` command archives this by measuring key metrics and making them comparable. + +Invoking the command looks like this: +```sh +cargo run --profile=production -- benchmark machine --dev +``` + +## Output + +The output on reference hardware: + +```pre ++----------+----------------+---------------+--------------+-------------------+ +| Category | Function | Score | Minimum | Result | ++----------+----------------+---------------+--------------+-------------------+ +| CPU | BLAKE2-256 | 1023.00 MiB/s | 1.00 GiB/s | ✅ Pass ( 99.4 %) | ++----------+----------------+---------------+--------------+-------------------+ +| CPU | SR25519-Verify | 665.13 KiB/s | 666.00 KiB/s | ✅ Pass ( 99.9 %) | ++----------+----------------+---------------+--------------+-------------------+ +| Memory | Copy | 14.39 GiB/s | 14.32 GiB/s | ✅ Pass (100.4 %) | ++----------+----------------+---------------+--------------+-------------------+ +| Disk | Seq Write | 457.00 MiB/s | 450.00 MiB/s | ✅ Pass (101.6 %) | ++----------+----------------+---------------+--------------+-------------------+ +| Disk | Rnd Write | 190.00 MiB/s | 200.00 MiB/s | ✅ Pass ( 95.0 %) | ++----------+----------------+---------------+--------------+-------------------+ +``` + +The *score* is the average result of each benchmark. It always adheres to "higher is better". + +The *category* indicate which part of the hardware was benchmarked: +- **CPU** Processor intensive task +- **Memory** RAM intensive task +- **Disk** Hard drive intensive task + +The *function* is the concrete benchmark that was run: +- **BLAKE2-256** The throughput of the [Blake2-256] cryptographic hashing function with 32 KiB input. The [blake2_256 function] is used in many places in Substrate. The throughput of a hash function strongly depends on the input size, therefore we settled to use a fixed input size for comparable results. +- **SR25519 Verify** Sr25519 is an optimized version of the [Curve25519] signature scheme. Signature verification is used by Substrate when verifying extrinsics and blocks. +- **Copy** The throughput of copying memory from one place in the RAM to another. +- **Seq Write** The throughput of writing data to the storage location sequentially. It is important that the same disk is used that will later-on be used to store the chain data. +- **Rnd Write** The throughput of writing data to the storage location in a random order. This is normally much slower than the sequential write. + +The *score* needs to reach the *minimum* in order to pass the benchmark. This can be reduced with the `--tolerance` flag. + +The *result* indicated if a specific benchmark was passed by the machine or not. The percent number is the relative score reached to the *minimum* that is needed. The `--tolerance` flag is taken into account for this decision. For example a benchmark that passes even with 95% since the *tolerance* was set to 10% would look like this: `✅ Pass ( 95.0 %)`. + +## Interpretation + +Ideally all results show a `Pass` and the program exits with code 0. Currently some of the benchmarks can fail even on reference hardware; they are still being improved to make them more deterministic. +Make sure to run nothing else on the machine when benchmarking it. +You can re-run them multiple times to get more reliable results. + +## Arguments + +- `--tolerance` A percent number to reduce the *minimum* requirement. This should be used to ignore outliers of the benchmarks. The default value is 10%. +- `--verify-duration` How long the verification benchmark should run. +- `--disk-duration` How long the *read* and *write* benchmarks should run each. +- `--allow-fail` Always exit the program with code 0. +- `--chain` / `--dev` Specify the chain config to use. This will be used to compare the results with the requirements of the chain (WIP). +- [`--base-path`] + +License: Apache-2.0 + +<!-- LINKS --> +[Blake2-256]: https://www.blake2.net/ +[blake2_256 function]: https://crates.parity.io/sp_core/hashing/fn.blake2_256.html +[Curve25519]: https://en.wikipedia.org/wiki/Curve25519 +[`--base-path`]: ../shared/README.md#arguments diff --git a/substrate/utils/frame/benchmarking-cli/src/overhead/README.md b/substrate/utils/frame/benchmarking-cli/src/overhead/README.md new file mode 100644 index 00000000000..6f41e881d05 --- /dev/null +++ b/substrate/utils/frame/benchmarking-cli/src/overhead/README.md @@ -0,0 +1,136 @@ +# The `benchmark overhead` command + +Each time an extrinsic or a block is executed, a fixed weight is charged as "execution overhead". +This is necessary since the weight that is calculated by the pallet benchmarks does not include this overhead. +The exact overhead to can vary per Substrate chain and needs to be calculated per chain. +This command calculates the exact values of these overhead weights for any Substrate chain that supports it. + +## How does it work? + +The benchmark consists of two parts; the [`BlockExecutionWeight`] and the [`ExtrinsicBaseWeight`]. +Both are executed sequentially when invoking the command. + +## BlockExecutionWeight + +The block execution weight is defined as the weight that it takes to execute an *empty block*. +It is measured by constructing an empty block and measuring its executing time. +The result are written to a `block_weights.rs` file which is created from a template. +The file will contain the concrete weight value and various statistics about the measurements. For example: +```rust +/// Time to execute an empty block. +/// Calculated by multiplying the *Average* with `1` and adding `0`. +/// +/// Stats [NS]: +/// Min, Max: 3_508_416, 3_680_498 +/// Average: 3_532_484 +/// Median: 3_522_111 +/// Std-Dev: 27070.23 +/// +/// Percentiles [NS]: +/// 99th: 3_631_863 +/// 95th: 3_595_674 +/// 75th: 3_526_435 +pub const BlockExecutionWeight: Weight = 3_532_484 * WEIGHT_PER_NANOS; +``` + +In this example it takes 3.5 ms to execute an empty block. That means that it always takes at least 3.5 ms to execute *any* block. +This constant weight is therefore added to each block to ensure that Substrate budgets enough time to execute it. + +## ExtrinsicBaseWeight + +The extrinsic base weight is defined as the weight that it takes to execute an *empty* extrinsic. +An *empty* extrinsic is also called a *NO-OP*. It does nothing and is the equivalent to the empty block form above. +The benchmark now constructs a block which is filled with only NO-OP extrinsics. +This block is then executed many times and the weights are measured. +The result is divided by the number of extrinsics in that block and the results are written to `extrinsic_weights.rs`. + +The relevant section in the output file looks like this: +```rust + /// Time to execute a NO-OP extrinsic, for example `System::remark`. +/// Calculated by multiplying the *Average* with `1` and adding `0`. +/// +/// Stats [NS]: +/// Min, Max: 67_561, 69_855 +/// Average: 67_745 +/// Median: 67_701 +/// Std-Dev: 264.68 +/// +/// Percentiles [NS]: +/// 99th: 68_758 +/// 95th: 67_843 +/// 75th: 67_749 +pub const ExtrinsicBaseWeight: Weight = 67_745 * WEIGHT_PER_NANOS; +``` + +In this example it takes 67.7 µs to execute a NO-OP extrinsic. That means that it always takes at least 67.7 µs to execute *any* extrinsic. +This constant weight is therefore added to each extrinsic to ensure that Substrate budgets enough time to execute it. + +## Invocation + +The base command looks like this (for debugging you can use `--release`): +```sh +cargo run --profile=production -- benchmark overhead --dev +``` + +Output: +```pre +# BlockExecutionWeight +Running 10 warmups... +Executing block 100 times +Per-block execution overhead [ns]: +Total: 353248430 +Min: 3508416, Max: 3680498 +Average: 3532484, Median: 3522111, Stddev: 27070.23 +Percentiles 99th, 95th, 75th: 3631863, 3595674, 3526435 +Writing weights to "block_weights.rs" + +# Setup +Building block, this takes some time... +Extrinsics per block: 12000 + +# ExtrinsicBaseWeight +Running 10 warmups... +Executing block 100 times +Per-extrinsic execution overhead [ns]: +Total: 6774590 +Min: 67561, Max: 69855 +Average: 67745, Median: 67701, Stddev: 264.68 +Percentiles 99th, 95th, 75th: 68758, 67843, 67749 +Writing weights to "extrinsic_weights.rs" +``` + +The complete command for Polkadot looks like this: +```sh +cargo run --profile=production -- benchmark overhead --chain=polkadot-dev --execution=wasm --wasm-execution=compiled --weight-path=runtime/polkadot/constants/src/weights/ +``` + +This will overwrite the the [block_weights.rs](https://github.com/paritytech/polkadot/blob/c254e5975711a6497af256f6831e9a6c752d28f5/runtime/polkadot/constants/src/weights/block_weights.rs) and [extrinsic_weights.rs](https://github.com/paritytech/polkadot/blob/c254e5975711a6497af256f6831e9a6c752d28f5/runtime/polkadot/constants/src/weights/extrinsic_weights.rs) files in the Polkadot runtime directory. +You can try the same for *Rococo* and to see that the results slightly differ. +👉 It is paramount to use `--profile=production`, `--execution=wasm` and `--wasm-execution=compiled` as the results are otherwise useless. + +## Output Interpretation + +Lower is better. The less weight the execution overhead needs, the better. +Since the weights of the overhead is charged per extrinsic and per block, a larger weight results in less extrinsics per block. +Minimizing this is important to have a large transaction throughput. + +## Arguments + +- `--chain` / `--dev` Set the chain specification. +- `--weight-path` Set the output directory or file to write the weights to. +- `--repeat` Set the repetitions of both benchmarks. +- `--warmup` Set the rounds of warmup before measuring. +- `--execution` Should be set to `wasm` for correct results. +- `--wasm-execution` Should be set to `compiled` for correct results. +- [`--mul`](../shared/README.md#arguments) +- [`--add`](../shared/README.md#arguments) +- [`--metric`](../shared/README.md#arguments) +- [`--weight-path`](../shared/README.md#arguments) + +License: Apache-2.0 + +<!-- LINKS --> +[`ExtrinsicBaseWeight`]: https://github.com/paritytech/substrate/blob/580ebae17fa30082604f1c9720f6f4a1cfe95b50/frame/support/src/weights/extrinsic_weights.rs#L26 +[`BlockExecutionWeight`]: https://github.com/paritytech/substrate/blob/580ebae17fa30082604f1c9720f6f4a1cfe95b50/frame/support/src/weights/block_weights.rs#L26 + +[System::Remark]: https://github.com/paritytech/substrate/blob/580ebae17fa30082604f1c9720f6f4a1cfe95b50/frame/system/src/lib.rs#L382 diff --git a/substrate/utils/frame/benchmarking-cli/src/overhead/bench.rs b/substrate/utils/frame/benchmarking-cli/src/overhead/bench.rs index 68f3f6597b4..be7dac24021 100644 --- a/substrate/utils/frame/benchmarking-cli/src/overhead/bench.rs +++ b/substrate/utils/frame/benchmarking-cli/src/overhead/bench.rs @@ -43,11 +43,11 @@ use crate::shared::Stats; #[derive(Debug, Default, Serialize, Clone, PartialEq, Args)] pub struct BenchmarkParams { /// Rounds of warmups before measuring. - #[clap(long, default_value = "100")] + #[clap(long, default_value = "10")] pub warmup: u32, /// How many times the benchmark should be repeated. - #[clap(long, default_value = "1000")] + #[clap(long, default_value = "100")] pub repeat: u32, /// Maximal number of extrinsics that should be put into a block. diff --git a/substrate/utils/frame/benchmarking-cli/src/pallet/README.md b/substrate/utils/frame/benchmarking-cli/src/pallet/README.md new file mode 100644 index 00000000000..72845652de6 --- /dev/null +++ b/substrate/utils/frame/benchmarking-cli/src/pallet/README.md @@ -0,0 +1,3 @@ +The pallet command is explained in [frame/benchmarking](../../../../../frame/benchmarking/README.md). + +License: Apache-2.0 diff --git a/substrate/utils/frame/benchmarking-cli/src/shared/README.md b/substrate/utils/frame/benchmarking-cli/src/shared/README.md new file mode 100644 index 00000000000..2a3719b8549 --- /dev/null +++ b/substrate/utils/frame/benchmarking-cli/src/shared/README.md @@ -0,0 +1,15 @@ +# Shared code + +Contains code that is shared among multiple sub-commands. + +## Arguments + +- `--mul` Multiply the result with a factor. Can be used to manually adjust for future chain growth. +- `--add` Add a value to the result. Can be used to manually offset the results. +- `--metric` Set the metric to use for calculating the final weight from the raw data. Defaults to `average`. +- `--weight-path` Set the file or directory to write the weight files to. +- `--db` The database backend to use. This depends on your snapshot. +- `--pruning` Set the pruning mode of the node. Some benchmarks require you to set this to `archive`. +- `--base-path` The location on the disk that should be used for the benchmarks. You can try this on different disks or even on a mounted RAM-disk. It is important to use the same location that will later-on be used to store the chain data to get the correct results. + +License: Apache-2.0 diff --git a/substrate/utils/frame/benchmarking-cli/src/storage/README.md b/substrate/utils/frame/benchmarking-cli/src/storage/README.md new file mode 100644 index 00000000000..820785f7ea2 --- /dev/null +++ b/substrate/utils/frame/benchmarking-cli/src/storage/README.md @@ -0,0 +1,105 @@ +# The `benchmark storage` command + +The cost of storage operations in a Substrate chain depends on the current chain state. +It is therefore important to regularly update these weights as the chain grows. +This sub-command measures the cost of storage operations for a concrete snapshot. + +For the Substrate node it looks like this (for debugging you can use `--release`): +```sh +cargo run --profile=production -- benchmark storage --dev --state-version=1 +``` + +Running the command on Substrate itself is not verify meaningful, since the genesis state of the `--dev` chain spec is used. + +The output for the Polkadot client with a recent chain snapshot will give you a better impression. A recent snapshot can be downloaded from [Polkachu]. +Then run (remove the `--db=paritydb` if you have a RocksDB snapshot): +```sh +cargo run --profile=production -- benchmark storage --dev --state-version=0 --db=paritydb --weight-path runtime/polkadot/constants/src/weights +``` + +This takes a while since reads and writes all keys from the snapshot: +```pre +# The 'read' benchmark +Preparing keys from block BlockId::Number(9939462) +Reading 1379083 keys +Time summary [ns]: +Total: 19668919930 +Min: 6450, Max: 1217259 +Average: 14262, Median: 14190, Stddev: 3035.79 +Percentiles 99th, 95th, 75th: 18270, 16190, 14819 +Value size summary: +Total: 265702275 +Min: 1, Max: 1381859 +Average: 192, Median: 80, Stddev: 3427.53 +Percentiles 99th, 95th, 75th: 3368, 383, 80 + +# The 'write' benchmark +Preparing keys from block BlockId::Number(9939462) +Writing 1379083 keys +Time summary [ns]: +Total: 98393809781 +Min: 12969, Max: 13282577 +Average: 71347, Median: 69499, Stddev: 25145.27 +Percentiles 99th, 95th, 75th: 135839, 106129, 79239 +Value size summary: +Total: 265702275 +Min: 1, Max: 1381859 +Average: 192, Median: 80, Stddev: 3427.53 +Percentiles 99th, 95th, 75th: 3368, 383, 80 + +Writing weights to "paritydb_weights.rs" +``` +You will see that the [paritydb_weights.rs] files was modified and now contains new weights. +The exact command for Polkadot can be seen at the top of the file. +This uses the most recent block from your snapshot which is printed at the top. +The value size summary tells us that the pruned Polkadot chain state is ~253 MiB in size. +Reading a value on average takes (in this examples) 14.3 µs and writing 71.3 µs. +The interesting part in the generated weight file tells us the weight constants and some statistics about the measurements: +```rust +/// Time to read one storage item. +/// Calculated by multiplying the *Average* of all values with `1.1` and adding `0`. +/// +/// Stats [NS]: +/// Min, Max: 4_611, 1_217_259 +/// Average: 14_262 +/// Median: 14_190 +/// Std-Dev: 3035.79 +/// +/// Percentiles [NS]: +/// 99th: 18_270 +/// 95th: 16_190 +/// 75th: 14_819 +read: 14_262 * constants::WEIGHT_PER_NANOS, + +/// Time to write one storage item. +/// Calculated by multiplying the *Average* of all values with `1.1` and adding `0`. +/// +/// Stats [NS]: +/// Min, Max: 12_969, 13_282_577 +/// Average: 71_347This works under the assumption that the *average* read a +/// Median: 69_499 +/// Std-Dev: 25145.27 +/// +/// Percentiles [NS]: +/// 99th: 135_839 +/// 95th: 106_129 +/// 75th: 79_239 +write: 71_347 * constants::WEIGHT_PER_NANOS, +``` + +## Arguments + +- `--db` Specify which database backend to use. This greatly influences the results. +- `--state-version` Set the version of the state encoding that this snapshot uses. Should be set to `1` for Substrate `--dev` and `0` for Polkadot et al. Using the wrong version can corrupt the snapshot. +- [`--mul`](../shared/README.md#arguments) +- [`--add`](../shared/README.md#arguments) +- [`--metric`](../shared/README.md#arguments) +- [`--weight-path`](../shared/README.md#arguments) +- `--json-read-path` Write the raw 'read' results to this file or directory. +- `--json-write-path` Write the raw 'write' results to this file or directory. + +License: Apache-2.0 + +<!-- LINKS --> +[Polkachu]: https://polkachu.com/snapshots +[paritydb_weights.rs]: https://github.com/paritytech/polkadot/blob/c254e5975711a6497af256f6831e9a6c752d28f5/runtime/polkadot/constants/src/weights/paritydb_weights.rs#L60 -- GitLab