Skip to content
Snippets Groups Projects
user avatar
Dcompoze authored
**Update:** Pushed additional changes based on the review comments.

**This pull request fixes various spelling mistakes in this
repository.**

Most of the changes are contained in the first **3** commits:

- `Fix spelling mistakes in comments and docs`

- `Fix spelling mistakes in test names`

- `Fix spelling mistakes in error messages, panic messages, logs and
tracing`

Other source code spelling mistakes are separated into individual
commits for easier reviewing:

- `Fix the spelling of 'authority'`

- `Fix the spelling of 'REASONABLE_HEADERS_IN_JUSTIFICATION_ANCESTRY'`

- `Fix the spelling of 'prev_enqueud_messages'`

- `Fix the spelling of 'endpoint'`

- `Fix the spelling of 'children'`

- `Fix the spelling of 'PenpalSiblingSovereignAccount'`

- `Fix the spelling of 'PenpalSudoAccount'`

- `Fix the spelling of 'insufficient'`

- `Fix the spelling of 'PalletXcmExtrinsicsBenchmark'`

- `Fix the spelling of 'subtracted'`

- `Fix the spelling of 'CandidatePendingAvailability'`

- `Fix the spelling of 'exclusive'`

- `Fix the spelling of 'until'`

- `Fix the spelling of 'discriminator'`

- `Fix the spelling of 'nonexistent'`

- `Fix the spelling of 'subsystem'`

- `Fix the spelling of 'indices'`

- `Fix the spelling of 'committed'`

- `Fix the spelling of 'topology'`

- `Fix the spelling of 'response'`

- `Fix the spelling of 'beneficiary'`

- `Fix the spelling of 'formatted'`

- `Fix the spelling of 'UNKNOWN_PROOF_REQUEST'`

- `Fix the spelling of 'succeeded'`

- `Fix the spelling of 'reopened'`

- `Fix the spelling of 'proposer'`

- `Fix the spelling of 'InstantiationNonce'`

- `Fix the spelling of 'depositor'`

- `Fix the spelling of 'expiration'`

- `Fix the spelling of 'phantom'`

- `Fix the spelling of 'AggregatedKeyValue'`

- `Fix the spelling of 'randomness'`

- `Fix the spelling of 'defendant'`

- `Fix the spelling of 'AquaticMammal'`

- `Fix the spelling of 'transactions'`

- `Fix the spelling of 'PassingTracingSubscriber'`

- `Fix the spelling of 'TxSignaturePayload'`

- `Fix the spelling of 'versioning'`

- `Fix the spelling of 'descendant'`

- `Fix the spelling of 'overridden'`

- `Fix the spelling of 'network'`

Let me know if this structure is adequate.

**Note:** The usage of the words `Merkle`, `Merkelize`, `Merklization`,
`Merkelization`, `Merkleization`, is somewhat inconsistent but I left it
as it is.

~~**Note:** In some places the term `Receival` is used to refer to
message reception, IMO `Reception` is the correct word here, but I left
it as it is.~~

~~**Note:** In some places the term `Overlayed` is used instead of the
more acceptable version `Overlaid` but I also left it as it is.~~

~~**Note:** In some places the term `Applyable` is used instead of the
correct version `Applicable` but I also left it as it is.~~

**Note:** Some usage of British vs American english e.g. `judgement` vs
`judgment`, `initialise` vs `initialize`, `optimise` vs `optimize` etc.
are both present in different places, but I suppose that's
understandable given the number of contributors.

~~**Note:** There is a spelling mistake in `.github/CODEOWNERS` but it
triggers errors in CI when I make changes to it, so I left it as it
is.~~
002d9260
Name Last commit Last update
..
general
parachains
README.md

Do I need this ?

Polkadot nodes collect and produce Prometheus metrics and logs. These include health, performance and debug information such as last finalized block, height of the chain, and many other deeper implementation details of the Polkadot/Substrate node subsystems. These are crucial pieces of information that one needs to successfully monitor the liveliness and performance of a network and its validators.

How does it work ?

Just import the dashboard JSON files from this folder in your Grafana installation. All dashboards are grouped in folder per category (like for example parachains). The files have been created by Grafana export functionality and follow the data model specified here.

We aim to keep the dashboards here in sync with the implementation, except dashboards for development and testing.

Contributing

Your contributions are most welcome!

Please make sure to follow the following design guidelines:

  • Add a new entry in this file and describe the usecase and key metrics
  • Ensure proper names and descriptions for dashboard panels and add relevant documentation when needed. This is very important as not all users have similar depth of understanding of the implementation
  • Have labels for axis
  • All values have proper units of measurement
  • A crisp and clear color scheme is used

Prerequisites

Before you continue make sure you have Grafana set up, or otherwise follow this guide.

You might also need to setup Loki.

Alerting

Alerts are currently out of the scope of the dashboards, but their setup can be done manually or automated (see installing and configuring Alert Manager)

Dashboards

This section is a list of dashboards, their use case as well as the key metrics that are covered.

Node Versions

Useful for monitoring versions and logs of validator nodes. Includes time series panels that track node warning and error log rates. These can be further investigated in Grafana Loki.

Requires Loki for log aggregation and querying.

Dashboard JSON

Parachain Status

This dashboard allows you to see at a glance how fast are candidates approved, disputed and finalized. It was originally designed for observing liveliness after parachain deployment in Kusama/Polkadot, but can be useful generally in production or testing.

It includes panels covering key subsystems of the parachain node side implementation:

  • Backing
  • PVF execution
  • Approval voting
  • Disputes coordinator
  • Chain selection

It is important to note that this dashboard applies only for validator nodes. The prometheus queries assume the instance label value contains the string validator only for validator nodes.

Dashboard JSON

Key liveliness indicators

  • Relay chain finality lag. How far behind finality is compared to the current best block. By design, GRANDPA never finalizes past last 2 blocks, so this value is always >=2 blocks.
  • Approval checking finality lag. The distance (in blocks) between the chain head and the last block on which Approval voting is happening. The block is generally the highest approved ancestor of the head block and the metric is computed during relay chain selection.
  • Disputes finality lag. How far behind the chain head is the last approved and non disputed block. This value is always higher than approval checking lag as it further restricts finality to only undisputed chains.
  • PVF preparation and execution time. Each parachain has it's own PVF (parachain validation function): a wasm blob that is executed by validators during backing, approval checking and disputing. The PVF preparation time refers to the time it takes for the PVF wasm to be compiled. This step is done once and then result cached. PVF execution will use the resulting artifact to execute the PVF for a given candidate. PVFs are expected to have a limited execution time to ensure there is enough time left for the parachain block to be included in the relay block.
  • Time to recover and check candidate. This is part of approval voting and covers the time it takes to recover the candidate block available data from other validators, check it (includes PVF execution time) and issue statement or initiate dispute.
  • Assignment delay tranches. Approval voting is designed such that validators assigned to check a specific candidate are split up into equal delay tranches (0.5 seconds each). All validators checks are ordered by the delay tranche index. Early tranches of validators have the opportunity to check the candidate first before later tranches that act as as backups in case of no shows.