Skip to content
Unverified Commit c8d5e5a3 authored by Iulian Barbu's avatar Iulian Barbu Committed by GitHub
Browse files

cumulus/minimal-node: added prometheus metrics for the RPC client (#5572)



# Description

When we start a node with connections to external RPC servers (as a
minimal node), we lack metrics around how many individual calls we're
doing to the remote RPC servers and their duration. This PR adds metrics
that measure durations of each RPC call made by the minimal nodes, and
implicitly how many calls there are.

Closes #5409 
Closes #5689

## Integration

Node operators should be able to track minimal node metrics and decide
appropriate actions according to how the metrics are interpreted/felt.
The added metrics can be observed by curl'ing the prometheus metrics
endpoint for the ~relaychain~ parachain (it was changed based on the
review). The metrics are represented by
~`polkadot_parachain_relay_chain_rpc_interface`~
`relay_chain_rpc_interface` namespace (I realized lining up
`parachain_relay_chain` in the same metric might be confusing :).
Excerpt from the curl:

```
relay_chain_rpc_interface_bucket{method="chain_getBlockHash",chain="rococo_local_testnet",le="0.001"} 15
relay_chain_rpc_interface_bucket{method="chain_getBlockHash",chain="rococo_local_testnet",le="0.004"} 23
relay_chain_rpc_interface_bucket{method="chain_getBlockHash",chain="rococo_local_testnet",le="0.016"} 23
relay_chain_rpc_interface_bucket{method="chain_getBlockHash",chain="rococo_local_testnet",le="0.064"} 23
relay_chain_rpc_interface_bucket{method="chain_getBlockHash",chain="rococo_local_testnet",le="0.256"} 24
relay_chain_rpc_interface_bucket{method="chain_getBlockHash",chain="rococo_local_testnet",le="1.024"} 24
relay_chain_rpc_interface_bucket{method="chain_getBlockHash",chain="rococo_local_testnet",le="4.096"} 24
relay_chain_rpc_interface_bucket{method="chain_getBlockHash",chain="rococo_local_testnet",le="16.384"} 24
relay_chain_rpc_interface_bucket{method="chain_getBlockHash",chain="rococo_local_testnet",le="65.536"} 24
relay_chain_rpc_interface_bucket{method="chain_getBlockHash",chain="rococo_local_testnet",le="+Inf"} 24
relay_chain_rpc_interface_sum{method="chain_getBlockHash",chain="rococo_local_testnet"} 0.11719075
relay_chain_rpc_interface_count{method="chain_getBlockHash",chain="rococo_local_testnet"} 24
```

## Review Notes

The way we measure durations/hits is based on `HistogramVec` struct
which allows us to collect timings for each RPC client method called
from the minimal node., It can be extended to measure the RPCs against
other dimensions too (status codes, response sizes, etc). The timing
measuring is done at the level of the `relay-chain-rpc-interface`, in
the `RelayChainRpcClient` struct's method 'request_tracing'. A single
entry point for all RPC requests done through the
relay-chain-rpc-interface. The requests durations will fall under
exponential buckets described by start `0.001`, factor `4` and count
`9`.

---------

Signed-off-by: default avatarIulian Barbu <[email protected]>
parent 0c9d8fed
Pipeline #497892 waiting for manual action with stages
in 1 hour, 11 minutes, and 13 seconds
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment