- Jan 12, 2021
-
-
Pierre Krieger authored
* Add Prometheus alerts if unbounded channels are too large * Tweaks
-
- Nov 04, 2020
-
-
Pierre Krieger authored
-
- Oct 05, 2020
-
-
Max Inden authored
* .maintain/monitoring: Add alert when continuous task ends Through the `polkadot_tasks_ended_total` Prometheus metric one can tell when a task ended. Use this metric to alert when specific known-to-be-continuous tasks end on a node. * .maintain/monitoring: Don't hard-code task names
-
- Sep 30, 2020
-
-
Max Inden authored
* .maintain/monitoring: Normalize alerting rules - Start alert names with their component and end with the describing adjective. - Describe alert duration in `message` with `for more than` across all alerts. * .maintain/monitoring: Fix alert tests
-
- Aug 24, 2020
-
-
Max Inden authored
Alert on high file descriptor allocation.
-
- Jul 17, 2020
-
-
Max Inden authored
The `HighCPUUsage` alert is based on the `cpu_usage_percentage` metric. Instead of exposing the overall CPU usage in percent, the metric exposes the per core usage summed over all cores. This commit removes the alert for two reasons: 1. Substrate itself does not expose the core count and thus one can not alert based on the `cpu_usage_percentage` metric. 2. Alerting based on CPU usage is generic and not specific to Substrate or Blockchains. Thus any CPU usage alert suffice.
-
- Jul 01, 2020
-
-
Max Inden authored
The transaction queue size alert has been firing with a constant 10 transactions in the queue. While maybe problematic those 10 transactions don't need to be the same across scrape intervals. Instead of alerting with a size above 10, alert based on two things: 1. Monotonically increasing queue size 2. Upper limit queue size reached
-
- Jun 19, 2020
-
-
Max Inden authored
* .maintain/monitoring: Add alerting rule tests * .maintain/monitoring/alerting-rules/alerting-rules.yaml: Break lines * .gitlab-ci.yml: Add promtool rule testing step
-
- May 21, 2020
-
-
Max Inden authored
Create a place to collaborate on Prometheus alerting rules for Substrate starting with a basic set of rules covering: - Resource usage - Block production - Block finalization - Transaction queue - Networking - ... Others
-