Skip to content
Snippets Groups Projects
  1. Jan 12, 2021
  2. Nov 04, 2020
  3. Oct 05, 2020
    • Max Inden's avatar
      .maintain/monitoring: Add alert when continuous task ends (#7250) · 0ff724c9
      Max Inden authored
      * .maintain/monitoring: Add alert when continuous task ends
      
      Through the `polkadot_tasks_ended_total` Prometheus metric one can tell
      when a task ended. Use this metric to alert when specific
      known-to-be-continuous tasks end on a node.
      
      * .maintain/monitoring: Don't hard-code task names
      0ff724c9
  4. Sep 30, 2020
    • Max Inden's avatar
      .maintain/monitoring: Normalize alerting rules (#7232) · 51c0d27a
      Max Inden authored
      * .maintain/monitoring: Normalize alerting rules
      
      - Start alert names with their component and end with the describing
      adjective.
      
      - Describe alert duration in `message` with `for more than` across all
      alerts.
      
      * .maintain/monitoring: Fix alert tests
      51c0d27a
  5. Aug 24, 2020
  6. Jul 17, 2020
    • Max Inden's avatar
      .maintain/monitoring/alerting-rules: Remove HighCPUUsage alert (#6648) · fe9c01fc
      Max Inden authored
      The `HighCPUUsage` alert is based on the `cpu_usage_percentage` metric.
      Instead of exposing the overall CPU usage in percent, the metric exposes
      the per core usage summed over all cores.
      
      This commit removes the alert for two reasons:
      
      1. Substrate itself does not expose the core count and thus one can not
      alert based on the `cpu_usage_percentage` metric.
      
      2. Alerting based on CPU usage is generic and not specific to Substrate
      or Blockchains. Thus any CPU usage alert suffice.
      fe9c01fc
  7. Jul 01, 2020
    • Max Inden's avatar
      .maintain/monitoring/alerting-rules: Adjust transaction queue size alert (#6426) · 585ea531
      Max Inden authored
      The transaction queue size alert has been firing with a constant 10
      transactions in the queue. While maybe problematic those 10 transactions
      don't need to be the same across scrape intervals.
      
      Instead of alerting with a size above 10, alert based on two things:
      
      1. Monotonically increasing queue size
      
      2. Upper limit queue size reached
      585ea531
  8. Jun 19, 2020
  9. May 21, 2020