12.7. Prometheus alerts#

The Prometheus alert manager does periodically queries defined in the alert rules files. In the event that any of these conditions are met, then the alerting system will send a notification (i.e email), directly to the specified contact points, or towards a specific group of contact points (these are named notification policies).

Very basic concepts of the Prometheus Alerting system, are explained right below. Please check the Prometheus official documentation , or the Grafana documentation, if you need to get some additional information.

  • Alert rules: One or more queries (expresions) to measure (i.e disk space, memory, or cpu usage).

    • Each alert rule contains a condition with a specific threshold.

    • Each alert rule can contain a precise contact point to send the notifications to.

    • Within the same alert rule, you can specify multiple alert instances.

  • Contact points: This is the message notification itself, in conjunction with the specific address to send the notification to.

  • Notification policies: This feature allows you to gather a group different contact points, under the same label name.

12.7.1. Install Prometheus alert manager#

# apt install prometheus-alertmanager
# systemctl start prometheus-alertmanager
# systemctl status protheus-alertmanager

12.7.2. Edit the Prometheus configuration file#

To make Prometheus talk with the alerting system, you need to speficy this, on the main prometheus configuration file.

# Path: /etc/prometheus/prometheus.yml
# Add this at the end of yml file
# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets: ["localhost:9093"]

12.7.3. Alert rules configuration file#

  • Create your very first first_rule.yml file.

    Note

    The code shown below, is just an example for CPU, disk and memory usage.

# Path: /etc/prometheus/alert_rules.yml
groups:
- name: node_exporter_alerts
  rules:
  - alert: HighCPULatency
  expr: sum(rate(node_cpu_seconds_total{mode="system"}[1m])) / count(node_cpu_seconds_total{mode="system"}) * 100 > 80
  for: 1m
  labels:
   severity: warning
  annotations:
    summary: "High CPU Latency detected"
    description: "CPU latency is above 80% for more than 1 minute."

- alert: LowDiskSpace
  expr: (node_filesystem_free_bytes / node_filesystem_size_bytes) * 100 < 10
  for: 1m
  labels:
    severity: critical
  annotations:
    summary: "Low Disk Space detected"
    description: "Disk space is below 10% for more than 1 minute."

- alert: HighMemoryUsage
  expr: (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100 > 80
  for: 1m
  labels:
    severity: warning
  annotations:
    summary: "High Memory Usage detected"
    description: "Memory usage is above 80% for more than 1 minute."

12.7.4. Configure SMTP#

# Path: /etc/prometheus/alertmanager.yml

global:
  smtp_smarthost: 'smtp.example.com:587'
  smtp_from: 'alertmanager@example.com'
  smtp_auth_username: 'yourusername'
  smtp_auth_password: 'yourpassword'

route:
  receiver: 'email'

receivers:
  - name: 'email'
    email_configs:
      - to: 'recipient@example.com'
        send_resolved: true

12.7.5. Add your alert rules to Prometheus#

#Path: /etc/prometheus/prometheus.yml
# Add here your alert_rules.yml files
rule_files:
   - "first_rule.yml"
   - # "second_rule.yml"

12.7.6. Edit the alertmanager systemd service file#

# Path: /usr/lib/systemd/system/prometheus-alertmanager.service

[Unit]
Description=Alertmanager for prometheus
Documentation=https://prometheus.io/docs/alerting/alertmanager/

[Service]
Restart=on-failure
User=prometheus
EnvironmentFile=/etc/default/prometheus-alertmanager
ExecStart=/usr/bin/prometheus-alertmanager \
 --cluster.advertise-address="ip:9093" # Add this, as otherwise it won't work
ExecReload=/bin/kill -HUP $MAINPID
TimeoutStopSec=20s
SendSIGKILL=no

[Install]
WantedBy=multi-user.target

# systemctl daemon-reload
# systemctl restart prometheus-alertmanager
# systemctl restart prometheus

12.7.7. Check#

You can check both your rules (http://ip:9090/rules) and alerts (http://ip:9090/alerts), from your web browser.