..
This file is part of GNU TALER.
Copyright (C) 2014-2023 Taler Systems SA
TALER is free software; you can redistribute it and/or modify it under the
terms of the GNU Affero General Public License as published by the Free Software
Foundation; either version 2.1, or (at your option) any later version.
TALER is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR
A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with
TALER; see the file COPYING. If not, see
@author Javier Sepulveda
Prometheus alerts
#################
.. contents:: Table of Contents
:depth: 1
:local:
The Prometheus alert manager does periodically queries defined in the alert rules files.
In the event that any of these conditions are met, then the alerting system will send a notification (i.e email),
directly to the specified contact points, or towards a specific group of contact points (these are named notification policies).
Very basic concepts of the Prometheus Alerting system, are explained right below. Please check the `Prometheus official
documentation `_ , or the `Grafana documentation `_, if you need to get some additional information.
* Alert rules: One or more queries (expresions) to measure (i.e disk space, memory, or cpu usage).
- Each alert rule contains a condition with a specific threshold.
- Each alert rule can contain a precise contact point to send the notifications to.
- Within the same alert rule, you can specify multiple alert instances.
* Contact points: This is the message notification itself, in conjunction with the specific address to send the notification to.
* Notification policies: This feature allows you to gather a group different contact points, under the same label name.
Install Prometheus alert manager
================================
.. code-block:: console
# apt install prometheus-alertmanager
# systemctl start prometheus-alertmanager
# systemctl status protheus-alertmanager
Edit the Prometheus configuration file
======================================
To make Prometheus talk with the alerting system, you need to
speficy this, on the main prometheus configuration file.
.. code-block:: yaml
# Path: /etc/prometheus/prometheus.yml
# Add this at the end of yml file
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets: ["localhost:9093"]
Alert rules configuration file
==============================
- Create your very first first_rule.yml file.
.. note::
The code shown below, is just an example for CPU, disk and memory usage.
.. code-block:: yaml
# Path: /etc/prometheus/alert_rules.yml
groups:
- name: node_exporter_alerts
rules:
- alert: HighCPULatency
expr: sum(rate(node_cpu_seconds_total{mode="system"}[1m])) / count(node_cpu_seconds_total{mode="system"}) * 100 > 80
for: 1m
labels:
severity: warning
annotations:
summary: "High CPU Latency detected"
description: "CPU latency is above 80% for more than 1 minute."
- alert: LowDiskSpace
expr: (node_filesystem_free_bytes / node_filesystem_size_bytes) * 100 < 10
for: 1m
labels:
severity: critical
annotations:
summary: "Low Disk Space detected"
description: "Disk space is below 10% for more than 1 minute."
- alert: HighMemoryUsage
expr: (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100 > 80
for: 1m
labels:
severity: warning
annotations:
summary: "High Memory Usage detected"
description: "Memory usage is above 80% for more than 1 minute."
Configure SMTP
==============
.. code-block:: yaml
# Path: /etc/prometheus/alertmanager.yml
global:
smtp_smarthost: 'smtp.example.com:587'
smtp_from: 'alertmanager@example.com'
smtp_auth_username: 'yourusername'
smtp_auth_password: 'yourpassword'
route:
receiver: 'email'
receivers:
- name: 'email'
email_configs:
- to: 'recipient@example.com'
send_resolved: true
Add your alert rules to Prometheus
==================================
.. code-block:: yaml
#Path: /etc/prometheus/prometheus.yml
# Add here your alert_rules.yml files
rule_files:
- "first_rule.yml"
- # "second_rule.yml"
Edit the alertmanager systemd service file
============================================
.. code-block:: systemd
# Path: /usr/lib/systemd/system/prometheus-alertmanager.service
[Unit]
Description=Alertmanager for prometheus
Documentation=https://prometheus.io/docs/alerting/alertmanager/
[Service]
Restart=on-failure
User=prometheus
EnvironmentFile=/etc/default/prometheus-alertmanager
ExecStart=/usr/bin/prometheus-alertmanager \
--cluster.advertise-address="ip:9093" # Add this, as otherwise it won't work
ExecReload=/bin/kill -HUP $MAINPID
TimeoutStopSec=20s
SendSIGKILL=no
[Install]
WantedBy=multi-user.target
.. code-block:: console
# systemctl daemon-reload
# systemctl restart prometheus-alertmanager
# systemctl restart prometheus
Check
=====
You can check both your rules (http://ip:9090/rules) and alerts (http://ip:9090/alerts), from your web browser.