.. This file is part of GNU TALER. Copyright (C) 2014-2023 Taler Systems SA TALER is free software; you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation; either version 2.1, or (at your option) any later version. TALER is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details. You should have received a copy of the GNU Affero General Public License along with TALER; see the file COPYING. If not, see @author Javier Sepulveda Prometheus alerts ################# .. contents:: Table of Contents :depth: 1 :local: The Prometheus alert manager does periodically queries defined in the alert rules files. In the event that any of these conditions are met, then the alerting system will send a notification (i.e email), directly to the specified contact points, or towards a specific group of contact points (these are named notification policies). Very basic concepts of the Prometheus Alerting system, are explained right below. Please check the `Prometheus official documentation `_ , or the `Grafana documentation `_, if you need to get some additional information. * Alert rules: One or more queries (expresions) to measure (i.e disk space, memory, or cpu usage). - Each alert rule contains a condition with a specific threshold. - Each alert rule can contain a precise contact point to send the notifications to. - Within the same alert rule, you can specify multiple alert instances. * Contact points: This is the message notification itself, in conjunction with the specific address to send the notification to. * Notification policies: This feature allows you to gather a group different contact points, under the same label name. Install Prometheus alert manager ================================ .. code-block:: console # apt install prometheus-alertmanager # systemctl start prometheus-alertmanager # systemctl status protheus-alertmanager Edit the Prometheus configuration file ====================================== To make Prometheus talk with the alerting system, you need to speficy this, on the main prometheus configuration file. .. code-block:: yaml # Path: /etc/prometheus/prometheus.yml # Add this at the end of yml file # Alertmanager configuration alerting: alertmanagers: - static_configs: - targets: ["localhost:9093"] Alert rules configuration file ============================== - Create your very first first_rule.yml file. .. note:: The code shown below, is just an example for CPU, disk and memory usage. .. code-block:: yaml # Path: /etc/prometheus/alert_rules.yml groups: - name: node_exporter_alerts rules: - alert: HighCPULatency expr: sum(rate(node_cpu_seconds_total{mode="system"}[1m])) / count(node_cpu_seconds_total{mode="system"}) * 100 > 80 for: 1m labels: severity: warning annotations: summary: "High CPU Latency detected" description: "CPU latency is above 80% for more than 1 minute." - alert: LowDiskSpace expr: (node_filesystem_free_bytes / node_filesystem_size_bytes) * 100 < 10 for: 1m labels: severity: critical annotations: summary: "Low Disk Space detected" description: "Disk space is below 10% for more than 1 minute." - alert: HighMemoryUsage expr: (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100 > 80 for: 1m labels: severity: warning annotations: summary: "High Memory Usage detected" description: "Memory usage is above 80% for more than 1 minute." Configure SMTP ============== .. code-block:: yaml # Path: /etc/prometheus/alertmanager.yml global: smtp_smarthost: 'smtp.example.com:587' smtp_from: 'alertmanager@example.com' smtp_auth_username: 'yourusername' smtp_auth_password: 'yourpassword' route: receiver: 'email' receivers: - name: 'email' email_configs: - to: 'recipient@example.com' send_resolved: true Add your alert rules to Prometheus ================================== .. code-block:: yaml #Path: /etc/prometheus/prometheus.yml # Add here your alert_rules.yml files rule_files: - "first_rule.yml" - # "second_rule.yml" Edit the alertmanager systemd service file ============================================ .. code-block:: systemd # Path: /usr/lib/systemd/system/prometheus-alertmanager.service [Unit] Description=Alertmanager for prometheus Documentation=https://prometheus.io/docs/alerting/alertmanager/ [Service] Restart=on-failure User=prometheus EnvironmentFile=/etc/default/prometheus-alertmanager ExecStart=/usr/bin/prometheus-alertmanager \ --cluster.advertise-address="ip:9093" # Add this, as otherwise it won't work ExecReload=/bin/kill -HUP $MAINPID TimeoutStopSec=20s SendSIGKILL=no [Install] WantedBy=multi-user.target .. code-block:: console # systemctl daemon-reload # systemctl restart prometheus-alertmanager # systemctl restart prometheus Check ===== You can check both your rules (http://ip:9090/rules) and alerts (http://ip:9090/alerts), from your web browser.