mirror of
https://gitea.com/mcereda/oam.git
synced 2026-02-09 05:44:23 +00:00
feat(kb/prometheus): filter out metrics
This commit is contained in:
3
.vscode/settings.json
vendored
3
.vscode/settings.json
vendored
@@ -65,6 +65,7 @@
|
|||||||
"adduser",
|
"adduser",
|
||||||
"airgap",
|
"airgap",
|
||||||
"airgapped",
|
"airgapped",
|
||||||
|
"alertmanager",
|
||||||
"apiserver",
|
"apiserver",
|
||||||
"asciicast",
|
"asciicast",
|
||||||
"asciinema",
|
"asciinema",
|
||||||
@@ -214,6 +215,7 @@
|
|||||||
"openmediavault",
|
"openmediavault",
|
||||||
"openpgp",
|
"openpgp",
|
||||||
"opentofu",
|
"opentofu",
|
||||||
|
"openwrt",
|
||||||
"opkg",
|
"opkg",
|
||||||
"ossec",
|
"ossec",
|
||||||
"pacman",
|
"pacman",
|
||||||
@@ -264,6 +266,7 @@
|
|||||||
"setfattr",
|
"setfattr",
|
||||||
"siem",
|
"siem",
|
||||||
"slurm",
|
"slurm",
|
||||||
|
"snmp",
|
||||||
"spiffe",
|
"spiffe",
|
||||||
"sshfs",
|
"sshfs",
|
||||||
"sshpass",
|
"sshpass",
|
||||||
|
|||||||
@@ -1,7 +1,9 @@
|
|||||||
# Prometheus
|
# Prometheus
|
||||||
|
|
||||||
Monitoring and alerting system that collects metrics from configured targets at given intervals, evaluates rule expressions, displays the results, and can trigger alerts when specified conditions are observed.<br/>
|
Monitoring and alerting system that collects metrics from configured targets at given intervals, evaluates rule
|
||||||
Metrics can also be pushed using plugins, in the event hosts are behind a firewall or prohibited from opening ports by security policy.
|
expressions, displays the results, and can trigger alerts when specified conditions are observed.<br/>
|
||||||
|
Metrics can also be pushed using plugins, in the event hosts are behind a firewall or prohibited from opening ports by
|
||||||
|
security policy.
|
||||||
|
|
||||||
## Table of contents <!-- omit in toc -->
|
## Table of contents <!-- omit in toc -->
|
||||||
|
|
||||||
@@ -9,6 +11,7 @@ Metrics can also be pushed using plugins, in the event hosts are behind a firewa
|
|||||||
1. [Extras](#extras)
|
1. [Extras](#extras)
|
||||||
1. [Configuration](#configuration)
|
1. [Configuration](#configuration)
|
||||||
1. [Queries](#queries)
|
1. [Queries](#queries)
|
||||||
|
1. [Filter metrics](#filter-metrics)
|
||||||
1. [Further readings](#further-readings)
|
1. [Further readings](#further-readings)
|
||||||
1. [Sources](#sources)
|
1. [Sources](#sources)
|
||||||
|
|
||||||
@@ -17,14 +20,18 @@ Metrics can also be pushed using plugins, in the event hosts are behind a firewa
|
|||||||
Prometheus is composed by its **server**, the **Alertmanager** and its **exporters**.
|
Prometheus is composed by its **server**, the **Alertmanager** and its **exporters**.
|
||||||
|
|
||||||
Alerting rules can be created within Prometheus, and configured to send custom alerts to _Alertmanager_.<br/>
|
Alerting rules can be created within Prometheus, and configured to send custom alerts to _Alertmanager_.<br/>
|
||||||
Alertmanager then processes and handles the alerts, including sending notifications through different mechanisms or third-party services.
|
Alertmanager then processes and handles the alerts, including sending notifications through different mechanisms or
|
||||||
|
third-party services.
|
||||||
|
|
||||||
The _exporters_ can be libraries, processes, devices, or anything else exposing metrics so that they can be scraped by Prometheus.<br/>
|
The _exporters_ can be libraries, processes, devices, or anything else exposing metrics so that they can be scraped by
|
||||||
Such metrics are usually made available at the `/metrics` endpoint, which allows them to be scraped directly from Prometheus without the need of an agent.
|
Prometheus.<br/>
|
||||||
|
Such metrics are usually made available at the `/metrics` endpoint, which allows them to be scraped directly from
|
||||||
|
Prometheus without the need of an agent.
|
||||||
|
|
||||||
### Extras
|
### Extras
|
||||||
|
|
||||||
As welcomed addition, [Grafana] can be configured to use Prometheus as a backend of its in order to provide data visualization and dashboarding functions on the data it provides.
|
As welcomed addition, [Grafana] can be configured to use Prometheus as a backend of its in order to provide data
|
||||||
|
visualization and dashboarding functions on the data it provides.
|
||||||
|
|
||||||
## Configuration
|
## Configuration
|
||||||
|
|
||||||
@@ -45,14 +52,20 @@ scrape_configs:
|
|||||||
- job_name: router
|
- job_name: router
|
||||||
static_configs:
|
static_configs:
|
||||||
- targets: [ 'openwrt.local:9100' ]
|
- targets: [ 'openwrt.local:9100' ]
|
||||||
|
metric_relabel_configs:
|
||||||
|
- source_labels: [__name__]
|
||||||
|
action: keep
|
||||||
|
regex: '(node_cpu)'
|
||||||
```
|
```
|
||||||
|
|
||||||
## Queries
|
## Queries
|
||||||
|
|
||||||
Prometheus' query syntax is [PromQL].
|
Prometheus' query syntax is [PromQL].
|
||||||
|
|
||||||
All data is stored as time series, each one identified by a metric name, e.g. `node_filesystem_avail_bytes` for available filesystem space.<br/>
|
All data is stored as time series, each one identified by a metric name, e.g. `node_filesystem_avail_bytes` for
|
||||||
Metrics' names can be used in the expressions to select all of the time series with this name and produce an **instant vector**.
|
available filesystem space.<br/>
|
||||||
|
Metrics' names can be used in the expressions to select all of the time series with this name and produce an
|
||||||
|
**instant vector**.
|
||||||
|
|
||||||
Time series can be filtered using selectors and labels (sets of key-value pairs):
|
Time series can be filtered using selectors and labels (sets of key-value pairs):
|
||||||
|
|
||||||
@@ -77,8 +90,40 @@ When using time ranges, the vector returned will be a **range vector**.
|
|||||||
|
|
||||||

|

|
||||||
|
|
||||||
Labels are used to filter the job and the mode. `node_cpu_seconds_total` returns a **counter**, and the irate() function calculates the **per-second rate of change** based on the last two data points of the range interval.<br/>
|
Labels are used to filter the job and the mode. `node_cpu_seconds_total` returns a **counter**, and the irate() function
|
||||||
To calculate the overall CPU usage, the idle mode of the metric is used. Since idle percent of a processor is the opposite of a busy processor, the irate value is subtracted from 1. To make it a percentage, it is multiplied by 100.
|
calculates the **per-second rate of change** based on the last two data points of the range interval.<br/>
|
||||||
|
To calculate the overall CPU usage, the idle mode of the metric is used. Since idle percent of a processor is the
|
||||||
|
opposite of a busy processor, the irate value is subtracted from 1. To make it a percentage, it is multiplied by 100.
|
||||||
|
|
||||||
|
## Filter metrics
|
||||||
|
|
||||||
|
Refer [How relabeling in Prometheus works], [Scrape selective metrics in Prometheus] and
|
||||||
|
[Dropping metrics at scrape time with Prometheus].
|
||||||
|
|
||||||
|
Use [metric relabeling configurations][metric_relabel_configs] to select which series to ingest **after** scraping:
|
||||||
|
|
||||||
|
```diff
|
||||||
|
scrape_configs:
|
||||||
|
- job_name: router
|
||||||
|
…
|
||||||
|
+ metric_relabel_configs:
|
||||||
|
+ - # do *not* record metrics which name matches the regex
|
||||||
|
+ # in this case, those which name starts with 'node_disk_'
|
||||||
|
+ source_labels: [ __name__ ]
|
||||||
|
+ action: drop
|
||||||
|
+ regex: node_disk_.*
|
||||||
|
- job_name: hosts
|
||||||
|
…
|
||||||
|
+ metric_relabel_configs:
|
||||||
|
+ - # *only* record metrics which name matches the regex
|
||||||
|
+ # in this case, those which name starts with 'node_cpu_' with cpu=1 and mode=user
|
||||||
|
+ source_labels:
|
||||||
|
+ - __name__
|
||||||
|
+ - cpu
|
||||||
|
+ - mode
|
||||||
|
+ regex: node_cpu_.*1.*user.*
|
||||||
|
+ action: keep
|
||||||
|
```
|
||||||
|
|
||||||
## Further readings
|
## Further readings
|
||||||
|
|
||||||
@@ -86,7 +131,7 @@ To calculate the overall CPU usage, the idle mode of the metric is used. Since i
|
|||||||
- [Github]
|
- [Github]
|
||||||
- [`docker/monitoring`][docker/monitoring]
|
- [`docker/monitoring`][docker/monitoring]
|
||||||
- [Node exporter]
|
- [Node exporter]
|
||||||
- [SMNP exporter]
|
- [SNMP exporter]
|
||||||
- [`ordaa/boinc_exporter`][ordaa/boinc_exporter]
|
- [`ordaa/boinc_exporter`][ordaa/boinc_exporter]
|
||||||
- [Grafana]
|
- [Grafana]
|
||||||
|
|
||||||
@@ -100,20 +145,15 @@ All the references in the [further readings] section, plus the following:
|
|||||||
- [`prometheus/node_exporter`][prometheus/node_exporter]
|
- [`prometheus/node_exporter`][prometheus/node_exporter]
|
||||||
- [`prometheus/snmp_exporter`][prometheus/snmp_exporter]
|
- [`prometheus/snmp_exporter`][prometheus/snmp_exporter]
|
||||||
- [How I monitor my OpenWrt router with Grafana Cloud and Prometheus]
|
- [How I monitor my OpenWrt router with Grafana Cloud and Prometheus]
|
||||||
|
- [Scrape selective metrics in Prometheus]
|
||||||
|
- [Dropping metrics at scrape time with Prometheus]
|
||||||
|
- [How relabeling in Prometheus works]
|
||||||
|
|
||||||
<!--
|
<!--
|
||||||
References
|
Reference
|
||||||
|
═╬═Time══
|
||||||
-->
|
-->
|
||||||
|
|
||||||
<!-- Upstream -->
|
|
||||||
[functions]: https://prometheus.io/docs/prometheus/latest/querying/functions/
|
|
||||||
[github]: https://github.com/prometheus/prometheus
|
|
||||||
[node exporter guide]: https://prometheus.io/docs/guides/node-exporter/
|
|
||||||
[prometheus/node_exporter]: https://github.com/prometheus/node_exporter
|
|
||||||
[prometheus/snmp_exporter]: https://github.com/prometheus/snmp_exporter
|
|
||||||
[promql]: https://prometheus.io/docs/prometheus/latest/querying/basics/
|
|
||||||
[website]: https://prometheus.io/
|
|
||||||
|
|
||||||
<!-- In-article sections -->
|
<!-- In-article sections -->
|
||||||
[further readings]: #further-readings
|
[further readings]: #further-readings
|
||||||
|
|
||||||
@@ -123,10 +163,23 @@ All the references in the [further readings] section, plus the following:
|
|||||||
[snmp exporter]: snmp%20exporter.md
|
[snmp exporter]: snmp%20exporter.md
|
||||||
|
|
||||||
<!-- Files -->
|
<!-- Files -->
|
||||||
[docker/monitoring]: ../docker/monitoring/README.md
|
[docker/monitoring]: ../containers/monitoring/README.md
|
||||||
|
|
||||||
|
<!-- Upstream -->
|
||||||
|
[functions]: https://prometheus.io/docs/prometheus/latest/querying/functions/
|
||||||
|
[github]: https://github.com/prometheus/prometheus
|
||||||
|
[node exporter guide]: https://prometheus.io/docs/guides/node-exporter/
|
||||||
|
[prometheus/node_exporter]: https://github.com/prometheus/node_exporter
|
||||||
|
[prometheus/snmp_exporter]: https://github.com/prometheus/snmp_exporter
|
||||||
|
[promql]: https://prometheus.io/docs/prometheus/latest/querying/basics/
|
||||||
|
[website]: https://prometheus.io/
|
||||||
|
[metric_relabel_configs]: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs
|
||||||
|
|
||||||
<!-- Others -->
|
<!-- Others -->
|
||||||
|
[dropping metrics at scrape time with prometheus]: https://www.robustperception.io/dropping-metrics-at-scrape-time-with-prometheus/
|
||||||
[getting started with prometheus]: https://opensource.com/article/18/12/introduction-prometheus
|
[getting started with prometheus]: https://opensource.com/article/18/12/introduction-prometheus
|
||||||
[how i monitor my openwrt router with grafana cloud and prometheus]: https://grafana.com/blog/2021/02/09/how-i-monitor-my-openwrt-router-with-grafana-cloud-and-prometheus/
|
[how i monitor my openwrt router with grafana cloud and prometheus]: https://grafana.com/blog/2021/02/09/how-i-monitor-my-openwrt-router-with-grafana-cloud-and-prometheus/
|
||||||
[ordaa/boinc_exporter]: https://gitlab.com/ordaa/boinc_exporter
|
[ordaa/boinc_exporter]: https://gitlab.com/ordaa/boinc_exporter
|
||||||
|
[scrape selective metrics in prometheus]: https://docs.last9.io/docs/how-to-scrape-only-selective-metrics-in-prometheus
|
||||||
[snmp monitoring and easing it with prometheus]: https://medium.com/@openmohan/snmp-monitoring-and-easing-it-with-prometheus-b157c0a42c0c
|
[snmp monitoring and easing it with prometheus]: https://medium.com/@openmohan/snmp-monitoring-and-easing-it-with-prometheus-b157c0a42c0c
|
||||||
|
[how relabeling in prometheus works]: https://grafana.com/blog/2022/03/21/how-relabeling-in-prometheus-works/
|
||||||
|
|||||||
Reference in New Issue
Block a user