diff --git a/knowledge base/cloud computing/aws/cli.md b/knowledge base/cloud computing/aws/cli.md index 4cfd0aa..4eac3d0 100644 --- a/knowledge base/cloud computing/aws/cli.md +++ b/knowledge base/cloud computing/aws/cli.md @@ -316,6 +316,7 @@ yubikeytotp = awscli_plugin_yubikeytotp ## Further readings - [Amazon Web Services] +- [Codebase] - CLI [quickstart] - [Configure profiles] in the CLI - [How do I assume an IAM role using the AWS CLI?] @@ -349,6 +350,7 @@ yubikeytotp = awscli_plugin_yubikeytotp [cli config files]: ../../../examples/dotfiles/.aws +[codebase]: https://github.com/aws/aws-cli/tree/v2 [configure profiles]: https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-profiles.html [how do i assume an iam role using the aws cli?]: https://repost.aws/knowledge-center/iam-assume-role-cli [improved cli auto-prompt mode]: https://github.com/aws/aws-cli/issues/5664 diff --git a/knowledge base/cortex.md b/knowledge base/cortex.md new file mode 100644 index 0000000..90f4c6d --- /dev/null +++ b/knowledge base/cortex.md @@ -0,0 +1,78 @@ +# Cortex + +> TODO + +Intro + + + +1. [TL;DR](#tldr) +1. [Further readings](#further-readings) + 1. [Sources](#sources) + +## TL;DR + + + + + + + +## Further readings + +- [Website] +- [Codebase] +- [Prometheus] + +Alternatives: + +- Grafana's [Mimir] +- [Thanos] + +### Sources + +- [Documentation] + + + + + +[mimir]: mimir.md +[prometheus]: prometheus.md +[thanos]: thanos.md + + + +[codebase]: https://github.com/cortexproject/cortex +[documentation]: https://cortexmetrics.io/docs/ +[website]: https://cortexmetrics.io/ + + diff --git a/knowledge base/dig.md b/knowledge base/dig.md index 4f2c862..cbb34f4 100644 --- a/knowledge base/dig.md +++ b/knowledge base/dig.md @@ -52,6 +52,9 @@ dig '@8.8.8.8' 'google.com' # Return all results. dig 'google.com' 'ANY' + +# Only return the first answer. +dig +short 'google.com' ``` @@ -61,6 +64,7 @@ dig 'google.com' 'ANY' ```sh dig +trace '@1.1.1.1' 'google.com' +dig 'A' +short '@172.31.0.2' 'fs-0123456789abcdef0.efs.eu-west-1.amazonaws.com' ``` diff --git a/knowledge base/mimir.md b/knowledge base/mimir.md new file mode 100644 index 0000000..3414a02 --- /dev/null +++ b/knowledge base/mimir.md @@ -0,0 +1,335 @@ +# Grafana's Mimir + +Metrics aggregator. + +Allows ingesting [Prometheus] or OpenTelemetry metrics, run queries, create new data through the use of recording rules, +and set up alerting rules across multiple tenants to leverage tenant federation. + + + +1. [TL;DR](#tldr) +1. [Setup](#setup) + 1. [Monolithic mode](#monolithic-mode) + 1. [Microservices mode](#microservices-mode) +1. [Storage](#storage) + 1. [Object storage](#object-storage) +1. [APIs](#apis) +1. [Deduplication of data from multiple Prometheus scrapers](#deduplication-of-data-from-multiple-prometheus-scrapers) +1. [Migrate to Mimir](#migrate-to-mimir) +1. [Further readings](#further-readings) + 1. [Sources](#sources) + +## TL;DR + +Scrapers (like Prometheus or Grafana's Alloy) need to send metrics data to Mimir.
+Mimir will **not** scrape metrics itself. + +Mimir listens by default on port `8080` for HTTP and on port `9095` for GRPC. + +Mimir stores time series in TSDB blocks, that are uploaded to an object storage bucket.
+Such blocks are the same that Prometheus and Thanos use, though each application stores blocks in different places and +uses slightly different metadata files for them. + +Mimir supports multiple tenants, and stores blocks on a **per-tenant** level.
+When multi-tenancy is **disabled**, it will only manage a single tenant going by the name `anonymous`. + +Blocks can be uploaded using the `mimirtool` utility, so that Mimir can access them.
+Mimir **will** perform some sanitization and validation of each block's metadata. + +```sh +mimirtool backfill --address='http://mimir.example.org' --id='anonymous' 'block_1' … 'block_N' +``` + +As a result of validation, Mimir will probably reject Thanos' blocks due to unsupported labels.
+As a workaround, upload Thanos' blocks directly to Mimir's blocks bucket, using the `//` prefix. + +
+ Setup + +```sh +docker pull 'grafana/mimir' + +mimir +docker run --rm --name 'mimir' --publish '8080:8080' --publish '9095:9095' 'grafana/mimir' + +mimir --config.file='./demo.yaml' +docker run --rm --name 'mimir' --publish '8080:8080' --publish '9095:9095' \ + --volume "$PWD/config.yaml:/etc/mimir/config.yaml" \ + 'grafana/mimir' --config.file='/etc/mimir/config.yaml' +``` + +
+ +
+ Usage + +```sh +# Get help. +mimir -help +mimir -help-all + +# Validate configuration files +mimir -modules -config.file 'path/to/config.yaml' + +# See the current configuration of components +GET /config +GET /runtime_config + +# See changes in the runtime configuration from the default one +GET /runtime_config?mode=diff + +# Check the service is ready +# A.K.A. readiness probe +GET /ready + +# Get metrics +GET /metrics +``` + +
+ + + +## Setup + +Mimir's configuration file is YAML-based.
+There is no default configuration file, but it _can_ be specified on launch. + +```sh +mimir --config.file='./demo.yaml' + +docker run --rm --name 'mimir' --publish '8080:8080' --publish '9095:9095' \ + --volume "$PWD/config.yaml:/etc/mimir/config.yaml" \ + 'grafana/mimir' --config.file='/etc/mimir/config.yaml' +``` + +Refer [Grafana Mimir configuration parameters] for the available parameters. + +If enabled, environment variable references can be used in the configuration file to set values that need to be +configurable during deployment.
+This feature is enabled on the command line via the `-config.expand-env=true` option. + +Each variable reference is replaced at startup by the value of the environment variable.
+The replacement is case-**sensitive**, and occurs **before** the YAML file is parsed.
+References to undefined variables are replaced by empty strings unless a default value or custom error text is +specified. + +Use the `${VAR}` placeholder, optionally specifying a default value with `${VAR:default_value}`, where `VAR` is the name +of the environment variable and `default_value` is the value to use if the environment variable is undefined. + +Configuration files can be stored gz-compressed. In this case, add a `.gz` extension to those files that should be +decompressed before parsing. + +Mimir loads a given configuration file at startup. This configuration **cannot** be modified at runtime. + +Mimir supports _secondary_ configuration files that define the _runtime_'s configuration.
+This configuration is reloaded **dynamically**. It allows to change the runtime configuration without having to restart +Mimir's components or instance. + +Runtime configuration must be **explicitly** enabled, either on launch or in the configuration file under +`runtime_config`.
+If multiple runtime configuration files are specified, they will be **merged** left to right.
+Mimir reloads the contents of these files every 10 seconds. + +```sh +mimir … -runtime-config.file='path/to/file/1,path/to/file/N' +``` + +It only encompasses a **subset** of the whole configuration that was set at startup, but its values take precedence over +command-line options. + +Some settings are repeated for multiple components.
+To avoid repetition in the configuration file, set them up in the `common` configuration file section or give them to +Mimir using the `-common.*` CLI options.
+Common settings are applied to all components first, then the components' specific configurations override them. + +Settings are applied as follows, with each one applied later overriding the previous ones: + +1. YAML common values +1. YAML specific values +1. CLI common flags +1. CLI specific flags + +Specific configuration for one component that is passed to other components is simply ignored by those.
+This makes it safe to reuse files. + +Mimir can be deployed in one of two modes: + +- _Monolithic_, which runs all required components in a single process. +- _Microservices_, where components are run as distinct processes. + +The deployment mode is determined by the `-target` option given to Mimir's process. + +Whatever the Mimir's deployment mode, it will need to receive data from other applications.
+It will **not** scrape metrics itself. + +
+Prometheus configuration + +```yaml +remote_write: + - url: http://mimir.example.org:9009/api/v1/push +``` + +
+ +[Grafana] considers Mimir a data source of type _Prometheus_, and must be [provisioned](grafana.md#datasources) +accordingly.
+From there, metrics can be queried in Grafana's _Explore_ tab, or can populate dashboards that use Mimir as their data +source. + +### Monolithic mode + +Runs **all** required components in a **single** process. + +Can be horizontally scaled out by deploying multiple instances of Mimir's binary, all of them started with the +`-target=all` option. + +```mermaid +graph LR + r(Reads) + w(Writes) + lb(Load Balancer) + m1(Mimir
instance 1) + mN(Mimir
instance N) + os(Object Storage) + + r --> lb + w --> lb + lb --> m1 + lb --> mN + m1 --> os + mN --> os +``` + +### Microservices mode + +Mimir's components are deployed as distinct processes.
+Each process is invoked with its own `-target` option set to a specific component (i.e., `-target='ingester'` or +`-target='distributor'`). + +```mermaid +graph LR + r(Reads) + qf(Query Frontend) + q(Querier) + sg(Store Gateway) + w(Writes) + d(Distributor) + i(Ingester) + os(Object Storage) + c(Compactor) + + r --> qf --> q --> sg --> os + w --> d --> i --> os + os <--> c +``` + +**Every** required component **must** be deployed in order to have a working Mimir instance. + +This mode is the preferred method for production deployments, but it is also the most complex.
+Recommended using Kubernetes and the [`mimir-distributed` Helm chart][helm chart]. + +Each component scales up independently.
+This allows for greater flexibility and more granular failure domains. + +## Storage + +Mimir supports the `s3`, `gcs`, `azure`, `swift`, and `filesystem` backends.
+`filesystem` is the default one. + +### Object storage + +Refer [Configure Grafana Mimir object storage backend]. + +Blocks storage must be located under a **different** prefix or bucket than both the ruler's and AlertManager's stores. +Mimir **will** fail to start if that is the case. + +To avoid that, it is suggested to override the `bucket_name` setting in the specific configurations: + +```yaml +common: + storage: + backend: s3 + s3: + endpoint: s3.us-east-2.amazonaws.com + region: us-east-2 + +blocks_storage: + s3: + bucket_name: mimir-blocks + +alertmanager_storage: + s3: + bucket_name: mimir-alertmanager + +ruler_storage: + s3: + bucket_name: mimir-ruler +``` + +## APIs + +Refer [Grafana Mimir HTTP API]. + +## Deduplication of data from multiple Prometheus scrapers + +Refer [Configure Grafana Mimir high-availability deduplication]. + +## Migrate to Mimir + +Refer [Migrate from Thanos or Prometheus to Grafana Mimir]. + +## Further readings + +- [Website] +- [Codebase] +- [Prometheus] +- [Grafana] + +Alternatives: + +- [Cortex] +- [Thanos] + +### Sources + +- [Documentation] +- [Migrate from Thanos or Prometheus to Grafana Mimir] +- [Configure Grafana Mimir object storage backend] +- [Grafana Mimir configuration parameters] + + + + + +[cortex]: cortex.md +[grafana]: grafana.md +[prometheus]: prometheus.md +[thanos]: thanos.md + + + +[codebase]: https://github.com/grafana/mimir +[configure grafana mimir high-availability deduplication]: https://grafana.com/docs/mimir/latest/configure/configure-high-availability-deduplication/ +[configure grafana mimir object storage backend]: https://grafana.com/docs/mimir/latest/configure/configure-object-storage-backend/ +[documentation]: https://grafana.com/docs/mimir/latest/ +[grafana mimir configuration parameters]: https://grafana.com/docs/mimir/latest/configure/configuration-parameters/ +[grafana mimir http api]: https://grafana.com/docs/mimir/latest/references/http-api/ +[helm chart]: https://github.com/grafana/mimir/tree/main/operations/helm/charts/mimir-distributed +[migrate from thanos or prometheus to grafana mimir]: https://grafana.com/docs/mimir/latest/set-up/migrate/migrate-from-thanos-or-prometheus/ +[website]: https://grafana.com/oss/mimir/ + + diff --git a/knowledge base/prometheus.md b/knowledge base/prometheus.md index 4b34627..4d81a4c 100644 --- a/knowledge base/prometheus.md +++ b/knowledge base/prometheus.md @@ -21,8 +21,9 @@ prohibited from opening ports by security policies. 1. [Write to remote Prometheus servers](#write-to-remote-prometheus-servers) 1. [Management API](#management-api) 1. [Take snapshots of the current data](#take-snapshots-of-the-current-data) +1. [High availability](#high-availability) 1. [Further readings](#further-readings) - 1. [Sources](#sources) + 1. [Sources](#sources) ## TL;DR @@ -395,6 +396,17 @@ $ curl -X 'POST' 'http://localhost:9090/api/v1/admin/tsdb/snapshot' The snapshot now exists at `/snapshots/20171210T211224Z-2be650b6d019eb54` +## High availability + +Typically achieved by: + +1. Running multiple Prometheus replicas.
+ Replicas could each focus on a subset of the whole data, or just duplicate it. +1. Running a separate AlertManager instance.
+ This would handle alerts from all the Prometheus instances, automatically managing eventually duplicated data. +1. Using tools like [Thanos], [Cortex], or Grafana's [Mimir] to aggregate and deduplicate data. +1. Directing visualizers like Grafana to the aggregator instead of the Prometheus replicas. + ## Further readings - [Website] @@ -412,6 +424,9 @@ The snapshot now exists at `/snapshots/20171210T211224Z-2be650b6d019eb - [Prometheus Definitive Guide Part I - Metrics and Use Cases] - [Prometheus Definitive Guide Part II - Prometheus Query Language] - [Prometheus Definitive Guide Part III - Prometheus Operator] +- [Cortex] +- [Thanos] +- Grafana's [Mimir] ### Sources @@ -432,6 +447,7 @@ The snapshot now exists at `/snapshots/20171210T211224Z-2be650b6d019eb - [Install Prometheus and Grafana by Helm] - [Prometheus and Grafana setup in Minikube] - [I need to know about the below kube_state_metrics description. Exactly looking is what the particular metrics doing] +- [High Availability in Prometheus: Best Practices and Tips] +[cortex]: cortex.md [grafana]: grafana.md +[mimir]: mimir.md [node exporter]: node%20exporter.md [snmp exporter]: snmp%20exporter.md +[thanos]: thanos.md [docker/monitoring]: ../docker%20compositions/monitoring/README.md @@ -465,6 +484,7 @@ The snapshot now exists at `/snapshots/20171210T211224Z-2be650b6d019eb [dropping metrics at scrape time with prometheus]: https://www.robustperception.io/dropping-metrics-at-scrape-time-with-prometheus/ [getting started with prometheus]: https://opensource.com/article/18/12/introduction-prometheus [high availability for prometheus and alertmanager: an overview]: https://promlabs.com/blog/2023/08/31/high-availability-for-prometheus-and-alertmanager-an-overview/ +[high availability in prometheus: best practices and tips]: https://last9.io/blog/high-availability-in-prometheus/ [how i monitor my openwrt router with grafana cloud and prometheus]: https://grafana.com/blog/2021/02/09/how-i-monitor-my-openwrt-router-with-grafana-cloud-and-prometheus/ [how relabeling in prometheus works]: https://grafana.com/blog/2022/03/21/how-relabeling-in-prometheus-works/ [how to integrate prometheus and grafana on kubernetes using helm]: https://semaphoreci.com/blog/prometheus-grafana-kubernetes-helm diff --git a/knowledge base/thanos.md b/knowledge base/thanos.md new file mode 100644 index 0000000..bce1570 --- /dev/null +++ b/knowledge base/thanos.md @@ -0,0 +1,78 @@ +# Thanos + +> TODO + +Intro + + + +1. [TL;DR](#tldr) +1. [Further readings](#further-readings) + 1. [Sources](#sources) + +## TL;DR + + + + + + + +## Further readings + +- [Website] +- [Codebase] +- [Prometheus] + +Alternatives: + +- [Cortex] +- Grafana's [Mimir] + +### Sources + +- [Documentation] + + + + + +[cortex]: cortex.md +[mimir]: mimir.md +[prometheus]: prometheus.md + + + +[codebase]: https://github.com/thanos-io/thanos +[documentation]: https://thanos.io/tip/thanos/ +[website]: https://thanos.io/ + +