From 05f08628d7bc82048537368f0e9b7b61b911d6f0 Mon Sep 17 00:00:00 2001 From: Michele Cereda Date: Mon, 13 Jan 2025 01:09:43 +0100 Subject: [PATCH] chore(loki): expand kb --- examples/loki/default.package.yml | 47 ++++++++ knowledge base/loki.md | 188 +++++++++++++++++++++++++++++- snippets/useradd.sh | 5 + 3 files changed, 235 insertions(+), 5 deletions(-) create mode 100644 examples/loki/default.package.yml create mode 100644 snippets/useradd.sh diff --git a/examples/loki/default.package.yml b/examples/loki/default.package.yml new file mode 100644 index 0000000..f76bcd2 --- /dev/null +++ b/examples/loki/default.package.yml @@ -0,0 +1,47 @@ +--- + +# /etc/loki/loki.yaml + +auth_enabled: false + +server: + http_listen_port: 3100 + grpc_listen_port: 9096 + +common: + path_prefix: /var/lib/loki + storage: + filesystem: + chunks_directory: /var/lib/loki/chunks + rules_directory: /var/lib/loki/rules + replication_factor: 1 + ring: + instance_addr: 127.0.0.1 + kvstore: + store: inmemory + +schema_config: + configs: + - from: 2020-10-24 + store: boltdb-shipper + object_store: filesystem + schema: v11 + index: + prefix: index_ + period: 24h + +ruler: + alertmanager_url: http://localhost:9093 + +# By default, Loki will send anonymous, but uniquely-identifiable usage and configuration +# analytics to Grafana Labs. These statistics are sent to https://stats.grafana.org/ +# +# Statistics help us better understand how Loki is used, and they show us performance +# levels for most users. This helps us prioritize features and documentation. +# For more information on what's sent, look at +# https://github.com/grafana/loki/blob/main/pkg/usagestats/stats.go +# Refer to the buildReport method to see what goes into a report. +# +# If you would like to disable reporting, uncomment the following lines: +#analytics: +# reporting_enabled: false diff --git a/knowledge base/loki.md b/knowledge base/loki.md index 6722557..19f437b 100644 --- a/knowledge base/loki.md +++ b/knowledge base/loki.md @@ -4,16 +4,33 @@ Horizontally scalable, highly available, multi-tenant log aggregation system ins very cost-effective and easy to operate. 1. [TL;DR](#tldr) +1. [Components](#components) + 1. [Distributor](#distributor) + 1. [Ingester](#ingester) + 1. [Query frontend](#query-frontend) + 1. [Query scheduler](#query-scheduler) + 1. [Querier](#querier) + 1. [Index gateway](#index-gateway) + 1. [Compactor](#compactor) + 1. [Ruler](#ruler) +1. [Clients](#clients) 1. [Further readings](#further-readings) 1. [Sources](#sources) ## TL;DR -It indexes **a set of labels** for each log stream instead of the full logs' contents. +It indexes **a set of labels** for each log stream instead of the full logs' contents.
+The log data itself is then compressed and stored in chunks in object storage solutions, or locally on the host's +filesystem. -Needs agents or other clients to push logs to the server. +Loki can be executed in _single binary_ mode, with all its components running simultaneously in one process, or in +_simple scalable deployment_ mode, which groups components into read, write, and backend parts. -Supports object storage solutions. +Loki's files can be _index_es or _chunk_s.
+Indexes are tables of contents in TSDB format of where to find logs for specific sets of labels.
+Chunks are containers for log entries for specific sets of labels. + +Needs agents or other clients to collect and push logs to the Loki server.
Setup @@ -22,6 +39,7 @@ Supports object storage solutions. # Install via package repository. apt install 'loki' dnf install 'loki' +zypper install 'loki' # Run via Docker. docker run --name loki -d \ @@ -29,7 +47,17 @@ docker run --name loki -d \ 'grafana/loki:3.3.2' -config.file='/etc/loki/config.yml' ``` -Default configuration file for package-based installations is `/etc/loki/config.yml`. +Default configuration file for package-based installations is `/etc/loki/config.yml` or `/etc/loki/loki.yaml`. + +
+ Disable reporting + + ```yaml + analytics: + reporting_enabled: false + ``` + +
@@ -37,19 +65,168 @@ Default configuration file for package-based installations is `/etc/loki/config. Usage ```sh -# Check the server is working +# Check the server is working. curl 'http://loki.fqdn:3100/ready' curl 'http://loki.fqdn:3100/metrics' + +# Check components in Loki clusters are up and running. +# Such components must run by themselves for this. +# The read component returns ready when browsing to . +curl 'http://loki.fqdn:3101/ready' +# The write component returns ready when browsing to . +curl 'http://loki.fqdn:3102/ready' ``` +## Components + +### Distributor + +Handles incoming push requests from clients. + +Once it receives a set of streams in an HTTP request, validates each stream for correctness and to ensure the stream is +within the configured tenant (or global) limits.
+Each **valid** stream is sent to `n` ingesters in parallel, with `n` being the replication factor for the data. + +The distributor determines the ingesters to which send a stream to by using consistent hashing. + +A load balancer **must** sit in front of the distributor to properly balance incoming traffic to them.
+In Kubernetes, this is provided by the internal service load balancer. + +The distributor is stateless and can be properly scaled. + +### Ingester + +On the _write_ path, persists data and ships it to long-term storage.
+On the _read_ path, returns recently ingested, in-memory log data for queries. + +Ingesters contain a lifecycle subcomponent managing their own lifecycle in the hash ring.
+Each ingester has a state, which can be one of `PENDING`, `JOINING`, `ACTIVE`, `LEAVING`, or `UNHEALTHY`: + +- `PENDING`: the ingester is waiting for a handoff from another ingester in the `LEAVING` state.
+ This only applies for legacy deployment modes. +- `JOINING`: the ingester is currently inserting its tokens into the ring and initializing itself.
+ It **may** receive write requests for tokens it owns. +- `ACTIVE`: the ingester is fully initialized.
+ It may receive both write and read requests for tokens it owns. +- `LEAVING`: the ingester is shutting down.
+ It may receive read requests for data it still has in memory. +- `UNHEALTHY`: the ingester has failed to heartbeat.
+ Set by the distributor when periodically checking the ring. + +Chunks are compressed and marked as read-only when: + +- The current chunk has reached the configured capacity. +- Too much time has passed without the current chunk being updated. +- A flush occurs. + +Whenever a chunk is compressed and marked as read-only, a writable chunk takes its place. + +If an ingester crashes or exits abruptly, all the data that has not yet been flushed **will be lost**.
+Replicas of each log mitigate this risk. + +When flushes occur to a persistent storage provider, the chunk in question is hashed based on its tenant, labels, and +contents. Multiple ingesters with the same copy of the data will **not** write the same data to the backing store twice, +but if any write failed to one of the replicas, multiple differing chunk objects **will** be created in the backing +store. + +### Query frontend + +**Optional** service providing the querier's API endpoints.
+Can be used to accelerate the _read_ path. + +When in place, incoming query requests should be directed to the query frontend instead of the queriers.
+The querier service will **still** be required within the cluster in order to actually execute the queries. + +Performs some query adjustments, and holds queries in an internal queue.
+Queriers will act as workers to pull jobs from the queue, execute them, and return them to the query frontend for +aggregation. They **will** need to be configured with the query frontend address to allow for connection. + +Query frontends are stateless and can be properly scaled. + +### Query scheduler + +**Optional** service providing more advanced queuing functionality than the query frontend.
+When active, the query frontend pushes split up queries to the query scheduler, which in turn enqueues them in an +internal in-memory queue. + +Each tenant will get its own queue to guarantee fairness across all tenants. + +Queriers connecting to the query scheduler will act as workers to pull jobs from the queue, execute them, and return +them to the query frontend for aggregation. They **will** need to be configured with the query scheduler address to +allow for connection. + +Query schedulers are stateless and can be properly scaled. + +### Querier + +Executes Log Query Language (LogQL) queries. + +Handles HTTP requests from the client directly, or pulls subqueries from the query frontend or query scheduler if +configured to do so. + +Fetches log data from both the ingesters and from long-term storage.
+They query **all** ingesters for in-memory data before falling back and run the same query against the backend store. + +Because of the replication factor, it **is** possible for the querier to receive duplicate data.
+To take care of this, it internally deduplicates data with the same nanosecond timestamp, label set, and log message. + +### Index gateway + +Used only by _shipper stores_. + +Handles and serves metadata queries.
+Those are queries that look up data from the index. + +Query frontends will query the index gateway to know the log volume of queries, so to make a decision on how to shard +the queries.
+The queriers will query the index gateway to know the chunk references for a given query, so to know which chunks to +fetch. + +The index gateway can run in _simple_ or _ring_ mode: + +- In _simple_ mode, each index gateway instance serves all indexes from all tenants. +- In _ring_ mode, index gateways use a consistent hash ring to distribute and shard the indexes per tenant amongst + available instances. + +### Compactor + +Used by _shipper stores_ to compact multiple index files, produced by the ingesters and shipped to object storage, into +single index files per day and tenant. + +It: + +- Downloads files from object storage at regular intervals. +- Merges downloaded files into a single one. +- Uploads the newly created index. +- Cleans up the old files. + +Also manages log retention and log deletion. + +### Ruler + +Manages and evaluates rules and alert expressions provided in rule configurations.
+ +Rule configurations are stored in object storage or local file system.
+They can be managed through the ruler's API, or directly by uploading them to object storage. + +Rulers _can_ delegate rule evaluation to the query frontends to gain the advantages of query splitting, query sharding, +and caching offered by the query frontend. + +Multiple rulers will use a consistent hash ring to distribute rule groups amongst available ruler instances. + +## Clients + +Refer [Send log data to Loki]. + ## Further readings - [Website] - [Codebase] - [Grafana] - [Promtail] +- [Send log data to Loki] ### Sources @@ -71,6 +248,7 @@ curl 'http://loki.fqdn:3100/metrics' [codebase]: https://github.com/grafana/loki [documentation]: https://grafana.com/docs/loki/latest/ [http api reference]: https://grafana.com/docs/loki/latest/reference/loki-http-api/ +[send log data to loki]: https://grafana.com/docs/loki/latest/send-data/ [website]: https://grafana.com/oss/loki/ diff --git a/snippets/useradd.sh b/snippets/useradd.sh new file mode 100644 index 0000000..d4fb1cc --- /dev/null +++ b/snippets/useradd.sh @@ -0,0 +1,5 @@ +#!/usr/bin/env sh + +# Add system users +useradd --system --comment '-' --home-dir '/' --user-group 'loki' --shell '/sbin/nologin' +useradd -r -c '-' -d '/' -U 'loki' -s '/sbin/nologin'