diff --git a/knowledge base/loki.md b/knowledge base/loki.md index 2adacc6..8b02eed 100644 --- a/knowledge base/loki.md +++ b/knowledge base/loki.md @@ -30,17 +30,21 @@ very cost-effective and easy to operate. ## TL;DR It indexes **a set of labels** for each log stream instead of the full logs' contents.
-The log data itself is then compressed and stored in chunks in object storage solutions, or locally on the host's -filesystem. +The log data itself is compressed after indexing, and stored into _chunks_ on the local filesystem or in configured +object storage solutions. -Can be executed in _single binary_ mode, with all its components running simultaneously in one process, or in -_simple scalable deployment_ mode, which groups components into read, write, and backend parts. +Can be executed in (either-or): -Files can be _index_es or _chunk_s.
-Indexes are tables of contents in TSDB format of where to find logs for specific sets of labels.
-Chunks are containers for log entries for specific sets of labels. +- _single binary_ mode, where all its components run simultaneously in a single process. +- _simple scalable deployment_ mode, which groups components into _read_, _write_, and _backend_ parts. +- _microservices mode_, which runs every component by itself in multiple, different processes. -Needs agents or other clients to collect and push logs to the Loki server. +Files, in Loki, can be _index_es or _chunk_s.
+Indexes are tables of contents, in TSDB format, of where to find logs for specific sets of labels.
+Chunks are containers for log entries that are assigned specific sets of labels. + +Loki does **not** collect logs itself.
+It needs _agents_, or other logs producers, to collect and push logs to its _ingesters_.
Setup @@ -61,13 +65,13 @@ helm --namespace 'loki' upgrade --create-namespace --install --cleanup-on-fail ' --repo 'https://grafana.github.io/helm-charts' 'loki-distributed' ``` -On startup, Loki tries to load a configuration file named `config.yaml` from the current working directory, or from the -`config/` subdirectory if the first does not exist.
-Should none of those files exist, it **will** give up and fail. +On startup, unless explicitly configured otherwise, Loki tries to load a configuration file named `config.yaml` from the +current working directory, or from the `./config/` subdirectory if the first does not exist.
+Should **none** of those files exist, it **will** give up and fail. The default configuration file **for package-based installations** is located at `/etc/loki/config.yml` or `/etc/loki/loki.yaml`.
-The docker image tries using `/etc/loki/local-config.yml` by default (via the `COMMAND` setting). +The docker image tries using `/etc/loki/local-config.yml` by default (as per the image's `COMMAND` setting). Some settings are currently **not** reachable via direct CLI flags (e.g. `schema_configs`, `storage_config.aws.*`).
Use a configuration file for those. @@ -117,16 +121,16 @@ loki -log.level='info' -log-config-reverse-order … loki -print-config-stderr … # Check the server is working -curl 'http://loki.fqdn:3100/ready' -curl 'http://loki.fqdn:3100/metrics' -curl 'http://loki.fqdn:3100/services' +curl 'http://loki.example.org:3100/ready' +curl 'http://loki.example.org:3100/metrics' +curl 'http://loki.example.org:3100/services' # Check components in Loki clusters are up and running. # Such components must run by themselves for this. -# The read component returns ready when browsing to . -curl 'http://loki.fqdn:3101/ready' -# The write component returns ready when browsing to . -curl 'http://loki.fqdn:3102/ready' +# The 'read' component returns ready when browsing to . +curl 'http://loki.example.org:3101/ready' +# The 'write' component returns ready when browsing to . +curl 'http://loki.example.org:3102/ready' ``` ```plaintext @@ -164,16 +168,17 @@ docker run --rm --name 'validate-cli-config' 'grafana/loki:3.3.2' \ Handles incoming push requests from clients. -Once it receives a set of streams in an HTTP request, validates each stream for correctness and to ensure the stream is -within the configured tenant (or global) limits.
-Each **valid** stream is sent to `n` ingesters in parallel, with `n` being the replication factor for the data. +Once it receives a set of streams in an HTTP request, it validates each stream for correctness and to ensure the stream +is within the configured tenant (or global) limits. -The distributor determines the ingesters to which send a stream to by using consistent hashing. +Each **valid** stream is forwarded to `n` ingesters in parallel to ensure its data is recorded.
+`n` is the replication factor for the data.
+The distributor determines which ingesters to send a stream to by using consistent hashing. A load balancer **must** sit in front of the distributor to properly balance incoming traffic to them.
In Kubernetes, this is provided by the internal service load balancer. -The distributor is stateless and can be properly scaled. +The distributor is state**less** and can be _properly_ scaled. ### Ingester @@ -302,7 +307,7 @@ Refer [Send log data to Loki]. ### Logstash Loki provides the `logstash-output-loki` Logstash output plugin to enable shipping logs to a Loki or Grafana Cloud -instance.
+instance, though Grafana's folks suggest to **not** use it.
Refer [Logstash plugin]. ```sh @@ -317,13 +322,43 @@ output { } ``` +Should one end up sending too many high cardinality labels to Loki, one can leverage the `include_fields` option to +limit the fields that would be mapped to labels and sent to the destination.
+When this list is configured, **only** these fields will be sent and all the other fields will be ignored. Use the +`metadata_fields` option to map those to structured metadata and send them to Loki too. + +
+ +```rb +output { + loki { + url => "http://loki.example.org:3100/loki/api/v1/push" + include_fields => [ + "agentId", + "cancelledAt", + "completedAt", + "event", + "statusCode", + "subtaskId", + "totalCount", + "taskId", + "uri" + ] + metadata_fields => [ "durationInMilliseconds" ] + } +} +``` + +
+ ### OpenTelemetry See also [OpenTelemetry / OTLP]. ## Labels -Refer [Understand labels], [Cardinality] and [What is structured metadata]. +Refer [Understand labels], [Cardinality] and [What is structured metadata].
+See also [The concise guide to Grafana Loki: Everything you need to know about labels]. The content of _each_ log line is **not** indexed. Instead, log entries are grouped into _streams_.
The streams are then indexed using _labels_. @@ -623,3 +658,4 @@ analytics: [loki-operator]: https://loki-operator.dev/ [opentelemetry / otlp]: https://loki-operator.dev/docs/open-telemetry.md/ [the quest for ha and dr in loki]: https://www.infracloud.io/blogs/high-availability-disaster-recovery-in-loki/ +[The concise guide to Grafana Loki: Everything you need to know about labels]: https://grafana.com/blog/2023/12/20/the-concise-guide-to-grafana-loki-everything-you-need-to-know-about-labels/