Cristian Marius Tiutiu, Bikram Gupta, and Easha Abid
In Kubernetes orchestration, the Loki stack has proven itself to be a powerful lightweight logging solution. It seamlessly manages the flux of logs in a Kubernetes cluster by providing scalability and high availability.
In this tutorial, you will learn about Loki which is a log aggregation system inspired by Prometheus.
To complete this tutorial, you will need:
DigitalOcean
tutorial to learn how to manage access keys. Keep the access and secret keys somewhere safe for later use.In this step, you will learn how to deploy Loki to your DOKS cluster, using Helm.
First, clone the Starter Kit repository, and then change the directory to your local copy:
git clone https://github.com/digitalocean/Kubernetes-Starter-Kit-Developers.git
cd Kubernetes-Starter-Kit-Developers
Next, add the Grafana Helm repository and list the available charts:
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update grafana
helm search repo grafana
The output looks similar to the following:
NAME CHART VERSION APP VERSION DESCRIPTION
grafana/grafana 6.20.5 8.3.4 The leading tool for querying and visualizing t...
grafana/enterprise-metrics 1.7.2 v1.6.1 Grafana Enterprise Metrics
grafana/fluent-bit 2.3.0 v2.1.0 Uses fluent-bit Loki go plugin for gathering lo...
grafana/loki-stack 2.6.4 v2.4.2 Loki: like Prometheus, but for logs.
...
grafana/loki-stack
which will install standalone Loki on the cluster. Please visit the loki-stack page for more details about this chart.2.6.4
is picked for loki-stack
, which maps to application version 2.4.2
.For your convenience, there’s a ready-to-use sample values file provided in the Starter Kit Git repository (loki-stack-values-v2.6.4.yaml
). Please use your favorite text editor (preferably with YAML lint support) for inspection.
code 04-setup-observability/assets/manifests/loki-stack-values-v2.6.4.yaml
The above values file enables Loki and Promtail for you so no other input is required. Prometheus and Grafana installation is disabled because the Set up Prometheus Stack step took care of it already. Fluent Bit is not needed so it is disabled by default as well.
Next, install the stack using helm
. The following command installs version 2.6.4
of grafana/loki-stack
in your cluster using the Starter Kit repository values file. This also creates the loki-stack
namespace if it doesn’t already exist.
HELM_CHART_VERSION="2.6.4"
helm install loki grafana/loki-stack --version "${HELM_CHART_VERSION}" \
--namespace=loki-stack \
--create-namespace \
-f "04-setup-observability/assets/manifests/loki-stack-values-v${HELM_CHART_VERSION}.yaml"
Finally, check Helm release status:
helm ls -n loki-stack
The output looks similar to (STATUS
column should display ‘deployed’):
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
loki loki-stack 1 2022-06-08 10:00:06.838665 +0300 EEST deployed loki-stack-2.6.4 v2.4.2
Next, inspect all the Kubernetes resources created for Loki:
kubectl get all -n loki-stack
You should have resources deployed for Loki itself (loki-0
) and Promtail (loki-promtail
). The output looks similar to:
NAME READY STATUS RESTARTS AGE
pod/loki-0 1/1 Running 0 44m
pod/loki-promtail-8dskn 1/1 Running 0 44m
pod/loki-promtail-mgb25 1/1 Running 0 44m
pod/loki-promtail-s7cp6 1/1 Running 0 44m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/loki ClusterIP 10.245.195.248 <none> 3100/TCP 44m
service/loki-headless ClusterIP None <none> 3100/TCP 44m
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/loki-promtail 3 3 3 3 3 <none> 44m
NAME READY AGE
statefulset.apps/loki 1/1 44m
In the next step, you will configure Grafana to use the Loki data source and view application logs.
In this step, you will add the Loki data source to Grafana. First, you need to expose the Grafana web interface on your local machine (default credentials: admin/prom-operator
):
kubectl --namespace monitoring port-forward svc/kube-prom-stack-grafana 3000:80
You should NOT expose Grafana to a public network (e.g. create an ingress mapping or Load Balancer service) with default login/password.
Next, open a web browser on localhost:3000, and follow below steps:
http://loki.loki-stack:3100
.If everything goes well, a green label message will appear, saying Data source connected and labels found.
Now, you can access logs from the Explore tab of Grafana. Make sure to select Loki as the data source. Use the Help button for the log search cheat sheet.
In the next step, you’ll be introduced to LogQL, which is similar to PromQL
but for logs. Some basic features of LogQL will be presented as well.
In this step, you will learn how to use LogQL for querying application logs, and make use of the available features to ease your work.
Loki comes with its very own language for querying logs called LogQL. LogQL can be considered a distributed grep with labels for filtering.
A basic LogQL query consists of two parts: the log stream selector and a filter expression. Due to Loki’s design, all LogQL queries are required to contain a log stream selector.
The log stream selector will reduce the number of log streams to a manageable volume. Depending on how many labels you use to filter down the log streams, it will affect the relative performance of the query’s execution. The filter expression is then used to do a distributed grep over the retrieved log streams.
First, you need to expose the Grafana web console on your local machine (default credentials: admin/prom-operator
):
kubectl --namespace monitoring port-forward svc/kube-prom-stack-grafana 3000:80
Next, point your web browser to localhost:3000, and navigate to the Explore tab from the left panel. Select Loki from the data source menu, and run this query:
{container="vote-bot", namespace="emojivoto"}
The output looks similar to the following:
Perform another query, but this time filter the results to include only the “Error” message:
{container="web-svc",namespace="emojivoto"} |= "Error"
The output looks similar to the following. Notice how the “Error” word is being highlighted in the query results panel.
As you can see in the above examples, each query is composed of:
log stream
selector {container="web-svc", namespace="emojivoto"}
, which targets the web-svc container from the ambassador namespace.filter
- e.g.: |= "Error"
, which shows only the lines containing the warning word.More complex queries can be created using aggregation operators. For more details on the topic and other advanced features, please visit the official LogQL page.
Another feature of Loki worth mentioning is Labels. Labels allow you to organize streams. In other words, labels add metadata to a log stream, so that the system can distinguish it later. Essentially, they are key-value pairs that can be anything you want, as long as they have a meaning for the data being tagged.
Loki indexes data based on labels, allowing more efficient storage. The following picture highlights this feature in the Log labels panel:
In the next step, you will discover Promtail, which is the agent responsible for fetching and transforming the data (labeling, adding new fields, dropping, etc).
In this step, you will learn what Promtail is and how it works. It is deployed as a DaemonSet
, and it’s an important part of your Loki stack installation, responsible for fetching all Pods logs running in your Kubernetes cluster.
What Promtail essentially does is to:
Before Promtail can ship any data from log files to Loki, it needs to find out information about its environment. This means discovering applications emitting log lines to files that need to be monitored.
Promtail borrows the same service discovery mechanism from Prometheus, although it currently only supports Static and Kubernetes service discovery. This limitation is because Promtail is deployed as a daemon to every local machine and does not discover labels from other machines. Kubernetes service discovery fetches required labels from the Kubernetes API server while Static usually covers all other use cases.
As with every monitoring agent, you need to have a way for it to be up all the time. The Loki stack Helm deployment already makes this possible via a DaemonSet
, as seen below:
kubectl get ds -n loki-stack
The output looks similar to the following (notice the loki-promtail line):
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-prom-stack-prometheus-node-exporter 2 2 2 2 2 <none> 7d4h
loki-promtail 2 2 2 2 2 <none> 5h6m
The scrape_configs
section from the Promtail main configuration will show you the details of how Promtail discovers Kubernetes pods and assigns labels to them. You can use kubectl
for inspection (notice that the application configuration is stored using a Kubernetes ConfigMap):
kubectl get cm loki-stack -n loki-stack -o yaml > loki-promtail-config.yaml
Next, open the loki-promtail-config.yaml
file using a text editor of your choice (preferably with YAML support).
code loki-promtail-config.yaml
Then, look for the scrape_configs
section. The output should be similar to:
...
scrape_configs:
- job_name: kubernetes-pods-name
pipeline_stages:
- docker: {}
kubernetes_sd_configs:
- role: pod
relabel_configs:
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
...
Promtail knows how to scrape logs by using scrape_configs
. Each scrape configuration tells Promtail how to discover logs and extract labels. Next, each scrape configuration contains one or more entries (called jobs) which are executed for each discovered target. Then, each job may contain a pipeline comprised of multiple stages. The main purpose of each stage is to transform your application logs, but it can also drop (filter) unwanted log data if needed. Jobs can contain relabel_configs
stanzas as well that are used to transform labels.
Explanations for the above configuration:
job_name
: Defines a new job and its associated name.pipeline_stages
: Can be used to add or update labels, correct the timestamp, or re-write log lines entirely. It uses stages to accomplish the previously mentioned tasks. The configuration snippet presented above uses a docker stage, which can extract data based on the standard Docker format. You can refer to the official documentation to read more about Stages and Pipelines.kubernetes_sd_config
: Tells Promtail how to discover logs coming from Pods via Kubernetes service discovery.relabel_configs
: Defines a list of operations to transform the labels from discovery into another form. The configuration snippet presented above is renaming the __meta_kubernetes_namespace
source label provided by the Kubernetes service discovery mechanism to a more human-friendly form: namespace
.In most cases, you may not want to fetch logs from all namespaces (and pods, implicitly). This is very useful to avoid high traffic inside your Kubernetes cluster caused by Promtail, as well as to reduce the amount of data ingested by Loki. What Promtail allows you to do in this case, is to drop logs from unwanted namespaces and keep the rest. Another advantage of this approach is that the volume of data that needs to be indexed by Loki is reduced, meaning less storage used and number of objects, if using a DO Spaces bucket for example.
Promtail allows you to filter logs on a namespace basis, via the drop stage. You can use Helm to configure Promtail for namespace filtering.
First, open the 04-setup-observability/assets/manifests/loki-stack-values-v2.6.4.yaml
file provided in the Starter Kit repository, using a text editor of your choice (preferably with YAML lint support). Make sure to change the directory where the Starter Kit repository was cloned first.
code 04-setup-observability/assets/manifests/loki-stack-values-v2.6.4.yaml
Next, please remove the comments surrounding the pipelineStages
section. In the following example, you will configure Promtail to drop all logs coming from all namespaces prefixed with kube-
like kube-node-lease
, kube-public
, kube-system
. The output looks similar to:
promtail:
enabled: true
#
# Enable Promtail service monitoring
# serviceMonitor:
# enabled: true
#
# User defined pipeline stages
pipelineStages:
- docker: {}
- drop:
source: namespace
expression: "kube-.*"
Explanations for the above configuration:
pipelineStages
: Tells Helm to insert user-defined pipeline stages in each job that it creates. By default, the Loki stack Helm chart is configuring Promtail to fetch all the logs coming from every namespace and pod (you can inspect the ConfigMap template for details).docker
: Tells Promtail to use a Docker stage. Helps with Docker logs formatting.drop
: Tells Promtail to use a Drop stage. Then, you make use of the source
field, to drop logs based on a namespace. Finally, expression
is the regex selector for the source
field.Finally, save the values file and apply changes using helm
upgrade:
HELM_CHART_VERSION="2.6.4"
helm upgrade loki grafana/loki-stack --version "${HELM_CHART_VERSION}" \
--namespace=loki-stack \
-f "04-setup-observability/assets/manifests/loki-stack-values-v${HELM_CHART_VERSION}.yaml"
If the upgrade succeeded and no errors were reported, you can check LogQL if logs are still pushed to Loki from the kube-system
namespace. Wait a minute or so, and then run the following queries. Make sure to adjust the time window in Grafana as well to match the last-minute interval (you need to fetch the most recent data only).
Next, create a port forward for the Grafana web console on your local machine. Default credentials are admin/prom-operator
.
kubectl --namespace monitoring port-forward svc/kube-prom-stack-grafana 3000:80
Point your web browser to localhost:3000, and navigate to the Explore tab from the left panel. Select Loki from the data source menu and run the following queries:
{namespace="kube-system"}
{namespace="kube-public"}
{namespace="kube-node-leases"}
The output window should not return any data for any of the above queries.
Information:
|
).ServiceMonitors
and enable Promtail metrics collection, as learned in Configure Prometheus and Grafana from the Prometheus tutorial:For more features and in-depth explanations, please visit the Promtail official documentation.
In the next step, you will learn how to set up persistent storage for Loki using DO Spaces.
In this step, you will learn how to enable persistent storage for Loki. You will use the DO Spaces bucket created in the Prerequisites section of the tutorial.
By default, Helm deploys Loki with ephemeral storage using the emptyDir volume. This means all your indexed log data will be lost if it restarts or if the DOKS cluster is recreated. To preserve indexed log data across Pod restarts, Loki can be set up to use DO Spaces instead.
DO Spaces scales very well and it’s cheaper than PVs which rely on Block Storage. On the other hand, you don’t have to worry about running out of disk space. Plus, you don’t have to worry about PV sizing and doing the extra math.
First, change the directory where the Starter Kit repository was cloned:
cd Kubernetes-Starter-Kit-Developers
Next, open the loki-stack-values-v2.6.4.yaml
file provided in the Starter Kit repository using a text editor of your choice (preferably with YAML lint support).
code 04-setup-observability/assets/manifests/loki-stack-values-v2.6.4.yaml
Remove the comments surrounding the schema_config
and storage_config
keys. The final Loki storage setup configuration looks similar to the following. (Replace the <>
placeholders accordingly).
loki:
enabled: true
config:
schema_config:
configs:
- from: '2020-10-24'
store: boltdb-shipper
object_store: aws
schema: v11
index:
prefix: index_
period: 24h
storage_config:
boltdb_shipper:
active_index_directory: /data/loki/boltdb-shipper-active
cache_location: /data/loki/boltdb-shipper-cache
cache_ttl: 24h
shared_store: aws
aws:
bucketnames: <YOUR_DO_SPACES_BUCKET_NAME_HERE>
endpoint: <YOUR_DO_SPACES_BUCKET_ENDPOINT_HERE>
region: <YOUR_DO_SPACES_BUCKET_REGION_HERE>
access_key_id: <YOUR_DO_SPACES_ACCESS_KEY_HERE>
secret_access_key: <YOURDO_SPACES_SECRET_KEY_HERE>
s3forcepathstyle: true
Explanation for the above configuration:
schema_config
- defines a storage type and a schema version to facilitate migrations. Schemas can differ based on the Loki installations so make sure the schema stays consistent throughout the configuration. In this case, boltdb-shipper
is specified as the storage implementation and a v11
schema version. The 24h
period for the index is the default and preferred value, so please don’t change it. Visit Schema Configs for more details.storage_config
- tells Loki about storage configuration details like setting BoltDB Shipper parameters. It also informs Loki about the aws
compatible S3
storage parameters (bucket name, credentials, region, etc).Apply settings, using helm
:
HELM_CHART_VERSION="2.6.4"
helm upgrade loki grafana/loki-stack --version "${HELM_CHART_VERSION}" \
--namespace=loki-stack \
-f "04-setup-observability/assets/manifests/loki-stack-values-v${HELM_CHART_VERSION}.yaml"
Now, check if the main Loki application pod is up and running. It may take up to 1 minute or so to start.
kubectl get pods -n loki-stack -l app=loki
The output looks similar to:
NAME READY STATUS RESTARTS AGE
loki-0 1/1 Running 0 13m
The main application Pod
is called loki-0
. You can check the configuration file, using the following command (please note that it contains sensitive information):
kubectl exec -it loki-0 -n loki-stack -- /bin/cat /etc/loki/loki.yaml
You can also check the logs while waiting. It’s also good practice in general to check the application logs and see if something goes bad or not.
kubectl logs -n loki-stack -l app=loki
If everything goes well, you should see the DO Spaces bucket containing the index
and chunks
(fake
) folders.
For more advanced options and fine-tuning the storage for Loki, please visit the Loki Storage official documentation.
Next, you will learn how to set storage retention policies for Loki.
In this step, you will learn how to set DO Spaces retention policies. Because you configured DO Spaces as the default storage backend for Loki, the same rules apply for every S3-compatible storage type.
S3 is very scalable, so you don’t have to worry about having disk space issues. But it’s still a good practice to have a retention policy in place. This way, really old data can be deleted if not needed.
S3-compatible storage has its own set of policies and rules for retention. In the S3 terminology, it is called object lifecycle. You can learn more about the DO Spaces bucket lifecycle options from the official documentation page.
S3CMD
is a really good utility to have to inspect how many objects are present, as well as the size of the DO Spaces bucket used for Loki retention. S3CMD
also helps you to see if the retention policies set so far are working or not. Please follow the DigitalOcean guide for installing and setting up s3cmd.
Setting the lifecycle for the Loki storage bucket is achieved via the s3cmd
utility. You are going to use the loki_do_spaces_lifecycle.xml
configuration file provided in the Starter Kit Git repository to configure retention for the Loki bucket. The policy file contents look similar to:
<LifecycleConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<Rule>
<ID>Expire old fake data</ID>
<Prefix>fake/</Prefix>
<Status>Enabled</Status>
<Expiration>
<Days>10</Days>
</Expiration>
</Rule>
<Rule>
<ID>Expire old index data</ID>
<Prefix>index/</Prefix>
<Status>Enabled</Status>
<Expiration>
<Days>10</Days>
</Expiration>
</Rule>
</LifecycleConfiguration>
The above lifecycle configuration will automatically remove all objects from the fake/
and index/
paths in the Loki storage after 10 days. A 10-day lifespan is chosen in this example because it’s usually enough for development purposes. For production or other critical systems, a period of >= 30 days is recommended.
First, change the directory where the Starter Kit repository was cloned.
cd Kubernetes-Starter-Kit-Developers
Next, open and inspect the 04-setup-observability/assets/manifests/loki_do_spaces_lifecycle.xml
file from the Starter Kit repository, using a text editor of your choice (preferably with XML lint support), and adjust according to your needs.
Set the lifecycle policy by replacing the <>
placeholders accordingly.
s3cmd setlifecycle 04-setup-observability/assets/manifests/loki_do_spaces_lifecycle.xml s3://<LOKI_STORAGE_BUCKET_NAME>
Finally, check that the policy
was set:
s3cmd getlifecycle s3://<LOKI_STORAGE_BUCKET_NAME>
After finishing the above steps, you can inspect the bucket size and number of objects via the du
subcommand of s3cmd
(the name is borrowed from the Linux Disk Usage utility). Please replace the <>
placeholders accordingly:
s3cmd du -H s3://<LOKI_DO_SPACES_BUCKET_NAME>
The output looks similar to the following (notice that it prints the bucket size - 19M
, and number of objects present - 2799
):
19M 2799 objects s3://loki-storage-test/
Next, the DO Spaces backend implementation will clean the objects for you automatically based on the expiration date. You can always go back and edit the policy if needed later on by uploading a new one.
In this tutorial, you learned how to install Loki for log monitoring in your DOKS cluster. Then, you configured Grafana to use Loki as a data source. You also learned about LogQL for querying logs, and how to set up persistent storage and retention for Loki.
The next step is to export Kuberentes Events.
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!
hi, can you provide example of how to use it with s3 storage on the new schema v13?
grafana/loki-stack
is deprecated by now and won’t work with the latest kube-prom-stack anymore. Would like to see specifically the storage configuration for DO with the recommended helm chartgrafana/loki
.