Today I am going to talk about the EFK stack, what it is, and how you can configure it for maximum visibility in your logs.
The 3 components of the EFK stack are as follows:
In this blog post I will focus on Fluentbit, as it is the lightweight version of Fluentd and, in my opinion, is more suited to Kubernetes. We will also focus on how I reached this the final solution and the hurdles I had to overcome, and also how to handle application logs without actually installing and 3rd party clients like https://github.com/fluent/fluent-logger-java.
So, first off, the Fluentbit is the agent which tails your logs, parses them, and sends them to Elasticsearch. In turn, Elasticsearch stores them in different (or the same) indexes, and provides an easy and powerful way to query your data. To make visualization easier, we have Kibana, which is a UI over Elasticsearch and helps you query your data either by using the Lucene query syntax, or by clicking on certain values for certain labels (e.q. you want all logs that have the log_level set to ERROR).
The simplest way to start off with creating this stack, is to deploy all components using helm charts. If you do not know what Helm or Helm charts, I will try to briefly explain the two. Helm is a package manager for Kubernetes, sort of how APT is a package manager for Debian, and Helm charts are packages which describe different software components (databases, caches, etc.). A Helm chart is composed of two main parts:
This abstraction helps us get rid of duplicate code, and store only what changes from environment to environment in the values file. When you download a chart, you only need to change the values file to suit your needs (e.g. maybe to want to change the size of the PVC that will be created, or you don't want a PVC, or you want to change the resource limits on your containers, the sky is the limit).
For the deployment of Elasticsearch and Kibana, I am not going to use Helm, as I want to deploy a single-node Elasticsearch, and the Kubernetes cluster which I will be using shall be the one provided by Docker for Desktop. Fluentbit will be installed by using its Helm chart.
As you can see, I have deployed my components:
And to generate some logs, I have used the following command:
kubectl run --image=cloudhero/fakelogs fakelogs
cloudhero/fakelogs is an image with a Go process in it that just outputs the same Java log every 5 seconds.
The first thing which everybody does, is deploy the Fluentbit daemonset and send all the logs to the same index. The results are shown below:
As you can see, our application log went in the same index with all other logs, and is parsed with the default docker parser. This presents itself with the following problems:
- All logs go into the same index. This makes our search more complicated and slower, as Elasticsearch has to search through all the logs. You also do not have control on the log cleanup, maybe you want your application logs to be kept for 30 days, but all other logs to be cleaned after 7 days.
- All logs are parsed with the same parser, and as we all know, not all applications have the same log format, so you will end up with some logs being parsed correctly, and some logs not being parsed at all.
- You could specify multiple INPUT and OUTPUT plugins in your Fluentbit configuration file, but that would lead to duplicate logs, which can end up being very costly in terms of disk space.
The next thing you can do, is deploy your applications with Fluentbit and logrotate sidecars, and direct the stdout of your application to a shared emptyDir volume. Below is an example of how you can do this with the cloudhero/fakelogs image:
With this setup, we have fixed the problems with the prior setup. Results are shown below:
As you can see, our logs are parsed and have their own index, so we don't interfere with the kubernetes_cluster-* index, where the rest of the logs are.
As good as this method seems to be, we are still facing some problems:
- Kubectl logs will not output anything anymore.
- You need a logrotate sidecar to take care of your log file, if we are redirecting the output from stdout. And there are some cases where logrotate will not even work, take for example Nginx. When rotating the logfile, you need to send a signal to the Nginx PID, which is not possible when running logrotate in another container.
- You use up more resources by having a minimum of 3 containers per pod.
For the first problem, I came up with a solution of a Nginx sidecar which outputs the application logs in a webpage, but this adds another container to our pod, taking the number to 4.
For the second problem you could create an image which runs both Nginx and logrotate, but that will force you to use more things in your container, like supervisorD to handle process failure. I have also tried to write to a named pipe instead of a file, but the Fluentbit tail plugin does not work on pipes (the head plugin seems to work, but I have concluded that it is not very reliable). Moreover, I tried piping the logs to netcat and send them over the network, but the forward plugin does not work this way, and the TCP plugin expects JSON output, and does not support parsing.
The third problem is solvable only if you create an image with all 3 processes, which is not advisable.
Recently, Fluentbit has added support for Kubernetes annotations. Currently, there are two supported annotations https://docs.fluentbit.io/manual/filter/kubernetes#kubernetes-annotations:
The first one is cool, but it still does not enable you to send logs to different indexes in Elasticsearch. The second one is the interesting one. This lets you exclude your application logs from the main tailing process (which tails /var/log/containers/*), and then create separate INPUT and OUTPUT stages in your Fluentbit configuration file, for each application.
First off, you can create a new filter which takes care of your application logs:
You then write another INPUT/OUTPUT pair in the config:
This works because the log names in the /var/log/containers folder are in the <deployment_name>*_<namespace>_<container>-*.log format.
This method gives you a centralised configuration file for all your applications, removes the need for additional Fluentbit processes, gives you maximum flexibility with your parsers and Elasticsearch indexes, and still lets you have your logs when running the kubectl logs command!
I hope you found this blogpost useful and will help you find problems in your applications much faster!