Introduction to Threat Detection on Kubernetes with Falco
Using Kubernetes is simple, for example as a managed service such as Azure Kubernetes Service (AKS) and especially for first-day operations. In the long term, you want to gain visibility into the cluster and monitor certain events – this applies to non-managed service clusters, too.
For example, spawning a shell in a container is most likely not required and could be an attack. Falco is a threat detection engine for Kubernetes and can be used to gain visibility on such events. It is a project hosted by the Cloud Native Computing Foundation (CNCF) and was donated by Sysdig. Falco is essentially an engine that listens to syscalls in the Linux kernel, processes them with an engine and a rule set, and generates alerts if a rule matches. These alerts can be pushed to various output channels. This blog post is the first in a two-part series, and it covers the basic concepts behind Falco. The second blog post sets out the first steps with Falco on AKS and describes how to configure the basic setup and use the standard rule set. It also provides sample Log Analytics queries that allow you to learn about the present environment and adapt Falco rules to it. You can find the second blog post here.
Why you need Falco
There are many events in a Kubernetes cluster that are not common or should not even happen. Depending on the environment, there may already be mechanisms in place that allow operators to control or monitor these events, for example through a proxy. Falco makes it possible to monitor such events directly inside the cluster. The events may include the following (complete list available here):
- Outgoing connections to specific IPs or domains
- Use or mutation of sensitive files such as /etc/passwd
- Execution of system binaries such as su
- Privilege escalation or changes to the namespace
- Modifications in certain folders such as /sbin
A standard Kubernetes cluster does not provide any mechanisms for monitoring such events; a tool like Falco is therefore required. Gaining insights into such events inside the cluster makes it possible to detect attacks and potential malicious behavior and to alert operations staff at an early stage.
As with any tool, there are limitations. For example, a supply chain attack based on a malicious image does not trigger any of these events. A demonstration of such an attack can be found in this CNCF Community talk.
From a high-level view, Falco is comprised of the following components:
- Event sources (drivers, Kubernetes audit events)
- A rule engine and a rule set
- An output system integration
Falco internals (cf. Falco Docs, distributed under CC BY 4.0).
Falco is placed at the heart of a Linux system – the kernel. It uses so-called drivers to monitor syscalls made by applications; it can therefore monitor everything that results in a syscall. As containers share a kernel, it is possible to monitor syscalls by all the containers on a host. This is not possible in the case of more isolated container hosts that do not rely on sharing a kernel and use a different runtime, for example Kata Containers. Falco supports three types of drivers: kernel module, eBPF probe, and userspace instrumentation:
- Kernel module (the default): A kernel module that must be compiled for the kernel that Falco will run on.
- eBPF probe: No need to load a kernel module, but requires a newer kernel that supports eBPF. Not supported on many managed services.
- Userspace instrumentation: Runs completely in userspace, but there is currently no supported implementation of a userspace instrumentation.
For more details, you can read this blog post on the Falco home page, which goes into greater depth. In addition to the drivers, it is also possible to connect the Kubernetes audit events to Falco. This is implemented via audit policies and a webhook.
In summary: All drivers intercept syscalls (by different techniques) and provide them to the Falco rule engine. The rule engine is configured with a rule set. The rule set contains conditions to match previously defined events, for example spawning a shell. Each time a rule matches, Falco will generate an alert and push it to the configured output system. The Falco project provides a basic rule set with many useful rules. It is available on GitHub.
Although Falco is a cloud-native threat detection engine, it is directly integrated with the Linux kernel and can therefore be used for non-container hosts as well.
Processing Falco logs with a logging system
Falco provides support for a variety of output channels for generated alerts. These can include stdout, gRPC, syslog, a file, and more. A complete list of supported interfaces/systems is available in the documentation. This flexibility allows for many different integration patterns, for example into existing log systems. A few of them might be:
- Using a sidecar container that parses Falco’s file-based logs inside a shared directory and forwards them to the log system.
- The log agent on the cluster node forwards Falco’s stdout logs directly to the log system (spoiler: Blog post two uses this pattern).
- Each host is already equipped with a log system agent, which reads and forwards the syslog. Hence, Falco can ingest its logs directly into syslog.
Of course, every type of integration is different and depends on existing setups. It is also possible to configure Falco to emit alerts in JSON format. This is quite useful and enables the log system to simply query the logs without the need to parse each entry.
The Falco project supports different ways of installing Falco. The recommended way on Kubernetes is to install Falco directly on the host, for example via an apt repository, and to run it as a systemd service. This provides a clean separation between the privileged part of Falco and a read-only agent that runs as a container. Sometimes this is not possible where custom host images are not supported (for example on some managed services). AKS is one such an example. For such cases, there are third-party installation procedures, for example via Helm. This is described in the documentation.
Be aware that different runtime privileges may be required by Falco depending on the selected driver and installation type. The Helm chart, for example, runs the Falco container with privileged: true, as it is needed to install the kernel module on the host. Depending on the security requirements, this may not be tolerable – and it is not recommended either.
Falco is a great tool for monitoring a Kubernetes cluster and provides deep insights into the cluster. It can be a valuable enhancement for your environment. As with any tool, some initial effort is required to set up and configure Falco properly. In particular, adapting it to the target environment can be quite time-consuming, and so too is getting to grips with the rule specification language. Feel free to move on to the second blog post to get started with Falco on AKS and learn how to set it up and figure out how your environment is behaving.