Rule-Driven Automation on Kubernetes with Autopilot Monitoring

by Carlo GutierrezFebruary 28, 2019
Custom policies in the Autopilot project can automatically generate actions to prevent unwanted data access, such as from rogue containers.

Common Kubernetes security issues

When running applications on Kubernetes, there are some common security problems that can be frequently missed. One of these mistakes involves running pods on a host volume. When this happens, data is left behind in the host machine even after the pod is terminated, leaving it vulnerable. Another issue is identifying whether or not a rogue container located on the host is accessing volumes attached to the host. These are just a few of the problems that can cause data vulnerability in Kubernetes.

While there are some security precautions available in Kubernetes, such as enabling role-based access control (RBAC) and encrypting persistent volumes, there are still limitations. In RBAC’s case, it can’t manage access to components that are not under the control of Kubernetes. Additionally, there are a few more loopholes that are difficult to secure:

  • A pod may not successfully terminate holding a reference to the volume causing software failure.
  • Left-over host mounts can be vulnerable and accessed by pods or malicious containers.
  • Rogue containers bound to the host can access all the attached and mounted persistent volumes.

At a Kubernetes meetup in Santa Clara, Gou Rao and Aditya Dani of Portworx discussed how some of these security loopholes can be automatically patched using an open-source solution called Autopilot.

Gao Rou, CTO and co-founder of Portworx

“When I’m accessing my e-mail at Gmail, I really don’t know what server it’s running on. I’m just going to assume that things are safe. The same thing applies here. When I’m running my database container in a 100-node Kubernetes cluster, I don’t know which machine it’s running on. I shouldn’t have to care whether that data is going to be accessed by somebody else or if it’s safe.” —Gou Rao, Portworx

 

What is Autopilot?

Autopilot is a rule-based analytical engine, which uses a monitor-and-react model. The solution takes its input from various metrics, logs, and tracers of the stateful applications like Postgres, Cassandra, and Redis. From this input, Autopilot can perform certain actions complying with a condition set. Both input rules and outcome actions are based on well-defined Kubernetes CustomResourceDefinitions (CRD).

Autopilot workflow (Image credit)

“It’s basically a policy-based engine that’s looking out for the administrator’s back. If an application’s running rogue, if there’s a security violation, or if there’s a performance issue, its job is to step in and either raise a red flag or take action on it.” —Gou Rao, Portworx

Some of the actions Autopilot can perform include:

  • Automatic persistent volume updates and relocation
  • Automatic scaling of a volume by increasing or decreasing input/output operations per second

According to the Portworx team:

  • Performance of an application and its containers at the required levels is ensured via monitoring.
  • High availability is achieved through redundancy.
  • Pod scaling and application-level rebalancing are supported out-of-the-box.

Aditya Dani, software developer at Softworx

“You define an application-level policy, which is given as input to the Autopilot inference engine. The other input are metrics, logs, and tracers. It also talks to Kubernetes, then it does correlations based on the input and the timelines. Based on the conditions that have been defined in the policy, it’s going to perform an action. The action can be specific to an application or it can be a generic action.” —Aditya Dani, Portworx

 

Detecting breadcrumbs with Autopilot

To get a better understanding of Autopilot, the Portworx team referred to a common security problem in Kubernetes, which is detecting breadcrumbs (the data an application leaves on a node) and stopping rogue containers which try to access it.

Detecting breadcrumbs with Autopilot (Image credit)

In their example, the speakers utilized and monitored cAdvisor, which provides metrics for resource usage and performance characteristics of running containers.

An example of a metric from cAdvisor (Image credit)

In order to use the container_fs_read_bytes_total metric from cAdvisor, a Postgres volume security policy is defined. Under this policy, containers which aren’t part of the /kubepods Kubernetes cgroup are stopped if they try to access the breadcrumbs.

An example of a storage policy CRD (Image credit)

“This is extensive. You can define your own actions and your own conditions. You can also define your own application-level specific policies.” —Aditya Dani, Portworx

Through the use of Kubernetes CRDs, the input and output policies of Autopilot can be heavily customized to meet different security and storage requirements. While the example presented here is about volume security, Autopilot can also be used to monitor and automate applications, as well as volume health. More examples and Autopilot’s development can be tracked in the project’s GitHub repo.

 

Want details? Watch the video!

Table of contents

  1. What is Portworx? (00’29”)
  2. Common security problems in Kubernetes (6’55”)
  3. What is the Autopilot engine? (13’05”)
  4. How does Autopilot work? (16’40”)
  5. Detecting breadcrumbs with Autopilot (18’18”)
  6. How is a storage policy defined? (21’10”)
  7. Demo: Autopilot in action (22’45”)
  8. Questions and answers (32’45”)

 
These are the slides presented.

 

Further reading

 

About the experts

Gou Rao is CTO and Co-founder of Portworx, leading the company’s technology, market, and solution execution strategy. Previously, he served as CTO of Data Protection at Dell, in charge of the technical direction, strategy, and architecture. Gou joined Dell through the acquisition of Ocarina Networks, where he was Co-founder, CTO, and Chief Architect. Gou was also CTO and Co-founder of Net 6 (acquired by Citrix), where he invented Hybrid VPN.

 

Aditya Dani is a member of technical staff at Portworx with 5+ years of experience in building distributed control-plane solutions. He has written the Kubernetes in-tree storage plugin for Portworx. Recently, Aditya has been working on Portworx’s distributed control plane, including the integration efforts with Kubernetes and different schedulers. Prior to that, he was a software development engineer at Amazon Music.

The post is written by Carlo Gutierrez, edited by Sophia Turol and Alex Khizhniak.