Saturday, December 19, 2020

Docker Security - Learn with exploiting the weakness

In our docker environment, the moment we started thinking about protecting docker with right secure practices then the first buzz word would come in all our mind is "isolation". 

 Yes, In docker security the real buzz word is "isolation". The more you isolate docker container runtime from a docker Host, the more you isolate one docker container from another container then the security is almost there. To bring these "isolation", the docker as a framework by default supports some of isolation practices such as 
  •   Docker Namespace 
  •   Cgroups 
  •   Kernel capabilities. 
Docker namespace brings much of isolation by providing namespace separation for "process" ," mount", "network stack", etc., etc. For example with docker process namespace, the isolation is provided between the process in the container and the process in the host. The process in the host will have a different process ID, and the same process inside the container will have a different process ID. The processes in running in a host cannot be accessed inside the container and vice versa. This way docker provides isolation of one container is not disturbing other container and also not disturbing the host. 

CGroups are another key component supports isolation in docker. They implement resource accounting and limiting. They provide many useful metrics, but they also help ensure that each container gets its fair share of memory, CPU, disk I/O; and, more importantly, that a single container cannot bring the system down by exhausting one of those resources. 

If we take Kernel capabilities, the docker by default restricts the set the kernel capabilities within the container. For example, the root user in the docker Host will NOT have all the capabilities inside the docker container.

Along with aforementioned isolation practices, we will look at some of the docker secure practices which docker and Linux Kernel supports.

Below are the list of docker secure practices, we will discuss in this article. Also, according to me if we want to protect something we also should be knowing how to break it too. Let's learning with exploiting each secure practices weakness in the docker environment.


Let's start with docker architecture to understand why we do we say "isolation" is important in docker security.  If we look at the below diagram , you could imagine that how the kernel is positioned  in docker architecture while comparing traditional VM architecture . In VM architecture the individual VM process will have it is own dedicated Kernel , but when it comes to docker architecture which is not the case. Each containers process will share the same host Kernel across the cluster.

This is one of the reason why the "isolation" is important in docker security terms.  Let's take an example if one containers is damaged with attacker arbitrary code then eventually there is a possibility
of the vulnerability breakout from the container to host kernel. As kernel is sharing across the container and docker engine is positioned above the host kernel the attack surface will be extended to breaking out to the other containers in the cluster also. This is the risk the docker architecture poses in terms of sharing host kernel across container processes. 



Rootless containers


Running your containers as a "Rootless container".  It means running the entire container runtime as well as the containers without the root privileges.

In a normal scenario when a docker engine spins a new container process the default privilege
the container will be running is "root" privilege, though the default docker isolation
practices limits the root user capabilities within the container but the still the container
will be running in a as root user. In any case, if the container runtime is processed it could maximum 
impact to the container and also if the vulnerability breakout the vulnerability will have access to docker engine and host machine kernels.

Also, if we really look into the need of running container in ROOT mode. Absolutely 90 % there is NO need to  run container in root mode. 

Below are the potential threats of running container in ROOT mode




Within the container

A compromised container runtime:  With root context can perform any action inside the container including installing new software editing files, mount file system, modify permission etc.,


Outside the container

In a compromised container, the vulnerability could:
  •      Breakout the container and escalate permission to Host.
  •      Breakout the container to damage other container 
  •      Breakout to docker engine and can make requests to docker API server.
      

How to exploit the root containers 


Here I will show you how the container running with root mode can be exploited in simple ways.

I've used Katacoda as a testing environment.

As a first step to exploit, you can verify the container running mode as shown below.
In the below container I verified that container is running mode by running "whoami" command
inside the container.




Privilege escalation to host machine

In below steps, I've shown how privilege escalation happens from the docker container to docker host.
To simualte I've mounted the host machine filesystem as a volume into the container, then I run
the command "cat /host/etc/shadow" . The output is listing the users details of host machine.



Small DoS attach within the container

In below step, I'll show a simple DoS attack exploitation within the docker container.
Here the container is running in a root user mode, hence it has privilege to install any software's within the container. Taking advantage of that, I install the debian package called "Stress", then using "stress" package I make heavy load to container memory and thereby bringdown the container to "OOMKilled" mode. Successfully made the DoS exploit.


How to run as "Rootless container"


Here the some of the basic steps to consider running your container as "rootless container"

1. Update your YAML file (if using K8s) and the securit context section to 
       "runAsNonRoot" : true
   "runAsUser" : 1000
   
2. Add a new non-root user in your docker file

RUN groupadd --gid 1000 NONROOTUser && useradd --uid 1000 --gid 1000 --home-dir /usr/share/NONROOTUser --no-create-home NONROOTUser
USER NONROOTUser

3. Incase your container port is running in priviliged port any thing below 1024 for example: port 80, please modify
to run in unpriviliged port (anything above 1024), for example : port 5000.


Rootless Docker Engine


Running docker engine or daemon in a NON-ROOT user context.

In the above section we saw "rootless container", here the other secure practice to run  your docker engine /host itself in a rootless mode.

Docker recently introduced "rootless docker engine" as part Docker version 19.03. Docker recommends
to run your container as rootless mode, however the this feature is still preview mode and yet to 
be used by many peoples.

With below command, you can check your docker engine is running in root mode or rootless mode.



Docker Seccomp Profile


Secure computing mode (seccomp) is a Linux kernel feature.

  • Seccomp acts like a firewall for systems (syscalls) from container to host kernel.
  • Sample list well known syscalls: MKDIR  <> , REBOOT <>, MOUNT <>,KILL <>, WRITE <>.
  • Docker default Seccomp profile disables 44 dangerous system calls, out of 313 available in 64-bit Linux systems
  • As per Docker incident CVE’s list,  most of docker incidents are due to privileged Syscalls.
  • Docker default  Seccomp profile provided whitelisted Syscalls most of time NOT necessary for our product needs.It is recommended to have product specific custom seccomp profile by whitelisting only Syscalls used by our container.



How to check Container Seccomp Profile

We can verify your container runtime is enabled with default seccomp profile protection or not. Just go inside your container terminal mode and run the below command grep Seccomp /proc/$$/status ( as shown below)

Seccomp value 2 means it is ENABLED
Seccomp value 0 meants it is NOT enabled



Docker Limited Kernel capabilities


By default, Docker starts containers with a restricted set of capabilities. This provides
a greater security within the container environment.

It means though your containers process is running with a root mode, the Kernel capabilities
within the container are limited. Docker will allow only a limited capabilities within the
container which user process can execute. However, this default protection from docker 
can be overridden if you run your container in a "privileged" mode.

To understand better. If you log into your Linux host machine as a Root user then you will
have the below Linux kernel capabilities will be allowed.

CAP_CHOWN, CAP_DAC_OVERRIDE, CAP_DAC_READ_SEARCH, CAP_FOWNER, CAP_FSETID, CAP_KILL, CAP_SETGID, CAP_SETUID, CAP_SETPCAP, CAP_LINUX_IMMUTABLE, CAP_NET_BIND_SERVICE, CAP_NET_BROADCAST, 
CAP_NET_ADMIN, CAP_NET_RAW, CAP_IPC_LOCK, CAP_IPC_OWNER, CAP_SYS_MODULE, CAP_SYS_RAWIO, CAP_SYS_CHROOT, CAP_SYS_PTRACE, CAP_SYS_PACCT, CAP_SYS_ADMIN, CAP_SYS_BOOT, CAP_SYS_NICE, CAP_SYS_RESOURCE, CAP_SYS_TIME, CAP_SYS_TTY_CONFIG, CAP_MKNOD,
 CAP_LEASE, CAP_AUDIT_WRITE, CAP_AUDIT_CONTROL, CAP_SETFCAP, CAP_MAC_OVERRIDE,  CAP_MAC_ADMIN, CAP_SYSLOG
 
 
But the same root user enters into docker container the most above kernel capabilities will 
be dropped and only below restricted list of capabilities will be allowed. 

CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID, CAP_KILL, CAP_SETGID,
CAP_SETUID, CAP_SETPCAP,CAP_NET_BIND_SERVICE, CAP_NET_RAW,CAP_SYS_CHROOT,CAP_MKNOD, CAP_AUDIT_WRITE

DO NOT RUN CONTANIER IN – –PRIVILEGED MODE !!

Privileged container can do almost everything that the host can do. 
The --privileged flag gives all capabilities to the container,  and it also lifts all the limitations enforced by the device cgroup controller. 

Using below command you can verify whether your command is running in PRIVILEGED Mode or normal mode.

If the command returns TRUE, it means container is running in a PRIVILEGED mode.


Run container with limited or NO Kernel capabilities


Absolutely, in normal scenarios most of the Microservices running in a container does NOT need 
all Kernel capabilities provided by Docker.

Hence, the best practice is DROP all capabilities and add only required capabilities.

This can be done from Kubernetes docker Yaml file security context configuration. In your security
context either DROP all capabilities. Example
 
 SecurityContext => Capabilities => drop : ALL
 
Or add only the require capabilities. Example

 SecurityContext => Capabilities => add : ["NET_ADMIN", "SYS_TIME"]

Docker SE Linux Protection


Docker SELinux controls access to processes by Type and Level to the containers. Docker offers two forms of SELinux protection: type enforcement and multi-category security (MCS) separation.

  • SELinux is a LABELING system
  • Every process has a LABEL. Every File, Directory and System object has a LABEL
  • SE Linux Policy rules controls access between labelled processes and labelled objects.
!! To enable SE Linux in container, your Linux host machine must have SE Linux enabled and running !!




Docker UNIX Socket (/var/run/docker. Sock) usage


There are approaches followed by developer to achieve container management related functionalities
they will mount the docker UNIT socket inside the container and using the docker socket they
will do achieve the container management functionalities implementations such as for  collect logs from all containers, create a container, stop container...etc

BE CAUTIOUS WHEN YOU MOUNT DOCKER UNIX SOCKET INSIDE YOUR CONTAINER !

It is more dangerous combination of Root context, container privileged mode and UNIX socket mounted.

Below is sample scenario which mounts docker UNIX socket inside container for log management of all the containers running by the docker engine.



Docker Network security 


Be cautious on how you expose the services insider the container to outside the cluster.

  • Do NOT expose the container with External IP ( if there is NO explicit need to run in external IP)
  • When there is a need to expose with External IP ensure that the inbound connection are encrypted and listening in 443 port.
  • Always try to expose your services only with Cluster IP mode.
  • If there is a need to expose with Node Port, ensure that the inbound connection are encrypted and listening in 443 port

Ingress and Egress rules:

Control traffic to your services with Ingress and Egress network policies. 
  • With strict ingress rules supported by Kubernetes you can restrict the inbound connections to your containers.
  • With strict egress supported by Kubernetes you can restrict the outbound connections from your connection to other network.

Other Docker Security Practices


  • Volume mount – as read only
  • Ensure SSHD does not run within the containers
  • Ensure Linux host network interface not shared with containers.
  • Having no limit on container memory usage can lead to issues where one container can easily make the whole system unstable incase DoS attack happened
  • Don't mount system relevant volumes (e.g. /etc, /dev, ...) of the underlying host into the container instance to prevent that an attacker can compromise the entire system and not just the container instance.
  • Ensure Docker daemon available remotely over a TCP port. Ensure TLS authentication.
  • Consider read-only filesystem for the containers.
  • Leverage secrets store/wallets instead of environment variables for sensitive data storage inside docker container.

2 comments:

  1. Ensuring the safety of websites or web applications is essential to prevent any sort of attacks (threats) and unauthorized access. As a Network security audit company, I am glad to come across this. Thanks for sharing, great blog.

    ReplyDelete