Wherein we learn how to run commands that require SSH keys or other secrets from within a Dockerfile, without leaving said secrets in the resulting Docker image.

Preparing an application to run under Docker involves generating one or more Docker images, usually with the docker client’s build command. The Dockerfile syntax provides directives to copy the application’s codebase and other files into the Docker image. This works well, as long as all the app’s dependencies are present in the codebase itself, but the build process for many modern web-platforms such as Rails or Node.js is often considerably more complex.

The Problem:

Typically, a file within the application’s codebase (such as a Ruby Gemfile) lists all the dependencies, and a platform-spcific tool (like bundler) is invoked to download them into place. Some of these “secondary” build steps can introduce even more dependencies – steps such as compiling certain Ruby gems with a C-language interface against the underlying OS libraries. Again, none of this is a problem, as long as all these dependencies either exist already in the source image, or can be acquired from publicly-available locations. But if that’s not the case, the nature of the Docker build process introduces a nasty security problem…

When docker build is run, the docker client parses the Dockerfile one directive at a time, and runs each directive in a new container. Naturally, for the first directive in the Dockerfile, the container is based on the Docker image specified by the FROM directive, but after the first build step succeeds, docker then creates a new, “intermediate” docker image from the the resulting container, then runs the second directive in a container based on that image, and so forth. You can see this clearly in the docker build output, which shows the UUIDs of each intermediate container and image:

Sending build context to Docker daemon 18.44 MB
Step 0 : FROM mdsol/passenger-ruby21
 ---> e3c5b63d6b87
Step 1 : MAINTAINER Medidata Devtools <devtools@mdsol.com>
 ---> Running in 36e870e7fd8f
 ---> bd2e6f162d7f
Removing intermediate container 36e870e7fd8f
Step 2 : ADD . /home/app/web
 ---> 0e28dac7e77e
Removing intermediate container 5d3daef87c57
Step 3 : WORKDIR /home/app/web
 ---> Running in 6f66534c30c2
 ---> 273a50b919e3
Removing intermediate container 6f66534c30c2
Step 4 : RUN npm install --unsafe-perm
 ---> Running in c4631bcf2a3f
...

The fully-built image is a union filesystem view of all the intermediate layers, each represented as chages from the previous layer, and merged with one another into a final view. This means that the intermediate layers are still accessible to anyone who knows how to deconstruct the image, so if we want to use some secret (such as a GitHub SSH key) during the build process, we need to make sure that the secret is not written to disk. More precisely, if it is written to the filesystem, it must be removed before the command requiring it terminates, because that’s when the build process takes a snapshot of that particular command’s container.

So we have two hurdles: how to inject the secret into the build context with minimal formality in the Dockerfile; and how to make sure it gets injected precisely when needed, then removed before the snaphot happens.

The Solution:

At Medidata, we’ve written a tiny command-line tool that allows us to overcome both halves of the problem. The tool is called docker-ssh-exec, and it runs in its own container to serve the key over the private Docker-created network. During the build, it’s invoked from the Dockerfile, using the RUN directive, where you just pass it the command that requires the secret. docker-ssh-exec fetches the key over the network from the server container, writes it to disk, executes the desired command, and then removes the key. Since all this occurs within the scope of a single RUN directive, the key data is never written into the resulting filesystem layer.

Here’s how to do it, in slightly more detail. To run the server component, pass it the private half of your SSH key, either as a shared volume:

docker run -v ~/.ssh/id_rsa:/root/.ssh/id_rsa -d mdsol/docker-ssh-exec -server

or as an ENV var:

docker run -e DOCKER-SSH-KEY="$(cat ~/.ssh/id_rsa)" -d mdsol/docker-ssh-exec -server

As long as the source image is set up to trust GitHub’s server key (see the product’s README for tips), you can clone private repositories from within the Dockerfile like this:

docker-exec-ssh git clone git@github.com:my_user/my_private_repo.git

The client first transfers the key from the server, writing it to $HOME/.ssh/id_rsa (by default), then executes whatever command you supply as arguments. Before exiting, it deletes the key from the filesystem.

Notes / Further Reading:

The secret can be any data; the software does not examine its content.

In order to keep any additions to the Dockerfile as minimal as possible, the client discovers the server using broadcast UDP, and no authentication is performed between them. Thus, any docker container on the host (and thus any user who can invoke Docker) has access to the secret. Since the two common use cases for this tool are: running on a developer’s workstation; and a running on a private build server, this is not usually an issue.

Docker version 1.9 will introduce the ARG directive, which will allow information to be passed to the build in a more ephemeral fashion. Depending on how complex you’re willing to make your Dockerfile and build command, you may prefer this approach once it’s released.

A similar approach to solving this problem, and a few other, related, Docker building issues, can be found here. It involves no custom tools, but requires more preparation both on the server side and on the client side.

Analytics