Docker is a set of software and services that enable virtualization at the level of the operating system; this is also known as containerization.
Docker allows developers to package applications, together with their dependencies and configuration settings, into (virtual) containers that can run on any Linux, Windows or MacOS machine, be it on a desktop, in the cloud, or on the node of an IoT network.
When a Docker container is run on Linux, Docker leverages the Linux kernel and the Overlay File System to ensure the process is isolated and its computing resources are limited. Since no hardware virtualization is involved, this has a very low overhead. On MacOS and Windows, a lightweight virtual machine is provisoned within which Docker is run and containers are executed.
The ability to run containers transparently on any OS gives developers a unifying experience regardless where they develop. Once an application is developed and packaged into a Docker image, the developer can be sure the application will run anywhere where docker runs.
The best way to see how one can use Docker to streamline and unify development and deployment is to use it.
To follow along with examples, you'll need to have Docker installed. You can install all required tools by installing Docker Desktop.
Packaging an application into a Docker image
To test drive Docker, let's implement a simple web application and package it as a docker image–or containerize it.
The web application
First, let's implement the app and run it natively.
We'll create a simple HTTP end-point that takes a query parameter named
path and returns the list of files under location specified by the
parameter. (Security wise, having an app that serves the filesystem
contents without authorization is generally a bad idea; we're using it
here merely to demonstrate the isolation that is offered by Docker.)
We'll implement this in Python using the Falcon web framework and the Gunicorn application server. Here's the code.
import json import os import falcon class FileBrowser: def on_get(self, req, resp): if "path" not in req.params: resp.status = falcon.HTTP_400 resp.text = "Missing path parameter!" return path = req.params["path"] if not os.path.isdir(path): resp.status = falcon.HTTP_404 resp.text = "Path '%s' does not exist" % path return try: files = json.dumps(os.listdir(path)) resp.status = falcon.HTTP_200 resp.text = files except Exception as e: resp.status = falcon.HTTP_500 resp.text = "Unexpected error: '%s'" % e app = falcon.App() app.add_route('/', FileBrowser())
This is the entire code which we save into
fileapi.py. The web
application accepts GET requests to the root endpoint
/, where the
following logic takes place.
- If the query parameter is missing, a
400 Bad Requestresponse is returned;
- If the query parameter is present, but it points to a non-existing
404 Not Foundresponse is returned;
- If the query parameter points to a valid directory path, the list of
files is obtained and returned as a JSON array; the default
content-typein Falcon is
- However, if an error occurs during the file listing, a
500 Internal Server Errorresponse is returned.
To run this application, we need a system that has Python, the Falcon framework and the gunicorn web application server installed.
On a typical Debian-based system one would install them with
sudo apt install python3 python3-pip and then use pip to further install python
dependencies, for instance
pip3 install falcon gunicorn.
Alternatively, we could also use the
venv command to create a Python
virtual environment. Needless to say, this process is different, if one
is using MacOS or Windows.
Finally, we can run the application by issuing
gunicorn fileapi:app --bind 127.0.0.1:8000. This will use the gunicorn application server to
start the application in file
fileapi.py and listen on the loopback
interface; make sure the command is run from the same directory as the
Next, let's test the server with
$ curl -i "localhost:8000/" HTTP/1.1 400 Bad Request Server: gunicorn Date: Wed, 21 Sep 2022 13:27:42 GMT Connection: close content-length: 23 content-type: application/json Missing path query parameter! $ curl -i "localhost:8000/?path=/not-a-dir" HTTP/1.1 404 Not Found Server: gunicorn Date: Wed, 21 Sep 2022 13:28:40 GMT Connection: close content-length: 32 content-type: application/json Path '/not-a-dir' does not exist $ curl -i "localhost:8000/?path=/usr" HTTP/1.1 200 OK Server: gunicorn Date: Wed, 21 Sep 2022 13:29:35 GMT Connection: close content-length: 78 content-type: application/json ["lib", "include", "libexec", "local", "sbin", "src", "share", "games", "bin"] $ curl -i "localhost:8000/?path=/root" HTTP/1.1 500 Internal Server Error Server: gunicorn Date: Wed, 21 Sep 2022 13:30:53 GMT Connection: close content-length: 57 content-type: application/json Unexpected error: '[Errno 13] Permission denied: '/root''
Looks like the server is working. Now let's package all of this into a Docker image.
To create a Docker image, we have to provide a set of instructions that
will build it. These instructions are provided with a
Create a new file named
Dockerfile in the same directory as the
fileapi.py and populate it with the following.
FROM python:3.10.7-alpine3.16 WORKDIR /app RUN pip install gunicorn==20.1.0 falcon==3.1.0 COPY . . EXPOSE 8000 CMD ["gunicorn", "fileapi:app", "--bind", "0.0.0.0:8000"]
These six lines define the entire image. Let's unpack them line-by-line.
FROMsets the base image to use. While we could start with an empty image, we are going to leverage one of many pre-configured Python images from the DockerHub.
In our case, we are picking Python version 3.10.7 and the supporting libraries that are part of the Alpine Linux distribution.
(This does not mean, that we'll be running the Alpine Linux in a virtual machine, only that the libraries packaged in the image will come from the said Linux distribution.)
We are selecting Alpine Linux because of its small disk footprint.
WORKDIRcommand sets the working directory inside the image. If the directory does not exist, it will be created; this will be the location of our application.
We install required Python dependencies with the
Here we are pinning the libraries to specific versions. This is good practice, since we know that our application works fine, if Python is 3.10.7, gunicorn is 20.1.0 and the falcon is 3.1.0.
(If we had many such dependencies, it would be better to use the
requirements.txtfile, but let's keep things simple for now.)
Next we use
COPY . .to copy all resources from the current directory on the host computer to the working directory (
/app) in the image.
As it currently stands, the command will copy all files from the host which is often undesirable; we show how to list exclusions a bit later.
EXPOSE 8000command will allow services running on port
8000in the container to be accessible to other processes inside the container.
While in our case no other container processes will access this service (there will only be a single process running in container), the command is still needed, because processes from outside of the container will access the service. But we will have to provide additional commands to allow this.
And finally, the
CMDcommand specifies the command that runs when the container is started.
In our case the command is
gunicorn fileapi:app --bind 0.0.0.0:8000; we changed the IP from loopback device to all interfaces. The reason is that the container will have to listen on all interfaces if we want to access it from the host computer.
If we used the container's loopback device, we would be unable to reach it from the host computer, since the loopback device inside the container is different than loopback device on the host computer.
To exclude certain files from being copied from the host into the image
COPY . . in step 4 above), create a file called
.dockerignore and populate it with the following.
These two lines instruct the Docker
COPY command to ignore all hidden
files (files starting with dot
.), and the
Building the Docker image
Now we are ready to build the image. Inside the directory that contains
Dockerfile, issue the following command.
$ docker build -t file-api . ... Successfully tagged file-api:latest
The command builds the image and tags it
file-api. During the build,
all required dependencies are also installed. We can get the list of
images that are available on our system as follows.
$ docker images REPOSITORY TAG IMAGE ID CREATED SIZE file-api latest fbbf93095fb6 3 minutes ago 63.2MB python 3.10.7-alpine3.16 4da4c1dc8c72 13 days ago 48.7MB
Running the container
Now that the image has been built, we can run it and create a container.
$ docker run -p 127.0.0.1:5000:8000 file-api [2022-09-21 15:07:50 +0000]  [INFO] Starting gunicorn 20.1.0 [2022-09-21 15:07:50 +0000]  [INFO] Listening at: http://0.0.0.0:8000 (1) [2022-09-21 15:07:50 +0000]  [INFO] Using worker: sync [2022-09-21 15:07:50 +0000]  [INFO] Booting worker with pid: 6
We have now run the image
file-api and started the container. Docker
is mapping address
127.0.0.1:5000 on the host to
0.0.0.0:8000 in the
container; this was achieved with the
-p 127.0.0.1:5000:8000 switch.
If we open a new terminal—the container is running in the current one—and issue a few GET requests, we should get familiar responses.
$ curl -i "localhost:5000/?path=/usr" HTTP/1.1 200 OK Server: gunicorn Date: Wed, 21 Sep 2022 15:12:34 GMT Connection: close content-length: 47 content-type: application/json ["lib", "local", "sbin", "share", "bin", "src"] $ curl -i "localhost:5000/?path=/root" HTTP/1.1 200 OK Server: gunicorn Date: Wed, 21 Sep 2022 15:12:54 GMT Connection: close content-length: 29 content-type: application/json [".cache", ".python_history"]
However, notice how the contents of the
/root are now
different. This is because the app is now running inside the container
which has its own filesystem and directory structure and is isolated
from the host computer.
Moreover, applications inside the container are run as
default; this is why accessing
/root is now allowed. However, in
certain situation, there might be good security reasons to avoid
But this is a material for another topic.
We can now query the Docker to see which containers are running.
$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES ee841890bee4 file-api "gunicorn fileapi:ap…" 13 minutes ago Up 13 minutes 127.0.0.1:5000->8000/tcp cool_johnson
Since we did not name the container explicitly, Docker came up with a
cool_johnson. If we press
CTRL+C in the terminal that is
running the container, the container should stop. Now we have to run
docker ps -a to see the list of all containers, stopped and running.
To delete the container, run
docker rm cool_johnson. You might have to
change the name, since it is unlikely that yours is also called
Managing more complex setups with
Web applications often consist of multiple services: an application server, a database, a cache layer, a background task system and so on. To make our example more realistic, let's add another service to the web application.
Suppose we bench-marked our system and found out that the operation that lists the contents of a directory is rather slow. Since our filesystem rarely changes, if ever, we decide to implement a simple cache mechanism using Redis database.
Redis is an in-memory fast key-value store. As keys, we'll store the
path values, and as the corresponding values, we'll store the
list of files under given paths.
Next, we'll change the application so that when a request to a valid
path is received, it will first consult redis if it contains the key
path, and if so, it will serve the cached contents.
If the key does not exist, the application will list the files from the
filesystem, serve them to the client, and save the result to the cache.
Consequentely, all subsequent requests to the same
path should then be
fetched from the cache and not from the slow filesystem.
This modification will introduce a new service to our application set-up and add complexity: we have to modify the application to use the Redis database, we have to create another container that will run it, and we have to connect both containers.
The modifications to the web application are rather straightforward;
below we list the entire Falcon application that uses Redis for caching.
fileapi.py to contain the following code.
import json import os import falcon import redis class FileBrowser: def __init__(self, cache): self.cache = cache def on_get(self, req, resp): if "path" not in req.params: resp.status = falcon.HTTP_400 resp.text = "Missing path query parameter!" return path = req.params["path"] cached = self.cache.get(path) if cached: resp.status = falcon.HTTP_200 resp.text = cached return if not os.path.isdir(path): resp.status = falcon.HTTP_404 resp.text = "Path '%s' does not exist" % path return try: files = json.dumps(os.listdir(path)) resp.status = falcon.HTTP_200 resp.text = files self.cache.set(path, files) except Exception as e: resp.status = falcon.HTTP_500 resp.text = "Unexpected error: '%s'" % e app = falcon.App() redis_cache = redis.Redis(host='redis-cache', port=6379, db=0, decode_responses=True) app.add_route('/', FileBrowser(redis_cache))
Notice how we set the address of the Redis database to
this is an actual hostname that will be assigned to the container that
will run the Redis database.
Because our modifications also add a new Python dependency, namely
Python libraries that connect to Redis, we have to update the
FROM python:3.10.7-alpine3.16 WORKDIR /app RUN pip install gunicorn==20.1.0 falcon==3.1.0 redis==4.3.4 COPY . . EXPOSE 8000 CMD ["gunicorn", "fileapi:app", "--bind", "0.0.0.0:8000"]
The only change is in the
RUN command that now additionally installs
Python redis bindings.
Setting up additional docker containers
Next, we have to spin-up another container that will run the Redis database, and link it with the web application container.
While we could do all these things manually with multiple but separate
commands, we can package everything into a
specifies all required services, their dependencies, configuration, and
start-up sequence. And then we can start our application with a single
Here is the
docker-compose.yml that we'll need.
version: "3" services: redis-cache: image: redis:7.0.4-alpine3.16 restart: always expose: - 6379 falcon-webapp: restart: always build: . image: file-api ports: - 127.0.0.1:5000:8000 depends_on: - redis-cache
Let's parse the contents line-by-line.
First we have to specify the schema version; it needs to be provided as a string.
Next, we specify the list of services, or containers, that will run in this setup; this is defined with the
We are naming the first service
redis-cache. The container will be assigned an interal IP and
redis-cachewill be its hostname; recall the Python code.
As with Python image, we browse the Dockerhub for Redis images and pin it to a specific version (7.0.4) and environment (Alpine Linux 3.16). We expose port
6379which Redis uses by default. If the container unexpectedly stops, Docker will attempt to restart it.
Finally, we define the web application service and name it
This service container gets created from the image defined in the
Dockerfiledefined above. It needs to be in the same directory as the
Next, we set the name of the image to be built to
file-api, we set the port forwarding to allow the host computer to access the container on
localhost:5000, and we require the
redis-cacheservice to be online before the
Running and inspecting services
docker-compose.yml is ready, we build required images; in our
case, only image
file-api, image for Redis will get downloaded when we
start the application.
$ docker compose build [+] Building 3.2s (9/9) FINISHED => [internal] load build definition from Dockerfile 0.8s => => transferring dockerfile: 32B 0.0s => [internal] load .dockerignore 1.2s => => transferring context: 34B 0.0s => [internal] load metadata for docker.io/library/python:3.10.7-alpine3.16 0.0s => [1/4] FROM docker.io/library/python:3.10.7-alpine3.16 0.0s => [internal] load build context 0.6s => => transferring context: 100B 0.0s => CACHED [2/4] WORKDIR /app 0.0s => CACHED [3/4] RUN pip install gunicorn==20.1.0 falcon==3.1.0 redis==4.3.4 0.0s => CACHED [4/4] COPY . . 0.0s => exporting to image 0.6s => => exporting layers 0.0s => => writing image sha256:eb8a14d65dcfe9c29ce1ae5020a3f15ea01ac307941aaba5c45101c11cf47bc7 0.0s => => naming to docker.io/library/file-api 0.0s
To run the application, we issue
docker compose up -d. The command
up -d means start containers in detached mode–in the background. Once the
application is running, we can issue requests as before.
$ curl -i "localhost:5000/?path=/root" HTTP/1.1 200 OK Server: gunicorn Date: Thu, 22 Sep 2022 11:51:26 GMT Connection: close content-length: 29 content-type: application/json [".cache", ".python_history"] $ curl -i "localhost:5000/?path=/home" HTTP/1.1 200 OK Server: gunicorn Date: Thu, 22 Sep 2022 11:51:36 GMT Connection: close content-length: 2 content-type: application/json 
The following commands are often useful:
docker compose down– stops all containers,
docker compose logs– shows logs from all containers,
docker compose exec <service> <command>– executes the
commandin the container that is running the given
To demonstrate how we can attach to a container and run commands in it, let's examine the contents of the Redis database. First, we attach to the redis container as follows.
$ docker compose exec redis-cache sh /data # ps ax PID USER TIME COMMAND 1 redis 0:00 redis-server *:6379 22 root 0:00 sh 28 root 0:00 ps ax /data #
docker compose exec redis-cache sh will execute the
shell) binary within the container running the
which effectively gives us shell access.
If we execute
ps ax, we see that besides the processes that we are
ps ax—the only other process is
listening on port
6379 on all interfaces. Let's exit by pressing
To inspect the contents of Redis, issue the following command on the host computer.
$ docker compose exec redis-cache redis-cli 127.0.0.1:6379> keys * 1) "/home" 2) "/root" 127.0.0.1:6379> get /root "[\".cache\", \".python_history\"]" 127.0.0.1:6379> get /home "" 127.0.0.1:6379>
Now we directly run
redis-cli to jump Redis command-line prompt inside
the container. Then we list all keys using
keys * and then inspect the
contents under keys
/home. We close the prompt with
While this was everything but short, it only scraped the surface of what Docker is. For further reference, consider visiting the Docker documentation.