Introduction
Containerization stands for tands for packaging of software code with just the operating system (OS) libraries and dependencies required to run the code. Here we do not virtualize hardware like in traditional virtual machines, but instead we virtualize computing resources, such as CPU, memory, I/O, network and others, and execute applications on the underlying operating system but within isolated environments called containers. Containers can run on end-user computers, edge IoT devices, or in the cloud, regardless of the vendor.
What are containers?
Containers are a fully operational and portable executables that contain the application, configuration files, libraries, and other dependencies, and thus keep all of that isolated from other processes of the host.
Images and containers
When discussing containers it is worth making a distinction between an image and a container.
An image is (usually) a lightweight executable that contains everything an application needs to run: the application itself, the libraries, dependencies, configuration etc. It can be specified in source-code or as a binary. In the former case, it needs to built before it can be run.
An image becomes an container when it is run; a single image can be executed multiple times, resulting into many containers where each container is an independent running application.
Why are containers useful?
One of the main uses of containers is to unify and streamline the development and the runtime environment.
Imagine a team of developers implementing a non-trivial web application, e.g. one that uses an application server, a database server, and a cache system. Each developer might have a slightly different development environment (a different version of the operation system, slightly different network configuration, different versions of shared libraries etc.), and likely none of them has the identical configuration as the production server. These differences are likely to cause issues when moving into production; ideally we'd want each developer to have the same environment and that should be the same as the production environment.
This is where containers shine. A container hosts the entire application, the required dependencies, the network configuration and other configuration resources in a single bundle that can be easily moved to or run on any container-supported runtime.
And while this is a common example, containers can be used in any context where portability and isolation are required.
What about virtual machines?
Containers seem similar to virtual machines, and to a degree they are, however, there is an important difference.
In classic virtualization hypervisors virtualize or emulate hardware and allow users to run multiple isolated operating systems (e.g. Windows and Linux) concurrently. Containers, in contrast, share the underlying operating system and isolate application processes. This sharing tightly couples containers and the host: to run a Windows x86 container, you need a Windows x86 system, to run an ARM Linux container, you need an ARM Linux system and so on.
This difference is emphasized when we compare start-up times and image sizes. Since virtual machines have to include the application and the entire operating system, they tend to take up much more space, sometimes on the order of gigabytes. Containers, on the other hand, package only application which is usually on the order of megabytes.
Similarly, since virtual machines need to boot the operating system before the application can run, the start-up time is usually on the order of seconds. In contrast, containers run on the native operating system and have nearly instant start-up time; often only a few milliseconds more than natively run applications.
Moreover, since a container is a bundle that contains the application code, dependencies and configuration, it can be straightforwardly versioned. And since the format is standardized, it can be manipulated (started, stopped…) with a standard interface, and run in almost any computing environment.
Example containerization software
Here we list few systems that allow virtualization at the level of operating system.
FreeBSD jail and Linux chroot are probably the oldest mechanism that allow (some level of) containerization. They work by running a program with a modified directory structure. So a program run with
chroot
will appear to have a different root directory than a program that is run directly.Linux Containers, or LXC, is a full OS-level virtualization package that isolates applications on Linux. LXC makes uses of two kernel features called control groups (or cgroups) and namespaces. Control groups are used to limit and prioritize computing resources, such as CPU, I/O, memory and other, while namespaces isolate application's view of the environment, like process trees, user IDs, mounts, and similar. LXD is an alternative Linux container manager that is built on top of LXC and aims to provide a better user experience.
Docker, is a set of tools, that similar to LXC, provide full OS-level virtualization. While the early versions of Docker used the LXC to execute containers, current version rely on their own
libcontainer
which is an abstraction that supports a broader range of isolation technologies.