Containers are a form of Operating System virtualization. Different than virtual machines where a hypervisor abstracts the underlying hardware and presents virtual hardware to the guest Operating Systems, Containers share a single Operating System and kernel. Containers offer many advantages over Virtual Machines including: instantaneous startup times (the same as a process), Reduced overhead by eliminating the need to run duplicate Operating Systems significantly more containers can run on a given host than VMs, reduced management/patching/updating because there is a only one Operating System.
Image Source: https://hub.docker.com/r/tplcom/docker-presentation/
Containers are far from new. The technology has been around various forms in the Unix/Linux world since the early 1980s. The Containers in use today are possible because of two relatively new features built into the Linux Kernel:
Cgroups – Control Groups – Limit, prioritize, and account for resources including CPU, memory, disk, and Network that a process can consume.
Namespaces – Take previously global system resources and isolates them to a particular set of processes thus giving the appearance that the process has its own dedicated resource. The analogy in the networking world would be a VRF or multitenancy. Linux has 7 namespaces including:
These two features allow a program to run on a Linux host in such a way that it is isolated from all the other running programs. Linux exposes the Kernel Container features through and interface called LXC.
Generally Containers are associated with the Linux Operating System however Windows Server 16 now offers similar functionality as well as Docker support.
Docker is an open-source project that automates the deployment of applications inside containers. The idea behind docker is to "build, ship and run anywhere" with the concept around separation of concerns where the developers can build an application anywhere, ship it via Docker Hub and run on any infrastructure that has Docker Engine. The Concept is illustrated in the graphic below:
Docker has been wildly successful and as a result has wide support on both Linux and Windows platforms, Container orchestration solutions like Kubernetes, as well as most public clouds including AWS, Azure, and Google
The Docker architecture is composed of three main components:
Docker images use the Linux Union file system (UnionFS) which allows individual layers to be combined into a single image as depicted below. The individual layers are combined to form a single file system. Changes can be made to individual layers without effecting the other layers. The layers themselves are read only with a writeable layer being added on top when a container is created.
A text file called the Dockerfile describes the layers and acts as a template to build a Docker image. The example below is a Dockerfile used to build an ACI toolkit image. This Dockerfile would create an image with 5 layers.
1) Ubuntu base image
2) Any updates found by running the apt-get update command
3) The apt-get install would create a layer where Git, Python and Python-pip are installed
4) The clone of the actual ACI toolkit
5) Any changes made by running the setup.py install command.
A docker container is simply an running instance of a Docker image. Multiple unique containers can be started from the same Docker image.
Swarm is Dockers integrated clustering solution. Before Swarm each individual Docker host, that is a physical or virtual machine running the Docker Engine operated as an autonomous system and was managed individually. Swarm mode provides the ability to cluster many hosts together and manage them through a single CLI on the Swarm Manager.
The Swarm architecture is depicted in the diagram below. There are two types of nodes, Manager and Worker.
As depicted in the following diagram the Worker nodes simply perform tasks given to them by the dispatchers. In contrast the Manager nodes handle providing the API to access the Swarm, orchestration, IP address allocation, dispatching tasks and Scheduling. Today, Swarm offers 3 scheduling methods:
Bin Pack – new containers are place on most loaded host
Spread – new containers are place on least loaded host
Random – new containers are placed randomly
The final component in the architecture is a Quorum layer. The managers run an internal state store using the RAFT consensus protocol. One manager is elected as the leader and handles the functions listed in the diagram. All of the non-leaders are available as hot standbys, creating a fault-tolerant architecture.