The use of Docker containerisation has become quite common, especially in production. Docker Swarm is a clustering and scheduling tool for Docker containers. With the help of Swarm, sysadmins can deploy and manage a cluster of Docker nodes as a single virtual system.
Docker is a tool that’s intended to make the process of creating, deploying and running applications easier by using container based virtualisation technology. Docker is open source container technology that provisions far more apps running on the same old servers compared to traditional VMs.
The Docker engine
The Docker engine is the core component responsible for creating Docker images and running them as services.
The core components of the Docker engine are:
- The Docker daemon
This is a continuous running program (daemon) that manages the service and other Docker objects tied to it.
- The REST API
This specifies interfaces that programs can use to speak to the daemon and directs what it should do.
- The Docker client
This is used to interact with the daemon (Docker command).
Docker networking comes into the picture in large scale, real-time scenarios. Docker networking helps us to share data across various containers.
The host and containers in Docker are linked on the basis of a 1:N relationship, which means one host can command multiple containers.
Before we get into the details of Docker Swarm, let us understand some of the basic concepts of a cluster.
Cluster management is a technique to manage one or more cluster nodes. This is done using a cluster manager and agent. A cluster manager is nothing but a GUI or CLI software.
The clustering tool
A clustering tool is software that manages a set of resources through a single point of command. In our case, these sets of resources are nothing but containers. For example, workload distribution management across a distributed system/cluster is very tedious for large enterprise systems. The clustering tool eases matters by automating this task. The instructor will just specify the details such as the cluster’s size, settings and some advanced features. Everything else is taken care of by the clustering tool. Docker Swarm is one such example of a clustering tool for containers.
The need for a container orchestration system
Container management becomes very tough in case of a large scale distributed system which involves hundreds of containers.
A few of the key activities are:
- Scaling the number of containers based on the peak load
- Performing rolling updates for containers
- Performing health checks on containers
Docker Swarm is a cluster management and orchestration tool which is inbuilt in the Docker engine. A Swarm is a cluster of Docker engines, or nodes, with which you deploy services.
You can build this using the SwarmKit (a toolkit to orchestrate the system). Swarm mode can be enabled by either initialising a Swarm or joining an existing Swarm.
How Docker Swarm evolved
In early 2014, Docker developed a cluster management system with a communication protocol known as Beam. Later, with the Docker API, a daemon was introduced to communicate with a distributed system. This was named Libswarm. The daemon is called Swarmd.
In November 2014, the Docker team retained the concept of cluster communication with additional remote APIs and named this Swarm. This first generation was called Swarm v1.
In February 2016, Swarm v1 was completely redesigned to overcome certain limitations and was named Swarm v2. In June 2016, SwarmKit, which is an orchestration toolkit within the Docker engine for distributed service, came into being. This version is also known as ‘Swarm mode’.
Features of Docker Swarm
The features of Docker Swarm are:
- Decentralised design: You can manage the nodes in the Swarm cluster through Swarm commands. This gives a single point of access to build the entire Swarm.
- Scaling: You can specify the number of tasks for every service. This is automatically done through commands to scale the service.
- Desired state reconciliation: The Swarm manager keeps track of the cluster state so that the actual and the desired state is always the same.
- Multi-host networking: When you specify an overlay network to connect your services, the Swarm manager assigns addresses to the containers on the overlay network once you create/update the containers.
- Service discovery: Swarm manager nodes allocate a unique DNS name for the services. You will be able to find containers in the swarm through the DNS server embedded in the swarm.
- Load balancing: SwarmKit has an internal load balancer which distributes the service containers within nodes. You can include an external load balancer as well.
Before we dive into the architecture of Swarm, let’s look at some of the key concepts in Docker Swarm.
- Docker node
- Docker service
- Docker tasks
Docker Node is a Docker engine instance that is included in Docker Swarm. In real-time, these nodes are distributed across multiple clouds as well as physical machines.
There are two kinds of Docker nodes:
- Manager node
- Worker node
Manager node: The manager node is responsible for all the orchestration and container management activities required to keep up the desired system state.
Worker node: The worker node executes the tasks assigned by the manager node.
Service is nothing but a task definition that has to be executed. You will have to create a service specifying the image name and other additional parameters. In most cases, service is an image for a microservice of some large application. An example would be an HTTP server or a database, any kind of executable program that you would run in a distributed system, etc.
The node is the key member of a Docker Swarm. A Swarm can have more than one manager node. In this case, the nodes elect their leader using the Raft algorithm to conduct the orchestration tasks.
Manager nodes are also worker nodes. But you can configure them as manager-only nodes, thereby restricting them from working on any other tasks.
The Docker Swarm API is another key component of the Docker Swarm architecture.
Swarm has a list of remote APIs similar to Docker. This enables all kinds of Docker clients to connect to Swarm. In addition, the Swarm API provides the cluster information as well.
Docker API information is restricted to one single Docker engine, whereas the Docker Swarm API provides the information on the cluster of engines, the number of nodes available in a cluster, and the node details.
Manager nodes use the Raft Consensus Algorithm to internally manage the cluster state.
This is to ensure that all manager nodes that are scheduling and controlling tasks in the cluster maintain/store a consistent state.
Docker Swarm: Commands
Let us now get acquainted with Docker Swarm commands.
To initialise Swarm, use the following command:
docker swarm init
Note: You can configure the manager node to publish its address as a manager with the mentioned IP address, as shown below:
docker swarm init --advertise-addr <ip-address>
For Docker Swarm Help, use the following command:
docker swarm --help
The above command lists the basic Swarm commands with their usage details.
To list the number of nodes currently available in the Swarm, use the following command:
docker node ls
Deploying a service with constraints
You can add your own constraints while deploying a service. This will ensure that the service will get scheduled as a task only when the constraint is met. This constraint is known as placement.
In the following example, you add a constraint to deploy a service only on nodes where there is SSD storage.
docker service create --name testService --replicas 2 --constraint node.labels.disk==ssd tomcat