Docker and Friends: Why You Need Container Technology on Your Projects
Ever since Docker entered the application deployment arena in 2013, it remains immensely popular. Yet, not all development teams use containers to package their applications, which is, charitably speaking, a bit surprising. Containerization offers several unbeatable advantages over traditional application deployment and we’ll walk you through all of them in this article.
Traditional approaches to application deployment
There are currently several approaches to traditional application deployment. Normally, it looks like this: you have a server with an installed operating system and use the CI/CD pipeline to deploy your application or product.
But depending on your infrastructure, your application may work on several servers or you might need to launch different application replicas at once. This is where the traditional deployment approach displays its peculiarities.
The main features of traditional deployment
Even though traditional deployment works well, there might always be a case when you need a bit more than it can offer. In addition, there are several additional risks that it brings to an application and these risks may cause a serious threat to the application’s security and performance in the future.
Let’s look at the following example. We have one server and for some reason, we need to start two replicas of the same backend application on this server, with the application’s source code being stored on this server as well. Or we might decide that we want to have two separate environments on one server, which is a valid use case too.
Whatever option we decide to go with, we’ll face a problem. With traditional deployment, it’s difficult to start the same version of a backend application with the same parameters or the same environment variables on one server simply because the first application replica will occupy the server port. So if you need to start a second replica, you’ll have to change the server port and free it for the second replica. This leads us to the issue of lack of isolation when application replicas cannot run on the same server.
If you run the next application version on a different server, chances are high that the environment of that other server will differ from the initial one. The differences may come in such parameters as installed distribution packages, dependencies versions, or runtime engine versions. This lack of consistency in terms of an application’s environment may lead to issues in its further performance because it is very difficult (in most cases) to install two identical environments on one server.
Let’s assume that the server stops working at some point - your critically important application stops working too. If you don’t have backups, the recovery process from the server crush might take a significant amount of time during which one may encounter many errors, conflicts, and corresponding issues.
Another scenario: your application has not been updated for a year and during this year, many things have changed (i.e. dependencies and frameworks that you used before are not supported and updated anymore). Now imagine that you need to roll back to the initial version of the app and to the same parameters that it had a year ago. In this case, the recovery process becomes especially painful since some tools might become outdated and you won’t be able to work with them anymore.
The application consumes a certain amount of resources upon and after the launch process. The quantity of consumed resources is limited by the server on which the application works. And here is where the bottleneck hides: the excessive amount of consumed resources may lead to certain consequences, such as memory leaks since the application won’t be able to efficiently cope with the load.
If we take this example further, suppose that we have several applications working on one server and a memory leak happened in one of the applications. This leak may cause the risk of stopping all applications (and even all system components) on the server.
Most programming languages and their frameworks have mechanisms that allow stopping unwanted behavior (i.e. excessive resource consumption) to a certain extent but not everyone really takes full care of that. So it goes without saying that one has to set up monitoring and alerts for unwanted behaviors but you’ll need a dedicated and experienced specialist to perform these set-ups.
Upon the application launch, there is always a certain user who performs the launch. This user has a certain number of rights on the server and, depending on these rights, he can send different signals to the operating system’s kernel.
But in most cases, the user does not need to have so many rights. So to enhance the security of the application and prevent vulnerabilities and risks, you should configure user rights in such a way that the user has only those rights that allow to start an application and ensure it works as intended.
But what if you have hundreds of servers and need to configure user rights for each? In this case, automation is your best friend but you’ll need an experienced specialist and a certain amount of time for that.
Containers: a new approach to application deployment
Now that we are clear that a traditional application deployment process has certain flaws, a question arises: what is a suitable alternative? The answer is containers.
A container is an executable unit of software that packages up the application’s code and creates an isolated environment for an application to run. Containers allow apps to run anywhere, from laptop to cloud, and that’s one of their biggest advantages. Examples of containerization solutions include Docker, Podman, containerd, LXC, LCD, Kata Containers, and rkt (which is now closed).
An important thing to remember here is the difference between containers and virtual machines. When you run a container, it runs on top of your operating system’s kernel and contains only an application (or applications) and certain APIs and services that function in the user mode. A virtual machine, in turn, runs a complete OS with its kernel and in this way, provides better isolation than a container which may be needed in certain use cases.
How containers fix the application deployment issues
Due to their nature, containers are able to resolve the issues with application deployment that we mentioned above. Here is how they do it.
As we already stated, containerization allows you to launch an application (or several applications) in a container that is isolated from the server’s operating system. This isolation uses the namespaces feature to launch an application and in this way, server ports, for example, won’t intersect, especially if you decide to launch two application versions at once. All you have to do is launch two same containers that are by default isolated from each other and from the main OS.
Every container always functions based on a certain distributive image. A developer usually installs the needed packages and dependencies within this distributive. All these operations form the state of your application for a specific time period and this state is called an image. This method of creating an image resolves the issue of repetition. That means it doesn’t matter how much time has passed or what happened to an application - we can always roll back to the initial state of the application and the application’s source code, correspondingly.
Since the image reflects the state of the application in a certain time period, it also resolves the portability issue. That means that regardless of the server that we used to run the application in a container, it will work the same in spite of installed packages and dependencies on a server.
Unique resource control
Another advantage of containerization is that it allows controlling the resources for each application via a single interface through the mechanism of the operating system’s kernel that’s called cgroups. This mechanism allows detecting the number of consumed resources that is expected from the application and we can configure the resource consumption limits so the application does not exceed them.
The cgroups feature solves the abovementioned problem with memory leaks and their consequences due to resource control. But you’ll need to carefully configure the resource consumption so you won’t end up with reduced application performance and keep it within stated limits instead. One more important thing to notice is that containerization does not create an overhead on using resources on the server since its mechanisms use the operating system’s kernel.
Isolation limits the number of capabilities that we can send to the operating system's kernel. By default, the excessive number of capabilities is limited due to security reasons (fewer user rights = fewer chances for an error to occur). But if your application does not need some of the capabilities, there is always an option to drop a capability to the container. Therefore a superuser on the server does not equal a superuser in the container.
Since containerization offers a single interface for managing containers on one local or remote server in one moment of time, this interface must be protected as well as possible.
As you can see, containerization not only facilitates the process of application deployment but brings an additional security layer to your project. The main challenge is how to manage and orchestrate containers properly - and we’ll talk about that in our next article (and yes, there will be lots of talk about Kubernetes!)