In Automation We Trust: What Is GitOps, After All?
In one of our previous articles we talked about container orchestration and Kubernetes. And before that, we explained what containers actually are and what they are used for. Now it’s finally time to talk about the main approach toward infrastructure automation - yes, we are ready to talk about GitOps.
Pull vs Push: know your model
Before discussing what GitOps is, it makes sense to first have a look at the two main models for secure and quick delivery of an application into the target infrastructure: Pull and Push.
With the Pull model, the solution that realizes the approach takes changes from the target repository and deploys them in the infrastructure where your application works. In simpler words, as explained on Stack Overflow by one of the commentators, “Pull" is the target repository grabbing your changes to be present there”.
This approach has the following advantages and aspects:
- Security in terms of access control. Because the Pull model implies that changes are taken from somewhere and are applied, then only the solution that realizes this model has access to the infrastructure or that part of the infrastructure where the application works, without providing external access to your infrastructure.
- Lightweightness: a Pull model-based solution normally consumes a little amount of resources.
- A single repository to store sensitive information. Developers usually use Git as a repository of choice to store encrypted data and you can use such tools as Sops, Helm Secrets, Sealed Secrets or others, depending on your infrastructure.
- A need for DevOps engineers to be attentive when working with secrets in terms of secrets management (I.e. their updates or creation) and in terms of encryption too.
- Depending on the secrets decoding solution, there might be additional costs involved.
- Vendor-lock. You will be locked with a solution or an ecosystem of solutions that the Pull model realizes.
- A single and most often, declarative language of describing the process of application deployment.
- A weak connection between the CI (Continuous Integration) and CD (Continuous Delivery/Deployment) processes.
With the Push model, the solution that realizes the approach applies changes to the target repository where your application works. Or, if we get back to the explanation from Stack Overflow again, “A "push request" would be the target repository requesting you to push your changes”.
This approach has the following advantages and aspects:
- Security in terms of access distribution. Because the Push model implies that changes will be applied to the target repository, the solution that realizes this model will have temporary or permanent access to the infrastructure (or its part) where the new app versions are to be delivered.
- Lightweightness: no unprofitable resource consumption in terms of secrets management.
- No vendor-lock. That means you can use any solution that realizes the Push model.
- There is a strong connection between the CI and CD processes.
- Data for the access to the infrastructure is stored in the CI solution that realizes the Push model.
- Depending on the CI solution, there might be a need for studying aspects of its work and its syntax.
You might now have a question about what model you should use. Both Pull and Push models have their strengths and weaknesses so it will depend solely on your goal and project requirements.
Now, what do these models have to do with GitOps? The thing is, GitOps works by the Pull model so, for a better understanding of GitOps, it is important to understand the peculiarities of Pull and Push models.
GitOps defined and explained
GitOps is often called the combination of the evolved Infrastructure as Code (IaC) and DevOps. The reason for that is because IaC implies management of the app’s infrastructure through code rather than through manual processes. However, GitOps takes the infrastructure management further as it allows you to describe the needed state of the app in a declarative manner so the GitOps operator takes care of it.
Another critical thing to know about GitOps is that this approach to managing the application infrastructure treats Git as a single source of truth. That means that any changes made in Git will be applied to your application and the app’s configurations will always be in sync with the app’s state declared in Git. Changes in the app’s configurations are caused by a trigger which can be either a certain time period or a webhook.
What does GitOps have to do with Kubernetes?
Because Kubernetes can be managed declaratively and supports continuous deployment, GitOps is a perfect operating model for it. And since Kubernetes is often named the most popular container orchestration tool, no wonder GitOps is strongly associated with it.
However, GitOps is not limited by K8s - this approach is applicable to any infrastructure that satisfies the abovementioned requirements for declarative management. But for an easier understanding of GitOps pros and cons as well as its operating principles, we’ll be using Kubernetes as an example of infrastructure in use.
Wait, but how does it work?
By now, you should have a general understanding of what GitOps is and why it’s often mentioned in conjunction with Kubernetes. Now let's have a closer look at the way the whole system operates.
There are four main components in the working process of a Kubernetes infrastructure:
- Git: a repository where you store, for example, your Deployment, Service and Ingress manifests. Remember that Git is a single source of truth and if you want to change the state of an app, you apply these changes in Git only.
- Kubernetes: a platform that manages your application and its infrastructure.
- GitOps operator: a specialized tool that receives requests from Git that a change has been made and passes the information on the needed infrastructure state to Kubernetes. GitOps operator is also responsible for checking whether the states of Git and Kubernetes files are in sync.
- Docker Registry: a stored and distribution system for named Docker images.
Say, you’ve described the desired state of your application in Git. When a new Docker image appears, GitOps operator pushes a new commit to the Git repository, grabs information from Git, and checks whether it corresponds to the current state of the application in Kubernetes. Whenever a change in Git happens, GitOps operator passes it to the Kubernetes and requires it to change the app’s state to the needed one.
Most popular GitOps tools
The next question you may ask: what on Earth is GitOps operator and how do I get started with GitOps?
The GitOps approach can be realized by implementing one of the following solutions that are GitOps operators:
- ArgoCD: this tool places a special focus on Continuous Delivery and was designed to facilitate and automate the delivery process. As well, the tool is very lightweight and has a very well-developed UI.
- FluxCD v2: this tool was developed with the GitOps Toolkit and it allows syncing an arbitrary number of Git repositories. As well, it supports multi-tenancy and overall provides easy and automated delivery.
- werf: this CLI tool is used to create pipelines that can be further embedded into any CI/CD system of choice. Werf is a rather versatile tool that also allows efficient delivery of your app to Kubernetes.
To get started with GitOps, you’ll need to install one of these operators in the Kubernetes cluster and set it up for work with the Git repository where your manifests are stored.
The benefits of GitOps
Okay, so GitOps brings automation and frees you from the need to manually declare changes in Kubernetes. How else does it help? Here are its biggest advantages:
- Ease of integration and use;
- Automated leading of the current state of an application to the desired state under different conditions (creation, update, deletion).
- Security in terms of manual intervention into the Kubernetes cluster since GitOps operator will decline any attempt for such intervention.
- Repeatability: every subsequent leading of the current state to the desired one will always give the same result. Mind that repeatability is provided not by the GitOps approach but by Kubernetes itself - more about it below.
- An option to receive information on whether the current state in Kubernetes matches the state in Git.
Mythbusting the biggest pros and cons of GitOps
GitOps evangelists claim that GitOps approach is beyond awesome and will bring nothing but joy to your life. While GitOps, indeed, is highly valuable and has many great benefits, it’s not 100% perfect though. Let’s have a look at its biggest pros and cons and the possible considerations.
With GitOps, automation means you won’t need to implement any manual changes to Kubernetes itself since these changes are already made to Git. As well, you won’t need to worry about manually synchronizing the states of an app in Git and Kubernetes since GitOps operator takes full care of it. Hence, automation is a big and a very significant advantage of using GitOps.
Convergence means the system always strives to achieve the desired state and even if mal-synchronization occurs, the system gets back to the state of sync between Git and Kubernetes. It is important to notice here that mal-synchronization may happen either because of manual interventions in Kubernetes (i.e. unauthorized ones) or because the changes made to Git have not yet been delivered to Kubernetes. Whatever the case is, GitOps operator is responsible for bringing the system back to the state of synchronization (just like with automation, the operator does all the work!).
Idempotence means we can repeat synchronization several times and every time, the result will be the same. But in this case, the credit goes to Kubernetes and its API rather than to GitOps operator since Kubernetes is the one responsible for idempotence in the system.
In terms of GitOps, observability means the ability of a DevOps engineer at any time to learn whether the system is synchronized and to get notified in case mal-synchronization happens. However, due to the Docker Registry present in the system, we cannot say GitOps provides 100% observability. This is because GitOps operator can provide information only about whether the Kubernetes state matches the Git state - but in this equation, Docker is left behind. In this way, we observe only 50% of the state which is Git-Kubernetes.
Determinism means the Kubernetes state is fully defined by the state declared in Git. However, it’s not fully true since the Kubernetes state also depends on the Docker Registry. So if anyone changes an image in the registry, the Kubernetes state will be affected too.
Audit implies easy and centralized monitoring of all changes made to Kubernetes. And while GitOps indeed allows monitoring changes, this relates only to changes that are made to Git - but once again, it leaves Docker Registry behind. In this way, we have a certain area of the system that is not covered by GitOps and cannot be properly monitored. Hence, GitOps does not provide a 100% audit of the system but rather an opportunity to monitor those changes in Kubernetes that come from the Git repository.
This is a rather controversial point. On one hand, GitOps operator does not allow users to directly apply changes to Kubernetes so some call it an additional layer of security. But in reality, the user still applies changes to Git or to Docker Registry and this can be a weak area in terms of security. Hence, the implementation of GitOps will not guarantee 100% security to your system and it’s obligatory to watch the security of your CI pipeline in the first place.
To sum up, the points mentioned above, the introduction of GitOps to your current processes will undoubtedly be beneficial and will have a positive impact on the performance of your system and delivery speed. On the other hand, do not treat GitOps like a cure-all solution that will take full care of everything, including security. You will have to pay double attention to all components of your system and it will take quite some time to set everything up so you create a truly secure and high-performing environment.