Build/DevOps @ R3 – Operation Automation
March 16, 2023
By: Maciej Swierad – DevOps Engineer at R3
I, for one, welcome our new robot overlords. For they are the future, and the future is automated.
George Devol
Have you ever wondered how software actually makes its way into the real world? How it’s published and distributed? How it’s built and secured? Have you thought about how the sysAdmin role has been retired and moulded into DevOps and what that means? Well have I got the post for you!
Settle down, take a cup of tea, and bear with me as I talk you through build engineering and DevOps while tossing in my grad perspective at what it all is. However, if you’re a savvy build engineer already, or simply like seeing real world examples, just jump ahead to the heading “Real World example – GitHub repo creation automation”.
Why We Automate
Automation helps reduce the time and effort required to build and deploy software, which improves the overall efficiency of the software development process. This saves time and resources and allows teams to focus on other important tasks. In an ideal world, a software release is just one click away!
Automation helps ensure that the build and deployment process is consistent and repeatable, which improves the reliability and stability of the software. This is especially important in a cloud native environment, where the software may be deployed across multiple cloud environments or services. This is often a security requirement too. Think about it: what happens if you’re a financial entity using a product which has been sent to you, but the sender does not have a reliable and repeatable build system, and therefore the product you received is not the exact same as what was tested by QA and signed? Suddenly, you’re looking at a costly mistake. Fret not however; good build automation will save you from such an endeavour.
We employ the use of infrastructure as code (IAC) and configuration as code (CasC) to manage and deploy the infrastructure and configuration of the software in a consistent, automated way. This ensures that the environments used to build, test, and deploy the software are replicable, meaning that they can be deployed consistently and reliably in any other environment with the same configuration and settings.
What is build engineering/DevOps in Build?
One of the most common problems in software engineering is how to ensure your software is delivered in a reliable manner. You want to prevent issues as described above, to take the weight of building and deploying away from your engineers, and to adhere to best security practices.
That’s where build engineers step in! Build engineers are responsible for creating and maintaining the processes and tools that are used to build, test, and deploy software applications. This involves writing scripts and automation tools to automate the build and deployment process, setting up and configuring continuous integration and continuous delivery (CI/CD) systems, and working with other teams to ensure that the software build process is efficient and reliable. DevOps engineers within the build team at R3 are also responsible for the smooth deployment and maintenance of the services used by developers such as Artifactory, Jenkins instances, and whatever else may be needed.
One of the most common areas of attack of a company is the build pipeline, the CI/CD system. If we look at the recent incident where a American government’s no-fly list was leaked, this was done through a vulnerable Jenkins instance. Someone at a major US airline was running a Jenkins instance with anonymous admin access;, all the malicious actor had to do was scrape the credentials and then they had all they needed to access the no-fly list.
One of the jobs we are tasked with as DevOps engineers in R3 is to make sure that can’t happen. We need to make sure our pipelines are secure, locked behind VPNs, credentials obfuscated, amongst other security practices. We must do this while also still focusing on continuous development. In my eyes, having become a DevOps engineer relatively recently, our greatest strength is versatility.
Infrastructure/Configuration as code
In build engineering and DevOps, automation is the key to improving the reliability and efficiency of the build and deployment process. There are a variety of tools and techniques used to automate various aspects of the process, such as provisioning infrastructure, configuring applications, and deploying code. Some common tools and techniques for automating build and DevOps processes include:
- Infrastructure as code (IaC) tools, such as Terraform, are used to define and manage your cloud infrastructure in a declarative and version-controlled configuration file.
- Configuration management tools, such as Ansible and Puppet, are used to define and manage the configuration of your applications and services in a declarative and version-controlled playbook.
- Container orchestration tools, such as Kubernetes, are used to define and manage your containerized applications and their dependencies in a declarative and version-controlled configuration file.
By using these tools and techniques, build and DevOps engineers can automate many of the repetitive and error-prone tasks involved in building and deploying applications, and can focus on delivering value to their users.
Real World example – GitHub repo creation automation
When I came into R3, I had an idea of an ideal world where repositories are created automatically based on service desk tickets and managed automatically with Terraform.
In my mind this would ensure proper etiquette; such as branch protection, required reviewers, CODEOWNERS, PR gates etc, is kept everywhere. When I brought this idea forward to my manager, it turned out it was something they had thought of but never implemented company-wide. And so, it ended up being one of the cooler projects I’ve recently worked on.
With the use of Terraform, Atlantis, and Jenkins, we implemented a system where once a user enters a ticket to request a new repository, the user responding to the ticket kicks off a Jenkins pipeline that creates a JSON file with the repo details, pushes this file to GitHub and opens a pull request. Atlantis then picks up the pull request and runs its plan; essentially just a Terraform plan. If we are happy with the proposed outcome of the plan, then the plan is applied, and the repository is created.
With this approach we can create template files with the github_repository_file resource.
resource "github_repository_file" "github_repositories" {
for_each = {
for item in flatten([
for repo in github_repository.github_repositories : [
for key, value in local.standard_github_repository_files : {
repository : repo.name
branch : repo.default_branch
file : value.file
content : value.content
commit_message : "Adding basic ${key}"
}
] if !repo.archived
]) : "${item.repository}:${item.file}" => item
}
repository = each.value.repository
branch = each.value.branch
file = each.value.file
content = file(each.value.content)
commit_message = each.value.commit_message
overwrite_on_create = false
lifecycle { ignore_changes = all }
}
To create these files, define their location in the local.tf file. The process iterates through each referenced file, copying its content and creating the file in the repository.
For the keen-eyed out there, you may have noticed in the above code that lifecycle is set to ignore all changes. When we first implemented this automation, we not only created repositories but also tried to manage repositories. This proved problematic to say the least. The unknown factor that is “human interaction” proved to be detrimental to managing repositories! People would make changes that may have seemed trivial such as changing the default branch name. However, when terraform checked its state, suddenly, the branch protection seemed to be gone, the branch seemed to have been deleted, or other problems occurred. In short, Terraform couldn’t handle changes made outside of the state. (And for anyone who has had the displeasure of manually fixing Terraform state manually you have my condolences.)
After much deliberation, we decided, for all resources in the Terraform deployment, to ignore changes. Unfortunately, managing is not something that should be done automatically – yet! Instead, we now focus on creating repositories based on a template as described above.
My future work on this project will include creating a coherent set of security levels per repository and enforcing management through a non-automated repository. Pro-tip: if you manage more than a small number of repositories with a system like this, it ends up taking quite a while to check each repository’s state.
Conclusion
In conclusion, in this new world of DevOps and continuous development and integration not only are build engineers and DevOps engineers a necessity but so is automation! The more human error that can be removed from day-to-day tasks the better the development lifecycle becomes.
I look forward to continuing my journey down this path, and hope to see you back for my next appearance!
Learn more about the deployment of Corda in this blog post.