From Ground to Cloud: Adopting Cloud Native Patterns

Adopting Cloud Native Patterns to Improve Software Craftsmanship

From Ground to Cloud: Adopting Cloud Native Patterns
Cloud Native Patterns

There are many types of terms for Cloud Native. We have Cloud Native Architecture, Cloud Native Applications, Cloud Native Development, Cloud Native Infrastructure and Cloud Native Data.

Cloud Native Architecture is a modern approach to designing, building, and running applications that fully leverage the advantages of cloud computing. It focuses on creating loosely coupled, automated systems that can be quickly built, deployed, and managed in dynamic cloud environments.

Cloud Native Development uses cloud based technologies to build applications that can run in cloud infrastructure. Cloud Native Development uses continuous integration and delivery.

Cloud Native Applications consists of smaller interdependent services called microservices built with the 12 Factor App methodology. Cloud Native applications use containers to package these services ensuring they can run consistently across different environments.

Is The Application Cloud Native or Cloud Enabled

An application can run on cloud based infrastructure but that doesn't mean it is cloud native. It is possibly that the software is cloud enabled.

My team's application that is being modernized is currently cloud enabled. We have multiple virtual machines hosting the application with segregrated database and application virtual machines. The application is a monolithic application that hosts several applications on Apache Tomcat. Even though the application itself is running on a virtual machine in the cloud the application itself is not cloud native whereas our new modernized application is utilizing many of the benefits of being cloud native

Cloud Native Cloud Enabled
Feature
Designed and built to leverage cloud computing to its fullest
Microservices architecture
Flexibile and Resilient
Containerization
Improved scalability
Faster Innovation
Improved reliability
Feature
Designed for traditional software
Adapted to work on cloud infrastructure
Not optimized for cloud benefits
Monolithic architecture tightly integrated but runs on cloud servers
Limited scalability
Limited resource optimization
Slower to deploy due to hardware or software setup

Cloud Native Patterns

According to Cloud Native Transformation Patterns website, there are about five categories for cloud native patterns. I will explain some of the patterns our team implemented over the last two years and explain the benefits or drawbacks of implementing these patterns.

Cloud Native Transformation Categories

Strategy & Risk Reduction

Strategy Patterns
Data Driven Decision Making

Our team organized all of our features to produce small iterations to the application so that we could deploy features by webpages. This allows our team to get feedback faster on our development. These small iterations are called short feedback loops. We also use Figma to design our prototype of the screens and then interview several users asking them to critique the screen giving us feedback on what was easy to use and understand and what needed improvement. We would use that first feedback session with the several users to redesign the screen or features on the screen if necessary. This is done before we have even written one line of code.

Dynamic Strategy

When I first started with the roadmap to migrate all the features of our application to the new target, my origin strategy was to migrate our static product data before some of the other features but looking at the time to market for the most important features such as work orders would take too long and would not provide any value to the business. So I decided along with the team to switch our strategy to migrating the most important features of the application first before our static data.

Learning Organization

Our team does small experiments when we do not really know enough to make a decision on an architecture design or software solution.

For example, we started to implement several of our services without a gateway manager then quickly realized we have too many DNS entries which will keep growing and make the DNS entries and certificate creation complex and difficult to manage. We needed a better way to handle this situation and keep the solution simplified.

We implemented the gateway manager which resolved several problems with our original architecture. This gateway manager allowed us to have internal routing using paths to specific microservices rather than using public DNS entries. The gateway manager also provided us better security for our APIs.


Organization & Culture

Organization Patterns
Build - Run Teams

Before we transitioned to build-run teams, we had separate development, devOps and support teams. We found that we had better support overall for the development of our application if we combined the teams into one. The developers would have better opportunities to provide solutions to support issues and could quickly recreate incidents when they were also a part of the support team. We were able to more quickly resolve bugs in production by providing a code solution.

Decide Closest to Action

As part of the Cloud Native Transformation and our Continuous Architecture approach at Michelin, I would not architect everything at the beginning of a project. I would architect just enough to get the team started and when we need to make a decision on a feature or process then the teams discuss the upcoming proposed architecture to see if it fits within our current scope. Deciding closest to the action when coding the feature has helped our team shift and pivot from issues we found or shifting roadmap changes more easily than if I had architected the entire application upgrade at the beginning.

Exploratory Experiments

In order to try out new experimental coding features or other software tools your team might need to experiment with the feature or software to see if it is worth the effort to put into place in your application. Our legacy application did not have event streaming tools or gateway managers. However with our modernization we saw some benefits of using event streaming and gateway managers so we decided to try out event streaming with Kafka while we moved our data from the legacy application to the modern application and we were doing small iterations of our application. This helped us keep our databases in sync and keep features in the modern application and legacy working as expected.

Remote Teams

We have a very geographically diverse team. We have several team members who are in the US who don't reside near the North American headquarters. We also have many team members who work from India. The teams have daily huddles to talk with each other and the team leads to discuss issues and make sure there are no blockers for their work. When they do find they have issues, many of the teams members work in pairs to help overcome these blockers.

Manage for Proficiency

Our team is split between two groups. At the beginning of our cloud native exploration, we had one group focused on support issues, keeping the existing legacy application running and providing obsolecense upgrades to the legacy application while the second group is focused on delivering new functionality for our modern application.

Also as part of managing for proficiency and to keep the second group focused on delivering new functionality, the product architect and myself created thin slice spreadsheets that outlined the breakdown of the features and which features we could go live with. These thin slice spreadsheets keep the team informed on what deployment cycle each thin slices resided in to keep them from getting disorganized and asking if we had forgotten pieces of the functionality.

ThinSlices.jpg


Development & Design

Development Patterns
Communicate Through APIs

All of our new code has APIs for communicating with other services. We can use these APIs to provide data to other teams such as our data analytics team or even the customers themselves.

Microservice Architecture

We started with microservices because we were driven to have disparate systems. However we realized that most of our services would be shared between applications so it would not be a good practice to create microservices for every feature. However we decided to convert our microservices to modular monoliths to create separation of functionality so we can deploy to one application without affecting our other applications.

MiniServicesArchi.png

Strangle The Monolith

We slowly pull out features and deploy in our new application. We turn off the old feature in the legacy application and have it running only in the new application. This reduces the risk of a big bang rollout. Our first features I wrote about in a previous blog post called Anti Corruption Layers. This was a small feature that is used in several of our applications. We were able to strangle it out of the legacy application and recreate it in the modern application for use by all the applications using APIs. Our second features was the customers pages. Our customers were able to view and edit their existing customers in new modernized web pages. These pages offered a cleaner design and a few more features than the legacy application.

Strangle.png

Leveraging Feature Flags

To keep our code base from getting too large, we have implemented custom developed feature flags in our code so that we can continuous deploy to production without affecting our customers applications. Once we finish the development of a specific feature we can turn on the feature by updating a feature flag field in the database. We implemented these flags so we can push out new changes to one specific customer or several customers at a time to essentially pilot these new changes.

Once we are comfortable with the piloted changes we continue to push out customer by customer until we are confident the changes will not adversely affect our entire customer base. After a specified time period we remove the feature flag to clean up the existing code.

This has been one of the most effective strategies we have implemented that has saved us so much time and effort. Several times we have had issues that affected our customers so we would turn off the feature flag so the customer would be rolled back to the original code. With one database update, we reduced outage time from many hours to a couple of minutes.


Infrastructure & Cloud

Infrastructure Cloud Patterns
Risk Reducing Deployment Strategies

We are using Kubernetes to deploy our containerized applications so we are leveraging RollingUpdate strategy for deployment of our applications to minimize the amount of downtime. This ensures that we always have at the minimum of 1 container running at all times while the other containers are being updated.

Observability

Observability tools are key for most modern applications. We have several tools we use to capture issues with our applications. These tools integrate well with our ticketing system to quickly alert the team of issues they need to quickly solve. These tools also help us with finding performance issues. Without these tools we would not be alerted before our customers and could affect our customer satisification scores.

Containerized Apps

All of our new applications are deployed in containers. This allows us to fully benefit from the features of containers and Michelin Kubernetes environments. Our containerized apps ensure consistency that the app runs the same way regardless of where they are deployed. Each container starts up quickly and can be deployed rapidly. The containers can be easily scaled up or down based on the load.

Custom Productivity Tools

There are several tools that our infrastructure team has created over the last several years to help accelerate our development lifecycle.

  • Self Service Infrastructure - We have a custom tool that allows architects or devops to request new virtual machines or install new software on existing machines.
  • Automated Infrastructure - The self service application is backed with ansible scripts and playbooks that automatically create the machines after they have been approved by architects.
  • Automated network security groups(NSG) - Recently the cloud architects created an automated task scripts to update our Azure NSG groups that help keep our security groups in place and easily updated.

Operations

Operations Patternsc
Runbooks

Most developers hate to write documentation. However when we did write that documentation we needed to keep the documentation in an organization easy to use repository. Our current documentation repo we are storing in git using hugo to build a server. This allows me to easily share our documentation with other internal users who need information from our team.

GitOps

We store most of our configurations for our new services in a config server. For instance we use Spring Cloud Config Server so when we have configuration changes we just have to push the changes to git and the application will pick up the configuration changes so we do not have to modify the application and redeploy. All of our configurations for each environment are stored in our git repository. Each service has a yaml file that has parameterized values such as url to your other services, filters, rewrite paths, or whitelisted urls.

What Didn't Go Well

Even with all these transformations and there are a lot of them, you can’t implement all of them at once. We had to pick some small, quick changes and some larger changes and implemented them over a period of three years. Not everyone was on board and some resisted change. Several changes didn’t work like expected.

For instance some of our experiments took too long to implement. Our team was experienced in legacy technologies. We did not know the new gateway tools or event driven tools and even some of the team had to learn new modern software languages. It took us a while to understand the tools ands software and how to implement using best practices based on lots of trial and error scenarios.

The thin slice spreadsheet I referred to earlier in this blog are hard to break into small chunks. It takes a lot of practice to break down features into small deployable chunks.

Some of the experiments we tried we waited too long to move or decide if we wanted to continue. However we learned by these experiments how we wanted to continue forward. We realized we had a failure so we adjusted and continued to the next feature.

Too much innovation – This sounds counter-intuitative to reduce innovation and creativity. I am not suggesting that the teams do not have any autonomy or stop innovating. We noticed that when we gave too much freedom to explore options or experiments then we got in a cycle of decision paralysis. We need the teams to be creative and innovative but we need them to start in a controlled manner for small experiments for some period of time and then make a decision or end the experiment.

Last Words

Every application doesn’t need to have microservices or mini services. It is perfectly acceptable to be on a monolith. Take into consideration, speed to deployment. Do you need to get your application running quickly? It is a more simplistic app that doesn’t need distributed services. Are your services scalable? How complex is your functionality? Is it easily able to be changed, debugged or tested?

Some tools are necessary for applications but just because a tool is suggested to be used doesn’t mean it fits within your application scope. It might be more complex to integrate into a more simplistic application. For instance, event driven tools are great for integrations with other applications or large data processing and possibly a few other scenarios. However if you app doesn’t have integrations or large data process handling then maybe you don’t need it.

At the beginning of our upgrade the architecture design was mostly microservices. However after we developed some of the microservices for one of our features, I decided with some of team's valuable input that microservices would severely complicate the architecture making the solution too difficult to manage and increased the risk for our applications. I redesigned the architecture utilizing miniservices rather and microservices or a monolith.

In the images below you can see original architecture was complex and would be too difficult to manage all the microservices. However in the second image you can see the architecture is much more simplified using mini services.

Original Architecture Design
Target Architecture

Our team has had a lot of failures and lessons learned. Remember it is okay to fail. Even when you try to implement cloud native practices everything won’t be great or go according to plan. Just accept the loss, ask for help and learn from the failures and try again.

References

Cloud-based versus cloud native: what’s the difference?
Member post originally published on the Cloudsmith blog by Andrea Saez The term ‘cloud’ has become a buzzword in the tech industry, often used interchangeably to describe anything from online storage…
Cloud Native vs Cloud Enabled: Key Differences in 2024
Cloud native is designed to work in the cloud but cloud-enabled is the application originally designed for on-premises environments.
Patterns Library
The Twelve-Factor App
A methodology for building modern, scalable, maintainable software-as-a-service apps.
Spring Cloud Config

Spring Cloud Config Server

Microservices Pattern: Pattern: Strangler application
A pattern language for microservices