Saga Pattern in a Microservices Architecture

Azure Spring Apps is a fully managed service from Microsoft (built in collaboration with VMware), focused on building and deploying Spring Boot applications on Azure Cloud without worrying about Kubernetes.

The Enterprise plan comes with some interesting features, such as commercial Spring runtime support, a 99.95% SLA and some deep discounts (up to 47%) when you are ready for production.

>> Learn more and deploy your first Spring Boot app to Azure.

And, you can participate in a very quick (1 minute) paid user research from the Java on Azure product team.

Slow MySQL query performance is all too common. Of course it is. A good way to go is, naturally, a dedicated profiler that actually understands the ins and outs of MySQL.

The Jet Profiler was built for MySQL only, so it can do things like real-time query performance, focus on most used tables or most frequent queries, quickly identify performance issues and basically help you optimize your queries.

Critically, it has very minimal impact on your server's performance, with most of the profiling work done separately - so it needs no server changes, agents or separate services.

Basically, you install the desktop application, connect to your MySQL server, hit the record button, and you'll have results within minutes:

>> Try out the Profiler

Accelerate Your Jakarta EE Development with Payara Server!

With best-in-class guides and documentation, Payara essentially simplifies deployment to diverse infrastructures.

Beyond that, it provides intelligent insights and actions to optimize Jakarta EE applications.

The goal is to apply an opinionated approach to get to what's essential for mission-critical applications - really solid scalability, availability, security, and long-term support:

>> Download and Explore the Guide (to learn more)

The AI Assistant to boost Boost your productivity writing unit tests - Machinet AI.

AI is all the rage these days, but for very good reason. The highly practical coding companion, you'll get the power of AI-assisted coding and automated unit test generation.
Machinet's Unit Test AI Agent utilizes your own project context to create meaningful unit tests that intelligently aligns with the behavior of the code.
And, the AI Chat crafts code and fixes errors with ease, like a helpful sidekick.

Simplify Your Coding Journey with Machinet AI:

>> Install Machinet AI in your IntelliJ

Looking for the ideal Linux distro for running modern Spring apps in the cloud?

Meet Alpaquita Linux: lightweight, secure, and powerful enough to handle heavy workloads.

This distro is specifically designed for running Java apps. It builds upon Alpine and features significant enhancements to excel in high-density container environments while meeting enterprise-grade security standards.

Specifically, the container image size is ~30% smaller than standard options, and it consumes up to 30% less RAM:

>> Try Alpaquita Containers now.

DbSchema is a super-flexible database designer, which can take you from designing the DB with your team all the way to safely deploying the schema.

The way it does all of that is by using a design model, a database-independent image of the schema, which can be shared in a team using GIT and compared or deployed on to any database.

And, of course, it can be heavily visual, allowing you to interact with the database using diagrams, visually compose queries, explore the data, generate random data, import data or build HTML5 database reports.

>> Take a look at DBSchema

Slow MySQL query performance is all too common. Of course it is. A good way to go is, naturally, a dedicated profiler that actually understands the ins and outs of MySQL.

Critically, it has very minimal impact on your server's performance, with most of the profiling work done separately - so it needs no server changes, agents or separate services.

Basically, you install the desktop application, connect to your MySQL server, hit the record button, and you'll have results within minutes:

>> Try out the Profiler

1. Introduction

In a typical microservice-based architecture, where a single business use case spans multiple microservices, each service has its own local datastore and localized transaction. When it comes to multiple transactions, and the number of microservices is vast, there comes the requirement to handle the transaction spanning various services.
The Saga Pattern was introduced to handle these multiple transactions. Initially introduced in 1987 by Hector Garcia Molina and Kenneth Salems, it’s defined as a sequence of transactions that can be interleaved with one another.

In this tutorial, we’ll dive into the challenges of managing distributed transactions, how an orchestration-based Saga Pattern solves this, and an example implementation of a Saga Pattern using Spring Boot 3 and Orkes Conductor, the enterprise-grade version of the leading open-source orchestration platform, Conductor OSS (formerly Netflix Conductor).

2. Challenges of Managing Distributed Transactions

Distributed transactions come with a lot of challenges if they are not implemented correctly. In a distributed transaction, each microservice has a separate local database. This approach is generally called the “Database per Service” model.

For example, MySQL might be suitable for one microservice due to its performance characteristics and features, while PostgreSQL might be chosen for another microservice based on its strengths and capabilities. In this model, each service executes its local transactions to complete the entire application transaction. This whole transaction is referred to as a Distributed Transaction.

The distributed transaction can be handled in many ways. The two traditional approaches are the 2PC (Two Phase Commit) and ACID (Atomicity, Consistency, Isolation, Durability) transactions, and each comes with its challenges, such as polyglot persistence, eventual consistency, latency, and more.

3. Understanding the Saga Pattern

The Saga Pattern is an architectural pattern for implementing a sequence of local transactions that helps maintain data consistency across different microservices.

The local transaction updates its database and triggers the next transaction by publishing a message or event. If a local transaction fails, the saga executes a series of compensating transactions to roll back the changes made by the previous transactions. This ensures that the system remains consistent even when transactions fail.

To further illustrate this, consider an order management system that consists of sequential steps spanning from placing to delivering an order:

In this example, the process begins with the user placing an order from an application. The flow then goes through several steps: inventory checks, payment processing, shipping, and notification services.

If the payment fails, the application must execute a compensating transaction to roll back the changes made in the previous steps, such as reversing the payment and canceling the order. This ensures that the Saga Pattern can handle the failures at any stage and compensate for the previous transaction.

The Saga Pattern can be implemented in two different ways.

Choreography: In this pattern, the individual microservices consume the events, perform the activity, and pass the event to the next service. There is no centralized coordinator, making communication between the services more difficult:

Orchestration: In this pattern, all the microservices are linked to the centralized coordinator that orchestrates the services in a predefined order, thus completing the application flow. This facilitates visibility, monitoring, and error handling:

4. Why Orchestration-based Saga Pattern?

The decentralized approach in the choreography pattern makes it more challenging to manage and monitor service interactions. The complexity increases with a lack of centralized coordination and visibility, making the application harder to maintain.

Let’s look at the major drawbacks of Choreography and the advantages of opting for Orchestration instead.

4.1. Limitations of Choreography

Choreography-based implementation has many limitations when building distributed applications:

Tight Coupling – Services are tightly coupled as they’re directly connected. Any changes to a service in the application can impact all the connected services, requiring a dependency when upgrading the services.
Distributed Source of Truth – Maintaining application state across various microservices complicates the tracking of the process’s flow and may necessitate an additional system to consolidate state information. This adds to the infrastructure and introduces complexity to the overall system.
Difficult to Troubleshoot – When the application flow is spread across different services, it can take longer to find and fix problems. Troubleshooting requires a centralized logging service and a good understanding of the code. If one service fails, it could cause more significant issues, potentially creating extensive outages.
Challenging Environment for Testing – Testing becomes difficult for developers as the microservices are interconnected with each other.
Difficult to Maintain – As the services develop, incorporating new versions involves reintroducing conditional logic, resulting, once again, in a distributed monolith. This makes it harder to understand the service flows without inspecting the entire code.

4.2. Advantages of Orchestration

Orchestration-based implementation has many advantages when building distributed applications:

Coordinated transaction within the distributed system – Different microservices handle the different aspects of the transaction in a distributed system. With the orchestration-based pattern, a central coordinator manages the execution of these microservices in a predefined manner. It actively ensures the precise execution of individual local transactions, thereby maintaining the application’s consistency.
Compensation transaction – In an application, failures can occur at any point of execution due to any errors. The Saga Pattern enables the execution of compensating transactions in the event of failures. It can roll back the previously completed transactions, ensuring the application maintains a consistent state.
Asynchronous processing – Each microservice can process its activity independently, and the centralized coordinator can manage the communication and sequencing of these asynchronous actions. This is useful in cases where specific steps can take longer to complete or where parallel processing is desirable.
Scalability – The orchestration pattern is highly scalable, meaning that we can make changes to the application by simply adding or modifying the required services without significantly affecting the overall application. This is particularly useful in cases where the application needs to adapt to changing demands, allowing for easy expansion or modification of the architecture.
Enhanced Visibility and Monitoring Capabilities – Utilizing the orchestration pattern provides centralized visibility across distributed applications, enabling swift issue identification and resolution. This improves productivity, minimizes downtime, and ultimately decreases the mean time to detect and recover from failures.
Faster Time to Market – The orchestrator simplifies the rewiring of existing services and the creation of new flows, facilitating rapid adaptation. This enables application teams to be more agile, leading to faster time to market for new ideas and concepts. Additionally, the orchestrator often manages versioning, reducing the need for extensive “if..then..else” statements in the code to create different versions.

In summary, the orchestration-based Saga Pattern provides a way to implement coordinated, consistent, and scalable distributed transactions in a microservices architecture, with the added benefit of handling failures through compensating transactions. This makes it a powerful pattern for building robust and scalable distributed applications.

5. Implementing Saga Orchestration Pattern With Orkes Conductor

Now, let’s look at a practical example of an application employing the Saga Pattern with Orkes Conductor.

Consider an order management system with the following services:

OrderService – Handles the initial order placement, including adding items to the cart, specifying quantities, and initializing the checkout process.
InventoryService – Checks and confirms the availability of items.
PaymentService – Manages the payment process securely, handling various payment methods.
ShipmentService – Prepares the items for shipment, including packaging, generating shipping labels, and initiating the shipping process.
NotificationService – Sends notifications to users about order updates.

Let’s explore replicating this flow using Orkes Conductor and Spring Boot 3.

Before beginning the app development, ensure that the system meets the following prerequisites.

Install Java 17

To set up Orkes Conductor for our application, we can opt for any of the following methods:

In this example, we’ll be using the Playground.

Here’s the code snippet of the food delivery app built using the Saga Pattern:

@AllArgsConstructor
@Component
@ComponentScan(basePackages = {"io.orkes"})
public class ConductorWorkers {
    
    @WorkerTask(value = "order_food", threadCount = 3, pollingInterval = 300)
    public TaskResult orderFoodTask(OrderRequest orderRequest) {
        String orderId = OrderService.createOrder(orderRequest);
        TaskResult result = new TaskResult();
        Map<String, Object> output = new HashMap<>();

        if(orderId != null) {
            output.put("orderId", orderId);
            result.setOutputData(output);
            result.setStatus(TaskResult.Status.COMPLETED);
        } else {
            output.put("orderId", null);
            result.setStatus(TaskResult.Status.FAILED);
        }

        return result;
    }
}

5.1. Food Delivery Application

The sample food delivery app looks like this from the Conductor UI:

View in Playground

Let’s see how the workflow progresses:

The application begins when a user places an order on a food delivery app. The initial process is implemented as a series of worker tasks that include adding food to the cart (order_food), checking the restaurant for food availability (check_inventory), payment process (make_payment), and the delivery process (ship_food).
The application flow then moves on to a fork-join task, which handles the notification service. It has two forks, one to notify the delivery person and the other to inform the user.

Now, let’s run the application!

5.2. Run the Application

Clone the project.
Update the application.properties file with the access keys generated. To connect this worker with the application server instance (workflow explained previously), we need to create an application in Orkes Conductor and generate the access keys.
- If using Playground, refer to this video on generating access keys.
- If setting up the Conductor locally, follow the instructions here (Install and Run Locally).

conductor.server.url=https://play.orkes.io/api
conductor.security.client.key-id=<key>
conductor.security.client.secret=<secret>

Notes:

Since we are using the playground, conductor.server.url remains the same. If we have set up Conductor locally, replace this with the Conductor server URL.
Replace the key-id and secret with the generated keys.
For the worker to be connected with the Conductor server, we need to provide permissions (in the app we’ve just created) to access the workflows and tasks.
By default, conductor.worker.all.domain is set to ‘saga’. Ensure to update with a different name to avoid conflicts with the workflows and workers spun up by others in Orkes Playground.

Let’s run the application from the root project using the command:

gradle bootRun

The application is running; the next step is to create an order by calling the triggerRideBookingFlow API from the application.

$ curl --location 'http://localhost:8081/triggerFoodDeliveryFlow' \
 --header 'Content-Type: application/json' \
 --data '{
     "customerEmail": "[email protected]",
     "customerName": "Tester QA",
     "customerContact": "+1(605)123-5674",
     "address": "350 East 62nd Street, NY 10065",
     "restaurantId": 2,
     "foodItems": [
         {
             "item": "Chicken with Broccoli",
             "quantity": 1
         },
         {
             "item": "Veggie Fried Rice",
             "quantity": 1
         },
         {
             "item": "Egg Drop Soup",
             "quantity": 2
         }
     ],
     "additionalNotes": [
         "Do not put spice.",
         "Send cutlery."
     ],
     "paymentMethod" : {
         "type": "Credit Card",
         "details": {
             "number": "1234 4567 3325 1345",
             "cvv": "123",
             "expiry": "05/2022"
         }
     },
     "paymentAmount": 45.34,
     "deliveryInstructions": "Leave at the door!"
  }'

Once the request is sent, we’ll receive a workflow ID indicating that our food delivery app is now running! 🍕

Using the workflow ID, we can visualize our application from Conductor UI. Let’s copy the workflow ID, and on our Conductor console, navigate to “Executions > Workflow“ from the left menu and search for the execution using the workflow ID.

A sample execution looks like this:

Let’s see what happens to the application flow if one of the services fails.

5.3. Compensation Flow

Here’s a simplistic visualization of the compensation transaction for the food delivery app:

While defining a workflow in Orkes Conductor, we can trigger a failureWorkflow when our main application fails. In the definition, include the workflow name to run in case of application failure.

"failureWorkflow": "<name of the workflow to be run on failure>",

The compensation workflow in Orkes Conductor rolls back changes in case of failure:

View in Playground

This workflow triggers when any services fail in our main application.

Let’s imagine that the payment fails due to insufficient funds. Then, the failure workflow triggers, initiating the compensation flow as follows:

Compensation flow in case of payment failure

The system cancels the payment, subsequently canceling the order, and sends failure notifications to the user.

Boom 🎊! That’s how we roll back the completed transactions in our food delivery application using Orkes Conductor, thus maintaining the consistency of the application.

There’s also a Slack community available that might be a good place to check out any queries related to Conductor.

6. Conclusion

In this article, we successfully developed an order management application using Orkes Conductor and Java Spring Boot 3, implementing the Saga Pattern.

Orkes Conductor is available on all major cloud platforms: AWS, Azure, and GCP.

As always, the source code for the article is available over on GitHub.

Saga Pattern in a Microservices Architecture

Get started with Spring and Spring Boot, through the Learn Spring course:

1. Introduction

2. Challenges of Managing Distributed Transactions

3. Understanding the Saga Pattern

4. Why Orchestration-based Saga Pattern?

4.1. Limitations of Choreography

4.2. Advantages of Orchestration

5. Implementing Saga Orchestration Pattern With Orkes Conductor

5.1. Food Delivery Application

5.2. Run the Application

5.3. Compensation Flow

6. Conclusion

Get started with Spring and Spring Boot, through the Learn Spring course:

REST with Spring

Learn Spring Security ▼▲

Learn Spring Security Core

Learn Spring Security OAuth

Learn Spring

Learn Spring Data JPA

Persistence

REST

Security

Full Archive

Baeldung Ebooks

About Baeldung

Write for Baeldung

Get started with Spring and Spring Boot, through the Learn Spring course:

1. Introduction

2. Challenges of Managing Distributed Transactions

3. Understanding the Saga Pattern

4. Why Orchestration-based Saga Pattern?

4.1. Limitations of Choreography

4.2. Advantages of Orchestration

5. Implementing Saga Orchestration Pattern With Orkes Conductor

5.1. Food Delivery Application

5.2. Run the Application

5.3. Compensation Flow

6. Conclusion

Get started with Spring and Spring Boot, through the Learn Spring course: