Eventual Consistency Patterns
In the next few blog entries we will examine three popular patterns around the eventual consistency of data – Compensating Transaction pattern, Database Sharding pattern, and the Map-Reduce Pattern. Each requires an agile partitioning of data into commonly formatted segments to yield benefits such in performance, access, scalability, concurrency, and size management. The tradeoff in the eventual consistency of the data means that all segments of the data may not always be showing the same values to all consumers at any given time, which occurs in a strongly consistent data architecture. However, the strongly consistent data model does not always map well to a Cloud data architecture where data can be distributed in many geographic locations and in multiple types of storage. The transactional requirement that strongly consistent data requires the same view of all values at any given time comes with a price of overhead due to locking. When locks need to span large distances to maintain transactional boundaries the latency and blocking that may occur might not be acceptable in the Cloud for your data architecture. Thus the choice of having the data eventually consistent is often a better choice in the Cloud.
Eventual vs. Strong Data Consistency
Before we look at patterns let’s get a good understanding of eventual consistency. To those coming from a traditional ACID enterprise data world, the view that data does not always have to carry the “C” (consistency) part of ACID it may be a totally new concept that is hard to fathom. Let’s start out with strong consistency since that is the paradigm that you are likely most familiar.
Strong data consistency (SC) presents the same set of data values are seen by all instances of the application at any time. This is enforced by transactional locks to prevent multiple instances of an app from concurrently changing these values – only the lock holder can change the value.
All updates to any strongly consistent data are done in atomic transactional manner. So in a Cloud where data is often replicated across many locations the use of strong consistency might be complex and slow since no atomic updates are completed until all of the replicated updated data is done updating. Due to the networked and distributed nature of Cloud resources causing a large number of failures in a Cloud environment the model of strong consistency is not very practical.
It’s best to implement strong consistency only in apps that are not replicated or split over many data stores or many different data centers, such as a traditional enterprise database. If data is stored across many data stores use eventual consistency.
Just a note that if you are storing your data in Azure storage that does not enforce transactional strong consistency for transactions that span multiple blobs and tables.
Eventually data consistency (EC) is used to improve performance and avoid contention in data update operations. This is not a simple and straightforward model to use. In fact, if possible to architect an application to use the native transactional features for update operations – then do that! Only use eventual consistency (and the compensating operations) when necessary to satisfy needs that a strongly consistent data story cannot.
A typical business process consists of a series of autonomous operations. These steps can be performed in all sequences or partially in parallel. While they are being completed that overall data may be in an inconsistent state. Only when it all operations are complete is the system consistent again. EC operations do not lock the data when it is modified.
EC typically makes better sense in a Cloud since it typically uses data replicated across many data stores and sites in different geographies. These data stores do not have to be databases as well. For instance, data could be stored in the state of a service. Data may be replicated across different servers to help load balancing. Or it could be duplicated and co-located close to the services and its users. Locking/blocking only effectively works when all data is stored in the same data store, which is commonly not done in a Cloud environment.
You trade strong consistency for attributes such as increased availability (since no locking/blocking to prevent concurrent access) when you design your solutions around the notion of eventual consistency. With EC, your data might not be consistent all of the time. But in the end it will be – eventually. Data that does not use transactional semantics only becomes eventually consistent when all replicas have been updated through synchronization.
Compensating Transaction Pattern
So now that you understand a bit of the difference around the two consistency models you probably want to know more about how the “C” eventually compensates to make the data eventually consistent. This pattern is best used for Eventually Consistent operations where a failure in the work done by an eventually consistent operation needs to be undone by a series of compensating steps. In the ACID “all-or-nothing” strongly consistent transactional operations, a rollback with is done via all resource managers (RM) involved in the transaction. In a typical two-phase commit, each of the RMs update their part of the transaction in the first phase (locking the data in the process). When all of the RMs are complete they all vote into the main transaction coordinator. It all vote “Commit”, the RMs then commit their individual resources. If one or more votes “Abort”, all the RMs roll-back their piece of the transaction to its original state.
But with eventually consistent operations, there is no supervising transactional manager. Eventually consistent operations are a non-transactional sequence of steps with each step committing totally once it’s complete. If a failure occurs then all partial changes up to the point of failure should be rolled back. From a high level this rollback is done the same as in a transactional strongly consistent environment. But at a lower level the process is very individualized and not coordinated among the different steps automatically via resource managers.
It’s not always the case of simply overwriting a changed value with the original value to compensate. This is because the original value may have been changed a few more times before the rollback needs to occur. Replacing the current value with the original value may not be the right thing to do in that case.
One of the best solutions for a compensating transaction is to run it all in a workflow. And in some cases the only way to recover from a non-transactional eventually consistent failed operation is via manual intervention. These compensating operations often are specific and customized to the user and the environment. In fact, the sequence of the strongly consistent operations do not need to be the same necessarily when a failure occurs and the compensation process is started.
The steps in a compensating transaction should also be idempotent since they can fail and have to be rerun again over and over. Part of this is it’s not always easy to tell when an eventual consistency operation has failed. Unlike an ACID operation when individual resource managers check in with the transaction coordinator and update it when one of them fails (to begin the transactional rollback) this does not occur on an eventually consistent operation.
In subsequent blogs on Eventual Consistency patterns, we will address the Database Sharding and the Map-Reduce patterns.