Category: Azure Architecture

Well it’s official!  Even though you probably have read the article “Kardashians to Kick Bruce and their Affinity Groups to the Curb” in People magazine, you were not really sure what to believe until you saw the official word from Microsoft.  Check out the content “About Regional VNets and Affinity Groups” at   Improvements to the Azure virtual network infrastructure that have made communication between different parts of the Azure DC very quick.  No longer do you need to explicitly group resources in close proximity to each other to minimize latency between items like VNETs, VMs, and storage.



Eventual Consistency Patterns
In the next few blog entries we will examine three popular patterns around the eventual consistency of data – Compensating Transaction pattern, Database Sharding pattern, and the Map-Reduce Pattern. Each requires an agile partitioning of data into commonly formatted segments to yield benefits such in performance, access, scalability, concurrency, and size management. The tradeoff in the eventual consistency of the data means that all segments of the data may not always be showing the same values to all consumers at any given time, which occurs in a strongly consistent data architecture. However, the strongly consistent data model does not always map well to a Cloud data architecture where data can be distributed in many geographic locations and in multiple types of storage. The transactional requirement that strongly consistent data requires the same view of all values at any given time comes with a price of overhead due to locking. When locks need to span large distances to maintain transactional boundaries the latency and blocking that may occur might not be acceptable in the Cloud for your data architecture. Thus the choice of having the data eventually consistent is often a better choice in the Cloud.

Eventual vs. Strong Data Consistency
Before we look at patterns let’s get a good understanding of eventual consistency. To those coming from a traditional ACID enterprise data world, the view that data does not always have to carry the “C” (consistency) part of ACID it may be a totally new concept that is hard to fathom. Let’s start out with strong consistency since that is the paradigm that you are likely most familiar.
Strong data consistency (SC) presents the same set of data values are seen by all instances of the application at any time. This is enforced by transactional locks to prevent multiple instances of an app from concurrently changing these values – only the lock holder can change the value.

All updates to any strongly consistent data are done in atomic transactional manner. So in a Cloud where data is often replicated across many locations the use of strong consistency might be complex and slow since no atomic updates are completed until all of the replicated updated data is done updating. Due to the networked and distributed nature of Cloud resources causing a large number of failures in a Cloud environment the model of strong consistency is not very practical.

It’s best to implement strong consistency only in apps that are not replicated or split over many data stores or many different data centers, such as a traditional enterprise database. If data is stored across many data stores use eventual consistency.
Just a note that if you are storing your data in Azure storage that does not enforce transactional strong consistency for transactions that span multiple blobs and tables.

Eventually data consistency (EC) is used to improve performance and avoid contention in data update operations. This is not a simple and straightforward model to use. In fact, if possible to architect an application to use the native transactional features for update operations – then do that! Only use eventual consistency (and the compensating operations) when necessary to satisfy needs that a strongly consistent data story cannot.

A typical business process consists of a series of autonomous operations. These steps can be performed in all sequences or partially in parallel. While they are being completed that overall data may be in an inconsistent state. Only when it all operations are complete is the system consistent again. EC operations do not lock the data when it is modified.

EC typically makes better sense in a Cloud since it typically uses data replicated across many data stores and sites in different geographies. These data stores do not have to be databases as well. For instance, data could be stored in the state of a service. Data may be replicated across different servers to help load balancing. Or it could be duplicated and co-located close to the services and its users. Locking/blocking only effectively works when all data is stored in the same data store, which is commonly not done in a Cloud environment.

You trade strong consistency for attributes such as increased availability (since no locking/blocking to prevent concurrent access) when you design your solutions around the notion of eventual consistency. With EC, your data might not be consistent all of the time. But in the end it will be – eventually. Data that does not use transactional semantics only becomes eventually consistent when all replicas have been updated through synchronization.

Compensating Transaction Pattern
So now that you understand a bit of the difference around the two consistency models you probably want to know more about how the “C” eventually compensates to make the data eventually consistent. This pattern is best used for Eventually Consistent operations where a failure in the work done by an eventually consistent operation needs to be undone by a series of compensating steps. In the ACID “all-or-nothing” strongly consistent transactional operations, a rollback with is done via all resource managers (RM) involved in the transaction. In a typical two-phase commit, each of the RMs update their part of the transaction in the first phase (locking the data in the process). When all of the RMs are complete they all vote into the main transaction coordinator. It all vote “Commit”, the RMs then commit their individual resources. If one or more votes “Abort”, all the RMs roll-back their piece of the transaction to its original state.

But with eventually consistent operations, there is no supervising transactional manager. Eventually consistent operations are a non-transactional sequence of steps with each step committing totally once it’s complete. If a failure occurs then all partial changes up to the point of failure should be rolled back. From a high level this rollback is done the same as in a transactional strongly consistent environment. But at a lower level the process is very individualized and not coordinated among the different steps automatically via resource managers.
It’s not always the case of simply overwriting a changed value with the original value to compensate. This is because the original value may have been changed a few more times before the rollback needs to occur. Replacing the current value with the original value may not be the right thing to do in that case.

One of the best solutions for a compensating transaction is to run it all in a workflow. And in some cases the only way to recover from a non-transactional eventually consistent failed operation is via manual intervention. These compensating operations often are specific and customized to the user and the environment. In fact, the sequence of the strongly consistent operations do not need to be the same necessarily when a failure occurs and the compensation process is started.

The steps in a compensating transaction should also be idempotent since they can fail and have to be rerun again over and over. Part of this is it’s not always easy to tell when an eventual consistency operation has failed. Unlike an ACID operation when individual resource managers check in with the transaction coordinator and update it when one of them fails (to begin the transactional rollback) this does not occur on an eventually consistent operation.

In subsequent blogs on Eventual Consistency patterns, we will address the Database Sharding and the Map-Reduce patterns.

Scalability in the cloud entails the managed process of increasing the number of capacity when the load increases. This can occur either by increasing the number of nodes, or by preserving the same number of nodes while increasing their physical capacity.

The Vertical Scaling Pattern requires downtime to reconfigure and “scale up” the hardware by increasing the capacity (CPU, memory, disk) of a node. Scaling up is also has limits since the node can only be scaled up so much due to its resource capability limit. Vertical scaling is the least common pattern of the scaling options.

Increasing the Virtual Machine (VM) nodes when load increases, and reducing the number of VMs when the stress on a system tails off below a certain level, is known as Horizontal Scaling Compute Pattern. “Scaling out” requires minimal or no downtime to reconfigure the number of nodes. In the end it provides a capacity that exceeds a single node using multiple nodes by increasing the number of nodes of same size and same configuration (homogeneous) to scale upon load. It’s used to handle capacity requirements that vary seasonally or due to unpredictable load spikes. Care must be exercised to not use sticky session or session state unless that state is commonly stored in a central location not bound to a particular VM instance.

The Horizontal Scaling Pattern can be implemented manually or automatically. Its often preferred to minimize the human intervention required to automate the horizontal scaling process via custom or standard auto-scaling rules. A standard rule can typically be written against CPU utilization, memory usage, average queue length, or average response times to continuously monitor a fluctuating resource. For example, you can set a rule such that is a customer query on a product takes more than 20 seconds to come back, then increase the number of VM instances hosting the database cluster by one.

Scaling out can be more than just a compute node and can be other resources, such as storage, queues, and other components that can grown and shrink dynamically. For instance, queues can be added or removed as needed. Databases can be sharded or consolidated as the growth occurs or scales back.

When scaling out a VM not you may want to pay attention to the “N+1” rule. This deploys N+1 new nodes when scaling up even though only N nodes are needed at one time. Doing this provides additional ready-to-go computing resources if a sudden spike occurs. It also provides an additional instance in case of hardware failure.

To start of 2015 I am going to step up a level conceptually from the low-level Azure technologies and talk Cloud architecture patterns for the short term on the blog. As an Principal Cloud Solutions Architect we need look at a customer’s business requirements, use cases, and their desire to move to the cloud. From that we try to fit their new architecture into well-known, tried-and-true architectural patterns. I will present a basic knowledge of some of my favorite patterns to give your a good conversational understanding of those that are useful for cloud applications. It will not be an in-depth exhaustive study of each pattern. Rather, we will talk about the problem the pattern solves and how it works.

Stay tuned for the first pattern very soon!

This is the last in a  three-part series on multi-tenancy within Windows Azure applications. Here is a breakdown of the topics for these posts.

Post 1 –  Tenants and Instances

Post 2 – Combinations of Instances and Tenancy

Part 3 – Tenant Strategy and Business Application

Multi-Tenant Strategies

Azure’s approach to scalability is to scale out as the load on an application dictates.  But what parts exactly do you want to scale out? You have different options based upon application specific requirements. You don’t have to use the same instance scaling approach for each layer of your application.

For instance, you can have multiple Web roles which write messages into a Windows Azure Service Bus queue that is managed by a single 1st-level Worker role. The Worker role can process the requests and send them to multiple 2nd level Worker roles. These roles then process the requests and write them to a single SQL Azure database. Or the 2nd level Worker roles can write to multiple SQL Azure databases. Or there could be multiple 1st level roles which read the queue and call a single level 2nd level Worker role which then writes to multiple SQL Azure databases.  The combinations are many but you have to carefully make sure they correctly support your application requirements. Here are just a few of the many tenancy combinations possible in your Azure application.

  • A multi-tenant UI (Web role) which calls out to a multi-tenant service (Worker role) which links to a single tenant data storage layer.
  • A single-tenant UI (Web role) calling into a multi-tenant service (Worker role) calling into a single-tenant data layer.

There are many reasons to group the application nodes in different combinations.  You have the flexibility within your architecture to choose which services can be shared and how much they are shared.  In an SIMT or a MIMT application you can logically group customers in your application domain into dedicated instances that support those common usage patterns.

Suppose you sell a multi-purpose application that is used by businesses in different industries.  Based upon usage patterns you could group customers by vertical markets since they tend to use similar transient state and work with similar schemas for persistent data.  For instance, you could put all the medical companies into one SIMT app and the all the restaurants that use your app into another SIMT app.  Or based upon different SLA requirements you can group customers with different SLA levels for availability in different SIMT applications.

Based upon customer security requirements there are different ways to separate or share data.  Multiple tenants can use different databases or schemas to have isolated storage. Coming up one level of isolation tenants can share the same database but have their data stored in different rows in the same table.  Or isolation may not be needed and data is shared at the field level.  Tenants can have different schemas or custom columns or table level permissions based upon requirements.  But however data is configured the multi-tenant application must protect each customer’s data from being visible or accessed by other customers as per application requirements.

State can be isolated or shared as the application dictates just like the database. All tenants in a multi-tenant application can share state, but most likely will have their own state. State can be stored across all customers in one instance or across multiple customers in multiple instances.

Again it all depends upon the application logic and data/business requirements.

In some situations within the services layer or the Web UI layer you may require more than one Worker or Web role respectively.  Dividing the work up by different tasks and assigning it to these roles can be done in many ways.

Case 1:


Case 2:


Case 1 – You can assign each role a specific task that the other roles do not have.   This means you will need to have at least one instance for each type of role.  For instance, you can have a services layer Worker role A that does number-crunching and another Worker role B that does caching of data.   This requires typically more Worker role instances to support the application’s scalability requirements than if both the number crunching and caching was all done in a combined instance.  If you had only one Worker role service that did all the number crunching and all the dependent Web roles required that service regularly so it was in high contention, you might need to scale out with multiple number crunching instances.  The good news is the code to support only one function tends to be simpler than if the role had to handle multiple tasks.

Case 2 – Alternatively you can have multiple roles that all do the same group of tasks.  For instance Worker role A does number crunching and caching, and Worker role B also does the same number crunching and caching.  This is more efficient typically since you don’t need to host as many nodes. However the code and logic is more complex to manage similar operations across multiple instances.  Your code will need to differentiate when it is permissible to carry out the same task in multiple roles at once.   There may be times when that task can only be done in one of the roles at any given time and you will need to implement a synchronization mechanism. If using Azure storage you can require a lease on that storage entity (i.e. blob storage) to do any work. Once that lease is required no other instances can access that resource.

Business Models and Tenancy

Let’s look at perhaps the most important driving factor of all behind tenancy – how will you charge customers of your application so you make $? Will you charge customers monthly based upon the actual resources they use, a percentage of resource usage for an instance shared among other tenants, or a flat fee?  Can customers of your application run concurrently in a shared instance? Or do they require their own dedicated instance? Can they share application code but not the same database?  Or can they share the same database tables but co-reside in adjacent rows? And so on.

There are many ways your architecture can evolve out of the answers to those business questions. At a high level you can use these answers as a basis to decide which tenancy model is correct for you.  At a more granular level you can decide upon different topologies from these architectures that further support your business model. Here are a few business points to consider when making tenancy decisions.

  • Choosing to sell to a large or small customer base?  Or both?
  • Strict regulatory data storage requirements?  Or data that can be stored anywhere and viewed by anyone?
  • Different performance, availability, and scalability requirements
  • A variety of different customer subscription pricing options

Depending upon the answers to these questions here are a few typical billing options you can use for your application.

Fixed Fee

You could agree up front to charge customers a fixed-fee each month regardless of the variable costs they incur.  This is like your standard cable TV model where whether you want ESPN 30 minutes a month or 10 hours per day your bill is still the same each month.

Actual Usage

For single-tenant applications it’s simpler to measure costs per customer since all the costs incurred belong to that specific customer. You could create a separate Azure account for each customer to simplify this process.  This is like your electric bill where you pay for what you use each month.

Variable Shared Costs

Shared of costs infers multi-tenancy.  You can do it at a granular level where you try to specifically charge each customer for what they use.  Thus in an app with 200 tenants you would have all sorts of various monthly bills for each customer, all summing up at least to the amount of the total costs.  Each fall season my neighbors and I used to pay approximately $100 each to rent individual pull-behind aerators to attach to our lawn mowers to aerate and seed our lawn.

We finally got smart and instead of us all paying $100 we all went in on one aerator and shared it across the weekend. If three of us went in together it costs us each $33 apiece. If only two of us shared the cost is was $50 apiece. Each year we paid a variable fee based upon how many of us shared the cost – a shared cost but at a variable amount.

Fixed Shared Costs

Customers could share the costs using a simpler model by taking the total Azure costs of that instance serving that group of tenants and divide it evenly.   To expand upon my shared aerator example suppose three of us went and purchased a deluxe whiz-bang self-propelled aerator for $2400 that not only aerated but seeded at the same time.  We went on a two-year interest-free payment plan and agreed to share the monthly payment of $100 per month for two years – a shared cost but at a fixed amount.

Whatever model you choose you will probably want to build in some profit % to charge customers once the actual Azure costs have been paid.  So it is absolutely critical that you do your homework and establish solid estimates of expected usage costs and profit points before you decide upon your billing strategy. This is especially true if you are using a fixed fee approach or you will end up eating the overage costs yourself. Note that regardless of how much actual CPU time a customer uses the compute cost is a fixed cost per month.  Other charges like storage, SQL Azure, bandwidth, etc. are variable costs and are dependent upon how much the customer uses the Azure infrastructure.

Regardless of the billing model try to maximize the # of tenants in the instance without a degradation of performance. You should consider the price of required resources against the customer’s need for isolation.  The more tenants can share resources the more correspondingly the cost will be minimized for all the tenants and make your application more attractive to a larger group of customers.  You could even have two different deployments of your application.  The more expensive deployment could be dedicated per each high-paying customer who needs isolation and higher performance.  The other deployment could be shared among the other customers who don’t need isolation or the very best performance but want lower pricing.


Provisioning customers for a multi-tenant application is typically a bit more involved than for a single-tenant application.  With a single-tenant application it is probably nothing more than a configuration update. But for a multi-instance application a new Azure instance will need to be configured as each new tenant is added.

A part of provisioning has to do with customizing the UI of the application. For a single-tenant application each customer will have their own instance running a customized version of the UI for that customer.  You can map a custom DNS name (using DNS CNAMEs) to each customer’s instance of the application.  So for Company1 and Company2, you might expose the URLs of and http://myAccount/  This approach works fine for the HTTP protocol.  But for secure HTTPS protocol a problem occurs in this strategy. When using HTTPS only a single SSL certificate can be associated with the standard HTTPS port 443.   To remedy this you can have different Web sites within the same Web instance. This can be done by adding port numbers onto the URIs. For instance, you could have the following addresses for four of the tenants within the same Web instance.





You can also use a custom addressing scheme with the same core part of the URL but just change other parts of it per tenant.  For instance, depending on the configuration of your site and app you could have something like these two URIS for Company1 and Company2.

https://<myAccount&gt; https://<myAccount&gt;

There are other ways to provision and divide functionality among tenants to allow customization of their UI and processing.  You will have to decide just how much liberty to give customers to customize their applications.  It could be a simple change to a part of a page or using cascading style sheets. Or you can allow them to customize entire pages within their namespace.

You will need to ensure that technically a customer’s data is safe within both Azure storage and SQL Azure within a multi-tenant application. More importantly, you will also need to support the perception that the customer’s data is indeed safe in the shared tenant environment.

We mentioned earlier how the cost of using the compute instance is minimized with more customers.  The same applies to SQL Azure and the number of databases used.   For customers that need complete guaranteed data isolation it would make sense to allocate one database for each of them.  Others customers may more economically be able to share rows in the same database table.


In summation, tenancy in Azure applications is largely dependent upon your customers business and application requirements.  You should carefully examine your business and data storage requirements and weigh out the pros and cons of running in a shared vs. isolated environment.

A poor multi-tenant architecture can make the experience of using your application very frustrating for customers.  It can also more importantly result in corrupted data.   Conversely covering your eyes and not doing your research by simply relying upon simplified single-tenant architecture you can make the cost of using your application cost prohibitive to certain customers.  Take your time and ensure you get the correct balance of tenancy and instances to allow your application to take full advantage of the Windows Azure platform.

This is the first of a  three-part series on multi-tenancy within Windows Azure applications. Here is a breakdown of the topics for this post as well as the final two posts.

Post 1 –  Tenants and Instances

Post 2 – Combinations of Instances and Tenancy

Post 3 – Tenant Strategy and Business Application

Part 1 – Tenants and Instances

One of the prime economic motivations to running your application in a Cloud environment is the ability to distribute the cost of shared resources among multiple customers.  At least that’s what all the evangelists and marketing folks like to tell us, right?  But the harsh reality of life in the Cloud is that an application’s ability to run safely and efficiently with multiple customers does not just magically ‘happen’ simply by deploying it to Azure.  An application’s architecture must be explicitly and carefully designed to support running with multiple customers across multiple (VM) virtual machines.  Its’ implementation must prevent customer-specific processing from negatively affecting processing for other tenants.

Here I will attempt to simplify the concept of tenancy in Windows Azure by first defining tenants and instances. There are deeper levels this discussion could be taken as entire books have been written on multi-tenancy in shared computing/storage environment (which is what the Cloud is after all). So we will only touch the tip of the iceberg when it comes to the science of instance allocations per tenant and multi-tenant data access.   We will look at various configurations of instances and tenancy and how to structure your application and data.  And finally we will wrap up with some strategies for multi-tenancy and how business models relate to tenancy.


Let’s first agree upon a definition for the term tenancy and how it applies to single- and multi-tenancy.  There are a few different definitions of this concept that can cloud (no pun intended) one’s understanding. Some define it as the relationship of clients to application instances.  Wikipedia defines it as “…a single instance of the software runs on a server, serving multiple client organizations (tenants)”. While I appreciate this definition for the sake of practicality I want to expand upon it if I may.   There are architectures where you can also have “more than one instance of the software running on a server”.   How you provision tenants to those multiple instances is your choice. You could have one tenant (customer or company) per instance and all the users from that company dedicated to that instance.  Or you could have more than one customer sharing that one instance but logically partitioned from each other’s data and processing. Each of those customers (or tenants) can have one or more users accessing the application.

So for the sake of this paper let’s agree that tenancy refers to a “customer/billing” relationship.   Suppose we have a SaaS Azure application that we sell to ten different companies (our “customers”).  Each company has 5,000 employees using our Cloud application.  Using our billing-relationship definition if we sell that service to 10 different companies we don’t have 5,000 tenants. Rather we have 10 tenants because of our 10 customer/billing relationships.  Later in this paper we will look at more about this type of relationship in the Business Model and Tenancy section. 

Tenant Types

There are two types of tenant environments we will need to consider. The simplest type is a single-tenant application where one customer has 100% dedicated access to an application’s process space.  A single tenant application is much more predictable and stable by its nature since there will never be more than one dedicated customer at any point in time in that VM.  That customer has all of its users accessing that dedicated instance of the application.

Contrast that with a multi-tenant environment where more than one customer shares the application’s process space and data.   Due to requirements of security and performance isolation, it’s more difficult to build a multi-tenant application.  You may have to plan for added complexity and development/test time to synchronize access to shared data and resources.   For a Windows Azure multi-tenant application similar complexity is required to ensure data and resources do not get corrupted with multiple tenants in the same or multiple process spaces.

Realize there are variations on this theme.  You could theoretically have only one user per instance of a VM.  So if a company had 2000 users it would need 2000 VMs. Not a very practical application of resources or costs.  More realistically, with our tenancy definition of customer/billing relationship, you would have 2000 users from that same company sharing the same VM instance. So if you have 20 companies you’d have 20 VMs.  Each VM would handle one or more users for that specific company only.  Getting more complex in tenancy we could have more than one company sharing an instance. So if you had 100 companies, you could provision 10 of them per instance and thus use 10 VMs. Each company would share a VM with 9 other companies, and you would have 10 VM instances of this. Confused? Hold that thought. We will break this down with picture and stick figures (okay… maybe no stick figures!) as we progress in this article.

Why Multi-Tenancy?

Why would one want to go through the trouble of making a Windows Azure application multi-tenant? Wouldn’t it just be so much easier to make an application single-tenant and not have to worry about complicated synchronization and a more complex billing model? From an infrastructure and development standpoint, it’s the difference in cost vs. simplicity.  Whether making the choice of the simplicity of driving your own private car to work or reducing your commuting cost with other riders using the public mass-transit system you have to decide what’s right for you.  For various reasons you may decide not to share your ride with others and pay more for your own custom commuting environment.  Maybe your schedule is unique, or you are scared of what could happen health-wise or crime-wise being sharing a commuting vehicle with strangers. Or you don’t want your commute time to be affected in any way by other riders and like to leave your briefcase in the car each night when you get home.  So you choose to pay more and have a dedicated way to work.

But for those commuters who can’t or won’t spend the increased cost of private commuting the choice of shared public transportation is their best option.  It offers a simple pay as you go model where the cost you pay is minimized since it is distributed across many customers. So there is no right or wrong answer – it depends upon the rider requirements.

Compare this to a software standpoint where there could be various requirements – government regulations or demanding SLAs – that you need to support for your customers.  It’s a tradeoff, right?  You want the benefits of operation isolation and simplified design found with a dedicated single-tenant application.  But it will cost you more since you aren’t sharing the cost of the resources over many customers.  So the choice when deciding to use a multi- or single-tenant environment depends upon your application requirements.

Instances and Tenancy

In the case of Windows Azure a new instance means running on a new VM instance isolated from the other instances.  These instances can communicate with each other typically through queues or WCF calls.  Let’s see how these terms combine to define the different types of applications running under Windows Azure.

There are four possible combinations of instances and tenancy.  The standard convention is “INSTANCE-TENANT” format when describing these combinations. These are listed in approximate complexity of architecture with SIST being the simplest.

  1. SIST (single-instance, single-tenant)
  2. SIMT (single-instance, multi-tenant)
  3. MIST (multi-instance, single-tenant)
  4. MIMT (multi-instance, multi-tenant)

A quirky way that helps me remember these in terms of Windows Azure is to equate the following:

“Instance” = “Azure virtual machine (VM) instance”


“Tenant” = “Customers sharing of the VM”

So in my personal Azure terms, I like to define the remaining combinations as follows. These are listed in order of increasingly complex development.

1. SIST (Single-Instance, Single-Tenant)

  • “Only one Azure VM dedicated to one company”

2.  SIMT (Single-Instance, Multi-Tenant)

  • “Only one Azure VM shared among multiple companies”

3.  MIST (Multi-Instance, Single-Tenant)

  • “Many Azure VMs each dedicated to one company”

4.  MIMT (Multi-Instance, Multi-Tenant)

  • “Many Azure VMs shared among multiple companies”

 In the next post we will look in detail at all four of these combinations of tenancy.