Category: Azure Diagnostics/Monitoring


Microsoft announced its purchase of the startup MetricsHub which specializes in software to monitor and scale Azure Wed and Worker role instances.   For more information on the acquisition refer to here.

Not sure why it took MS this deep into the Azure lifecycle to obtain the functionality of auto-scaling. It’s the single biggest question IT Ops folks tend to ask when viewing the “Instances” tab in the portal.  At the SDR last year they said auto-scaling was coming very soon so not sure why they would wait this long to finally obtain that functionality.   Purchasing MetricsHub is an interesting but late acquisition for Microsoft. Better late than never I guess…

In this post I will share my initial experience and observations of the MetricsHub tool. I will then compare it to one I use regularly – AzureWatch from Paraleap. Doing the comparison helps to show points of improvement for MetricsHub.

Evaluation of MetricsHub

The MetricsHub site does not give a lot of technical info. It’s high-level and written for marketing folks and even the blogs do not go into detail on how this works and what does and it appears that the functionality of the tool is solely about adjusting scaling based upon rules.  Like Azure Watch it appears to have the ability to monitor endpoints or queue length. The latter is a key factor is loosely-coupled environments for correct scaling.    It also requires an agent (not a favorite of customers) to get metrics on Azure VMs and Websites.  Just to be clear there is no auto-scaling – just metrics – for VMs and Wed sites.

Without the agent the service is 100% free which is very nice. Using normal Azure diagnostics it’s about 50 cent per month charge.  Not sure how they arrived at this number since it uses YOUR Azure storage evidently for WAD tables/blobs.  By contrast, Azure Watch uses THEIR azure storage and you pay them .01 per hour per service to be monitored.  If you turn up their monitoring levels vs. light monitoring levels YOUR storage costs can grow faster and not sure where the .50 cent figure comes from since it appears Azure will be charging up vs. MetricsHub (the cost to log to WAD tables in YOUR storage).  I was very unclear on that.  The .50 cent/month quoted rate to me is a bit misleading. If you are adding approximately .50 per month storage per month, and do not have a policy to truncate after 30 days, the second month your storage bill would be 1.00, 3rd month 1.50, etc.

I like the fact that is has default recommended rules for setup which can be tweaked later. That is a very nice feature since after you install the tool you then have to ‘guess’, based upon experience, what counters to monitor, what their values should be, etc.  For the average Azure customer this is nice.  They also provide alerts in three forms (pager, email, SMS) and give you a detailed breakdown of your monthly bill costs.

When you click signup it takes you to the Azure store.  The only option I was given was to sign up for the “No-Agent” option which is free, select the correct subscription and it’s done. Very simple.  Under contact setting it uses the default login email you use for your Azure portal which you can change but have to first check the box to be emailed promotions to enable the email field. Make the change, and uncheck the promotions box if you want.  The Upgrade button just gives you the original option of free (no agent) so still not sure how to sign up for the Agent option.  The Connection Info button brings up a few “secret” values but not sure where those are used or what I do with them. Obviously used for security purposes but they are not related to my storage key where MetricsHub will store the diagnostics data.

The Manage button is where all the magic happens.  You download the publish settings file first, then upload it to MetricsHub.   All my VMs and Cloud Services are displayed to choose which to monitor.  The VMs are only available using the Agent and still no info on how to sign up for it – these options are not available for me at this point.  Click “start now” to begin the monitoring.

From there you can click on specific roles to view data. You can use the default display or create a custom table or chart.  The default view contains generic CPU, Memory, Disk Read, Disk Write, NetIn, and NetOut values.  The custom table or chart option allows you to only select a subset of the Perfmon counters you have enabled for your application ahead of time.

MetricsHub has a link to Display Issues.  That worked nice when I clicked it as my test app was over 80% CPU and it gave me a message on this. However I did not receive an alert nor did it show up in bold letters on the main page telling me I have an issue. Unless I clicked on the Issues button I did not know about it.

AzureWatch and MetricsHub Comparison

One of the tools I use to monitor Azure applications is the AzureWatch tool. I like it because it is inexpensive, easy to use, has an incredibly powerful rules engine,  and runs from the Web so no setup is needed on your desktop.  Within AzureWatch you aggregate metrics (such as Perfmon counters). Then you write rules against those metrics and define actions to take when those Boolean rules evaluate to true.  Those actions include scaling up or down the number of compute instances, notification options, etc.

So being an AzureWatch OM user I was curious in how MetricsHub matched up with it. Here ar esome of comparisons and observations I made.

Performance Counters

For MetricsHub I did not see an option to add counters beyond what your code or scripts has already told Diagnostics Monitor to track. To me this is a big negative. When I use AzureWatch it allows me to add counters on the fly….this is HUGE for an ITOps person especially during troubleshooting.  To the average customer doing monitoring the ability to add additional counters dynamically is a big feature that’s missing from MetricsHub.

Custom Rules

AzureWatch also allows me to write custom rules and complex logic against those rules. I did not see the option to do that with MetricsHub.

User Interface

The MetricsHub UI is a richer and more user friendly than AzureWatch. However once someone is used to the AzureWatch UI this difference is minimal.

Coverage

In addition to auto-scaling Web/Worker Roles AzureWatch also supports active monitoring of SQL Azure, SQL Federations, Service Bus, Azure Storage and URL’s (active monitoring means they query these resources every minute rather than read some statistical tables). In contrast, MetricsHub only supports Web/Worker Roles without an agent, and Websites/VM’s with an agent running on these machines

Scaling Engine

This is a significant different.  AzureWatch supports scaling to happen via unlimited number of potentially complex rules that can evaluate any amount of metrics of all kinds.  It can auto-scale based upon any kinds of standard or custom performance counters, combinations their-of, service-bus queue/topic levels, storage queues, etc.  In addition, customers can even define their own custom feeds of metrics that can be imported and used for auto-scaling or alerts.  In contrast, MetricsHub appears to only auto-scale based on CPU speed and Storage queues.  AzureWatch can also execute scaling or alerting actions based on a schedule.

Something I was not able to find in Metrics Hub is how to manage how quickly or slowly the scaling occurs. Also there was no way to allow customers truly save money by scaling down at the end of the clock hour.  AzureWatch supports both of these options. This prevents a sudden short spike from causing new instances to be improperly allocated for a non-existent true false surge in load.

Performance Extensions

AzureWatch also has ability to show performance data on mobile devices and RSS feeds and export performance data via PDF/XLS/Word. I didn’t find any such option for MetricsHub.

And the Winner Is…

From what I observed the choice between AzureWatch and MetricsHub lies in the choice between flash and looks vs. power and flexibility. In general, a company whose whole sole business is to provide scaling and monitoring is going to do a better job than a company (Microsoft) who’s simply trying to fill up an array of tools to its customers and check the checkmark to say that it does have monitoring package.  If I was a company considering picking one of the two products, support for monitoring a full variety of Azure middleware and sophisticated scaling engine capable of measuring any amount of indicators with ability to get fancy with complex Boolean logic and order-based rules – that’s what I’d care about.

Overall nothing blows me away with this MetricsHub and there is a lot of functionality it does not provide that AzureWatch give me.  And the winner is…..AzureWatch!

You can install AzureWatch for a free 14-day trial here.

Advertisements

Since the Windows Azure work of SCOM 2012 is not complete yet you cannot directly view custom performance counters in SCOM 2012. Rather you have to first use SCOM 2007 R2 Operations Manager and the Authoring Console to assist with this.  Here is information on how to do this.

DISCOVERY AND MONITORING FOR WINDOWS AZURE APPLICATIONS

Install the System Center Monitoring Pack for Windows Azure Applications from http://www.microsoft.com/en-us/download/details.aspx?id=11324.  Once you have installed the Azure Management pack on your SCOM server, you will have to follow those steps:

  1. In SCOM 2007 R2 create the appropriate accounts in Operation Manager to connect to your Azure environment for Azure applications discovery.
  2. Configure Performance Monitoring for Windows Azure applications.
  3. Export the pack into SCOM 2012 and view the results of the performance counters collection.

CONFIGURE ACCOUNTS FOR DISCOVERY STEPS

Step 1 – Create Run AS Accounts:

You will need three “Run As” Account in System Center Operation Manager 2012:

  • One for Binary Authentication. This account will use the Management Certificate to connect to Azure.
  • One for Basic Authentication. This account will be used for the Certificate and will store the password for the Certificate.
  • One that will be used for the proxy agent (Optional).

For detailed steps refer the following documents:

http://oakleafblog.blogspot.fr/2011/09/installing-systems-center-monitoring.html

http://blogs.technet.com/b/dcaro/archive/2012/05/03/how-to-monitor-your-windows-azure-application-with-system-center-2012-part-2.aspx

Step 2 – Configure Windows Azure Management Pack Template:

1. Click the Authoring button in the left pane, select the Authoring \ Management Pack Templates \ Windows Azure Application and click the Add Monitoring Wizard task in the right pane to open the Monitoring Wizard’s Select Monitoring Type dialog with Windows Azure Application selected.

2. Click Next to open the Name and Description dialog. Type a Name and Description for the service and click the new button to open the Create a new Management Pack dialog. Type a Name e.g. MY AZURE MP, which fills in the ID value, Version number, and Description.

3. Click Next to Open the Application Details dialog, type the service’s DNS prefix, e.g. MYDEVAPP1, copy the Subscription ID from the Developer portal and paste it in the text box, accept the default Production as the Environment to Monitor. Using the accounts you previously created, select the Binary Authentication account in the Azure Certificate Run As Account and the Basic Authentication account for the Azure Certificate Password Run As Account.

4. Click Next to open the Select Proxy Agent dialog and click Browse to open another Select Proxy Agent dialog. Click Search to list computers on your network and select an agent-managed computer to act as a proxy agent for the Windows Azure application, e.g. ITPRODC.

5. Click OK to close the dialog and click next, which displays a message. Click Yes to distribute the account to the selected Proxy Agent and open the Summary dialog.

6. Click Create to create the new Management Pack for the MYDEVAPP1 with Azure Applications Discovery.

7. Click the Monitoring button in the left pane, select the Monitoring \ Distributed Applications node to open the Distributed Applications list, and select the new MYDEVAPP1 hosted service monitor (Healthy state indicates monitoring is occurring):

8. Verify that Detail Views of the Deployment State, Hosted Service State, Role Instance State and Role State correspond to the known current state of the service.

CONFIGURE PERFORMANCE MONITORING FOR WINDOWS AZURE APPLICATIONS STEPS

  1. Create Performance Collection Rule using existing Performance Counters
  2. Create A Performance Rule using Custom Azure Counters
  3. Export the SCOM 2007 R2 Management Pack into SCOM 2012

Step 1 – Create Performance Collection Rule using Standard Performance Counters

In Operations Manager Console for 2007 R2 open Rules and select the following Targets:

  • Windows Azure Deployment
  • Windows Azure Hosted Service
  • Windows Azure Role
  • Windows Azure Role Instance
  1. Open Create Rule Wizard and select Collection Rules Node.
  2. Under the Performance Collection Node select the Windows Performance Node and select the Custom Management Pack (i.e. My Azure MP in our case).
  3. Provide the rule name e.g. “ASP.NET Applications Requests/sec (Custom)”.
  4. Select Performance Collection as the rule category.
  5. Select the Target as Windows Azure Role Instance (This will collect data for all role instances across hosted services you are monitoring).
  6. To collect data for instances within a particular hosted service, select the Windows Azure Role Instance (i.e. MYDEVAPP1 in our case).
  7. On the next Page select Performance Object, Counter, and Instance as:

Object Name:     ASP.NET Applications

Counter Name:  Requests/Sec

Instance Name: __Total__

All Instances: True or False

  1. Optimize Performance Collection Settings as default which means it will not use optimization.
  2. Save the performance collection rules.

Step 2 – Import the Management Pack into Authoring Console

  1. Open Authoring Console for Operations Manager 2007 R2.
  2. Connect to Management Group (i.e. “DEVMANAGEMENTGROUP”).
  3. Click on Tools and select Import Management Pack.
  4. Select the Custom MP created earlier (i.e. “My Azure MP”).  The Management Pack should successfully load in the Authoring Console.
  5. Open the Health Model Tab and click on Rules.
  6. You should see the same performance collection rule (Check display name to display correct rule names).
  7. Open the Rule Properties and click on Modules tab.
  8. Under Data Sources delete the existing Data Source values.
  9. Click on Create and select “Microsoft.SystemCenter.Azure.RoleInstance.PerformanceCounter.CollectData.DS” type entry.
  10. Give a custom name of ModuleID (i.e.  AzureDS).
  11. Edit the Data Source just created to adjust values for Counters and InstanceName.
  12. Provide the following values in Data Source module:

<Configuration>

<IntervalSeconds>300</IntervalSeconds>

<TimeoutSeconds>120</TimeoutSeconds>

<CounterName>Requests/Sec</CounterName>

<ObjectName>ASP.NET Applications</ObjectName>

<InstanceName>__Total__</InstanceName>

<AllInstances>false</AllInstances>

</Configuration>

13. Save those values and save the Management Pack in SCOM 2007 R2 Authoring Console.

14. Export the Management Pack back to in SCOM 2007 R2 Operations Manager Console.

15. Make sure you can see the Performance Data in the Console by creating a new Performance View.

Step 3 – Create Performance Collection Rule using Custom Azure Counters

  1. Create another Performance Collection Rule for Azure in the same manner mentioned above for standard counters but this time use a custom counter.
  2. Ensure the values under the Data Source Modules should be as follow:

Object Name:     CustomCategory

Counter Name:  TotalnumberofFileUpload

Instance Name:

All Instances: False

OR

<Configuration>

<IntervalSeconds>300</IntervalSeconds>

<TimeoutSeconds>120</TimeoutSeconds>

<CounterName>TotalnumberofFileUpload</CounterName>

<ObjectName>CustomCategory</ObjectName>

<InstanceName>

<AllInstances>false</AllInstances>

</Configuration>

(This means the InstanceName should be empty and AllInstances set to false).

NOTE – You need to follow all these steps in Operations Manager 2007 R2 set up or Management Group.

  1. Save the Management Pack and export it to Operations Manager 2007 R2 Management Group.
  2. Create a new Performance Collection View under Monitoring to see the Data related to Custom Counters.
  3. Once all the Rules are created, and when you can see the Data under Performance Collection View for Custom Counters save the Management Pack as a XML file using the Authoring Console.

EXPORT THE SCOM 2007 R2 MANAGEMENT PACK INTO SCOM 2012 STEPS

  1. Copy the saved Management Pack file to Operations Manager 2012 Management Server. Import the Management Pack using Operations Manager 2012 Console.
  2. Open Authoring Console on the same Management Server and select “import from Management Group” and select the Custom Azure MP in the list.
  3. Now open the Health Model and click on Rules. Select the Custom Counter Performance Collection Rule.
  4. Check the Data Sources in the Module and confirm that the correct Modules for Azure, Operations Manager and Operations Manager DW Database are selected.
  5. Change the Data Sources if required and save the Management Pack in the Console.
  6. Now export the Management Pack to Operations Manager 2012 Management Group using Authoring Console.
  7. Ensure to change the Azure Run As Accounts (if different in Operations Manager 2007 R2) used in Windows Azure Management Pack Template for Azure Discovery process using Operations Manager 2012.
  8. Also change the Proxy Agent in Azure MP Template to new Operations Manager 2012 Management Server.
  9. Save all the settings made in the Azure MP Template in same custom MP.
  10. Restart the Health Service on the Management Server and make sure all Azure Role Instances, Hosted Services and Deployment are successfully discovered.
  11. You should also see the same Performance Collection View in the Console as in Operations Manager 2007 R2 Console.
  12. Now you should see the same Performance Data for Custom Counters in the View.
  13. Similarly you can create more Rules for different Custom Performance Counters.

Important: In case you get the Event ID 34024, make sure to apply an Override and change the Diagnostic Connection String used in Windows Azure with your Custom Connection String.

ALL THIS IS POSSIBLE IF YOU CAN SEE THE CUSTOM PERFORMANCE COUNTERS ON THE AZURE VM OR SERVER WHERE THE AZURE ROLE INSTANCES ARE RUNNING.

Before doing this you need to configure AzureWatch to point to your Azure subscription.

Local Perfmon

The Control Panel client version of AzureWatch uses your local Perfmon (where the tool is running) to display counters.  So you can go to Perfmon and add all instances of the AzureWatch counters if you choose to view it there instead of using AzureWatch.

 

How to View an Azure Queue

Go to Raw Metrics, right click on Managed Queue Counters and add the queue to be viewed.

Once you assign the raw metric of the queue, bring up Aggregated Metrics for the role that will be writing to the queue. Right click and add new Aggregate. Use a value of 2 mins for queues (5 mins for perfmon counters) and ‘Use latest value’.

Click on Publish Changes and wait a few minutes.

Within AzureWatch click on Metrics View – Live to see the current number of items in the queue.

Using Perfmon Counters

Simple formula to remember

  1. Create a Raw Metric
  2. Create an Aggregate
  3. Create a Rule

For the desired role click on Raw Metrics.

In the Raw Metrics Window right click on Managed Performance Counters and select Add New.

In the Performance Counter Properties dialog choose the category and the counter. When you hit okay it should now show up in the Raw Metrics box.

To create an aggregate, choose the raw metric you just created, calculate an average, and use 5 mins interval for perfmon.  Click OK.

Click Rules under the instance you want to monitor, right click New to bring up Rule Edit box.

Add a Boolean expression to trigger an alert.  You can make this as simple or complex as you like. Here I am using 20% to make the perfmon counter trigger. It will send an email to my account (configured earlier in Azure setup) and I can also choose to do autoscaling. Here I have arbitrarily decided to scale down (by one instance) since its faster than scaling up for demo sake.

Click OK and Publish Changes.

Start the app you are monitoring and view the counters either in Perfmon or in AzureWatch.

You should also have an email notification sent to the email address you configured at the start. The rules will be evaluated in order and once the first one is found it will trigger. Here I had previously configured a rule to trigger when the ASP Requests Queued was >5.

Scaling     action was successful
Rule mikesscaledownrequest triggered itops\Production\AspWebRole to perform     scale action: ‘Scale down by’
From instance count of 2 to 1
Rule formula: MikeRequestsQueued<5
All known parameter values:
Average20CPUTime: 9.82; (metric count: 31);
Avg5UnresponsiveInstanceCount: 0; (metric count: 5);
MikeRequestsQueued: 0; (metric count: 7);

You can also view the perfmon counters in the Dashboard view which shows each role and its corresponding items being monitored.

Window Management

Moving the windows and displaying them was a bit of  a challenge. For some reason I did not have a menu bar show up on my display.  When you start up the Explorer window is present and that’s what you need to view the other windows. If you close it down without the menu there is no way to bring it b back up except restart the tool. Clicking on the items in the Explorer window brings up their corresponding windows.  If you grab it and drag it a cross with arrows appears to allow you to move it to the location you desire by dropping it on one of the arrows.

Here I am dragging the explorer window to drop it on one of the arrows.

 

Instance Scaling Limits

You can configure the monitored app to not scale past or below a certain # of instances.

From the Explorer window double click on one of the roles.

Enter a minimum or maximum number of instances regardless of the rules you have configured. Remember in the rules you can scale up or down when the rule is true.

Got to sit in on a demo of AzureWatch today from Paraleap Technology’s founder and CEO Igor Papirov. Great guy, and awesome tool for monitoring your Azure applications. Here are some key features of the tool.

  • You DO have to enable DiagMon in your code
  • Unlike AzureOps or SCOM, you DON’T need to enable perfmon counters ahead of time via code or PowerShell scripts to have them show up in AzureWatch. It’s like W2K8 Server perfmon where you just select the counters you want to use for the app you want to monitor.
  • AzureWatch doesn’t require making any changes to your code or VM and just consumes the data that’s produced from Windows Azure Diagnostics.
  • With AzureWatch you do not need to install any agents anywhere to get perfmon data
  • AzureWatch currently only runs its configuration tool as a client application that is being moved to the Web. The monitoring service that runs from their Cloud-based servers.
  • It does not manage logging or displaying any log/trace file entries. But it does log errors AzureWatch encounters when monitoring.
  • Use WADPerfmonCounters table from which to both write and read data.
  • Configurable alerts and thresholds for emails and auto-scaling
  • You can store the config settings on the AzureWatch servers so you can access from any client platform and have access to them
  • Configurable auto-scaling (up or down) using many different counters and scale units you configure
  • View monitoring using RSS feed capabilities, email, mobile app/phone, or online
  • Monitors Windows Azure SQL Database instances and Windows Azure SQL Federations
  • Provides historical reports/views of past data captured and can export it to Excel
  • View custom perfmon counters (defined in your .NET code)
  • Monitors Azure storage queues
  • You have the option to define custom aggregates on your raw data (average, total, max, min, etc)
  • Once you define raw metrics and aggregations you can then define boolean rules (simple or complex) to either send an alert or to configure auto scaling (up or down)
  • You cannot leverage a single representation of a rule across multiple role instances but you can cut and paste the rules easily in the designer to span more than one instance
  • Very inexpensive pricing – to run it is about 1.5 cents per/hour, pay-for-what-you-use
  • http://azurewatch.net to sign up for free trial

I have been recently playing around with SCOM 2012 and the System Center Monitoring Pack for Windows Azure Applications (http://www.microsoft.com/en-us/download/details.aspx?id=11324).  My goal was to understand how to monitor and view Performance counters for my Azure service using SCOM.  I had no previous experience with SCOM so this was a new adventure for me.

Overall I found SCOM very powerful, but not as straightforward to use as I had hoped.  There are more intuitive tools like AzureOps from OpStera to monitor Azure services and applications.  I had to create Run As accounts as Binary Authentication (for the certificate an private key) and Basic Authentication (for the certificate’s password). I then created a management pack which serves as a container for other SCOM entities.  From there I derived a Monitoring pack from the Windows Azure Application template. This is where I added the Azure-specific values to uniquely identify to SCOM the Azure service I wanted to monitor.   Finally I created rules, one per each performance counter I wanted to monitor.  Rule creation has a wizard to (most SCOM tasks I tried did) but a few of the fields were not as straightforward to complete, such as role instance type.

Counters used for an Azure application are a subset of those you would use for a Windows Server 2008 application.  For my Azure application I decided to use a sampling rate of one minute (1-2 is recommended) and a transfer rate of every 5 minutes. The transfer rate is how often Diagnostics Monitor will move the counter data from local storage into Azure storage.   I used the following Perfmon counters which are typical ones you would use in your Azure monitoring process.  The counters I monitored for a worker role are a subset of those I monitored for a Web role because the worker role does not include any IIS or ASP.NET functionality.

Counters for Web Roles

The following counter is used with the Network Interface monitoring object.

Bytes Sent/sec – Helps you determine the amount of Azure bandwidth you are using.

The following counters are used with the ASP.NET Applications monitoring object.

Request Error Events Raised – If this value is high you may have an application problem.  Excessive time in error processing can degrade performance.

  (__Total__)\Requests/Sec – If this value is high you can see displays how your application is behaving under stress. If low value, and other counters show a lot of activity going on (CPU or memory) there is probably a bottleneck or a memory leak.

 (__Total__)\Requests Not Found – If a lot of requests not found you may have a  virus or something wrong with the configuration of your Web site.

 (__Total__)\Requests Total – Displays the throughput of your application.  If this is low and CPU or memory are begin used in large amounts, you may have a bottleneck or memory leak.

(__Total__)\Requests Timed Out – A good indicator of performance. A high value means your system cannot turnover requests fast enough to handle new ones.  For an Azure application this might mean creating more instances of your Web role to the amount that these timeouts disappear.

(__Total__)\Requests Not Authorized – High value could mean a DoS attack. You can throttle them possibly to allow valid requests to come through.

Counters for Web and Worker Roles

For both worker and Web roles here are some counters to watch for your Azure service/application.

The following counter is used with the Processor monitoring object.

 (_Total)\% Processor Time – One of the key counters to watch in your application. If value is high along with the number of Connections Established you may want to increase the # of core in the VM for your hosted service.  If this value is high but low # of requests your application may be taking more CPU than it should.

The following counters are used with the Memory monitoring object.

Available Mbytes – If value is low you can increase the size of your Azure instance to make more memory available.

Committed Bytes – If constantly increasing it makes no sense to increase Azure instance size since you  most likely have a memory leak.

The following counters are used with the TCPv4 monitoring object.

Connections Established – shows how many connections to your service. If high, and Processor Time counter is low, you may not be releasing connections properly.

Segments Sent/sec – If high value may want to increase the Azure instance size.

Summary

In summary, using Perfmon counters is a valuable way to indirectly keep an eye on your application’s use of Azure resources. Often performance monitors can be used more effectively when in conjunction with each other. For instance, if you see a lot of memory being used you might want to check CPU utilization. If high CPU, lots of apps are using the memory and you need to scale up. If  low CPU then you probably have an issue with how the memory is being allocated or released.

You can use SCOM to track Perfmon if you know how to use it and your company has invested financially in a license.  Remember SCOM is a very rich and robust enterprise-scale tool with a ton of functionality. For instance, once you configure your hosted service as a monitoring pack, you can then view it in the Distributed Applications tab. This gives you consolidated and cascading summaries of the performance and availability of your Azure service.

If you don’t own or use SCOM, or if you merely want to keep it simple, then AzureOps is probably an easier option. It also has no installation/setup as well (runs as a Web service) and simple Azure auto-scaling based upon Perfmon threshold values. (http://www.opstera.com/products/Azureops/).

A special thanks to Nuno Godhino for providing me information for this post.

In my previous post, I discussed separate storage accounts and the locality of those accounts as well as transfer, sample and trace logging levels as ways to optimize using Diagnostics using Windows Azure. This post discusses six additional ways to optimize your Windows Azure Diagnostic experience.
  1. Data selection– Carefully select the minimal amount of diagnostic data that you need to monitor your application.  That data should contain only the information you need to identify the issue and troubleshoot your application.  Logging excess data increases the clutter of looking through logs data while troubleshooting and costs more to store in Windows Azure.
  2. Purge Azure diagnostic tables – Periodically purge the diagnostic tables for stale data that you will not need any more to avoid paying storage costs for dormant bits. You can store it back on-premise if you feel you will need it sometime later for historical or auditing purposes.  There are tools to help with this including System Center Monitoring Packfor Windows Azure.
  3. Set Perfmon Counters during role OnStart– Diagnostics are set per role instance. Due to scalability needs the number of role instances can increase or decrease. By putting the initialization of the Perfmon counters in the OnStart method (which is invoked when a role instance is instantiated) you can ensure your role instance will always start configured with the correct Perfmon counters.  If you don’t specifically setup the counters during the OnStart method, the configurations might be out of sync.  This is a common problem for customers who do not define the Perfmon counters in OnStart.
  4. Optimize Performance Counters– Sometimes diagnostics are like gift purchases before Christmas sale.  You need only a few of them but due to the lower prices you end up buying more than you need. The same goes with performance counters. Be sure what you are gathering are meaningful to your application and will be used for alerts or analysis.   Windows Azure provides a subset of the performance counters available for Windows Server 2008, IIS, and ASP.NET. Here are some of the categories of commonly used PerfMon counters for Windows Azure applications. For each of these categories there can be more than one actual counter to track:
    1. NET CLR exceptions
    2. NET CLR memory
    3. ASP.NET process and app restarts
    4. ASP. NET requests
    5. Memory and CPU
    6. TCPV4 connections
    7. Network Interface (Microsoft Virtual Machine Bus Network Adapter) bytes
  5. Manage max buffer size– When configuring Perfmon counters to be used in your role’s OnStart method you can specify a buffer size using the PerformanceCounters.BufferQuotaInMB property of the DiagnosticMonitorConfiguration object. If you set this to a value that fills up before the buffer is transferred from local to Azure storage you will lose the oldest events.  Make sure your buffer size has room to spare to prevent loss of diagnostic data.
  6. Consider WAD Config file – There are some cases where you may not want to put all the calls to configure Perfmon counters or logs to use in a config file instead of the code for the role. For instance, if you are using a VM which does not have a startup routine or non-default diagnostic operations, you can use the WAD config file to manage that.  The settings in the config file will be set up before OnStart method gets called.

Windows Azure Diagnostics enables you to collect diagnostic data from an application running in Windows Azure. This data can be used for debugging and troubleshooting, measuring performance, monitoring resource usage, traffic analysis and capacity planning, and auditing. In this blog post, I will share some optimizations that I have learned while working with Windows Azure Diagnostics during my time at Aditi Technologies.

  1. Keep separate storage accounts – Keep your Azure data storage in a different storage account than your application data. There is no additional cost to do this. If for some reason you need to work with Microsoft support and have them muck through your diagnostics storage locations you don’t have to allow them access to potentially sensitive application data.
  2. Locality of storage and application – Make sure and keep your storage account in the same affinity group (data center) as the application that is writing to it.  If for some reason you can’t do so, use a longer transfer interval so data is transferred less frequently but more of it is moved at once.
  3. Transfer interval – For most applications I have found that a transfer once every 5 minutes is a very useful rate.
  4. Sample interval – For most applications setting the sample rate to once per 1-2 minutes provides a good yet frugal sampling of data.  Remember that when your sampled data is move to Azure storage you pay for all you store. So you want to store enough information to help you get a true window into that performance counter, but not too much that you pay unnecessarily for data you won’t need.
  5. Trace logging level – While using the Verbose logging filter for tracing may give you lots of good information, it is also very chatty and your logs will grow quickly.  Since you pay for what you use In Azure only use the Verbose trace level when you are actively working on a problem. Once it is solved scale back to Warning, Error, or Critical levels which are less common and smaller amounts of messages written to the logs.

Stay tuned for my next post where I will write about the six additional ways to optimize the use of Windows Azure Diagnostics.