For the past two months I’ve had the fortune of working on an Azure cloud project. Within the Azure technology spectrum, our project team is using a lot of Azure features including web roles, worker roles, storage (table, blob, and queue), SQL Azure, and calling native/unmanaged code.
It’s been quite a wild ride for the last two months.
Azure offers up many new features that can make projects and development easier, but in many ways Azure makes projects more difficult. If our team knew two months ago what we know now about Azure, we’d probably be farther along in the project - and we may have approached the project a little differently.
At the time of this writing Azure is in a “CTP” state, so there are bugs and the technology is subject to change in the near future. However, I think there are some high level takeaways from the technology that you will want to account for on your Azure project:
- Differences between the local Azure dev fabric and “live” Azure
- Minimize web.config and app.config usage
- Visibility into your application
- Availability of Azure environments for your team
- Deployment time
Azure “dev fabric” vs. the real thing
What you develop on your own computer with the Azure SDK and the Azure Development Fabric won’t necessarily work on the live Azure site. Even better, features of Azure that work on the live Azure site won’t necessarily work on your local dev fabric. For example, programmatic creation of new tables in Table Storage is not allowed in the local dev fabric. We’ve also seen problems where our native c++ code works fine in the dev fabric but won’t run in the real Azure cloud.
This local-versus-live issue also applies to SQL Azure, as it is a completely different animal than SQL 2005 or SQL 2008. The database scripts you use in your local dev environment were likely generated from a SQL Management Studio script wizard (or another 3rd party tool) and likely won’t pass the SQL Azure litmus test. SQL Azure offers a limited subset of the Microsoft SQL Server features you are used to. I guarantee SQL Azure will bark at you about things it doesn’t like in your auto-generated scripts . You are asking for trouble if you run your generated scripts in SQL Azure for the first time the morning before a live demo.
You will need to handle these challenges through frequent test deployments to a live Azure environment. There is no other way to make sure your app works in the cloud than to actually deploy it to the cloud. Make time in your development process to deploy often to Azure and test your features.
Minimize web.config and app.config usage
In an Azure application package, web.config and app.config files are “compiled” into a single package file (.cspkg) along with binary and static content. You deploy that package to Azure and start up your web or worker roles. If you need to make a configuration change to a setting in a config file, you need to re-package your entire Azure application, suspend your Azure roles, upgrade them, and finally spin up your roles again.
This is obviously a lengthy process for an otherwise simple change - especially if you merely need to make a config change on a test site (which you will do very often during development and testing).
In short, don’t use web.config or app.config files for your configuration settings. Instead, leverage the Azure configuration settings files (.cscfg) to store your settings or obtain them from a database table or another storage location. Azure configuration settings can be updated while your Azure roles are running and doesn’t require them to be suspended.
This may be the simplest lesson you can take away from this article. It’s easy to account for and it will save you tons of time during development.
Visibility Into Your Application
When you run your app in the Azure cloud, it’s kind of like driving a car blindfolded: you might get to your destination but do you know if you hit any mailboxes? Since your app is “in the cloud” you can’t obtain any operating system debug output or use any operating system tools to help you see what is going on in your app. Azure provides some logging features, but at the time of this writing those logging features are disabled. The rumor is that new logging features are coming.
Granted, the new logging features may solve all of your problems with visibility into your app, but until those features arrive you will need to provide your own visibility.
I highly recommend using a logging library such as log4net as it will allow you to write diagnostic and error statements consistently from anywhere in your app (including web roles and worker roles). Through configuration you can direct those logging statements anywhere you choose such as database tables.
On our project we’ve favored logging to SQL tables, but you may want to consider logging to Azure Table storage instead as I believe it is cheaper by the byte (please do your own research to verify that). The down side is that Table storage isn’t as easily queried as SQL tables. Logging to text files won’t help much as you can’t really get at them in the cloud.
If the Azure logging features play out nicely, you may want to consider looking at this Azure log4net appender: http://neilmosafi.blogspot.com/2009/01/azure-event-log-and-log4net.html.
Through logging, you’ll be able to get a near-real time snapshot of what is going on in your app.
Availability of Azure environments for your team
Your team will likely have one Live ID and one Windows Azure host for your production app. Windows Azure hosts offer two “slots” for deployment: a production slot and a staging slot. Let’s assume that your production slot is the real .com site and that the staging slot is a “QA” site that you use for beta testing or to give a controlled user base a preview of the next release. For all intents and purposes, both the production and QA slots are always live.
Thus, you will need another slot to quickly and frequently test your app because you won’t want to overwrite or disrupt the live or QA slots. Since each host can only hold two “slots”, you will need another Azure hosting account on another Live ID to do this. At the time of this writing, CTP accounts are free and you can sign up for them quickly and easily. However, when Azure goes live with “v1” I don’t know if that will be the case.
I’d recommend having Azure hosting accounts for each person who will be performing any type of deployment on the project. This could be both developers and testers. You don’t want to risk only having one “deployment guy” since Azure deployments aren’t trivial. Make sure you can pass the “hit by a bus” test  for this.
Deploying an Azure app to the live cloud isn’t like creating a virtual directory on an internal server - it requires a Live ID account and only offers two slots per Live ID. In other words, treat Azure accounts and deployment slots as a scarce resource. You will probably be able to allocate them, but they are owned by an individual Live ID and Azure accounts are handed out and controlled by Microsoft.
I laughed out loud when I saw this message for the first time on the Azure deployment screen:
“90% of the time, this operation takes less than 54 seconds”. It appears that the duration is dynamically calculated and is a real-time system metric. The metric and message may be true, but are misleading. That “Deploy…” button only involves uploading your deployment package and configuration file. It does not account for the time spent spinning up your roles
Spin up time has been highly inconsistent in my experience. By spin up time, I mean the time it takes for your roles to be in a “Ready” state:
Be prepared to get extremely frustrated with the length of time it takes to spin up your roles and how reliably Azure will actually spin them up successfully. Again, I’m speaking from a CTP perspective here and the technology isn’t even “v1” yet, but this is something to watch out for. On a great day, the entire upload and initialization process can take less than 10 minutes. On a bad day, the services may take more than 30 minutes to spin up - or not spin up at all.
This assumes that your code and configuration are correct the first time you upload and spin up the roles. If you find a bug with your configuration or code, you’ll need to suspend the services and start the entire process over again.
For larger, more visible deployments at the end of a development cycle you’ve probably allocated more time for performing a deployment and you can probably cope with a longer deployment if there are any mistakes. However, in the middle of a development cycle when you need to test changes in a live environment and there is a lot of code churn, you will be prone to make mistakes or find bugs in the code or configuration - and you will need to deploy again and again. Be prepared for a lot of frustration.
Are you sure about this?
In conclusion, make sure you are ready to invest extra time into your development process to make sure your Azure projects are successful. Part of the extra time will be due to an Azure learning curve (which I have not addressed here). The remainder (and possibly the majority) of the time will be spent dealing with these issues I’ve talked about here. You will need to weigh this extra time and potential stress against the benefits you will gain from Azure as a cloud host.
 Microsoft may provide SQL Azure scripting tools in the future which would help deal with this challenge.
 The “hit by a bus” test passes when you can confidently say your project won’t suffer (much) if any of your project team members gets hit by a bus