On a recent project I recommended the use of Azure Table Storage.
In one scenario, there was a need to import large volumes of data from an external system quickly and reliably. In another, we wanted to provide a read only API that represented the state of business domain entities changing over time. It was an API that needed to be fast under load. In each case the primary focus was the volume of the data and speed of the storage. The structure of the data and querying flexibility was less of a priority.
Another thing that was at the back of my mind was the tight coupling database schemas can have on a solution. I have been on many projects where a strongly defined schema became an inhibitor rather than an enabler. Once you have invested effort in a comprehensive common schema it becomes one big service interface. Subsequently simple changes to the database can have a large impact across applications… and this was something I wanted to avoid.
In my mind a NoSQL database looked a good fit and given that we were targeting Azure, Table Storage looked like the way to go. At the time Azure DocumentDB had just been announced and didn’t look mature enough for our needs.
So forward wind to today… We are taking Azure Table Storage out of the solution… why?
Table storage has very limited querying support. Essentially you are limited to one index query per table which is based on the Partition and Row keys. This may be okay if your querying needs are limited, but you quickly become unstuck if you have requirements for ad-hoc, “what if?” querying. As it turned out, a limited logging implementation meant that the only reliable way to determine what was happening in the system was to query the data sources.
Backup and Restore
Backing up and restoring SQL databases is bread and butter for most organisations. Adding Azure Table Storage into the mix however, resulted in much head scratching.
Azure Table Storage’s partitioning and data duplication (across data centres if geo-redundancy is enabled) means it is highly available and reliable. Data loss or corruption due to system failure is unlikely. So are backups aren’t required, right?
Backup and restore provides solutions for other use cases. When you think about it, you need to guard against accidental data loss, either via the user or due to application bugs, and you might need to provide a means to load data in test environments. Azure Table Storage out of the box does not provide a good story here.
We looked at dealing with these situations by recreate data from source. This became error prone and time consuming. We were lacking querying capability too so it was difficult to establish whether “all the expected data was available” in the destination system. Later we looked at third party tools such as Azcopy or those provided by Cherry Safe or Cerebrata. In the end, all of these options represented additional time and effort to solve an unexpected problem – one that had an easy solution if only we were using a SQL database!
Building Custom Tooling
In order to solve these and other problems we were tending to build custom tooling for our needs. Although we were running an Agile project we did have critical business milestones and on this project story points burnt building tooling were not seen as story points delivering business value. I’ll probably elaborate that in a future post!
These issues became an insurmountable challenge for table storage. I started hearing phrases such as “not fit for purpose” and “inadequate” from a number of stakeholders. The heart and minds of the team were lost and as a pragmatist there was little point fighting against the tide.
This is not a failure however. I have learnt something and I hope members of the project team have too. In Agile and Lean approaches you need to be able to experiment, get rapid feedback and change course often and quickly. In the past it would have been almost impossible to change technology 2/3s of the way into a project. Yet we are doing so. We had put incremental releases live with Table Storage and realised that is wasn’t right for us. All stakeholders understand the value that will come because they have felt the pain of not making this change.