We are using windows azure table for logging errors within our applications hosted inside either worker role or web role. We are logging sufficient information in table so that it is easy to identify which role, which component of a class has logged error.
Component Id (fully qualified class name) is used partition key and random unique Guid is as row key.
This logging information is displayed on ASP.NET MVC website, where administrator can search this log based on filter criteria like component id, date range, role identifier, severities etc.
This works fine till table is small. Once azure table contains huge amount of records(200000 or more), filter on azure table is taking too long time and it times out. We are using .NET azure storage API to query tables.
We also wanted a paging on returned resultset, but it looks like in azure table we don’t get exact count for records returned.
We tried using azure storage API to apply filter and get data based on current page number, but its not working. I understand that we may have to redesign our table structure, especially partitionkey and rowkey, but not sure how to proceed with.
If you need a lot of custom filters, table storage may not offer the best performance. Consider to use SQL Azure if possible. To increase table storage performance, you can try:
Do not use custom fitlers. Only filter the result using partition/row key. This will give you the best performance.
Make each partition small. Scanning a small partition is faster than scanning a large partition.
As for paging, unless you have a cross partition query, paging should work as expected. If you ask for 20 entities, exactly 20 entities (if there’re so many) will be returned. But if a partition boundary is encountered, a new page is required. For example, if a partition boundary is encountered when the 15th matching entity is found, only 15 entities will be returned in this request. You have to create a new request to query the next partition. If you keep partitions small, you may encounter this problem more frequently. So you need to design the system to automatically query the next partition if needed.
Also keep in mind table storage’s paging is not based on page number. It is based on continuation token. Refer to http://msdn.microsoft.com/en-us/library/dd135718.aspx for more information.