I am a total newbie with Azure! The purpose is to return the rows based on the timestamp stored in the RowKey. As there is a transaction cost with each query, I want to minimize the number of transactions/queries whilst maintain performance
These are the proposed Partition and Row Keys:
- Partition Key: TextCache_(AccountID)_(ParentMessageId)
- Row Key: (DateOfMessage)_(MessageId)
Legend:
- AccountId – is an integer
- ParentMessageId – The parent messageId if there is one, blank if it is the parent
- DateOfMessage – Date the message was created – format will be DateTime.Ticks.ToString(“d19”)
- MessageId – the unique Id of the message
I would like to get back from a single query the rows and any childrows that is > or < DateOfMessage_MessageId
Can this be done via my proposed PartitionKeys and RowKeys?
ie.. (in psuedo code)
var results = ctx.PartitionKey.StartsWith(TextCache_AccountId)
&& ctx.RowKey > (TimeStamp)_MessageId
Secondly, if there I have a number of accounts, and only want to return back the first 10, could it be done via a single query
ie.. (in psuedo code)
var results = (
(
ctx.PartitionKey.StartsWith(TextCache_(AccountId1)) &&
&& ctx.RowKey > (TimeStamp1)_MessageId1 )
)
||
(
ctx.PartitionKey.StartsWith(TextCache_(AccountId2)) &&
&& ctx.RowKey > (TimeStamp2)_MessageId2 )
) ...
)
.Take(10)
The short answer to your questions is yes, but there are some things you need to watch for.
Azure table storage doesn’t have a direct equivalent of
.StartsWith(). If you’re using the storage library in combination with LINQ you can use.CompareTo()(> and < don’t translate properly) which will mean that if you run a search for account 1 and you ask the query to return 1000 results, but there are only 600 results for account 1, the last 400 results will be for account 10 (the next account number lexically). So you’ll need to be a bit smart about how you deal with your results.If you padded out the account id with leading 0s you could do something like this (pseudo code here as well)
Something else to bear in mind is that queries to Azure Tables return their results in
PartitionKeythenRowKeyorder. So in your case messages without aParentMessageIdwill be returned before messages with aParentMessageId. If you’re never going to query this table byParentMessageIdI’d move this to a property.If
TextCache_is just a string constant, it’s not adding anything by being included in thePartitionKeyunless this will actually mean something to your code when it’s returned.While you’re second query will run, I don’t think it will produce what you’re after. If you want the first ten rows in
DateOfMessageorder, then it won’t work (see my point above about sort orders). If you ran this query as it is and account 1 had 11 messages it will return only the first 10 messages related to account 1 regardless if whether account 2 had an earlier message.While trying to minimise the number of transactions you use is good practice, don’t be too concerned about it. The cost of running your worker/web roles will dwarf your transaction costs. 1,000,000 transactions will cost you $1 which is less than the cost of running one small instance for 9 hours.