I would like to store historical equity price data in a table in the SQL Azure database.
I will get around 100000 equity prices every 15 minutes and some of them may or may not change their value. So I need to store around (5000 * 32 (8 hours * 4 times) = 160000) 160000 records everyday.
Currently, the equity table has got the following structure with around 20 columns.
Equity table
---------------
ID INT PK,
Name Varchar(20),
Value Money,
Currency Varchar(10),
.......
The new table (HistoricalPrices) where I would like to store historical prices contains the following structure.
HistoricalPrices
-------------------
ID INT PK,
EquityID INT FK,
[Date] DateTime,
Value Money
If I store these 160000 records everyday, in a month my table will get around 5 million records.
My question is, how this table is going to cope with the data, do I get any performance problems with this, is there any other way in terms of maintaining this data and do I need to make any changes in the table structure etc.?
With proper indexing and clustering, performance shouldn’t be an issue with appropriate, selective queries. Traditional operational issues like backups, reindexing jobs and restricting the volumes of data returned will need consideration although this won’t be your problem with Azure.
Note that Azure DB size restrictions will probably force you to horizontally partition (shard ) at some point ( http://blogs.msdn.com/b/sqlazure/archive/2010/06/24/10029719.aspx) (Azure doesn’t support TABLE partitioning.)
http://msdn.microsoft.com/en-us/library/ms345146(v=sql.90).aspx
Also to consider is overflow of your 32 bit int PK – although at current rates you have over 50 years worth, if you track at increased frequencies (e.g. more exchanges or more stocks) you will need to consider a 64 bit INT.