I’m attempting to create a Cassandra schema for stock price data.
Each records has the following fields:
- stock symbol
- date
- open price
- close price
As you can see there is no single field that can act as a unique identifier.
I’d like to create something like:
Stocks { // column family
????? { //row key - unique id
symbol: 'GOOG' // column -- secondary index
date: '2005/01/01' //column -- secondary index
open: '500' //column
close: '501' //column
}
}
As you can see I want to take advantage of the secondary index feature in cassandra 0.7.
When inserting the data into the database, should I just create a TimeUUID for each record? That seems like the most straightforward route.
Note: I do not work in the financial industry, I’m just using this dataset to wrap my mind around Cassandra.
The answer depends on how you will use the data. I assume, based on the indexes you’re indicating, that you want to search by symbol and data. I’m going to further assume that you wish to be able to get a list of the data for a given date or the dates/stats for a given symbol. Stock data is static, so we don’t need to worry about our schema being updatable.
We can work with a single column family here, which we will call ‘StockData’. Each symbol will have a row, as will each date. So for the above record, you would have keys of ‘GOOG’ and ‘2005/01/01’.
For the first type of key (symbols), your column names would be something like ‘2005/01/01-start’ and ‘2005/01/01-end’ with the column value being the start and end values.
For the second type of key (dates), your column names would be something like ‘GOOG-start’ and ‘GOOG-end’. Again, the start and end values would be stored as the column values.
To illustrate:
Now, you can select all or some stats for a symbol (ordered by date), using the get_slice function on the symbol’s row. Similarly, you can get some or all stats for a day (ordered by symbol).
With NOSQL systems there are nearly as many ways to structure your data as there are developers. This is by no means the only way. Just something to get you started.