I have an intraday 30-second interval time series data in a CSV file with the following format:
20120105, 080000, 1
20120105, 080030, 2
20120105, 080100, 3
20120105, 080130, 4
20120105, 080200, 5
How can I read it into a pandas data frame with these two different indexing schemes:
1, Combine date and time into a single datetime index
2, Use date as the primary index and time as the secondary index in a multiindex dataframe
What are the pros and cons of these two schemes? Is one generally more preferable than the other? In my case, I would like to look at time-of-the-day analysis but am not entirely sure which scheme will be more convenient for my purpose. Thanks in advance.
Combine date and time into a single datetime index
Use date as the primary index and time as the secondary index in a
multiindex dataframe
My naive inclination would be to prefer a single index over the multiindex.
However, I am not very experienced with Pandas, and there could be some advantage to having the multiindex when doing time-of-day analysis.
I would try coding up some typical calculations both ways, and then see which one I liked better on the basis of ease of coding, readability, and performance.
This was my setup to produce the results above.
You can of course use
instead of