I have a pandas dataframe that I filled with this:
import pandas.io.data as web
test = web.get_data_yahoo('QQQ')
The dataframe looks like this in iPython:
In [13]: test
Out[13]:
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 729 entries, 2010-01-04 00:00:00 to 2012-11-23 00:00:00
Data columns:
Open 729 non-null values
High 729 non-null values
Low 729 non-null values
Close 729 non-null values
Volume 729 non-null values
Adj Close 729 non-null values
dtypes: float64(5), int64(1)
When I divide one column by another, I get a float64 result that has a satisfactory number of decimal places. I can even divide one column by another column offset by one, for instance test.Open[1:]/test.Close[:], and get a satisfactory number of decimal places. When I divide a column by itself offset, however, I get just 1:
In [83]: test.Open[1:] / test.Close[:]
Out[83]:
Date
2010-01-04 NaN
2010-01-05 0.999354
2010-01-06 1.005635
2010-01-07 1.000866
2010-01-08 0.989689
2010-01-11 1.005393
...
In [84]: test.Open[1:] / test.Open[:]
Out[84]:
Date
2010-01-04 NaN
2010-01-05 1
2010-01-06 1
2010-01-07 1
2010-01-08 1
2010-01-11 1
I’m probably missing something simple. What do I need to do in order to get a useful value out of that sort of calculation? Thanks in advance for the assistance.
If you’re looking to do operations between the column and lagged values, you should be doing something like
test.Open / test.Open.shift().shiftrealigns the data and takes an optional number of periods.