I read-in a file and plot it with pandas DataFrame. The index is DatetimeIndex, and then I use ginput(1) method to get one point, however the coordinate which I get is wrong.
The code is as follows:
import pandas as pd
from matplotlib.dates import num2date, date2num
ts = pd.date_range('2012-04-12,16:13:09', '2012-04-14,00:13:09', freq='H')
df = pd.DataFrame(index=ts)
df[0] = 20.6
I then plot and click on the graph using ginput:
df.plot()
t = pylab.ginput(n=1) #click somewhere near 13-APR-2012
However, the first item appears to be a float
In [8]: x = t[0][0] # ~ 370631.67741935479
In [9]: num2date(x)
Out[9]: datetime.datetime(1015, 10, 3, 16, 15, 29, 32253, tzinfo=<matplotlib.dates._UTC object at 0x104196550>)
# this is way out!
The docs suggest that it should be using these floats (from datetonum):
In [10]: dt = pd.to_datetime('13-4-2012', dayfirst=True)
In [11]: date2num(dt)
Out[11]: 734606.0
What is this float, and how can I convert it to a datetime?
Note: If I remove one of the rows from the dataframe this works correctly:
df1 = df.drop(ts[1], axis=0)
...
For data indexed with a regular frequency, pandas converts the underlying index to a PeriodIndex so that the resolution of the x-tick labels are updated automatically when zooming in and out. So the ordinals you get are Period ordinals.
In order to convert it back into datetime, you can do the following:
*Timestamp is a subclass of datetime that keeps nanoseconds
That being said, ideally we would hide the conversion from the user (or not have to do the conversion at all if possible!), as soon as I have enough time to refactor all the plotting code…