I’m seeing a strange behaviour in the pandas.to_datetime function. If I put in a string, I get the correct date:
In [100]: pandas.to_datetime(' 2012-10-19 16:32:35')
Out[100]: datetime.datetime(2012, 10, 19, 16, 32, 35)
However, I’ve got a data set that has a datetime column with strings that have the same format as the string in line 100 above:
In [101]: data_frame = pandas.read_csv('my_data.csv', header=None, names=['bid', 'datetime'])
In [102]: data_frame.ix[0]
Out[102]:
bid 428916
datetime 2012-10-19 16:32:35 # NOTE: THIS IS A STRING
Name: 0
When I try to set the datetime column to a timestamp, I get a very strange datetime object:
In [102]: data_frame['datetime'] = pandas.to_datetime(data_frame['datetime'])
In [103]: data_frame.ix[0]
Out [103]:
bid 428916
datetime 1970-01-16 80:32:35 # SEE THIS
Name: 0
So either I’m misunderstanding the way that to_datetime works (very possible) or this is unexpected behavior (less possible). Which is it?
I suspect the problem is in the printing of numpy datetime64[ns] objects. If you take those funny date values and convert them back into pandas Timestamp objects, they look normal.
should give a normal-looking result.