I have a csv file with a time column representing POSIX timestamps in milliseconds. When I read it in pandas, it correctly reads it as Int64 but I would like to convert it to a DatetimeIndex. Right now I first convert it to datetime object and then cast it to a DatetimeIndex.
In [20]: df.time.head()
Out[20]:
0 1283346000062
1 1283346000062
2 1283346000062
3 1283346000062
4 1283346000300
Name: time
In [21]: map(datetime.fromtimestamp, df.time.head()/1000.)
Out[21]:
[datetime.datetime(2010, 9, 1, 9, 0, 0, 62000),
datetime.datetime(2010, 9, 1, 9, 0, 0, 62000),
datetime.datetime(2010, 9, 1, 9, 0, 0, 62000),
datetime.datetime(2010, 9, 1, 9, 0, 0, 62000),
datetime.datetime(2010, 9, 1, 9, 0, 0, 300000)]
In [22]: pandas.DatetimeIndex(map(datetime.fromtimestamp, df.time.head()/1000.))
Out[22]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2010-09-01 09:00:00.062000, ..., 2010-09-01 09:00:00.300000]
Length: 5, Freq: None, Timezone: None
Is there an idiomatic way of doing this? And more importantly is this the recommended way of storing non-unique timestmaps in pandas?
You can use a converter in combination with read_csv.