(Python 2.7, Pandas 0.9) This seems like a simple thing to do, but I

Question

0

Asked: June 14, 20262026-06-14T16:08:16+00:00 2026-06-14T16:08:16+00:00

(Python 2.7, Pandas 0.9) This seems like a simple thing to do, but I

0

(Python 2.7, Pandas 0.9)

This seems like a simple thing to do, but I can’t figure out how to calculate the difference between two date columns in a dataframe using Pandas. This dataframe already has an index, so making either column into a DateTimeIndex is not desirable.

To convert each date column from strings I used:

data.Date_Column = pd.to_datetime(data.Date_Column)

From there, to get elapsed time between 2 columns, I do:

data.Closed_Date - data.Created_Date

which returns an error:

TypeError: %d format: a number is required, not a numpy.timedelta64

Checking dtypes on both columns yields datetime64[ns] and the individual dates in the array are type timestamp.

What am I missing?

EDIT:

Here’s an example where I can create separate DateTimeIndex objects and accomplish what I want, but when I try to do it in the context of a dataframe, it fails.

Created_Date = pd.DatetimeIndex(data['Created_Date'], copy=True)
Closed_Date = pd.DatetimeIndex(data['Closed_Date'], copy=True)

Closed_Date.day - Created_Date.day
[Out] array([ -3, -16,   5, ...,   0,   0,   0])

Now the same but in a dataframe:

data.Created_Date = pd.DatetimeIndex(data['Created_Date'], copy=True)
data.Closed_Date = pd.DatetimeIndex(data.Closed_Date, copy=True)

data.Created_Date.day - data.Created_Date.day

AttributeError: 'Series' object has no attribute 'day'

Here’s some of the data if you want to play around with it:

data['Created Date'][0:10].to_dict()
{0: '1/1/2009 0:00',
 1: '1/1/2009 0:00',
 2: '1/1/2009 0:00',
 3: '1/1/2009 0:00',
 4: '1/1/2009 0:00',
 5: '1/1/2009 0:00',
 6: '1/1/2009 0:00',
 7: '1/1/2009 0:00',
 8: '1/1/2009 0:00',
 9: '1/1/2009 0:00'}

data['Closed Date'][0:10].to_dict()
{0: '1/7/2009 0:00',
 1: nan,
 2: '1/1/2009 0:00',
 3: '1/1/2009 0:00',
 4: '1/1/2009 0:00',
 5: '1/12/2009 0:00',
 6: '1/12/2009 0:00',
 7: '1/7/2009 0:00',
 8: '1/10/2009 0:00',
 9: '1/7/2009 0:00'}

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-14T16:08:17+00:00

Update: A useful workaround is to just smash this with the DatetimeIndex constructor (which is usually much faster than an apply), for example:

DatetimeIndex(df['Created_Date']).day

In 0.15 this will be vailable in the dt attribute (along with other datetime methods):

df['Created_Date'].dt.day

Your error was the syntax, which although one might hope it would work, it doesn’t:

data.Created_Date.day - data.Created_Date.day
AttributeError: 'Series' object has no attribute 'day'

With more complicated selections like this one you can use apply:

In [111]: df['sub'] = df.apply(lambda x: x['Created_Date'].day - x['Closed_Date'].day, axis=1)

In [112]: df[['Created_Date','Closed_Date','sub']]
Out[112]: 
         Created_Date         Closed_Date  sub
0 2009-01-07 00:00:00 2009-01-01 00:00:00    6
1                 NaT 2009-01-01 00:00:00    9
2 2009-01-01 00:00:00 2009-01-01 00:00:00    0
3 2009-01-01 00:00:00 2009-01-01 00:00:00    0
4 2009-01-01 00:00:00 2009-01-01 00:00:00    0
5 2009-01-12 00:00:00 2009-01-01 00:00:00   11
6 2009-01-12 00:00:00 2009-01-01 00:00:00   11
7 2009-01-07 00:00:00 2009-01-01 00:00:00    6
8 2009-01-10 00:00:00 2009-01-01 00:00:00    9
9 2009-01-07 00:00:00 2009-01-01 00:00:00    6

Be wary, you’ll probably ought to do something separately with these NaTs:

In [114]: df.ix[1][1].day # NaT.day
Out[114]: -1

.

Note: there is similarly strange behaviour using .days on a timedelta with NaT:

In [115]: df['sub2'] = df.apply(lambda x: (x['a'] - x['b']).days, axis=1)

In [116]: df['sub2'][1]
Out[116]: 92505

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

(Python 2.7, Pandas 0.9) This seems like a simple thing to do, but I

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply