I have a Pandas dataframe ‘dt = myfunc()’ , and copy the screen output from IDLE as below:
>>> from __future__ import division
>>> dt = __get_stk_data__(['*'], frq='CQQ', from_db=False) # my function
>>> dt = dt[dt['ebt']==0][['tax','ebt']]
>>> type(dt)
<class 'pandas.core.frame.DataFrame'>
>>> dt
tax ebt
STK_ID RPT_Date
000719 20100331 0 0
20100630 0 0
20100930 0 0
20110331 0 0
002164 20080331 0 0
300155 20120331 0 0
600094 20090331 0 0
20090630 0 0
20090930 0 0
600180 20090331 0 0
600757 20110331 0 0
>>> dt['tax_rate'] = dt.tax/dt.ebt
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "D:\Python\Lib\site-packages\pandas\core\series.py", line 72, in wrapper
return Series(na_op(self.values, other.values),
File "D:\Python\Lib\site-packages\pandas\core\series.py", line 53, in na_op
result = op(x, y)
ZeroDivisionError: float division
>>>
It costs me a lot time to figure out why Pandas raises the ‘ZeroDivisionError: float division’ , while Pandas works very well for below sample code:
tuples = [('000719','20100331'),('000719','20100930'),('002164','20080331')]
index = MultiIndex.from_tuples(tuples, names=['STK_ID', 'RPT_Date'])
dt =DataFrame({'tax':[0,0,0],'ebt':[0,0,0]},index=index)
dt['tax_rate'] = dt.tax/dt.ebt
>>> dt
ebt tax tax_rate
STK_ID RPT_Date
000719 20100331 0 0 NaN
20100930 0 0 NaN
002164 20080331 0 0 NaN
>>>
I expect Pandas offer ‘NaN’ for both cases, why ‘ZeroDivisionError’ happens in first case ? How to fix it ?
below codes & screen output is attached to provide further information to debug
def __by_Q__(df):
''' this function transforms the input financial report data (which
is accumulative) to qurterly data
'''
df_q1=df[df.index.map(lambda x: x[1].endswith("0331"))]
print 'before diff:\n'
print df.dtypes
df_delta = df.diff()
print '\nafter diff: \n'
print df_delta.dtypes
q1_mask = df_delta.index.map(lambda x: x[1].endswith("0331"));
df_q234 = df_delta[~q1_mask]
rst = concat([df_q1,df_q234])
rst=rst.sort_index()
return rst
screen output:
before diff:
sales float64
discount object
net_sales float64
cogs float64
ebt float64
tax float64
after diff:
sales object
discount object
net_sales object
cogs object
ebt object
tax object
@bigbug, how are you getting the data out of the SQLite backend? If you look in
pandas.io.sql, theread_framemethod has acoerce_floatparameter that should convert numerical data to float if possible.Your second example works because the DataFrame constructor tries to be clever about types. If you set the dtype to object then it fails:
Check your data importing code again and let me know what you find?