I have two pandas dataframes: dfLeft and dfRight with the date as the index.
dfLeft:
cusip factorL
date
2012-01-03 XXXX 4.5
2012-01-03 YYYY 6.2
....
2012-01-04 XXXX 4.7
2012-01-04 YYYY 6.1
....
dfRight:
idc__id factorR
date
2012-01-03 XXXX 5.0
2012-01-03 YYYY 6.0
....
2012-01-04 XXXX 5.1
2012-01-04 YYYY 6.2
Both have a shape close to (121900,3)
I tried the following merge:
test = pd.merge(dfLeft, dfRight, left_index=True, right_index=True, left_on='cusip', right_on='idc__id', how = 'inner')
This gave test a shape of (60643500, 6).
Any recommendations on what is going wrong here? I want it to merge based on both date and cusip/idc_id. Note: for this example the cusips are lined up, but in reality that may not be so.
Thanks.
Expected Output
test:
cusip factorL factorR
date
2012-01-03 XXXX 4.5 5.0
2012-01-03 YYYY 6.2 6.0
....
2012-01-04 XXXX 4.7 5.1
2012-01-04 YYYY 6.1 6.2
You could append
'cuspin'and'idc_id'as a indices to your DataFrames before youjoin(here’s how it would work on the first couple of rows):