This may be a bug, but it may also be a subtlety of pandas that I’m missing. I’m combining two dataframes and the result’s index isn’t sorted. What’s weird is that I’ve never seen a single instance of combine_first that failed to maintain the index sorted before.
>>> a1
X Y
DateTime
2012-11-06 16:00:11.477563 8 80
2012-11-06 16:00:11.477563 8 63
>>> a2
X Y
DateTime
2012-11-06 15:11:09.006507 1 37
2012-11-06 15:11:09.006507 1 36
>>> a1.combine_first(a2)
X Y
DateTime
2012-11-06 16:00:11.477563 8 80
2012-11-06 16:00:11.477563 8 63
2012-11-06 15:11:09.006507 1 37
2012-11-06 15:11:09.006507 1 36
>>> a2.combine_first(a1)
X Y
DateTime
2012-11-06 16:00:11.477563 8 80
2012-11-06 16:00:11.477563 8 63
2012-11-06 15:11:09.006507 1 37
2012-11-06 15:11:09.006507 1 36
I can reproduce, so I’m happy to take suggestions. Guesses as to what’s going on are most welcome.
The
combine_firstfunction usesindex.unionto combine and sort the indexes. Theindex.uniondocstring states that it only sorts if possible, socombine_firstis not necessarily going to return sorted results by design.For non-monotonic indexes, the
index.uniontries to sort, but returns unsorted results if there is an exception. I don’t know if this is a bug or not, butindex.uniondoes not even attempt to sort monotonic indexes like the datetime index in your example.I’ve opened an issue on GitHub, but I guess you should do
a2.combine_first(a1).sort_index()for any datetime indexes for now.Update: This bug is now fixed on GitHub