I have an input file that I want to sort based on timestamp which is a substring of each record. I want to store multiple attributes of the
The list is currently about 1000 records. But, I want it to be able to scale up a bit just in case.
When I did it with a Linked List by searching the entire list for insertion it took about 20 seconds. Now, just filling up a vector and outputting to file is taking 4 seconds (does that sound too long)?
I would like to use merge sort or quick sort (merge sort appears to be a little easier to me). The trouble that I’m running into is that I don’t see many examples of implementing these sorts using objects rather than primitive data types.
I could use either a vector or Linked list. The feedback that I’ve gotten from this site has been most helpful so far. I’m hoping that someone can sprinkle on the magic pixie dust to make this easier on me 🙂
Any links or examples on the easiest way to do this with pretty decent performance would be most appreciated. I’m getting stuck on how to implement these sorts with objects because I’m newbie at C++ 🙂
Here’s what my new code looks like (no sorting yet):
class CFileInfo { public: std::string m_PackLine; std::string m_FileDateTime; int m_NumDownloads; }; void main() { CFileInfo packInfo; vector<CFileInfo> unsortedFiles; vector<CFileInfo>::iterator Iter; packInfo.m_PackLine = 'Sample Line 1'; packInfo.m_FileDateTime = '06/22/2008 04:34'; packInfo.m_NumDownloads = 0; unsortedFiles.push_back(packInfo); packInfo.m_PackLine = 'Sample Line 2'; packInfo.m_FileDateTime = '12/05/2007 14:54'; packInfo.m_NumDownloads = 1; unsortedFiles.push_back(packInfo); for (Iter = unsortedFiles.begin(); Iter != unsortedFiles.end(); ++Iter ) { cout << ' ' << (*Iter).m_PackLine; } }
Sorting a linked-list will inherently be either O(N^2) or involve external random-access storage.
Vectors have random access storage. So do arrays. Sorting can be O(NlogN).
At 1000 elements you will begin to see a difference between O(N^2) and O(NlogN). At 1,000,000 elements you’ll definitely notice the difference!
It is possible under very special situations to get O(N) sorting. (For example: Sorting a deck of playing cards. We can create a function(card) that maps each card to its sorted position.)
But in general, O(NlogN) is as good as it gets. So you might as well use STL’s sort()!
Just add #include <algorithms>
All you’ll need to add is an operator<(). Or a sort functor.
But one suggestion: For god’s sake man, if you are going to sort on a date, either encode it as a long int representing seconds-since-epoch (mktime?), or at the very least use a ‘year/month/day-hour:minute:second.fraction’ format. (And MAKE SURE everything is 2 (or 4) digits with leading zeros!) Comparing ‘6/22/2008-4:34′ and ’12/5/2007-14:54’ will require parsing! Comparing ‘2008/06/22-04:34’ with ‘2007/12/05-14:54’ is much easier. (Though still much less efficient than comparing two integers!)
Rich wrote: the other answers seem to get into syntax more which is what I’m really lacking.
Ok. With basic a ‘int’ type we have:
With your own type we have: