I’m interested in algorithm which I should use to meet the requirements of external sorting of ints with O(N log N) reads and O(N) writes
I’m interested in algorithm which I should use to meet the requirements of external
Share
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
If you’re after an algorithm for that type of sorting (where the data can’t all fit into core at once), my solution comes from the very earliest days of the “revolution” when top-end machines had less memory than most modern-day calculators.
I haven’t worked out the big-O properties but I think it would be O(n) reads, O(n log n) sort phase (depends on the sort method chosen) and O(n) writes.
Let’s say your data set has one million elements and you can only fit 100,000 in memory at a time. Here’s what I’d do:
In other words, once your 10 groups are sorted within the group, grab the first entry from each group.
Then write that the lowest of those 10 (which is the lowest of the whole million) to the output file and read the next one from that group in its place.
Then just continue selecting the lowest of the 10, writing it out and replacing it from the same group. In that way, the final output is the entire sorted list of a million entries.