I have a text file includes over than 10 million lines. Lines like that:
37024469;196672001;255.0000000000
37024469;196665001;396.0000000000
37024469;196664001;396.0000000000
37024469;196399002;85.0000000000
37024469;160507001;264.0000000000
37024469;160506001;264.0000000000
As you seen, delimiter is “;”. i would like to sort this text file by using python according to the second element. I couldnt use split function. Because it causes MemoryError. how can i manage it ?
Don’t sort 10 million lines in memory. Split this up in batches instead:
Run 100 100k line sorts (using the file as an iterator, combined with
islice()or similar to pick a batch). Write out to separate files elsewhere.Merge the sorted files. Here is an merge generator that you can pass 100 open files and it’ll yield lines in sorted order. Write to a new file line by line: