In a C program, I need to search for an exact string in a normal file (I use Linux). How shall I do in order to search?
My first assumption consisted of moving each line of the file to the RAM (via fgets()) and, after each move, check if that line was the right string. If it isn’t, a loop will re-call fgets() and check the strings until EOF.
But what happens with a file with 150 million lines? Happens that this kind of sequential search seems to be ineffective at all.
However, I was thinking about a kind of binary search, using the insertion sort in order to sort the lines my program add to the file (it adds one line around every 3 seconds, immediately after have checked that that line doesn’t appear in the strings file). But then I gave up because I first would needed to move the lines to the RAM, using the same time I would have used for a sequential search. Thus I chose the sequential search.
Is this assumption right? Or is there a better way? I really hope so.
You could use
mmapto map the entire file into memory, and then do astrnstrsearch: