I’ve a binary data which contains a text. The text is known. What could be a fast method to search for that text:
As an eg.
This is text 1—
!@##$%%#^%&!%^$! <= Assume this line is 3 MB of binary data
Now, This is text 2 —
!@##$%%#^%&!%^$! <= Assume this line is 2.5 MB of binary data
This is text 3 —
How can I search for text This is text 2.
Currently I’m doing like:
size_t count = 0;
size_t s_len = strlen("This is text 2");
//Assume data_len is length of the data from which text is to be found and data is pointer (char*) to the start of it.
for(; count < data_len; ++count)
{
if(!memcmp("This is text 2", data + count, s_len)
{
printf("%s\n", "Hurray found you...");
}
}
- Is there any other way, more efficient way to do this
- Will replacing
++count logicwithmemchr('T') logichelp <= Please ignore if this statement is not clear - what should be the average case big-O comlexity of memchr
There’s nothing in standard C to help you, but there is a GNU extension
memmem()that does this:If you need to be portable to systems that don’t have this, you could take the
glibcimplementation ofmemmem()and incorporate it into your program.