i have written the following function
//O(n^2)
void MostCommonPair(char * cArr , char * ch1 , char * ch2 , int * amount)
{
int count , max = 0;
char cCurrent , cCurrent2;
int i = 0 , j;
while(*(cArr + i + 1) != '\0')
{
cCurrent = *(cArr + i);
cCurrent2 = *(cArr + i + 1);
for(j = i , count = 0 ; *(cArr + j + 1) != '\0' ; j++)
{
if(cCurrent == *(cArr + j) && cCurrent2 == *(cArr + j + 1))
{
count++;
}
}
if(count > max)
{
*ch1 = cCurrent;
*ch2 = cCurrent2;
max = *amount = count;
}
i++;
}
}
for the following input
“xdshahaalohalobscxbsbsbs”
ch1 = b ch2 = s amount = 4
but in my opinion the function is very un efficient , is there a way to go through the string only once or to reduce the run size to O(n)?
Since
charcan hold up to 256 values, you can set up a two-dimensional table of [256*256] counters, run through your string once, incrementing the counter that corresponds to each pair of character in the string. Then you can go through the table of 256×256 numbers, pick the largest count, and know to what pair it belongs by looking at its position in the 2D array. Since the size of the counter table is fixed to a constant value independent of the length of the string, that operation isO(1), even though it requires two nested loops.Here is a link to a demo on ideone.
Keep in mind that although this is the fastest possible solution asymptotically (i.e. it’s
O(N), and you cannot make it faster thanO(N)) the performance is not going to be good for shorter strings. In fact, your solution will beat it hands-down on inputs shorter than approximately 256 characters, probably even more. There is a number of optimizations that you can apply to this code, but I decided against adding them on to keep the main idea of the code clearly visible in its purest and simplest form.