Consider:
a c p r c
x s o p c
v o v n i
w g f m n
q a t i t
An alphabet i_index is adjacent to another alphabet j_index in the tile if i_index is next to j_index in any of the following positions:
* * *
* x *
* * *
Here all the * indicates the location which are adjacent to x.
The task is to find a given string in the tile. The condition is that all the characters of the given string should be adjacent, and no one character in the tile may be used more than once to construct the given string.
I have came up with a simply backtracking solution, for which the solutions are pretty fast, but the worst case time is really worse.
For an example: Say the tile has 4×4 filled with all a‘s , therefore 16 a‘s, and the string to find is aaaaaaaaaaaaaaab, that is, 15 a‘s and one b . One what is to eliminate strings with characters which does not appear in the tile. But still worst case can still appear with say the tile have abababababababab and the string to find is abababababababbb .
My attempt is like this:
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#define MAX 5
int sp (char mat[MAX][MAX], char *pat, int c, int i, int j)
{
int r = 0;
char temp;
if (c == strlen (pat))
return 1;
if (((i<0) || (j<0)) || (i>=MAX) || (j>=MAX))
return 0;
if (mat[i][j] != pat[c])
return 0;
if (isupper (mat[i][j]))
return 0;
/* Save character and mark location to indicate
* DFS has visited this node, to stop other branches
* to enter here and cross over path
*/
temp = mat[i][j];
mat[i][j] = 0;
r |= sp (mat, pat, c+1, i-1, j-1);
r |= sp (mat, pat, c+1, i-1, j);
r |= sp (mat, pat, c+1, i-1, j+1);
r |= sp (mat, pat, c+1, i, j+1);
r |= sp (mat, pat, c+1, i+1, j+1);
r |= sp (mat, pat, c+1, i+1, j);
r |= sp (mat, pat, c+1, i+1, j-1);
r |= sp (mat, pat, c+1, i, j-1);
/* restore value */
mat[i][j] = temp;
/* mark if success */
if ((mat[i][j] == pat[c]) && (r == 1))
mat[i][j] = toupper (mat[i][j]);
return r;
}
/* Testing the `sp` module */
int main (void)
{
char mat[MAX][MAX] = {
{'a', 'c', 'p', 'r', 'c'},
{'x', 's', 'o', 'p', 'c'},
{'v', 'o', 'v', 'n', 'i'},
{'w', 'g', 'f', 'm', 'n'},
{'q', 'a', 't', 'i', 't'}
};
char pat[] = "microsoft";
int i, j;
for (i=0; i<5; i++)
{
for (j=0; j<5; j++)
printf ("%c ", mat[i][j]);
printf ("\n");
}
for (i=0; i<5; i++)
for (j=0; j<5; j++)
sp (mat, pat, 0, i, j);
printf ("\n\n\n");
for (i=0; i<5; i++)
{
for (j=0; j<5; j++)
{
if (isupper (mat[i][j]))
printf ("%c ", mat[i][j]);
else
printf (". ");
}
printf ("\n");
}
printf ("\n");
return 0;
}
which prints:
a c p r c
x s o p c
v o v n i
w g f m n
q a t i t
. . . R .
. S O . C
. O . . I
. . F M .
. . T . .
The function sp does the work, performs the back tracking.
Is there a better way ? or is it possible to lower the worst case time ?
There is no polynomial algorithm, so I don’t think you can get much better…
It is possible to encode any grid graph (a planar graph with nodes with degree <= 4) using a letter matrix. The following grid
Can be converted by turning edges into ‘a’s, vertices into ‘b’s and empty spaces into ‘z’s
Looking for a hamiltonian path in the original graph is equivalent to searching for the string BaBaBaBaBaBaBaBaB (with all 9 Bs). But the Hamiltonian path problem for grids in NP-complete* so the word searching problem is NP-hard.
Since a word path is clearly a polynomial certificate, the word searching problem on matrices is NP-complete.
*I remember seeing a proof for this a while ago and Wikipedia confirms, but without linking to a reference >:/
I’m pretty sure theres more on this problem out there. I just pulled this proof out of my ass and I’m pretty sure were not the first ones to look at the problem. At the least there is a good chance for nice heuristics on the non-degenerate cases you get in a real magazine puzzle…