Here’s the deal: I’m trying, as a learning experience, to convert a C program to C++. This program takes a text file and applies modifications to it according to user-inputted rules. Specifically, it applies sounds changes to a set of words, using rules formatted like “s1/s2/env”. s1 represents the characters to be changed, s2 represents what to change it into, and env is the context in which the change should be applied.
I’m sorry that I don’t describe this in more depth, but the question would be too long, and the author’s site already explains it.
The function I’m having trouble is TryRule. I understand that it’s supposed to see if a given rule applies to a given string, but I’m having trouble understanding exactly how it does it. The poor explanation of the parameters confuses me: for example, I don’t understand why the strings “s1” and “s2” have to be passed back, or what does “i” represent.
This is the code:
/*
** TryRule
**
** See if a rule s1->s2/env applies at position i in the given word.
**
** If it does, we pass back the index where s1 was found in the
** word, as well as s1 and s2, and return TRUE.
**
** Otherwise, we return FALSE, and pass garbage in the output variables.
*/
int TryRule( char *word, int i, char *Rule, int *n, char **s1, char **s2, char *varRep )
{
int j, m, cont = 0;
int catLoc;
char *env;
int optional = FALSE;
*varRep = '\0';
if (!Divide( Rule, s1, s2, &env ) || !strchr( env, '_' ))
return(FALSE);
for (j = 0, cont = TRUE; cont && j < strlen(env); j++)
{
switch( env[j] )
{
case '(':
optional = TRUE;
break;
case ')':
optional = FALSE;
break;
case '#':
cont = j ? (i == strlen(word)) : (i == 0);
break;
case '_':
cont = !strncmp( &word[i], *s1, strlen(*s1) );
if (cont)
{
*n = i;
i += strlen(*s1);
}
else
{
cont = TryCat( *s1, &word[i], &m, &catLoc );
if (cont && m)
{
int c;
*n = i;
i += m;
for (c = 0; c < nCat; c++)
if ((*s2)[0] == Cat[c][0] && catLoc < strlen(Cat[c]))
*varRep = Cat[c][catLoc];
}
else if (cont)
cont = FALSE;
}
break;
default:
cont = TryCat( &env[j], &word[i], &m, &catLoc );
if (cont && !m)
{
/* no category applied */
cont = i < strlen(word) && word[i] == env[j];
m = 1;
}
if (cont)
i += m;
if (!cont && optional)
cont = TRUE;
}
}
if (cont && printRules)
printf( " %s->%s /%s applies to %s at %i\n",
*s1, *s2, env, word, *n );
return(cont);
}
This code is… tough to read. I looked at the original file, and it could really use some better variable names. I especially love this part from one of the function comments:
I can see the challenge. I agree with wmeyer about the variables. I think I understand things, so I’m going to attempt to translate the function into pseudo code.
Word: The string we are looking at
i: The index in the string we’re looking at
Rule: The text of the rule (i.e. “v/b/_”)
n: A variable to return the index into the string we found the match for the _, I think
s1: Returns the first part of the rule, decoded out of Rule
s2: Returns the second part of the rule, decoded out of Rule
varRep: Returns the character matched in the category, if a category matched, I think
I hope this helps. You may need to read it a few times. It took me over 45 minutes to write this, almost entirely because of trying to decipher exactly what’s going on in some of the cases around TryCat. Add in about 5 minutes for constantly trying to hit the Tab key and getting my cursor send to the next field (stupid HTML text box).
Sorry this is so big, you’ll probably have to do a bunch of horizontal scrolling.