I am trying to write lex code which will take an input and then find and print the first permutation of that input that it finds in a large dictionary text file. This is what I have so far:
%{
#include <stdio.h>
%}
%option noyywrap
%%
INPUT GOES HERE { //Not sure what expression to put here
printf("Longest is: %s", yytext);
return;
}
.|\n { }
%%
int main(void)
{
yylex();
return 0;
}
I have a feeling I’ll have to use states, but I’m not too familiar with how those work. Can someone point me in the right direction?
EDIT: Here is the code for the accepted answer in case anyone wants it:
%{
#include <stdio.h>
#include <string.h>
%}
%option noyywrap
%%
^[ablm]{4}$ {
char originalWord [5];
strcpy(originalWord, yytext);
char input[5] = {"ablm"};
char tmp;
int i, j;
for(i=0; i<4; i++)
{
for (j=i+1; j<4; j++)
{
if (yytext[i] > yytext[j])
{
tmp=yytext[i];
yytext[i]=yytext[j];
yytext[j]=tmp;
}
}
}
if(strcmp(input,yytext)==0){
printf("First permutation is: %s", originalWord);
return;
}
else
;
}
.|\n { }
%%
int main(void)
{
yylex();
return 0;
}
Regular expressions do not tend to natively support string matching for strings of the form “some permutation of the following symbols.” You can write regular expressions that match permutations of some string, but to do so you would (more or less) have to write out all permutations of those characters, then OR them all together.
An easier way to do this would be to have a regular expression that matches all strings that are of the appropriate length and which are made of symbols taken from the string in question. You could then associate an action with this regular expression which would take in candidate strings, then use normal C code to determine whether or not the string was a permutation of the original set of characters. This should be extremely fast, since the number of false positives will likely be very low in a real dictionary and the amount of time spent processing a candidate match is not very great.
Hope this helps!