When parsing text, I frequently need to implement mini-state-machines, in the generic form following the code below.
Is there a CPAN module that’s considered “best practice” and well suited to implement state machine logic like this in an easy and elegant way?
I would prefer solutions less complicated than Parse::RecDescent but if none exist and Parse::RecDescent is a lot easier to apply to this problem than I thought, I’m very willing to consider it instead of rolling my own like I’ve been so far.
Example generic parsing code:
my $state = 1;
while (my $token = get_next_token()) { # Usually next line
if ($state == 1) {
do_state1_processing();
if (token_matches_transition_1_to_2($token)) {
do_state_1_to_2_transition_processing();
$state == 2;
next;
} elsif (token_matches_transition_1_to_4($token)) {
do_state_1_to_4_transition_processing();
$state == 4;
next;
} else {
do_state1_continuation();
next;
}
} elsif ($state == 5) {
do_state5_processing();
if (token_matches_transition_5_to_6($token)) {
do_state_5_to_6_transition_processing();
$state == 6;
next;
} elsif (token_matches_transition_5_to_4($token)) {
do_state_5_to_4_transition_processing();
$state == 4;
next;
} else {
do_state5_continuation();
next;
}
} else {
}
}
I would recommend taking a look at Marpa and Marpa::XS.
Just look at this simple calculator.
You will have to implement the tokenizer yourself.