I have a regular expression (REGEX 1) plus some Perl code that picks out a specific string of text, call it the START_POINT, from a large text document. This START_POINT is the beginning of a larger string of text that I want to extract from the large text document. I want to use another regular expression (REGEX 2) to extract from START_POINT to an END_POINT. I have a set of words to use in the regular expression (REGEX 2) which will easily find the END_POINT. Here is my problem. The START_POINT text string may contain metacharacters which will be interpreted differently by the regular expression. I don’t know ahead of time which ones these will be. I am trying to process a large set of text documents and the START_POINT will vary from document to document. How do I tell the a regular expression to interpret a text string as just the text string and not as a text string with meta characters?
Perhaps this code will help this make more sense. $START_POINT was identified in code above this piece of code and is an extracted part of the large text string $TEXT.
my $END_POINT = "(STOP|CEASE|END|QUIT)";
my @NFS = $TEXT =~ m/(($START_POINT).*?($END_POINT))/misog;
I have tried to use the quotemeta function, but haven’t had any success. It seems to destroy the integrity of the $START_POINT text string by adding in slashes which change the text.
So to summarize I am looking for some way to tell the regular expression to look for the exact string in $START_POINT without interpreting any of the string as a metacharacter while still maintaining the integrity of the string. Although I may be able to get the quotemeta to work, do you know of any other options available?
Thanks in advance for your help!
You need to convert the text to a regex pattern. That’s what
quotemetadoes.quotemetacan be accessed via\Q..\E:Why reimplement
quotemeta?