I need to perform a regular expression search for a string x in another

Question

0

Asked: May 22, 20262026-05-22T12:49:38+00:00 2026-05-22T12:49:38+00:00

I need to perform a regular expression search for a string x in another

0

I need to perform a regular expression search for a string x in another string y, but I then need to know the token (word) index of the first character of the hit after tokenizing (splitting) string y using some other regular expression (e.g. white space). The first regular expression might find a substring, so I cannot guarantee that it will stop at the beginning of the token (word).

What would be the best algorithm to implement this. A simple approach would be the following:

Search for x in y using the first regular expression and get the character offset z
Split y into an array of elements using the second regular expression
Loop through the array of elements adding the length of each item to a variable LENGTH and adding 1 to a counter COUNTER
Stop the loop when LENGTH is greater or equal to z
The index of the token of the first character of the hit will be the value of COUNTER

(This assumes that the split function stores the splitting characters (e.g. white space) as array elements, which is very wasteful.

A concrete (simple) example: Suppose I want to know the token (word) index for the search “ade” in the string “The moon is made of cheese”. The function should give me back the answer: 3 (for zero indexed arrays).

==Edit==
The algorithm also needs to work when the regex search crosses token boundaries. For example, it should again return the index “3” when searching for “de of ch” in “The moon is made of cheese”.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-22T12:49:39+00:00

Editorial Team

2026-05-22T12:49:39+00:00Added an answer on May 22, 2026 at 12:49 pm

According to your updates:

#!/usr/bin/perl -l
use strict;
use warnings;

my $string = "The moon is made of cheese";
my $search = 'de of ch';
my $pos = index($string, $search);
if ($pos != -1) {
    my $substr = substr($string, 0, $pos);
    my @words = split /\s+/, $substr;
    print "found in word #", $#words, "\n";
} else {
    print "not found\n";
}

output:

found in word #3

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I need to perform a regular expression search for a string x in another

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply