I’m trying to retrieve the list of indices of each sub-string within a string. This string contains the special character \ several times in different places within the string. The \ should be recognized as a character and not as a special character. When I obtain the starting index of the sub-string it skips over the \ and returns one index less than what it should be. Any help on how to do this would be appreciated.
text = "ab\fx*abcdfansab\fasdafdab\f664s"
for m in re.finditer( 'ab\f', text ):
print( 'll found', m.start(), m.end() )
(‘ll found’, 0, 3)
(‘ll found’, 13, 16)
(‘ll found’, 22, 25)
The second index should be (14, 17) and the third (24, 27). Also, I’m not sure why the first one is right.
Python interpreting the
\as an escape character, like many other programming languages do. If you want a literal backslash, use raw strings, and also double the\in the pattern, since backslash is a regex metacharacter:Alternately, double the backslashes everywhere, and don’t use raw strings. Again, remember to doubly escape in the regex.