Quick RegExp problem (i hope).
I need to identify a sub string from any string based on a regular expression.
For Example, take the following strings:
"Blogs, Joe (S0003-000292).html"
"bla bla bla S0003-000292 & so on"
"RE: S0003-000292"
I need to extract the ‘S0003-000292’ portion (or flag exception if not found).
As for what i have tried, well, i’ve written a rough pattern to identify S0000-000000:
^\(S[0-9]{4}-[0-9]{6}\)$
And i have tried testing for it as follows:
Dim regex As New Regex("Blogs, Joe (S0003-000292) Lorem Ipsum!")
Dim match As Match = regex.Match("^S[0-9]{4}-[0-9]{6}$")
If match.Success Then
console.writeline "Found: " & match.Value
Else
console.writeline "Not Found"
End If
However, this always results in Not Found.
So, 2 questions really, what is wrong with my pattern & how can I use a revised pattern to extract the sub string?
(Working with .net 2)
EDIT: stema pointed me in the right direction (i.e. to drop the ^ and $) – however that did not solve the problem, my main problem was that i had defined the string in the RegEx contructor instead of the pattern – swapped these over and it worked fine (i blame lack of caffine):
Dim regex As New Regex("S[0-9]{4}-[0-9]{6}")
Dim match As Match = regex.Match("Joe, Blogs (S0003-000292).html")
If match.Success = True Then
console.writeline "Found: " & match.Value
Else
console.writeline "Not Found"
End If
You have anchors in place that prevents your pattern from matching
^is matching the start of the string$is matching the end of the stringand since there is other stuff before and after the part you want to match, your pattern will not match. Just remove those anchors and it should be fine.
Or use word boundaries instead
\bwill match if there is a “non-word” character (non a letter or a digit) before and after your pattern.