I have wrote a script to grab different fields in an HTML file and populate variables with the results. I’m having issues with the regular expression for grabbing the email. Here is some sample code:
$txt='<p class=FillText><a name="InternetMail_P3"></a>First.Last@company-name.com</p>'
$re='.*?'+'([\\w-+]+(?:\\.[\\w-+]+)*@(?:[\\w-]+\\.)+[a-zA-Z]{2,7})'
if ($txt -match $re)
{
$email1=$matches[1]
write-host "$email1"
}
I get the following error:
Bad argument to operator '-match': parsing ".*?([\\w-+]+(?:\\.[\\w-+]+)*@(?:[\\w-]+\\
.)+[a-zA-Z]{2,7})([\\w-+]+(?:\\.[\\w-+]+)*@(?:[\\w-]+\\.)+[a-zA-Z]{2,7})" - [x-y] range in reverse order..
At line:7 char:16
+ if ($txt -match <<<< $re)
+ CategoryInfo : InvalidOperation: (:) [], RuntimeException
+ FullyQualifiedErrorId : BadOperatorArgument
What am I missing here? Also, is there a better regex for email?
Thanks in advance.
Actually any regex that is suitable for .Net or C# will work for PowerShell. And you could find tons and tons samples at stackoverflow and inet. For example: How to Find or Validate an Email Address: The Official Standard: RFC 2822
But there is also other part of this answer. Regex by nature is not very suitable to parse XML/HTML. You could find more details here: Using regular expressions to parse HTML: why not?
To provide real solution, I’m recomment first