html source
<form>
<input type="text" name="a" value="a1fa4" type="hidden"/>
<input type="text" name="b" value="b1fa9" type="hidden"/>
<input type="text" name="c" value="c1fd2" type="hidden"/>
<input type="text" name="d" value="d1fx1" type="hidden"/>
</form>
php source
<?php
preg_match_all('/<input name="(.*?)" value="(.*?)" type="hidden"\/>/i', $form, $input);
$var = array();
for($i=0;$i<count($input[1]);$i++){
$var[$input[1][$i]] = $input[2][$i];
}
?>
C# source
Match match = Regex.Match(html, "<input name=\"(.*?)\" value=\"(.*?)\" type=\"hidden\"/>", RegexOptions.IgnoreCase );
while (match.Success)
{
System.Console.WriteLine(" {0} {1} ", match.Value, match.Index);
}
The php code works, but the c# code does not work. how can I fix the c# code?
thanks!
The problem with your regular expression is you omitted the
type=\"text\". The following works:However, as L.B says, use a dedicated HTML parser instead of regular expressions because HTML is not guaranteed to be valid XML, may contain different layouts and encodings and so on.
If you must use regular expressions, they need to be a lot more flexble. For example there may be more or different whitespace between attributes and elements.