Embedly has a great regex generator with which you can use to verfiy the correctnes of services urls(http://api.embed.ly/tools/generator). It generates javascript regexes, but unfortunately it does not generate c# regex expressions. As far as I know though, c# uses the same ECMA regex definition, and I should therefore be able to use the in C#
So what I would like to achieve is take the generated regex from the embdly site and just paste it into my c# code.
The javascript regex would look like this:
/http:\/\/(.*youtube\.com\/watch.*|.*\.youtube\.com\/v\/.*|youtu\.be\/.*|.*\.youtube\.com\/user\/.*#.*|.*\.youtube\.com\/.*#.*\/.*|picasaweb\.google\.com.*\/.*\/.*#.*|picasaweb\.google\.com.*\/lh\/photo\/.*|picasaweb\.google\.com.*\/.*\/.*)/i
and should match urls like:
http://picasaweb.google.com/westerek/LadakhDolinaMarkha?feat=featured#5497194022344000402
http://www.youtube.com/watch?v=GVDc1uXda6Y&feature=related
What I have so far is the following:
Regex regex = new Regex(
"[/http:\\/\\/(.*youtube\\.com\\/watch.*|.*\\.youtube\\.com\\/"+
"v\\/.*|youtu\\.be\\/.*|.*\\.youtube\\.com\\/user\\/.*#.*|.*\\."+
"youtube\\.com\\/.*#.*\\/.*|picasaweb\\.google\\.com.*\\/.*\\/"+
".*#.*|picasaweb\\.google\\.com.*\\/lh\\/photo\\/.*|picasaweb"+
"\\.google\\.com.*\\/.*\\/.*)/i]",
RegexOptions.IgnoreCase
| RegexOptions.CultureInvariant
| RegexOptions.IgnorePatternWhitespace
| RegexOptions.Compiled
);
.. but this only gives me partial matches..
EDIT:
Solution: Just paste the embedly javascript regex expression into strEmbdlyRegex string in the following snippet.
string strEmbdlyRegex = @"/http:\/\/(.*youtube\.com\/watch.*|.*\.youtube\.com\/v\/.*|youtu\.be\/.*|.*\.youtube\.com\/user\/.*#.*|.*\.youtube\.com\/.*#.*\/.*)/i";
string strRegx = strEmbdlyRegex.Remove(0, 1);
strRegx = strRegx.Remove(strRegx.IndexOf("("), 1);
strRegx = strRegx.Remove(strRegx.LastIndexOf(")/i"), 3);
strRegx = strRegx + "]";
regex = new Regex(
strRegx,
RegexOptions.IgnoreCase
| RegexOptions.CultureInvariant
| RegexOptions.ECMAScript
| RegexOptions.Compiled
);
Being a bit more specific with your problem would help, but I appear to have it working (at least with your two test strings). You just need to clean up a few extraneous characters:
@""syntax (no escaping backslashes)[/from the beginning of the string\ifrom the end of the string(and)near the beginning and end of the stringAlso, you probably don’t need the IgnorePatterWhitespace option, and for a simple URL you probably don’t need the CultureInvariant option either.
Lastly, there is a RegexOptions.ECMAScript option that allows you to pass in a /regex/i and have it be interpreted the same way JavaScript would handle it.