Here’s a regular expression to detect product pages on amazon. It works for pages in standard english but not for url’s with international characters. So URL2 is not detected. How do I get around this? Thanks.
var URL1 = "www.amazon.com/Big-Short-Inside-Doomsday-Machine/dp/0393338827/";
var URL2 = "www.amazon.fr/Larm%C3%A9e-furieuse-Fred-Vargas/dp/2878583760/";
var regex1 = RegExp("http://www.amazon.(com|co.uk|de|ca|it|fr|cn|co.jp)/([\\w-]+/)?(dp|gp/product)/(\\w+/)?(\\w{10})");
m = URL1.match(regex1);
%doesn’t match\w, soLarm%C3%A9e-furieuse-Fred-Vargasdoesn’t match[\w-]+. Why not just use[^/]+?PS — “
.” matches any character, so you should use pattern\., which would appear as\\.in the literal.