I have a JSON object as follows:
var jsonObject = {"regex":"<span class=\"Value\">\\$(.+?)<\\/span>"};
My target is to use this regular expression to scrape a value from a html document.
var match = html.match(new RegExp(jsonObject.regex, 'i'));
This however returns an error. The problem seems to be that the escape sequences in the regex string are lost in the string jsonObject.regex
A call to jsonObject.regex returns
< span class="Value">\$(.+?)<\ /span>
(The escape sequences like \” and \\ are lost)
I could replace the respective characters using javascript, but it seems the inefficient thing to do since I already have the correct format in the json object.
Any clues or workarounds are appreciated. Thanks!
You are doing two things wrong here.
First and foremost, you are trying to build a program that uses arbitrary regular expressions on HTML. Don’t do that. You have a DOM at your disposal on the client side, you should use one of the selector engines available. Examples include the browser built-in
document.querySelectorAll(), Sizzle (which is also part of jQuery), NWMatcher, or an XPath-based selector engine like XPath.js.Then, you obviously do not use a JSON serializer to build your JSON string on the server side, or things like messed-up escaping would not happen on the client side.
Lastly, what you have in your first code sample is not JSON. It’s a JavaScript object literal. JSON is always a string:
Selecting what you seem to want in jQuery would become as simple as
But as I said, you are not bound to use jQuery, there are lighter-weight alternatives if HTML-scraping is your main goal.