I’m apparently a little newer to Javascript than I’d care to admit. I’m trying to pull a webpage using Node.js and save the contents as a variable, so I can parse it however I feel like.
In Python, I would do this:
from bs4 import BeautifulSoup # for parsing
import urllib
text = urllib.urlopen("http://www.myawesomepage.com/").read()
parse_my_awesome_html(text)
How would I do this in Node?
I’ve gotten as far as:
var request = require("request");
request("http://www.myawesomepage.com/", function (error, response, body) {
/*
Something here that lets me access the text
outside of the closure
This doesn't work:
this.text = body;
*/
})
Edit: As Kishore noted, there are nice options for parsing available. Also see cheerio if you have python/gyp issues with jsdom on windows. Cheerio on github