Ok so I have attempted to use multiple XML libraries that NodeJS have to offer and I can’t seem to work out how to have an NodeJS read the XML file from a website.
I can pull the file using http.request, http.get and all of that but then to have NodeJS be able to actually do anything with the data in the XML file is another story.
I’m sure I must be missing something as when ever I turn the XML to JS with xml-stream; it can not use it from a website; my code runs when I host the file however I am using an api and they only use XML.
Current code:
var http = require('http');
var XmlStream = require('xml-stream');
var options = { host: 'cloud.tfl.gov.uk',
path: '/TrackerNet/LineStatus'};
var twitter = { host: 'api.twitter.com',
path: '/1/statuses/user_timeline.rss?screen_name=nwhite89'}
var request = http.get(options).on('response', function(response) {
response.setEncoding('utf8');
var xml = new XmlStream(response);
xml.on('updateElement: item', function(item) {
item.title = item.title.match(/^[^:]+/)[0] + ' on ' +
item.pubDate.replace(/ +[0-9]{4}/, '');
});
xml.on('text: item > pubDate', function(element) {
element.$text = element.$text;
});
xml.on('data', function(data) {
process.stdout.write(data);
});
});
What I don’t understand is using Twitter works fine outputs at xml.on(“data”) part however using options (cloud.tfl.gov.uk) nothing outputs even if I put console.log(“hi”) inside the data function it dosn’t get executed.
I know that the url is correct outputting console.log(xml) or console.log(response) after creating the variable xml outputs that it has connected. Any help would be greatly appreciated with this I have been stuck on this for a good 2 days now.
There is a byte order mark before the
<?xmltag, which xml-stream trips up on a bit and stops it from being able to read the encoding in the tag. That means you need to provide it yourself.Instead of this:
Just do this:
And really, setting the encoding on the stream is optional.
works just fine.
More info here: http://en.wikipedia.org/wiki/Byte_order_mark#UTF-8
If you look at the buffer emitted from
responserather thatxml, the buffer starts withThe first 3 bytes are the byte order mark for utf8, and afterwards you have the start of the tag.
xml-streamexpects the<?xmltag to only have whitespace between it and the start of the file, but byte order marks don’t count as whitespace.