I am trying to load the source of any page into a textbox for a client side only html editor. I need to be able to get the entire source of a web page, not just the body. This yql query returns just the body:
http://query.yahooapis.com/v1/public/yql?format=xml&callback=editor.handleLoad&q=select+*+from+html+where+url%3D%22example.com%22
Is there any way to get the entire source, or are there any other free json-p-x webservices that can?
I don’t see an obvious way to do that with YQL, but here is a Yahoo Pipe that seems to work. It refuses to get sites that are disallowed by their robots.txt, but it is getting the entire source for other sites:
http://pipes.yahoo.com/pipes/pipe.info?_id=dCsGDO123hG6BNv70EypaA
The default is set to http://www.example.com, which is denied because of the robots.txt on that page. However, it accepts the URL as a parameter. Here’s a link to an example usage of this pipe that gets the source of pipes.yahoo.com and returns the result wrapped in JSON:
http://pipes.yahoo.com/pipes/pipe.run?_id=dCsGDO123hG6BNv70EypaA&_render=json&url=http%3A%2F%2Fpipes.yahoo.com%2F
Does this help?