I am trying to extract specific content(links, text, images) from an HTML page. Is there some program out there that I can use to produce a visual representation of the DOM model of the page? I know I could I write such a program in Java using an HTML parser, but before I do that, I thought I would see if there already exists such a program.
My main objective is to extract certain links, image URLs, and text; and send these to a Flex applet on the page.
Thanks,
Vance
If you just want to extract a few bits of information (rather than print out the entire page structure say) the you can use the FireBug extension for Firefox.
Choose the HTML tab then click on the second icon from the left (looks like a cursor pointing at a box) then click on the part of the page you’re interested in to go to that part of the DOM.