I’m writing an automated test framework where after each action (loading a new page etc) I want to validate the html that is produced. I am writing the framework in java and the tests may be run in an environment which is sitting behind a firewall etc.
I currently have the html in string format so I am looking for a way to verify that the html in the string is valid. I was looking at jTidy but I can’t find a good example of how to do this. Does anyone have any idea?
Also further down the track I might look at validating the css files, so if suggestions could take that into consideration.
Thanks in advance,
James
Edit:
I have something working in JTidy now, hoping someone more knowlegable can verify it. How can I print out the errors from it?
tidy = new Tidy();
ByteArrayOutputStream os = new ByteArrayOutputStream();
String html = ScenarioFramework.driver.getHtml();
try {
Node node = tidy.parse(new ByteArrayInputStream(html.getBytes("UTF-8")), os);
} catch (UnsupportedEncodingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
return false;
}
if ((tidy.getParseErrors() > 0) || (tidy.getParseWarnings() > 0)) {
System.out.println("Tidy Parser errors: " + tidy.getParseErrors());
System.out.println("Tidy Parser warnings: " + tidy.getParseWarnings());
return false;
} else{
//return with no error
return true;
}
Depending on how good you want the validation to be. I’ve seen someone integrate their tests with http://validator.w3.org/docs/api.html … just by doing HTTP posts to the service I believe. Would give more details that jTidy.
They also have the equivalent for CSS.
Sorry, but I don’t have the code, just a demo I saw once that I thought would be cool to emulate some day.
Be nice to the service though … you might want to avoid validating if you can tell the HTML hasn’t changed.