I’m applying the following example http://jsoup.org/cookbook/extracting-data/example-list-links to list links.
package org.jsoup.examples;
import org.jsoup.Jsoup;
import org.jsoup.helper.Validate;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import java.io.IOException;
/**
* Example program to list links from a URL.
*/
public class ListLinks {
public static void main(String[] args) throws IOException {
Validate.isTrue(args.length == 1, "usage: supply url to fetch");
String url = args[0];
print("Fetching %s...", url);
Document doc = Jsoup.connect(url).get();
Elements links = doc.select("a[href]");
Elements media = doc.select("[src]");
Elements imports = doc.select("link[href]");
print("\nMedia: (%d)", media.size());
for (Element src : media) {
if (src.tagName().equals("img"))
print(" * %s: <%s> %sx%s (%s)",
src.tagName(), src.attr("abs:src"), src.attr("width"), src.attr("height"),
trim(src.attr("alt"), 20));
else
print(" * %s: <%s>", src.tagName(), src.attr("abs:src"));
}
print("\nImports: (%d)", imports.size());
for (Element link : imports) {
print(" * %s <%s> (%s)", link.tagName(),link.attr("abs:href"), link.attr("rel"));
}
print("\nLinks: (%d)", links.size());
for (Element link : links) {
print(" * a: <%s> (%s)", link.attr("abs:href"), trim(link.text(), 35));
}
}
private static void print(String msg, Object... args) {
System.out.println(String.format(msg, args));
}
private static String trim(String s, int width) {
if (s.length() > width)
return s.substring(0, width-1) + ".";
else
return s;
}
}
I only replaced “”usage: supply url to fetch”” with “http://www.google.com“. JSoup documentation is so poor (as I see). So, Im getting the following error & not able to figure out why: Exception in thread “main” java.lang.IllegalArgumentException: usage: http://www.google.com
at org.jsoup.helper.Validate.isTrue(Validate.java:45)
at TestClass.main(TestClass.java:16)
I found the following post about the same problem: importing java libarary But I already replaced usage: … etc with the web site name and does not help.
Well that suggests you don’t understand what the
Validate.isTruecall is doing. It’s incredibly important that you don’t change code without knowing what it’s doing before you change it.You’re not meant to change that code. You’re meant to run this code and supply the URL as a command-line argument. That first statement validates that there is exactly one command-line argument.
So put the code back to what it is, and run