I’m not exactly sure how to describe the problem, but basically, I’m using JSoup to parse some html and pull out the article text. The method I’m using is:
public static String getArticle(String articleLink) {
Log.i("article link", articleLink);
Document doc = null;
try {
doc = Jsoup.connect(articleLink).timeout(10000).get();
} catch (IOException ioe) {
return null;
}
Elements articleBody = doc.select("div.article-body");
Element first = articleBody.first();
return first.text();
}
When I pull out this snippet of code, and create a sample program in Netbeans, and pass in the link to the page, it returns the article just fine. But, when I run it on my android device, I get a null pointer at ‘return first.text()’.
I’m not sure how this can be. The app is published and has been working, but all of a sudden, it started crashing, leading me to believe that something changed in the layout of the webpage, but I just ran the standalone program, passed in the same articleLink, and it works fine on my computer, but I get the nullPointer on the android. Same versions of jsoup too, any ideas?
Update: The value of the doc variable is:
<!DOCTYPE html>
<html>
<head>
<title>Redirecting...</title>
<meta http-equiv="refresh" content="0;url=http://m.ncataggies.com/mobile/ViewArticle.dbml? atclid=205823481&DB_MENU_ID=&SPSID=&SPID=&DB_OEM_ID=24500" />
<meta name="ROBOTS" content="NOINDEX,NOFOLLOW" />
</head>
<body>
</body>
</html>
So something did change…
The server at
ncataggies.comis checking the user-agent header from the request, and serving different pages to mobile browsers. Because you don’t specify a user-agent, the server sees the default agent that Android supplies, which identifies it as a mobile browser.In jsoup you can set the user-agent like this:
You can check your current user-agent here.