Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8483299
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 10, 20262026-06-10T20:03:46+00:00 2026-06-10T20:03:46+00:00

I have Document document = Jsoup.connect(link).get(); and some times for some urls I get

  • 0

I have

Document document = Jsoup.connect(link).get();

and some times for some urls I get an exception:

Exception in thread "main" java.nio.charset.UnsupportedCharsetException: X-MAC-ROMAN
    at java.nio.charset.Charset.forName(Unknown Source)
    at org.jsoup.helper.DataUtil.parseByteData(DataUtil.java:86)
    at org.jsoup.helper.HttpConnection$Response.parse(HttpConnection.java:469)
    at org.jsoup.helper.HttpConnection.get(HttpConnection.java:147)

I have a catch block as:

catch (IOException  e1)

I understand the exception is because java is unicode and that webpage/site is not following unicode. how to handle this issue also the connect is used for many websites which include both unicode and bytecode

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-10T20:03:48+00:00Added an answer on June 10, 2026 at 8:03 pm

    I understand the exception is because java is unicode and that webpage/site is not following unicode.

    That’s not entirely correct. You’re likely confusing the statement “Java is unicode” with the fact that Java uses Unicode to store strings/characters in memory (you know, a computer memory can only store bytes (zeroes and ones), not characters, therefore characters needs to be converted to bytes and back using a specific character encoding; Java is using unicode for this).

    This exception occurs because the underlying operating system platform wherein your Java code runs doesn’t support this charset, so Java can’t convert the from the webserver obtained bytes to characters in this encoding. This charset is specific to Mac OS platforms and you’re likely running Windows or so.


    how to handle this issue

    Contact the website admin and report it as a bug. It’s their fault that they used a platform-specific (Mac OS) encoding instead of an universal (ISO/UTF) encoding.

    As to Jsoup, your best bet is to get website as InputStream by URL#openStream() first and then feed it to Jsoup#parse() instead wherein you explicitly specify the character encoding which is supported on your platform, such as ISO-8859-1. E.g.:

    Document doc = Jsoup.parse(new URL(link).openStream(), "ISO-8859-1", link);
    

    Note that you still risk to end up with Mojibake when there are non-ASCII characters present. Also note that you shouldn’t do it for all links, but only for those which threw UnsupportedCharsetException (thus, perform the job in its catch block).


    but I am able to access that in my chrome and why not from Jsoup

    That is because Chrome is trying to be so kind for you that it ignored the unknown encoding and chooses a default encoding instead –which might still risk in the website being displayed in Mojibake; anything beyond the ASCII range might look malformed.


    connect is used for many websites which include both unicode and bytecode

    Please refresh your vocabulary on the meaning of the word “bytecode”. This has got absolutely nothing to do with character encodings.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have the following code: String website = http://www.somewebsite.com/; Document doc = Jsoup.connect(website).get(); Elements
Document doc = Jsoup.connect(http://reviews.opentable.com/0938/9/reviews.htm).get(); Element part = doc.body(); Elements parts = part.getElementsByTag(span); String attValue;
Document doc = Jsoup.connect(studentprofiles).get(); Element tables = doc.select(table); Elements myTdsstudent = tables.select(tr td:eq(1)); I
I am using Jsoup. I do a get document= connect.get(); and get the html
I have a document type of info I also have some custom properites. infoTitle
I have a document-based Core Data app. My main Core Data entity has several
Is there a way to deep clone JSoup Document object and get back exactly
I have a little sample program which extracts some information from an HTML document.
I have a document that was made in jsoup that looks like this Document
package info.testing; import java.io.IOException; import org.jsoup.Jsoup; import org.jsoup.nodes.Document; import org.jsoup.select.Elements; import android.app.Activity; import android.os.Bundle;

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.