Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 555143
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 13, 20262026-05-13T11:47:04+00:00 2026-05-13T11:47:04+00:00

I was trying to download and parse a webpage with foreign (Chinese) characters. I’m

  • 0

I was trying to download and parse a webpage with foreign (Chinese) characters. I’m not sure whether I should use “utf-8” or something else. But none of these seems to work for me. I used the sample Wikitionary code for getUrlContent().

public void onCreate(Bundle savedInstanceState) {
    super.onCreate(savedInstanceState);
    setContentView(R.layout.main);
    mText = (TextView) findViewById(R.id.textview1);
    huaren.prepareUserAgent(this);
    String test = new String("fail");

    try {
        test = getUrlContent("http://huaren.us/");
    } catch (ApiException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
    byte[] b = new byte[100000];

    try {
          b = test.getBytes("utf-8");
    } catch (UnsupportedEncodingException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }

    char[] charArr = (new String(b)).toCharArray();
    CharSequence seq = java.nio.CharBuffer.wrap(charArr); 

    mText.setText(charArr, 0, 1000);//.setText(seq);
}

protected static synchronized String getUrlContent(String url) throws ApiException {
    if (sUserAgent == null) {
        throw new ApiException("User-Agent string must be prepared");
    }

    // Create client and set our specific user-agent string
    HttpClient client = new DefaultHttpClient();
    HttpGet request = new HttpGet(url);
    request.setHeader("User-Agent", sUserAgent);

    try {
        HttpResponse response = client.execute(request);

        // Check if server response is valid
        StatusLine status = response.getStatusLine();
        if (status.getStatusCode() != HTTP_STATUS_OK) {
            throw new ApiException("Invalid response from server: " +
                    status.toString());
        }

        // Pull content stream from response
        HttpEntity entity = response.getEntity();
        InputStream inputStream = entity.getContent();

        ByteArrayOutputStream content = new ByteArrayOutputStream();

        // Read response into a buffered stream
        int readBytes = 0;
        while ((readBytes = inputStream.read(sBuffer)) != -1) {
            content.write(sBuffer, 0, readBytes);
        }

        // Return result from buffered stream
        return new String(content.toByteArray(), "utf-8");
    } catch (IOException e) {
        throw new ApiException("Problem communicating with API", e);
    }
}
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-13T11:47:05+00:00Added an answer on May 13, 2026 at 11:47 am

    The charset is defined in the page itself:

    <meta http-equiv="Content-Type" content="text/html; charset=gb2312" /> 
    

    In general, there are 3 ways to specify the encoding of an HTTP-server HTML page:

    Content-Type header of HTTP

    Content-Type: text/html; charset=utf-8
    

    Encoding pseudo-attribute in the XML declaration

    <?xml version="1.0" encoding="utf-8" ?>
    

    meta tag inside head

    <meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
    

    see Character Encodings for details

    So you should try to evaluate each possible declaration in order to find the appropriate encoding. You could try to parse a page with utf-8 and restart if you encounter the Content-Type declaration meta tag.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm trying to download data from a webpage then parse it, the problem is
I'm trying to download an xml file to parse from a server, but the
I'm trying to download an HTML page, and parse it using XMLHttpRequest(on the most
I am trying to parse an RSS feed and download all of the images
I'm trying to parse download pages from www.mediafire.com, but i really often get a
i'm trying to use this to download mp3 files Intent downloadIntent = new Intent(Intent.ACTION_VIEW,
I'm trying to use OpenURI to download a file from S3, and then save
I'm trying to download and parse the HTML of a web page. Recently, the
I'm trying to download a web page and parse it. I need to reach
I'm trying to download, parse and show a list, from the XML received from

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.