Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9197261
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 17, 20262026-06-17T22:01:49+00:00 2026-06-17T22:01:49+00:00

Does there exist a table or something similar which shows how many bytes different

  • 0

Does there exist a table or something similar which shows how many bytes different languages need on average to represent a visible character (glyph) when the encoding is utf8?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-17T22:01:50+00:00Added an answer on June 17, 2026 at 10:01 pm

    If you want something general, I think you should stick with this:

    • English takes very slightly more than 1 byte per character (there is the occasional non-ASCII character, often punctuation or symbols embedded in text).
    • Most other languages which use the latin alphabet use somewhat more than 1, but I would be surprised if you should expect more than, say, 1.5.
    • Languages using some of the other scripts (Greek, etc…) take around 2 bytes per character.
    • East Asian languages take about 3 bytes per character (spacing, control characters, and embedded ASCII make it take less, non-BMP makes it take more).

    That’s all very incomplete, approximate, and non-quantitative.

    If you need something more quantitative, I think you will have to research each language individually. I doubt you will find precomputed results out there that already apply to a host of different languages.

    If you have a corpus of text for a language, it’s easy to calculate the average number of bytes required. Start with the Text corpus Wikipedia page. It links to at least one good freely available corpus for English and there might be some available for other languages as well (I didn’t hunt through the links to find out).

    Incidentally, I don’t recommend using this information to truncate the length of a database field as you indicated (in comments) that you intend to do. First of all, if you used a corpus made up from litterature to come up with your expected number of bytes per character, you might find the corpus is not at all representative of the short little text strings that end up in your database, throwing off your expectation. Just get the whole database column. Most results will be much shorter than the maximum length, and when they’re not, I don’t think your optimization is worth it to save a hundred bytes or so.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I need to check if I Filesystem exists, and if it does exist there
Does there exist a utf8 code for x(4), functional for cross browser/os. According to
Does there exist a free Windows software program that will help you generate regular
for ($rank=0; $rank<100; $rank++) { printf(Your rank: %d%s, $rank, $suffix); } Does there exist
There are always the log showing: file does not exist c:/wamp/www/favicon.ico in apache error_log.
There is the same question listed under The key 'UserID' does not exist in
See example: ORA-00942: table or view does not exist : How do I find
Does there is a way to run my AsyncTask after it finish ? My
There does not appear to be any good software to mount an FTP to
There are several PHP or js code formatting libs out there -- does anyone

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.