Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 1033183
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 16, 20262026-05-16T14:11:37+00:00 2026-05-16T14:11:37+00:00

I read How can I detect the encoding/codepage of a text file It’s not

  • 0

I read How can I detect the encoding/codepage of a text file
It’s not possible to detect encoding. However is it possible to detect whether encoding is one of two allowed?

For example I allow user to use Unicode UTF-8 and iso-8859-2 for their csv files. Is it possible to detect whether it is former or latter?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-16T14:11:37+00:00Added an answer on May 16, 2026 at 2:11 pm

    For example I allow user to use
    Unicode UTF-8 and iso-8859-2 for their
    csv files. Is it possible to detect
    whether it is former or latter?

    It’s not possible with 100% accuracy because, for example, the bytes C3 B1 are an equally valid representation of “Ăą” in ISO-8859-2 as they are of “ñ” in UTF-8. In fact, because ISO-8859-2 assigns a character to all 256 possible bytes, every UTF-8 string is also a valid ISO-8859-2 string (representing different characters if non-ASCII).

    However, the converse is not true. UTF-8 has strict rules about what sequences are valid. More than 99% of possible 8-octet sequences are not valid UTF-8. And your CSV files are probably much longer than that. Because of this, you can get good accuracy if you:

    1. Perform a UTF-8 validity check. If it passes, assume the data is UTF-8.
    2. Otherwise, assume it’s ISO-8859-2.

    However is it possible to detect
    whether encoding is one of two
    allowed?

    UTF-32 (either byte order), UTF-8, and CESU-8 can be reliably detected by validation.
    UTF-16 can be detected by presence of a BOM (but not by validation, since the only way for an even-length byte sequence to be invalid UTF-16 is to have unpaired surrogates).

    If you have at least one “detectable” encoding, then you can check for the detectable encoding, and use the undetectable encoding as a fallback.

    If both encodings are “undetectable”, like ISO-8859-1 and ISO-8859-2, then it’s more difficult. You could try a statistical approach like chardet uses.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Possible Duplicate: How can I detect the encoding/codepage of a text file I've been
I can read XLS file with this code : string path =@c:\r\1.xlsx; OleDbConnection MyConnection
Possible Duplicate: Can Read-Only Properties be Implemented in Pure JavaScript? I have an Object
The program can read all the data from the pipe. However, the program just
In main I can read my config file, and supply it as runReader (somefunc)
I have a text file and it can be ANSI (with ISO-8859-2 charset), UTF-8,
In my file browser, I can read the list of directories and files say
My code can detect { bracket error, if in file there is only }
You can read this question where I ask about the best architecture for a
I can read unmanaged memory in C# using UnmanagedMemoryStream, but how can I do

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.