Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 4022582
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 20, 20262026-05-20T10:29:36+00:00 2026-05-20T10:29:36+00:00

Internally our PHP application uses UTF-8, and we do processing on .csv files and

  • 0

Internally our PHP application uses UTF-8, and we do processing on .csv files and fixedwidth (text) files. We have written some nice libraries to work with these files (classes essentially).

We recently added the ability for administrators to upload files of these types so they could be processed and quickly ran into issues across multiple OS’s. What we soon realised is that the files being read in were of different encodings to our application (i.e Windows-1252 or ISO-8859).

Since it is impossible to control what encoding of files are submitted to us my question is; what is the best way to handle uploaded text files of different encodings? I can think of two solutions currently:

  • When a file is received, detect its encoding and convert it to UTF-8, then re-save it. The rest of the system then only needs to be UTF-8 aware and can ignore ‘encoding’ issues.
  • Change the csv / fixed width library so they become encoding aware themselves

I also thought about the pro’s and con’s of these too:

  • Converting input makes the rest of the libraries smaller and reduces duplication, however it seems wasteful in terms of processing
  • Make libraries internally aware – this seems to involve more code but might be more speedy

Thoughts please?

Edit: I am really interested to know where to apply, architecturally, character encoding/transforming should happen – is it at the point of input or during the use of the files?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-20T10:29:36+00:00Added an answer on May 20, 2026 at 10:29 am

    This is tricky, and there is no perfect solution.

    phpMyAdmin for example offers the user the possibility to specify the encoding of the uploaded file. Seeing as all the automatic detection methods are not 100% reliable, if at all possible, this is the best way to go IMO.

    An import dialog that allows the user to select the right encoding while seeing a preview of what their data looks like in that encoding might be optimal.

    A way to do this could be

    • Receive the uploaded file and store it in a temporary file

    • Display a dialog with a drop-down selection of the most important encodings

    • Have an iframe that, when the selected value in the drop-down changes, converts the contents of the uploaded file using iconv() (source = the selected encoding; target = utf-8) and shows a preview.

    • When the user selects an encoding, do a final iconv() and store the file as UTF-8.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

One of our internally written tool is fed a cvs commit trace of the
We're testing our ClickOnce deployed application internally on IIS (Internet Information Services), but we're
Our web page is only available internally in our company and we am very
We use a number of diffrent web services in our company, wiki(moinmoin), bugtracker (internally),
Our shop is in the process of converting our internal project management application from
I'm currently working on a parser for our internal log files (generated by log4php,
I have an HTTP server which is in our internal network and accessible only
I was thinking of adding some Achievements to our internal bug-tracking and time logging
I've been too lax with performing DB backups on our internal servers. Is there
I'm looking to re-organize the way we release our internal software. All of the

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.