Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 961395
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 16, 20262026-05-16T01:20:14+00:00 2026-05-16T01:20:14+00:00

I’ve set up a script that processes incoming emails and creates blog entries on

  • 0

I’ve set up a script that processes incoming emails and creates blog entries on Blogger. I’m using PEAR’s Mail_Mime libs (for now) to read the incoming message. The messages often have characters in them that cannot be read by browsers–this happens most often when people use Outlook or cut/paste from MS Word.

So the output at the other end is something like this:

Here is a test post with “quotes” and ‘apostrophes�for what it�s worth, it also has dashes�and other strange formatting cut and paste from MS Word.

You can also see the output in the wild.

It’s not hard to fix any specific instance, but each client (hotmail, gmail, outlook, etc) seems to handle things just a bit differently. Mail_Mime only seems to munge the output and, if I turn off Mail_Mime’s parsing and try to translate the encoded characters myself using mb_convert_encoding or some manual simulation of this, it’s even worse.

Please not that this is not going to be solved by selecting the right encoding type and using decode/encode/convert functions. The incoming formats vary from Windows-1252 to UTF8 to just about anything else mail clients can think of.

Has anyone scripted this before that could save me some time by offering up a sample or advice on the best approach? I’ve tried all the simple answers and done plenty of experimenting, so please don’t bother responding unless you’ve dealt with a similar issue successfully or have a deep understanding of encoding issues.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-16T01:20:14+00:00Added an answer on May 16, 2026 at 1:20 am

    To solve this problem, and get my message into valid UTF-8 that is readable from a browser, I found this PHP lib, ConvertCharset by Mikolaj Jedrzejak, which worked on almost everything. It still had issues with a specific symbol (=A0) when converting from Windows-1252 or iso-8859-1. So I converted this character manually before setting the code loose.

    Here’s what it looks like overall:

    // decode using Mail_Mime
    require 'Mail.php';
    require 'Mail/mime.php';
    require 'Mail/mimeDecode.php';
    $params['include_bodies'] = true;
    $params['decode_bodies']  = true; // this decodes it!
    $params['decode_headers'] = true;
    $decoder = new Mail_mimeDecode($input);
    $mime = $decoder->decode($params);
    
    // too much work to put in this example
    $charset = ...; //do some magic with $mime->parts to get the character set
    $text = ...; //do some magic with $mime->parts to get the text
    
    // fix the =A0 control character; it's already been decoded 
    // by Mail_Mime, so we need the actual byte code now
    // this has to be done before trying to convert to UTF-8
    $char = chr(hexdec(substr('A0',1)));
    $text = str_replace($char, '', $text);
    
    // convert to UTF-8 using ConvertCharset
    require 'ConvertCharset.class.php';
    if( strtolower($charset) != 'utf-8' ) {
      $converter = new ConvertCharset($charset, 'utf-8', false);
    }
    $text = $converter->Convert($text);
    

    Then everything is spiffy. It even does the infamous Iñtërnâtiônàlizætiøn conversion, as well as accepting french, spanish, and pastes directly from MS Word 🙂

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.