Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 39647
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 10, 20262026-05-10T14:53:53+00:00 2026-05-10T14:53:53+00:00

( Updated a little ) I’m not very experienced with internationalization using PHP, it

  • 0

(Updated a little)

I’m not very experienced with internationalization using PHP, it must be said, and a deal of searching didn’t really provide the answers I was looking for.

I’m in need of working out a reliable way to convert only ‘relevant’ text to Unicode to send in an SMS message, using PHP (just temporarily, whilst service is rewritten using C#) – obviously, messages sent at the moment are sent as plain text.

I could conceivably convert everything to the Unicode charset (as opposed to using the standard GSM charset), but that would mean that all messages would be limited to 70 characters (instead of 160).

So, I guess my real question is: what is the most reliable way to detect the requirement for a message to be Unicode-encoded, so I only have to do it when it’s absolutely necessary (e.g. for non-Latin-language characters)?

Added Info:

Okay, so I’ve spent the morning working on this, and I’m still no further on than when I started (certainly due to my complete lack of competency when it comes to charset conversion). So here’s the revised scenario:

I have text SMS messages coming from an external source, this external source provides the responses to me in plain text + Unicode slash-escaped characters. E.g. the ‘displayed’ text:

Let’s test öäü éàè אין תמיכה בעברית

Returns:

Let’s test \u00f6\u00e4\u00fc \u00e9\u00e0\u00e8 \u05d0\u05d9\u05df \u05ea\u05de\u05d9\u05db\u05d4 \u05d1\u05e2\u05d1\u05e8\u05d9\u05ea

Now, I can send on to my SMS provider in plaintext, GSM 03.38 or Unicode. Obviously, sending the above as plaintext results in a lot of missing characters (they’re replaced by spaces by my provider) – I need to adopt relating to what content there is. What I want to do with this is the following:

  1. If all text is within the GSM 03.38 codepage, send it as-is. (All but the Hebrew characters above fit into this category, but need to be converted.)

  2. Otherwise, convert it to Unicode, and send it over multiple messages (as the Unicode limit is 70 chars not 160 for an SMS).

As I said above, I’m stumped on doing this in PHP (C# wasn’t much of an issue due to some simple conversion functions built-in), but it’s quite probable I’m just missing the obvious, here. I couldn’t find any pre-made conversion classes for 7-bit encoding in PHP, either – and my attempts to convert the string myself and send it on seemed futile.

Any help would be greatly appreciated.

  • 1 1 Answer
  • 1 View
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. 2026-05-10T14:53:54+00:00Added an answer on May 10, 2026 at 2:53 pm

    To deal with it conceptually before getting into mechanisms, and apologies if any of this is obvious, a string can be defined as a sequence of Unicode characters, Unicode being a database that gives an id number known as a code point to every character you might need to work with. GSM-338 contains a subset of the Unicode characters, so what you’re doing is extracting a set of codepoints from your string, and checking to see if that set is contained in GSM-338.

    // second column of http://unicode.org/Public/MAPPINGS/ETSI/GSM0338.TXT $gsm338_codepoints = array(0x0040, 0x0000, ..., 0x00fc, 0x00e0) $can_use_gsm338 = true; foreach(codepoints($mystring) as $codepoint){     if(!in_array($codepoint, $gsm338_codepoints)){       $can_use_gsm338 = false;       break;     } } 

    That leaves the definition of the function codepoints($string), which isn’t built in to PHP. PHP understands a string to be a sequence of bytes rather than a sequence of Unicode characters. The best way of bridging the gap is to get your strings into UTF8 as quickly as you can and keep them in UTF8 as long as you can – you’ll have to use other encodings when dealing with external systems, but isolate the conversion to the interface to that system and deal only with utf8 internally.

    The functions you need to convert between php strings in utf8 and sequences of codepoints can be found at http://hsivonen.iki.fi/php-utf8/ , so that’s your codepoints() function.

    If you’re taking data from an external source that gives you Unicode slash-escaped characters (‘Let’s test \u00f6\u00e4\u00fc…’), that string escape format should be converted to utf8. I don’t know offhand of a function to do this, if one can’t be found, it’s a matter of string/regex processing + the use of the hsivonen.iki.fi functions, for example when you hit \u00f6, replace it with the utf8 representation of the codepoint 0xf6.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Ask A Question

Stats

  • Questions 283k
  • Answers 283k
  • Best Answers 0
  • User 1
  • Popular
  • Answers
  • Editorial Team

    How to approach applying for a job at a company ...

    • 7 Answers
  • Editorial Team

    What is a programmer’s life like?

    • 5 Answers
  • Editorial Team

    How to handle personal stress caused by utterly incompetent and ...

    • 5 Answers
  • Editorial Team
    Editorial Team added an answer Suraj and willcodejavaforfood put me on the good track. Checking… May 13, 2026 at 4:22 pm
  • Editorial Team
    Editorial Team added an answer You are only being shown the outer-most error there and… May 13, 2026 at 4:22 pm
  • Editorial Team
    Editorial Team added an answer double-check the complete constructed query and compare that it is… May 13, 2026 at 4:22 pm

Related Questions

( Updated a little ) I'm not very experienced with internationalization using PHP, it
I have a table (client) with 20+ columns that is mostly historical data. Something
A little presentation for what I want to do: Consider the case where different
For a little background information, I've got an application that's running in a loop,
I currently have a database that gets updated from a legacy application. I'd like

Trending Tags

analytics british company computer developers django employee employer english facebook french google interview javascript language life php programmer programs salary

Top Members

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.