Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9242793
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 18, 20262026-06-18T08:39:32+00:00 2026-06-18T08:39:32+00:00

I need to change a plain text UTF8 document from a R to L

  • 0

I need to change a plain text UTF8 document from a R to L language to a Latin language. It isn’t as easy as a character-character transliteration unfortunately.
For example, the “a” in the R to L language (ا) can be either “a” or “ә” depending on the word composition.

In words with a g, k, e, or hamza (گ،ك،ە، ء)
I need to change all the a, o, i, u (ا،و،ى،ۇ) to Latin ә, ѳ, i, ü (called “soft” vowels).
eg. سالەم becomes sәlêm, ءۇي becomes üy, سوزمەن becomes sѳzmên

In words without a g, k, e, or hamza (گ،ك،ە، ء)
the a, o, i, u change to Latin characters, a, o, i, u (called “hard” vowels).
eg. الما becomes alma, ۇل becomes ul, ورتا becomes orta.

In essence,
the g, k, e, or hamza act as a pronounciation guide in the arabic script.
In Latin, then I need two different sets of vowels depending on the original word in the arabic script.

I was thinking I might need to do the “soft” vowel words as step one, then do a separate Find and Replace on the rest of the document. BUT, how do I conduct a Find and Replace like this anyway with perl, or python?

Here is a unicode example: \U+0633\U+0627\U+0644\U+06D5\U+0645 \U+0648\U+0631\U+062A\U+0627 \U+0674\U+06C7\U+064A \U+0633\U+0648\U+0632\U+0645\U+06D5\U+0645 \U+0627\U+0644\U+0645\U+0627 \U+06C7\U+0644 \U+0645\U+06D5\U+0646\U+0649\U+06AD \U+0627\U+062A\U+0649\U+0645 \U+0634\U+0627\U+0644\U+0642\U+0627\U+0631.

It should come out looking like: “sәlêm orta üy sѳzmên alma ul mêning atim xalқar”.(NOTE: the letter ڭ, which is U+06AD actually ends up as two letters, n+g, to make an “-ng” sound). It shouldn’t look like “salêm orta uy sozmên alma ul mêning atim xalқar”, nor “sәlêm ѳrtә üy sѳzmên әlmә ül mêning әtim xәlқәr”.

Much thanks to any help.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-18T08:39:34+00:00Added an answer on June 18, 2026 at 8:39 am

    Command:

    $ echo سالەم ورتا ءۇي سوزمەن الما ۇل مەنىڭ اتىم شالقار | ./arabic-to-latin
    

    Output:

    sәlêm orta üy sѳzmên alma ul mêning atim xalқar
    

    To use files instead of stdin/stdout:

    $ ./arabic-to-latin input_file_with_arabic_text_in_utf8 >output_latin_in_utf8
    

    Where arabic-to-latin file:

    #!/usr/bin/perl
    use strict;
    use warnings;
    use utf8;
    use open qw(:std :utf8);
    #XXX normalization
    
    sub replace_word {
        my ($word) = @_;
        $_ = $word;
        if (/ء|ە|ك|گ/) { # g, k, e, or hamza in the word
            tr/اوىۇ/әѳiü/; # soft
        } else {
            tr/اوىۇ/aoiu/; # hard
        }
        tr/سلەمرتزنشق/slêmrtznxқ/;
        s/ءüي/üy/g;
        s/ڭ/ng/g;
        $_;
    }
    
    while (my $line = <>) {
        $line =~ s/(\w+)/replace_word($1)/ge;
        print $line;
    }
    

    To make arabic-to-latin file executable:

    $ chmod +x ./arabic-to-latin
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I need change background of all text that have two spaces from the start
I use Pjax with tutorial from http://railscasts.com/episodes/294-playing-with-pjax?view=comments I don't need change url and this
I need to change table name from lowercase to uppercase but using this statement
I need to change a menuStrip item text of the main window (mdi container)
I want to read plain/text from an Https server. I have tried various codes
I have the HTML from a page in a variable as just plain text.
I need a date of birth as a plain text in Ruby on Rails
I need change the font weight for a text element in JavaScript: My code
I have a huge plain text document, about 700kb which is very big for
I have a plain text file that I need to read in using C#,

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.