Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7718773
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 1, 20262026-06-01T03:14:58+00:00 2026-06-01T03:14:58+00:00

I am trying to output unicode text to an RTF file from a python

  • 0

I am trying to output unicode text to an RTF file from a python script. For background, Wikipedia says

For a Unicode escape the control word \u is used, followed by a 16-bit signed decimal integer giving the Unicode UTF-16 code unit number. For the benefit of programs without Unicode support, this must be followed by the nearest representation of this character in the specified code page. For example, \u1576? would give the Arabic letter bāʼ ب, specifying that older programs which do not have Unicode support should render it as a question mark instead.

There is also this question on outputting RTF from Java and this one on doing so in C#.

However, what I can’t figure out is how to output the unicode code point as a “16-bit signed decimal integer with the Unicode UTF-16 code unit number” from Python. I’ve tried this:

for char in unicode_string:
    print '\\' + 'u' + ord(char) + '?',

but the output only renders as gibberish when opened in a word processor; the problem appears to be that it’s not the UTF-16 code number. But not sure how to get that; though one can encode in utf-16, how does one get the code number?

Incidentally PyRTF does not support unicode (it’s listed as a “todo”), and while pyrtf-NG is supposed to do so, that project does not appear to be maintained and has little documentation, so I am wary of using it in a quasi-production system.

Edit: My mistake. There are two bugs in the above code – as pointed out by Wobble below the string has to be a unicode string, not an already encoded one, and the above code produces a result with spaces between characters. The correct code is this:

convertstring=""
for char in unicode(<my_encoded_string>,'utf-8'):
    convertstring = convertstring + '\\' + 'u' + str(ord(char)) + '?'

This works fine, at least with OpenOffice. I am leaving this here as a reference for others
(one mistake further corrected after discussion below).

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-01T03:15:00+00:00Added an answer on June 1, 2026 at 3:15 am

    Based on the information in your latest edit, I think this function will work properly. Except see the improved version below.

    def rtf_encode(unistr):
        return ''.join([c if ord(c) < 128 else u'\\u' + unicode(ord(c)) + u'?' for c in unistr])
    
    >>> test_unicode = u'\xa92012'
    >>> print test_unicode
    ©2012
    >>> test_utf8 = test_unicode.encode('utf-8')
    >>> print test_utf8
    ©2012
    >>> print rtf_encode(test_utf8.decode('utf-8'))
    \u169?2012
    

    Here’s another version that’s broken down a little to be easier to understand. I also made it consistent in returning an ASCII string rather than keeping Unicode and flubbing it at the join. It also incorporates a fix based on the comments.

    def rtf_encode_char(unichar):
        code = ord(unichar)
        if code < 128:
            return str(unichar)
        return '\\u' + str(code if code <= 32767 else code-65536) + '?'
    
    def rtf_encode(unistr):
        return ''.join(rtf_encode_char(c) for c in unistr)
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm trying to output unicode string into RTF format. (using c# and winforms) From
I'm trying to read a simple unicode (UTF-16) text file with just some numbers
I am trying to output a XML file by fetching data from a dataset.
I'm trying to output the unicode latin cross character described in this chart :
I am trying to output values of each rows from a DataSet : for
I am trying to output a few paragraphs of text in an Excel spreadsheet,
I am trying to output a file in perl. I open the file and
I am trying to output results of 2 sql queries to one JSON file.
I have a really crappy file full of unicode bytes that I'm trying to
I'm trying to write a program that manipulates unicode strings read in from a

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.