In sql server 2000, your choices for blob storages is…

Question

0

Asked: May 10, 20262026-05-10T17:29:06+00:00 2026-05-10T17:29:06+00:00

I’m not exactly sure how to ask this question really, and I’m no where

0

I’m not exactly sure how to ask this question really, and I’m no where close to finding an answer, so I hope someone can help me.

I’m writing a Python app that connects to a remote host and receives back byte data, which I unpack using Python’s built-in struct module. My problem is with the strings, as they include multiple character encodings. Here is an example of such a string:

‘^LThis is an example ^Gstring with multiple ^Jcharacter encodings’

Where the different encoding starts and ends is marked using special escape chars:

^L – Latin1
^E – Central Europe
^T – Turkish
^B – Baltic
^J – Japanese
^C – Cyrillic
^G – Greek

And so on… I need a way to convert this sort of string into Unicode, but I’m really not sure how to do it. I’ve read up on Python’s codecs and string.encode/decode, but I’m none the wiser really. I should mention as well, that I have no control over how the strings are outputted by the host.

I hope someone can help me with how to get started on this.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

score 0 · Answer 1 · 2026-05-10T17:29:06+00:00

There’s no built-in functionality for decoding a string like this, since it is really its own custom codec. You simply need to split up the string on those control characters and decode it accordingly.

Here’s a (very slow) example of such a function that handles latin1 and shift-JIS:

latin1 = 'latin-1' japanese = 'Shift-JIS'  control_l = '\x0c' control_j = '\n'  encodingMap = {     control_l: latin1,     control_j: japanese}  def funkyDecode(s, initialCodec=latin1):     output = u''     accum = ''     currentCodec = initialCodec     for ch in s:         if ch in encodingMap:             output += accum.decode(currentCodec)             currentCodec = encodingMap[ch]             accum = ''         else:             accum += ch     output += accum.decode(currentCodec)     return output

A faster version might use str.split, or regular expressions.

(Also, as you can see in this example, ‘^J’ is the control character for ‘newline’, so your input data is going to have some interesting restrictions.)

How to approach applying for a job at a company ...

How to handle personal stress caused by utterly incompetent and ...

What is a programmer’s life like?

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions