I am connecting to a MS SQL server through SQL Alchemy, using pyodbc module.

Question

0

Asked: May 16, 20262026-05-16T22:48:41+00:00 2026-05-16T22:48:41+00:00

I am connecting to a MS SQL server through SQL Alchemy, using pyodbc module.

0

I am connecting to a MS SQL server through SQL Alchemy, using pyodbc module. Everything appears to be working fine, until I began having problems with the encodings. Some of the non-ascii characters are being replaced with ‘?’

The DB has a collation ‘Latin1_General_CI_AS’ (I’ve checked also the specific fields and they keep the same collation). I started selecting the encoding ‘latin1’ in the call of create_engine and that appears to work for Western European character (like French or Spanish, characters like é) but not for Easter European characters. Specifically, I have a problem with the character ć

I have been trying to select other encodings as stated on Python documentation, specifically the Microsoft ones, like cp1250 and cp1252, but I keep facing the same problem.

Does anyone knows how to solve those differences? Does the collation ‘Latin1_General_CI_AS’ has an equivalence on Python encodings?

The code for my current connection is the following

for sqlalchemy import *

def connect():
    return pyodbc.connect('DSN=database;UID=uid;PWD=password')

engine = create_engine('mssql://', creator=connect, encoding='latin1')
connection = engine.connect()

Clarifications and comments:

This problems happens when retrieving information from the DB. I don’t need to store anything.
At the beginning I didn’t specify the encoding, and the result was that, whenever a non ascii character was encountered on the DB, pyodbc raises a UnicodeDecodeError. I corrected that using ‘latin1’ as encoding, but that doesn’t solve the problem for all the characters.
I admit that the server is not on latin1, the comment is incorrect. I have been checking both the database collation and the specific fields collations and appears to be all in ‘Latin1_General_CI_AS’, then, how can ć be stored? Maybe I’m not correctly understanding collations.
I corrected a little the question, specifically, I have tried more encodings than latin1, also cp1250 and cp1252 (which apparently is the one used on ‘Latin1_General_CI_AS’, according to msdn)

UPDATE:

OK, Following these steps, I get that the encoding used by the DB appears to be cp1252: http://bytes.com/topic/sql-server/answers/142972-characters-encoding
Anyway, that appears to be a bad assumption as reflected on answers.

UPDATE2:
Anyway, after configuring properly the odbc driver, I don’t need to specify the encoding on the Python code.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-16T22:48:42+00:00

Editorial Team

2026-05-16T22:48:42+00:00Added an answer on May 16, 2026 at 10:48 pm

You should stop using code pages and switch to Unicode. This is the only way of getting rid of this kind of problems.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am connecting to a MS SQL server through SQL Alchemy, using pyodbc module.

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply