I have a Python project containing hundreds of modules. In Python 2.6, the encoding

Question

0

Asked: May 24, 20262026-05-24T15:50:38+00:00 2026-05-24T15:50:38+00:00

I have a Python project containing hundreds of modules. In Python 2.6, the encoding

0

I have a Python project containing hundreds of modules. In Python 2.6, the encoding of source files (modules) must be ASCII unless
an explicit encoding declaration is present. Is there a simple way to find out which python modules contain non-ASCII characters? So that I can correct them.

Regards,

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-24T15:50:38+00:00

Have a look at the chardet python package. You can use the same os.walk approach as agf and call the chardet.detect method and flag files that are not ASCII (or with a lower confidence value).

This does leave some room for error though, so if you wanted to be more sure, you could also scan each file for characters that are unlikely to appear in a python file (non-alphanum, non-punctuation, etc). However, this will not detect things like UTF-16 chars that have the same value as two 7-bit, zero padded ascii chars i.e. U+16705 <–> AA.

That said, if the characters that you want to exclude are from a limited number of character sets, you should be able to locate them with high confidence.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a Python project containing hundreds of modules. In Python 2.6, the encoding

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply