I will have to perform a spelling check-like operation in Python as follows: I

Question

0

Asked: June 1, 20262026-06-01T14:53:44+00:00 2026-06-01T14:53:44+00:00

I will have to perform a spelling check-like operation in Python as follows: I

0

I will have to perform a spelling check-like operation in Python as follows:

I have a huge list of words (let’s call it the lexicon). I am now given some text (let’s call it the sample). I have to search for each sample word in the lexicon. If I cannot find it, that sample word is an error.

In short – a brute-force spelling checker. However, searching through the lexicon linearly for each sample word is bound to be slow. What’s a better method to do this?

The complicating factor is that neither the sample nor the lexicon is in English. It is in a language which instead of 26 characters, can have over 300 – stored in Unicode.

A suggestion of any algorithm / data structure / parallelization method will be helpful. Algorithms which have high speed at the cost of less than 100% accuracy would be perfect, since I don’t need 100% accuracy. I know about Norvig’s algorithm for this, but it seems English-specific.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-01T14:53:45+00:00

You can use a set of Unicode strings:

s = set(u"rabbit", u"lamb", u"calf")

and use the in operator to check whether a word occurs:

>>> u"rabbit" in s
True
>>> u"wolf" in s
False

This look-up is essentially O(1), so the size of the dictionary does not matter.

Edit: Here’s the complete code for a (case-sensitive) spell checker (2.6 or above):

from io import open
import re
with open("dictionary", encoding="utf-8") as f:
    words = set(line.strip() for line in f)
with open("document", encoding="utf-8") as f:
    for w in re.findall(r"\w+", f.read()):
        if w not in words:
            print "Misspelled:", w.encode("utf-8")

(The print assumes your terminal uses UTF-8.)

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I will have to perform a spelling check-like operation in Python as follows: I

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply