I am running a classification/feature extraction task on a windows server with 64GB of

Question

0

Asked: May 23, 20262026-05-23T05:29:10+00:00 2026-05-23T05:29:10+00:00

I am running a classification/feature extraction task on a windows server with 64GB of

0

I am running a classification/feature extraction task on a windows server with 64GB of RAM, and somehow, python thinks i am running out of memory:

misiti@fff /cygdrive/c/NaiveBayes
$ python run_classify_comments.py > tenfoldcrossvalidation.txt
Traceback (most recent call last):
  File "run_classify_comments.py", line 70, in <module>
    run_classify_comments()
  File "run_classify_comments.py", line 51, in run_classify_comments
    NWORDS = get_all_words("./data/HUGETEXTFILE.txt")
  File "run_classify_comments.py", line 16, in get_all_words
    def get_all_words(path): return words(file(path).read())
  File "run_classify_comments.py", line 15, in words
    def words(text): return re.findall('[a-z]+', text.lower())
  File "C:\Program Files (x86)\Python26\lib\re.py", line 175, in findall
    return _compile(pattern, flags).findall(string)
MemoryError

So the re module is crashing with 64 GB of RAM…I do not think so…
Why is this happening, and how can I configure python to use all available RAM on my machine?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-23T05:29:11+00:00

Editorial Team

2026-05-23T05:29:11+00:00Added an answer on May 23, 2026 at 5:29 am

Just rewrite your program to read your huge text file one line at a time. This is easily done by just changing get_all_words(path) to:

def get_all_words(path):
    return sum((words(line) for line in open(path))

Note the use of a generator in the parenthesis, which is lazy and will evaluate on demand by the sum function.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am running a classification/feature extraction task on a windows server with 64GB of

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply