I have over a million text files compressed into 40 zip files. I also have a list of about 500 model names of phones. I want to find out the number of times a particular model was mentioned in the text files.
Is there any python module which can do a regex match on the files without unzipping it. Is there a simple way to solve this problem without unzipping?
There’s nothing that will automatically do what you want.
However, there is a python zipfile module that will make this easy to do. Here’s how to iterate over the lines in the file.