I run across this problem frequently suppose I have a text file that I

Question

0

Editorial Team

Asked: May 20, 20262026-05-20T04:36:06+00:00 2026-05-20T04:36:06+00:00

I run across this problem frequently suppose I have a text file that I

0

I run across this problem frequently suppose I have a text file that I have read in as as a list using file.readlines()

suppose the file looks something like this

stuff stuff stuff stuff stuff
stuff stuff stuff stuff stuff
stuff stuff stuff stuff stuff
stuff stuff stuff stuff stuff #indeterminate number of line \
The text I want is set off by something distinctive
I want this
I want this
I want this
I want this # indeterminate number of lines
The end is also identifiable by something distinctive
stuff stuff stuff stuff stuff
stuff stuff stuff stuff stuff
stuff stuff stuff stuff stuff

The way I have been handling this is to do something like this

themasterlist=[]
for file in filelist:
    count=0
    templist=[]
    for line in file:
        if line=='The text I want is set off by something distinctive':
            count=1
        if line=='The end is also identifiable by something distinctive':
            count=0
        if count==1:
        templist.append(line)
   themasterlist.append(templist)

I have thought about using the string (file.read()) and splitting it based on the end points and then converting it to a list but actually I want to use this construction for a number of other types. For example, suppose I am iterating through the elements of an lxml.fromstring(somefile) and I want to process a subset of the elements based on whether or not the element.text contains some phrase etc.

Note, I could be running through 200K to 300K files at a time.

My solution works but it feels clunky and like I am missing something important about python

There are three really good answers and I learned something useful from each. I need to select one as the answer but I do appreciate the response of each poster it was very helpful

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-20T04:36:07+00:00

I like stuff like this:

def findblock( lines, start, stop ):
    it = iter(lines)
    for line in it:
        if start in line:
            # now we are in the block, so yield till we find the end
            for line in it:
                if stop in line:
                    # lets just look for one block
                    return # leave this generator
                    # break # would keep looking for the next block
                yield line                

for line in findblock(lines, start="something distinctive", 
                             stop="something distinctive"):
    print line

The stuff you were missing is yield and list comprehensions – here is your code revised:

def findblock( lines, start='The text I want is set off by something distinctive', 
                      stop='The end is also identifiable by something distinctive'):
    for line in lines:
        inblock = False
        if line==start:
            inblock=True
        if line==stop:
            inblock=False # or return mb?
        if inblock:
            yield line

themasterlist = [list(findblock( file )) for file in files]

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I run across this problem frequently suppose I have a text file that I

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply