I would like to search through a text file and print out a line and its subsequent 3 lines if a keyword is found in the line AND a different keyword is found within the subsequent 3 lines.
My code right now prints too much information. Is there a way to move forward to the next section of text once a portion is already printed?
text = """
here is some text 1
I want to print out this line and the following 3 lines only once keyword 2
print this line since it has a keyword2 3
print this line keyword 4
print this line 5
I don't want to print this line but I want to start looking for more text starting at this line 6
Don't print this line 7
Not this line either 8
I want to print out this line again and the following 3 lines only once keyword 9
please print this line keyword 10
please print this line it has the keyword2 11
please print this line 12
Don't print this line 13
Start again searching here 14
etc.
"""
text2 = open("tmp.txt","w")
text2.write(text)
text2.close()
searchlines = open("tmp.txt").readlines()
data = []
for m, line in enumerate(searchlines):
line = line.lower()
if "keyword" in line and any("keyword2" in l.lower() for l in searchlines[m:m+4]):
for line2 in searchlines[m:m+4]:
data.append(line2)
print ''.join(data)
The output right now is:
I want to print out this line and the following 3 lines only once keyword 2
print this line since it has a keyword2 3
print this line keyword 4
print this line 5
print this line since it has a keyword2 3
print this line keyword 4
print this line 5
I don't want to print this line but I want to start looking for more text starting at this line 6
I want to print out this line again and the following 3 lines only once keyword 9
please print this line keyword 10
please print this line it has the keyword2 11
please print this line 12
please print this line keyword 10
please print this line it has the keyword2 11
please print this line 12
Don't print this line 13
please print this line it has the keyword2 11
please print this line 12
Don't print this line 13
Start again searching here 14
I would like it to print out only:
I want to print out this line and the following 3 lines only once keyword 2
print this line since it has a keyword2 3
print this line keyword 4
print this line 5
I want to print out this line again and the following 3 lines only once keyword 9
please print this line keyword 10
please print this line it has the keyword2 11
please print this line 12
So, as someone else has pointed out, your first keyword
keywordis a substring of your second keywordkeyword2. So I’ve implemented this using regexp objects, so that you can use the word boundary anchor\b.Produces: