I have the following text file:
This is my text file
NUM,123
FRUIT
DRINK
FOOD,BACON
CAR
NUM,456
FRUIT
DRINK
FOOD,BURGER
CAR
NUM,789
FRUIT
DRINK
FOOD,SAUSAGE
CAR
NUM,012
FRUIT
DRINK
FOOD,MEATBALL
CAR
And I have the following list called ‘wanted’:
['123', '789']
What I’m trying to do is if the numbers after NUM is not in the list called ‘wanted’, then that line along with 4 lines below it gets deleted. So the output file will looks like:
This is my text file
NUM,123
FRUIT
DRINK
FOOD,BACON
CAR
NUM,789
FRUIT
DRINK
FOOD,SAUSAGE
CAR
My code so far is:
infile = open("inputfile.txt",'r')
data = infile.readlines()
for beginning_line, ube_line in enumerate(data):
UNIT = data[beginning_line].split(',')[1]
if UNIT not in wanted:
del data_list[beginning_line:beginning_line+4]
You shouldn’t modify a list while you are looping over it.
What you could try is to just advance the iterator on the file object when needed:
And use a set. It is faster for constantly checking the membership.
This approach doesn’t make you read in the entire file at once to process it in a list form. It goes line by line, reading from the file, advancing, and writing to the new file. If you want, you can replace the outfile with a list that you are appending to.