Ok here is my existing code:
////////////// = []
for line in datafile:
splitline = line.split()
for item in splitline:
if not item.endswith("JAX"):
if item.startswith("STF") or item.startswith("BRACKER"):
//////////.append( item )
for line in //////////
print /////////////
/////////// +=1
for t in//////
if t in line[:line.find(',')]:
line = line.strip().split(',')
///////////////write(','.join(line[:3]) + '\n')
break
/////////////.close()
/////////////close()
///////////.close()
I want to make a further optimization. The file is really large. I would like to delete the lines from it that have been matched after they have been matched and written to the small file to reduce the amount of time it takes to search through the big file. Any suggestions on how I should go about this?
You cannot delete lines in a text file – it would require moving all the data after the deleted line up to fill the gap, and would be massively inefficient.
One way to do it is to write a temp file with all the lines you want to keep in bigfile.txt, and when you have finished processing delete bigfile.txt and rename the temp file to replace it.
Alternatively if bigfile.txt is small enough to fit in memory you could read the entire file into a list and delete the lines from the list, then write the list back to disk.
I would also guess from your code that bigfile.txt is some sort of CSV file. If so then it may be better to convert it to a database file and use SQL to query it. Python comes with the SQLite module built in and there are 3rd party libraries for most other databases.