I’ve just started programming, therefore I’m kinda noob.
I’m trying to use python to remove a column from a .txt table. All columns are separated by tabs.
This is an example line:
100226.SCO0401 1 440 COG0001 glutamate-1-semialdehyde 2,1-aminomutase
I want to remove all text in the line after the fourth tab (the “glutamate-1-semialdehyde 2,1-aminomutase” part).
I’ve seen some people importing csv to work around this issue, but I was thinking of something simple like:
def remove(infilename, outfilename):
# Open original file and output file
infile = open(infilename, 'rt')
outfile = open(outfilename, 'wt')
# Read lines and remove annotation
for line in infile:
outfile.write(line['**everything-until-the-fourth-tab**']
# Close files
infile.close()
outfile.close()
The bold part is my issue right now. Any suggestions?
Thanks in advance.
Use
.split('\t')to split the entries in the row into an array. You can then slice the array with[:4], keeping only the first 4 elements. Finally, join it back up again with'\t'.join:As a one-liner: