I have a text file (bowtie alignment file) that looks like this:
read_1 + 345995|PACid:16033981 599 AGTAGTAATCAGTCACCCGCAAGGTAGACAAGG qqqqqqqqqqqqqqqqqqqqq!!qqqqqqqqqq 0 read_2 + 949205|PACid:16054220 338 TACCAGCACTAATGCACCGGATCCCATCAGATC qqqqqqqqqqqqqqqqqqqqqqqqqqqqqq!!q 0 31:A>T read_3 + 932004|PACid:16034380 1226 GGCACCTTATGAGAAATCAAAGTTTTTGGGTTC qqqqqqqqqqqqqqq!!qqqqqqqqqqqqq!!q 3
I want to subtract one from Column #4 (the position), and print each line with the updated value.
I can read the file, then separated the fields based on tab, and also identify Column #4 as data[3], but then I am stuck with subtracting one from each value in Column #4 and printing all the fields in each line with updated value for Column #4.
How can I do this using Python?
I tried something like this:
in_file = open(sys.argv[1],'r')
out_file = open(sys.argv[2], 'w')
for line in in_file:
data = line.rstrip().split('\t')
position = int(float(data[3]) -1)
but I am not sure about how to proceed with printing the lines with updated position.
Use the
csvmodule, informing it your field delimiter is a tab:Then just convert the position to a number, subtract one, convert it back, and print the line.