I just started learning python scripting yesterday and I’ve already gotten stuck. 🙁
So I have a data file with a lot of different information in various fields.
Formatted basically like…
Name (tab) Start# (tab) End# (tab) A bunch of fields I need but do not do anything with
Repeat
I need to write a script that takes the start and end numbers, and add/subtract a number accordingly depending on whether another field says + or -.
I know that I can replace words with something like this:
x = open("infile")
y = open("outfile","a")
while 1:
line = f.readline()
if not line: break
line = line.replace("blah","blahblahblah")
y.write(line + "\n")
y.close()
But I’ve looked at all sorts of different places and I can’t figure out how to extract specific fields from each line, read one field, and change other fields. I read that you can read the lines into arrays, but can’t seem to find out how to do it.
Any help would be great!
EDIT:
Example of a line from the data here: (Each | represents a tab character)
| |
V V
chr21 | 33025905 | 33031813 | ENST00000449339.1 | 0 | **-** | 33031813 | 33031813 | 0 | 3 | 1835,294,104, | 0,4341,5804,
chr21 | 33036618 | 33036795 | ENST00000458922.1 | 0 | **+** | 33036795 | 33036795 | 0 | 1 | 177, | 0,
The second and third columns (indicated by arrows) would be the ones that I’d need to read/change.
You can use
csvto do the splitting, although for these sorts of problems, I usually just usestr.split:csvis nice if you want to split lines like this:into:
but it doesn’t seem like you need that here.