I am trying to make some .bed files for genetic analysis. I am a python beginner. The files I want to make should be 3 columns, tab seperated, first column always the same (chromosome number) and 2nd and 3rd columns windows of size 200 starting at zero and ending at end of chromosome. Eg:
chr20 0 200
chr20 200 400
chr20 400 600
chr20 600 800
...
I have the size of the chromosome so at the moment I am trying to say ‘while column 2 < (size of chrom) print line. I have a skeleton of a script but it is not quite working, due to my lack of experience. Here is what I have so far:
output = open('/homw/genotyping/wholegenome/Chr20.bed', 'rw')
column2 = 0
column1 = 0
while column2 < 55268282:
for line in output:
column1 = column1 + 0
column2 = column2 + 100
print output >> "chr20" + '\t' + str(column1) + '\t' + str(column2)
If anyone can fix this simple script so that it does as I described, or writes a better solution that would be really appreciated. I considered making a script that could output all files for 20 chromosomes and chrX but as I need to specify the size of the chromosome I think I’ll have to do each file separately.
Thanks in advance!
How about this:
gives tab delimited output as requested
Note: using
withwill automatically close the file for you when you are done, or an exception is encountered.This gives more information about the .format() function in case you are curious.