I got problem with using split command.
The input string is as follows:
080821_HWI-EAS301_0002_30ALBAAXX:1:8:1649:2027 83 chr10 42038185 255 36M = 42037995 -225 GCCAGGTTTAATAAATTATTTATAGAATACTGCATC @?DDEAEFDAD@FBG@CDA?DBCDEECD@D?CBA>A NM:i:0 MD:Z:36
I want to grab ‘2027’ from this string
my command is: line.split(':',4)[1].split()[0]
However, it doesn’t work. The output is ‘1’
Then I switch to line.split(':',4)
And output is still ‘1’, and I see the first-step split is already problematic.
However, when I try line.split(':',1), I got expected result as:
1:8:1649:2027 83 chr10 42038185 255 36M = 42037995-225 GCCAGGTTTAATAAATTATTTATAGAATACTGCATC @?DDEAEFDAD@FBG@CDA?DBCDEECD@D?CBA>A NM:i:0 MD:Z:36
I’m confused by this split command! (I asked the similar question before, and split command worked at that time)
thanks
It appears that what you want is
The numeric parameter to split indicates the maximum number of splits that will occur. So you have:
If you pull element [1] out of this return value, you get ‘1’. I don’t see why you are surprised by this.
Since you are allowing up to 4 splits, and the item you want will be the last one, the subscript you want is [4]:
Then you can split that on space and get element [0] from it to produce your result.
You get the same result if you don’t pass a split limit value at all: