So I have file that with multiple line that look like this (space delimiter file):
A1BG P04217 VAR_018369 p.His52Arg Polymorphism rs893184 -
A1BG P04217 VAR_018370 p.His395Arg Polymorphism rs2241788 -
AAAS Q9NRG9 VAR_012804 p.Gln15Lys Disease - Achalasia
How do I make dictionary to look for id in second column and store the number (between words) on fourth column.
I tried this but it give me index of out range
lookup = defaultdict(list)
with open ('humsavar.txt', 'r') as humsavarTxt:
for line in csv.reader(humsavarTxt):
code = re.match('[a-z](\d+)[a-z]', line[1], re.I)
if code:
lookup[line[-2]].append(code.group(1))
print lookup['P04217']
Here’s a variant of the original code:
which produces