I am quite new in python.I want to match string in some lines of a file.Let’s say,
I have string:
british 7
German 8
France 90
And I have some lines in a file as like:
<s id="69-7">...Meanwhile is the studio 7 album by British pop band 10cc.</s>
<s id="15-8">...And Then There Were Three... is the ninth studio album by the german band Genesis 8 and was released in 1978.</s>
<s id="1990-2">Magnum Nitro Express is a France centerfire fire rifle cartridge 90.</s>
I want to get output as like:
<s id="69-7">...Meanwhile is the studio <w2>7</w2> album by <w1>British</w1> pop band 10cc.</s>
<s id="15-8">...And Then There Were Three... is the ninth studio album by the <w1>german</w1> band Genesis <w2>8</w2> and was released in 1978.</s>
<s id="1990-2">Magnum Nitro Express is a <w1>France</w1> centerfire fire rifle cartridge <w2>90</w2>.</s>
I tried with the following code:
for i in file:
if left in i and right in i:
line = i.replace(left, '<w1>' + left + '</w1>')
lineR = line.replace(right, '<w2>' + right + '</w2>')
text = text + lineR + "\n"
continue
return text
But, it also match string from id.eg.
<s id="69-<w2>7</w2>">...Meanwhile is the studio <w2>7</w2> album by <w1>British</w1> pop band 10cc.</s>
So, is there any way to search string as words not as character so that I can escape <s id="69-<w2>7</w2>"> ?
Thanks in advance for any kind of help.
You should use regular expressions to specifically replace only individual words, not word parts.
Something like
which gives us
'<s id="69-7">...Meanwhile is the studio <w2>7</w2> album by <w1>British</w1> pop band 10cc.</s>'And if such approach leads to errors, you can try a more refined code, like