Hi i have been trying this for some time now no result yet.
i have a dict = {'Å':'a', 'Ä':'a', 'Ö':'0', 'å':'a', 'ä':'a', 'ö':'o'}
input = lxml.etree.parse(inputxml)
for block in input.xpath('//PAGE/BLOCK/TEXT'):
J = block.xpath('TOKEN/text()')
current = 0
line = ""
while current < len(J):
A = J[current]
current += 1
i need to scan A with the dict and find the non-english letters and replace it with english letter
for i in A:
if(dict.has_key(i)):
ReplaceWord= A.replace(i,dict[i])
but this is not working
Not what you asked about, but it looks like you might be interested in it: Unidecode is a module specifically intended to reduce any series of characters to the most similar ASCII characters.