I am trying to find an XML element within an SVG (font) file based on the content of an attribute, like so:
font = et.ElementTree(file='fontfile.svg')
glyph = font.find('//n:glyph[@unicode="%s"]' % symbol, namespaces={'n': SVGNS})
Glyph examples — what I’m trying to match to — are:
<glyph unicode="©" horiz-adv-x="1792" d="M834 ... -40t-121 -18z " />
<glyph unicode="C" horiz-adv-x="1509" d="M1766 338q-49 ... 83.5v-215z" />
Problem is that when, for example,
symbol = "C"
it works fine (there is a match), but when
symbol = "©"
it doesn’t. I suspect that there is a unicode interpretation in one direction of the matching, but not the other. What is the correct way to resolve this?
Building on unutbu’s answer, when you do
ET.fromstring, it translates the HTML entities intounicodeobjects as attributes.So, the answer at the end of the day is that the HTML entity
©no longer exists as such infont, to search for it it needs to be converted into unicode. Some ways to do that are explained here.