Some background: In Devanagari fonts, same character code can be represented as a different character visually. In one font ’10’ may be represented visually as ‘A’ and in another font ’10’ can be represented visually as ‘B’. So if I select text and change the font then the characters change as well. Interestingly, same happens for buttons on the keyboard. Pressing ‘A’ in one font will display ‘A’ and in another pressing ‘A’ will display ‘B’.
What I am trying to do is, identify the font used in some text pasted into my software so that when the font is changed, then I can programatically change the characters to mean the same thing for new font.
Any pointers on how to go about this?
For non-Unicode fonts, the only way to accomplish this is to have some understanding of each font’s mapping. That is problematic, because even though there is a non-Unicode encoding standard, many Devanagari/Hindi fonts ignore it or make modifications and additions, resulting in the case you describe above (e.g. key ‘A’ in one font may correspond to the shape ‘म’ while ‘A’ in another font has the shape of ‘क’…just theoretical examples).
There exists at least one conversion tool that may help you, but ultimately it comes down to translating the input font-specific coding into the output font-specific coding. If you had a font-specific map to Unicode for each of the font-specific mappings, you could use Unicode as an intermediary and convert to/from any of your fonts’ schemes…some thing like:
With a full database of these mappings you could translate text set in “MyFont” to text for “MyOtherFont” pretty easily.
Of course, the best alternative, if possible, would be to convert both the fonts and text into Unicode, though…but that may not be possible.