Is anyone aware of a way to add diacritics from different unicode blocks to say, latin letters (or latin diacritics to say, Devanagari letters)? For instance:
Oै
I tried the zero-width-joiner in between, but it had no effect. Any ideas?
I know, for instance, that the Arabic combining diacritics will work on latin letters, but Hebrew will not. Is this random?
Accoding to the Unicode Standard, Chapter 2, Section 2.11, “All combining characters can be applied to any base character and can, in principle, be used with any script.” So the Latin letter O followed by the Devanagari vowel sign ai U+0948 is permitted. But the standard adds: “This does not create an obligation on implementations to support all possible combinations equally well. Thus, while application of an Arabic annotation mark to a Han character or a Devanagari consonant is permitted, it is unlikely to be supported well in rendering or to make much sense.”
So it is up to implementations. But there are some “cross-script” diacritics. For example, the acute accent has been unified with the Greek tonos mark, so the Latin letter é and the Greek letter έ, when decomposed, contain the same diacritic U+0301. Moreover, this combining mark can be placed after a Cyrillic letter, and this can be regarded as normal (though relatively rare) usage, so we can expect good implementations to render it properly.