I have a piece of Java code that is checking it is between two unicode characters:
LA(2) >= '\u0003' && LA(2) <= '\u00ff'
I understand that \u0003 represents END OF TEXT and \u00ff is LATIN SMALL LETTER Y WITH DIAERESIS, but what lies between these points? (what is it checking that LA(2) is?)
e.g. is it all Latin characters, or number characters, or characters with accents, all ascii characters, or something else?
It’s Latin 1 minus the code points U+0000, U+0001 and U+0002. This includes the usual stuff that can be found on the US keyboard, plenty of control characters (below U+0020 and between U+007F and U+009F) and a few other Latin characters that can be used to write the majority of Western European languages.