Is there any way to get PDF document text language?
Example:
Let’s say I have some PDF document in unknown to me language, is there any tool, that gives me opportunity to automatically get PDF document language and store (or ECHO) language name in file?
Regards,
Volodymyr
OK, I’ve found few useful links, that’s better than nothing:
C# example: http://www.eggheadcafe.com/community/csharp/2/10351962/how-to-recogonise-that-data-written-in-pdf-or-doc–is-english-or-not.aspx
Java: http://www.slideshare.net/shuyo/language-detection-library-for-java
Online(Web): http://whatlanguageisthis.com/
Thanks!