Is there some library out there that can figure out if a given string of characters contains a “real sentence” in English, meaning that it contains words from English? (The sentence need not make sense, but it should contains real English words)
For example, the following is not a sentence (at least in English:) –
hsgdhjf asdf dsusdf udfhpiew
This is an unsolved problem, as computers have no idea of what “makes sense”. Even if it tries to parse a sentence by detecting nouns, verbs, etc, there are still phrases like “colorless green ideas sleep furiously” or “Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo” that would get through. I doubt many people would say those are sentences.
There are also multiple ways of parsing sentences, for example “Time flies like an arrow; fruit flies like a banana” can be parsed as:
to take just two ways.
The bottom line: parsing natural language is hard, and making sense of it is even harder.