I’m looking to assign some different readability scores to text in R such as the Flesh Kincaid.
Does anyone know of a way to segment words into syllables using R? I don’t necessarily need the syllable segments themselves but a count.
so for instance:
x <- c('dog', 'cat', 'pony', 'cracker', 'shoe', 'Popsicle')
would yield:
1, 1, 2, 2, 1, 3
Each number corresponding the the number of syllables in the word.
Some tools for NLP are available here:
http://cran.r-project.org/web/views/NaturalLanguageProcessing.html
The task is non-trivial though. More hints (including an algorithm you could implement) here:
Detecting syllables in a word