Given a text, how could I count the density / count of word lengths, so that I get an output like this
- 1 letter words : 52 / 1%
- 2 letter words : 34 / 0.5%
- 3 letter words : 67 / 2%
Found this but for python
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
You could start by splitting your text into words, using either
explode()(as a very/too simple solution) orpreg_split()(allows for stuff that’s a bit more powerful) :Then, iterate over the words, getting, for each one of those, its length, using [**`strlen()`**][3] ; and putting those lengths into an array :
If you’re working with UTF-8, see
mb_strlen().At the end of that loop, `$results` would look like this :
The total number of words, which you’ll need to calculate the percentage, can be found either :
foreachloop,array_sum()on$resultsafter the loop is done.And for the percentages’ calculation, it’s a bit of maths — I won’t be that helpful, about that ^^