I created a word counter in Ruby as a little exercise in learning Ruby.
I’ve used the word counters on JavaScriptKit.com and WordCountTool.com as well as the one in Open Office Writer.
Some text produced the following results
OpenOffice: 458 words
JavaScriptKit: 453 words
WordCountTool: 455 words
Mine: 461 words
My question is this: Why do the counts differ for the same exact excerpt across all counters?
What are problems in a script that might cause an inaccurate, but still close count?
What are some ways I could improve upon my script so that it’s more accurate?
You’re really asking for a definition of a “word”, which for counting purposes could mean very different things. Let’s take your original post as an example.
The most simplistic counting tool would be
Yet what if you had put
"Why do the counts differ/change for the same[...]"? Well, clearly “differ/change” is two words, so we should probably count forward slashes as word delimiters. In fact, just because I forgot to put a space between a full stop and the next word, doesn’t make them the same word, so let’s include full stops as delimiters too. Yet I can’t be bothered to check whether it’s a URL, so those websites you mention will have to count as two words:Ok, that’s cool, but actually numbers are not technically words – and if they were spoken, 458 would be “four hundred and fifty eight” which is actually 5 words. So let’s discount them too
You get the idea. The results you got only differed by 8 words – so clearly their definitions of a word are not all that different. But word counts are only ever a rough guide, so don’t worry about the discrepancies.