I am using fpc package in R to perform cluster validation.
I could use the function cluster.stats() to compare my clustering with an external partitioning and compute several metrics like Rand Index, entropy e.t.c.
However, I am looking for a metric called ‘purity’ or ‘cluster accuracy’ which is defined in http://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-clustering-1.html
I am wondering if there is an implementation of this measure in R.
thanks,
Chet
I don’t know of an off-the-shelf function, but here is one way you could do it yourself using the equation in your link:
Here we can test it on some random assignments, where I believe we expect the purity to be 1/number-of-classes: