Does anyone know an algorithms that I can use to calculate the percentage of false positive in a two column list.
Take my situation for instance . I have a clustering vector showing me groups a cluster belongs to and I have the correct label by the side on another column. I know some classifications are wrong from them not mapping to their labels which is most occurring. How can I finding the percentage of false positive for all labels . I am implementing this in R.
Cluster_vector | Labels
1 5
3 5
1 5
1 5
6 5
Are you just looking for the proportion of mismatches, like
mean(x[,1] != x[,2])?You can get the confusion matrix by
table(x[,1] != x[,2])/nrow(x).