I have problem about group in cluster analysis(hierarchical cluster). As example, this is the dendrogram of complete linkage of Iris data set.

After I use
> table(cutree(hc, 3), iris$Species)
This is the output:
setosa versicolor virginica
1 50 0 0
2 0 23 49
3 0 27 1
I have read in one statistical website that, object 1 in the data always belongs to group/cluster 1. From the output above, we know that setosa is in group 1. Then, how I am going to know about the other two species. How do they fall into either group 2 or 3. How did it happen. Perhaps there is a calculation I need to know?
I’m guessing that you’re using this to create that image that doesn’t appear to be there at the moment.
Dist is created from measurements of plants from three different species with identical column and row names.
That object is passed on to hclust which constructs a tree and cut it into three pieces. Object
iris.orderholds the order by which the dendrogram is drawn. Original order is preserved, the tree is drawn based on this ordering.Here’s proof. I’ve put together original
Speciesdesignations, ordered species designations as they can be seen in the dendrogram, order number and group from a cutree function.Let’s look at the output. If you look at the first line, under
order.numthere’s number 108. This means that for this item (first item on the left side of the dendrogram) comes from row 108. Skim down to line 108, and you can see that the originalSpeciesis indeedvirginica. Cutree assigns this to group1. Let’s look at line 3. Underorder.numyou can see that this item comes from row 103. Again, if you go down and check the original species in row 103, it’s (still)virginica. I’ll make it an exercise for you to check other (random) rows and convince yourself that the order for constructing the table at the beginning is preserved. Ergo, the table should thus be correct.