I need to do a partition of approximately 50000 points into distinct clusters. There is one requirement: the size of every cluster cannot exceed K. Is there any clustering algorithm that can do this job?
Please note that upper bound, K, of every cluster is the same, say 100.
One way is to use hierarchical K-means, but you keep splitting each cluster which is larger than K, until all of them are smaller.
Another (in some sense opposite approach) would be to use hierarchical agglomerative clustering, i.e. a bottom up approach and again make sure you don’t merge cluster if they’ll form a new one of size > K.