We all know that the k-means algorithm:
which has a complexity of O( n * K * I * d ) Where:
- n = number of points
- K = number of clusters
- I = number of iterations
- d = number of attributes
but my question is when applying K-means in Dynamic Programming I can’t figure out the complexity of it.
the idea of K-means using DP in a nutshell is as follows:
- Compute the proximity matrix
- Let each data point be a cluster
- Repeat
- Merge the two closest clusters
- Update the proximity matrix
- Until only a single cluster remains
I have tried to find a pseudo-code for it so I can try to find out the complexity, but I couldn’t.
So, how can I find it’s complexity? and what it could be?
Thank you guys in advance for any answer.
The algorithm you’re describing is not k-means with dynamic programming, but rather a type of hierarchical clustering called agglomerative clustering. Typically, agglomerative clustering implementations take time (IIRC) O(n3d), where n is the number of data points and d is the number of features. Wikipedia goes into a bit more depth about how this works.
Note that the clusters found this way are not the same as the clusters you’d get with k-means. Agglomerative clustering tends to produce very different clusters with a different set of properties.
Hope this helps!