Is there a way to perform sequential k-means clustering using scikit-learn? I can’t seem to find a proper way to add new data, without re-fitting all the data.
Thank you
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
scikit-learn’s
KMeansclass has apredictmethod that, given some (new) points, determines which of the clusters these points would belong to. Calling this method does not change the cluster centroids.If you do want the centroids to be changed by the addition of new data, i.e. you want to do clustering in an online setting, use the
MiniBatchKMeansestimator and itspartial_fitmethod.