this is the sas code which i want to replicate in R,
proc fastclus data = in.stores_standard
maxclusters = 20
outseed= in.out_seed
maxiter = 1000
converge = 0
strict=5.0;
var storesize sales_per_sqft sales_per_visits tothhsinta;
id store_nbr;
run;
my attempt:
library(amap)
set.seed(1)
kmeans_object=Kmeans(stores_standard, 20, iter.max = 1000, nstart = 1, method = c("euclidean"))
p=do.call(rbind, kmeans_object)
What am unable to achieve:
1) run kmeans on these parameters only: storesize,sales_per_sqft,sales_per_visits, tothhsinta
2) id on store_nbr
3) outseed function in R
Thanks!
1) is quite easy:
For 2)
Now look at
cl:The
clustercomponent of the list contains the assigned cluster ID. These are in the same order as the samples in the input data. If you want to assign theclustercomponent as a column in the input data we’d then do:For your data do:
As for 3, that doesn’t appear possible with
kmeans()in standard R norKmeans()in package amap.