Ok this is going to sound really confusing but I will try my best to make it clear enough. I have a full dataset called fulldata this dataset is 494021x6.
I use svds (singular value decomposition) on it like so:
%% dimensionality reduction
columns = 6
[U,S,V]=svds(fulldata,columns);
I then randomly select 1000 rows from the fulldata:
%% randomly select dataset
rows = 1000;
columns = 6;
%# pick random rows
indX = randperm( size(fulldata,1) );
indX = indX(1:rows)';
%# pick columns in a set order (2,4,5,3,6,1)
indY = indY(1:columns);
%# filter data
data = U(indX,indY);
I then apply normalization to this randomly selected 1000 rows:
% apply normalization method to every cell
maxData = max(max(data));
minData = min(min(data));
data = ((data-minData)./(maxData));
I then output a datasample from the original fulldata set which matches the 1000 selected rows:
% output matching data
dataSample = fulldata(indX, :)
Also note that when I picked “random rows” I also output the indX rows which match the rows in the fulldata.
So datasample looks like this:

Which is the 1000 random rows which match the original fulldata.
And indX looks like this:

Which is the corresponding row number from fulldata.
The problem im arriving at is when I use K-Means to cluster the 1000 random rows and I output the data of each cluster like so:
%% generate sample data
K = 6;
numObservarations = size(data, 1);
dimensions = 3;
%% cluster
opts = statset('MaxIter', 100, 'Display', 'iter');
[clustIDX, clusters, interClustSum, Dist] = kmeans(data, K, 'options',opts, ...
'distance','sqEuclidean', 'EmptyAction','singleton', 'replicates',3);
%% plot data+clusters
figure, hold on
scatter3(data(:,1),data(:,2),data(:,3), 5, clustIDX, 'filled')
scatter3(clusters(:,1),clusters(:,2),clusters(:,3), 100, (1:K)', 'filled')
hold off, xlabel('x'), ylabel('y'), zlabel('z')
grid on
view([90 0]);
%% plot clusters quality
figure
[silh,h] = silhouette(data, clustIDX);
avrgScore = mean(silh);
% output the contents of each cluster
K1 = data(clustIDX==1,:)
K2 = data(clustIDX==2,:)
K3 = data(clustIDX==3,:)
K4 = data(clustIDX==4,:)
K5 = data(clustIDX==5,:)
K6 = data(clustIDX==6,:)
How can I match K1, k2… K6 to the corresponding indX row number? For instance K1’s output looks like this:

I was hoping to have extra files like K1-indX which is just a list of corresponding row numbers from indX which match the cluster data from K1, K2… etc. Or possibly append the indX row number into the K1, K2 output in column 7 (preferable)
For instance:
K1 cluster data | Belongs to fulldata row number
0.4 0.5 0.6 0.4 | 456456 etc
An example to illustrate: