Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8355429
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 9, 20262026-06-09T09:51:33+00:00 2026-06-09T09:51:33+00:00

I have run the k-Means clustering algorithm on the synthetic control data from the

  • 0

I have run the k-Means clustering algorithm on the synthetic control data from the Mahout tutorial, and was wondering if someone could explain how to interpret the output. I ran clusterdump and received output that looks something like this (truncated to save space):

CL-592{n=57 c=30.726, 29.813...] r=[3.528, 3.597...]}
Weight : [props - optional]: Point:
1.0 : [distance=27.453962995925863]: [24.672, 35.261, 30.486...]
1.0 : [distance=27.675053294846002]: [25.592, 29.951, 34.188...]
1.0 : [distance=28.97727289419493]: [30.696, 32.667, 34.223...]
1.0 : [distance=21.999685652862784]: [32.702, 35.219, 30.143...]
...
CL-598{n=50 c=[29.611, 29.769...] r=[3.166, 3.561...]}
Weight : [props - optional]:  Point:
1.0 : [distance=27.266203490250472]: [27.679, 33.506, 23.594...]
1.0 : [distance=28.749781351838173]: [34.727, 28.325, 30.331...]
1.0 : [distance=32.635136046420186]: [27.758, 33.859, 29.879...]
1.0 : [distance=29.328974057024624]: [29.356, 26.793, 25.575...]

Could someone explain to me how to read this? From what I understand, CL-__ is a cluster ID, followed by n=number of points in the cluster, c=centroid as a vector, r=radius as a vector, and then each point in the cluster. Is this correct? Furthermore, how do I know which clustered point matches up with which input point? i.e. are the points described as a key-value pair where the key is some kind of ID for the point and the value is the vector? If not is there some way I can set it up so it is?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-09T09:51:35+00:00Added an answer on June 9, 2026 at 9:51 am

    I believe your interpretation of the data is correct (I’ve only been working with Mahout for ~3 weeks, so someone more seasoned should probably weigh in on this).

    As far as linking points back to the input that created them I’ve used NamedVector, where the name is the key for the vector. When you read one of the generated points files (clusteredPoints) you can convert each row (point vector) back into a NamedVector and retrieve the name using .getName().

    Update in response to comment

    When you initially read your data into Mahout, you convert it into a collection of vectors with which you then write to a file (points) for use in the clustering algorithms later. Mahout gives you several Vector types which you can use, but they also give you access to a Vector wrapper class called NamedVector which will allow you to identify each vector.

    For example, you could create each NamedVector as follows:

    NamedVector nVec = new NamedVector(
        new SequentialAccessSparseVector(vectorDimensions), 
        vectorName
        );
    

    Then you write your collection of NamedVectors to file with something like:

    SequenceFile.Writer writer = new SequenceFile.Writer(...);
    VectorWritable writable = new VectorWritable();
    
    // the next two lines will be in a loop, but I'm omitting it for clarity
    writable.set(nVec);
    writer.append(new Text(nVec.getName()), nVec);
    

    You can now use this file as input to one of the clustering algorithms.

    After having run one of the clustering algorithms with your points file, it will have generated yet another points file, but it will be in a directory named clusteredPoints.

    You can then read in this points file and extract the name you associated to each vector. It’ll look something like this:

    IntWritable clusterId = new IntWritable();
    WeightedPropertyVectorWritable vector = new WeightedPropertyVectorWritable();
    
    while (reader.next(clusterId, vector))
    {
        NamedVector nVec = (NamedVector)vector.getVector();
        // you now have access to the original name using nVec.getName()
    }
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have run through the MVC3 Razor tutorial on ASP.Net, and and have started
I have run a few using batch jobs, but, I am wondering what would
My programme uses K-means clustering of a set amount of clusters from the user.
HI All I was hoping someone could help me improve a query I have
I have a Java applet that is meant to run only on Windows. (It
I have an application (winform exe) that I run several times. Does this mean
I have run into a problem with Eclipse Indigo, attempting to start up my
I have run this Perl code: #!/usr/bin/perl print content-type: text/html \n\n; print Hello World.\n;
I have run this command with root: [root@localhost git-shell-commands]# ssh git@192.168.1.12 git@192.168.1.12's password: Last
I have run into a little problem. I am connecting to a webservice that

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.