Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 103055
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 11, 20262026-05-11T01:03:47+00:00 2026-05-11T01:03:47+00:00

I half-answered a question about finding clusters of mass in a bitmap . I

  • 0

I half-answered a question about finding clusters of mass in a bitmap. I say half-answered because I left it in a condition where I had all the points in the bitmap sorted by mass and left it to the reader to filter the list removing points from the same cluster.

Then when thinking about that step I found that the solution didn’t jump out at me like I thought it would. So now I’m asking you guys for help. We have a list of points with masses like so (a Python list of tuples, but you can represent it as you see fit in any language):

[ (6, 2, 6.1580555555555554),   (2, 1, 5.4861111111111107),   (1, 1, 4.6736111111111107),   (1, 4, 4.5938888888888885),   (2, 0, 4.54),   (1, 5, 4.4480555555555554),   (4, 7, 4.4480555555555554),   (5, 7, 4.4059637188208614),   (4, 8, 4.3659637188208613),   (1, 0, 4.3611111111111107),   (5, 8, 4.3342191043083904),   (5, 2, 4.119574829931973),   ...   (8, 8, 0.27611111111111108),   (0, 8, 0.24138888888888888) ] 

Each tuple is of the form:

(x, y, mass) 

Note that the list is sorted here. If your solution prefers to not have them sorted it’s perfectly OK.

The challenge, if you recall, is to find the main clusters of mass. The number of clusters is not known. But you know the dimensions of the bitmap. Sometimes several points within a cluster has more mass than the center of the next (in size) cluster. So what I want to do is go from the higher-mass points and remove points in the same cluster (points nearby).

When I tried this I ended up having to walk through parts of the list over and over again. I have a feeling I’m just stupid about it. How would you do it? Pseudo code or real code. Of course, if you can just take off where I left in that answer with Python code it’s easier for me to experiment with it.

Next step is to figure out how many clusters there really are in the bitmap. I’m still struggling with defining that problem so I might return with a question about it.

EDIT: I should clarify that I know that there’s no ‘correct’ answer to this question. And the name of the question is key. Phase one of the my clustering is done. Im in search of a fast, accurate-‘enough’ method of filtering away nearby points.

Let me know if you see how I can make the question clearer.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. 2026-05-11T01:03:48+00:00Added an answer on May 11, 2026 at 1:03 am

    Just so you know, you are asking for a solution to an ill-posed problem: no definitive solution exists. That’s fine…it just makes it more fun. Your problem is ill-posed mostly because you don’t know how many clusters you want. Clustering is one of the key areas of machine learning and there a quite a few approaches that have been developed over the years.

    As Arachnid pointed out, the k-means algorithm tends to be a good one and it’s pretty easy to implement. The results depend critically on the initial guess made and on the number of desired clusters. To overcome the initial guess problem, it’s common to run the algorithm many times with random initializations and pick the best result. You’ll need to define what ‘best’ means. One measure would be the mean squared distance of each point to its cluster center. If you want to automatically guess how many clusters there are, you should run the algorithm with a whole range of numbers of clusters. For any good ‘best’ measure, more clusters will always look better than fewer, so you’ll need a way to penalize having too many clusters. The MDL discussion on wikipedia is a good starting point.

    K-means clustering is basically the simplest mixture model. Sometimes it’s helpful to upgrade to a mixture of Gaussians learned by expectation maximization (described in the link just given). This can be more robust than k-means. It takes a little more effort to understand it, but when you do, it’s not much harder than k-means to implement.

    There are plenty of other clustering techniques such as agglomerative clustering and spectral clustering. Agglomerative clustering is pretty easy to implement, but choosing when to stop building the clusters can be tricky. If you do agglomerative clustering, you’ll probably want to look at kd trees for faster nearest neighbor searches. smacl’s answer describes one slightly different way of doing agglomerative clustering using a Voronoi diagram.

    There are models that can automatically choose the number of clusters for you such as ones based on Latent Dirichlet Allocation, but they are a lot harder to understand an implement correctly.

    You might also want to look at the mean-shift algorithm to see if it’s closer to what you really want.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Ask A Question

Stats

  • Questions 72k
  • Answers 73k
  • Best Answers 0
  • User 1
  • Popular
  • Answers
  • Editorial Team

    How to approach applying for a job at a company ...

    • 7 Answers
  • Editorial Team

    How to handle personal stress caused by utterly incompetent and ...

    • 5 Answers
  • Editorial Team

    What is a programmer’s life like?

    • 5 Answers
  • added an answer Even though the heap saves you from searching through the… May 11, 2026 at 1:50 pm
  • added an answer I believe the OpenGL libraries are just part of the… May 11, 2026 at 1:50 pm
  • added an answer Hear what Bruce Eckel { author of the two of… May 11, 2026 at 1:50 pm

Related Questions

No related questions found

Trending Tags

analytics british company computer developers django employee employer english facebook french google interview javascript language life php programmer programs salary

Top Members

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.