Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8790913
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 13, 20262026-06-13T22:46:15+00:00 2026-06-13T22:46:15+00:00

I am now reading the book Data Mining: Practical machine learning tools and techniques

  • 0

I am now reading the book Data Mining: Practical machine learning tools and techniques third edition. In the section 4.8 clustering, it discusses how to use k-d trees or ball trees to improve the performance for the k-means algorithm.

After building the ball tree with all the data points, it searches all the leaf nodes to see which pre-chosen clustering center the points in it are each close to. It says sometimes the region represented by the higher interior node falls entirely within the domain of a single cluster center. Then we needn’t traverse its child nodes and all the date points can be processed in one blow.

The question is, when implementing the data structure and the algorithm, how can we decide whether the region referring to an interior node falls into a single cluster center?

In a two-dimensional or three-dimensional space, this is not difficult. We can see whether all the midperpendiculars of every pair in the cluster centres come across the region referring to the interior node.

But in higher dimensional spaces, how to recognize that? Is there a general methodology for this?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-13T22:46:15+00:00Added an answer on June 13, 2026 at 10:46 pm

    You need to consider maximum and minimum distances.

    If the minimum distance of a spatial object (say, a sphere of radius r) to all other means is larger than the maximum distance to one, all objects inside the container will belong to that mean. Because if

    maxdist(mean_i, container) < min of all j != i mindist(mean_j, container)
    

    then in particular for any object in the container

    dist(mean_i, obj_in_container) < min of all j != i dist(mean_j, obj_in_container)
    

    I.e. the object will belong to mean i.

    Minimum and Maximum distances for spheres and rectangles can be trivially computed in arbitrary dimensions. However, in higher dimensions, mindist and maxdist become quite similar, and the condition will rarely hold. Plus, it makes a huge difference if your tree is good structured (i.e. small containers) or badly structured (overlapping containers).

    k-d-trees are nice for in-memory, read-only operations. For insertions they perform quite bad. R*-trees are here a lot better. Plus, the improved split strategy of R*-trees does pay off, because it generates more rectangular boxes than the other strategies.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Right now I'm reading the book "Linux Kernel Development 3d Edition" by Robert Love.
I'm reading the book Spring Recipes right now, and am puzzled by one thing:
Just now, I'm reading Josuttis' STL book. As far as I know -- c++
I'm reading a book about data structures in java, and it's talking about iterators
I've just now started reading an Algorithms book that defined Graphs as follows: Graphs
I've been reading the book TCP/IP Sockets in Java, 2nd Edition. I was hoping
I have some question about arrays, I'm reading some book about Java, and now
I'm reading the book Java Concurrency in Practice. In section 3.2 it talks about
I am currently reading a book, and learning ruby on rails. (Agile Web Development
I have the following source snippet from a book that I'm reading now. So

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.