I’m working on a program that takes in several (<50) high dimension points in

Question

0

Asked: May 24, 20262026-05-24T05:55:45+00:00 2026-05-24T05:55:45+00:00

I’m working on a program that takes in several (<50) high dimension points in

0

I’m working on a program that takes in several (<50) high dimension points in feature space (1000+ dimensions) and performing hierarchical clustering on them by recursively using standard k-clustering.

My problem is that in any one k-clustering pass, different parts of the high dimensional representation are redundant. I know this problem follows under the umbrella of either feature extraction, selection, or weighting.

In general, what does one take into account when selecting a particular feature extraction/selection/weighting algorithm? And specifically, what algorithm would be the best way to prepare my data to clustering in my situation?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-24T05:55:46+00:00

Editorial Team

2026-05-24T05:55:46+00:00Added an answer on May 24, 2026 at 5:55 am

Check out this paper:

Witten DM and R Tibshirani (2010) A framework for feature selection in clustering. Journal of the American Statistical Association 105(490): 713-726.

And the related paper COSA by Friedman. They both discuss these issues in depth.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m working on a program that takes in several (<50) high dimension points in

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply