Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8823761
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 14, 20262026-06-14T06:26:20+00:00 2026-06-14T06:26:20+00:00

I am doing unsupervised classification. For this I have 8 features (Variance of Green,

  • 0

I am doing unsupervised classification. For this I have 8 features (Variance of Green, Std. div. of Green , Mean of Red, Variance of Red, Std. div. of Red, Mean of Hue, Variance of Hue, Std. div. of Hue) for classification per image and I want to select 3 most significant features using PCA. I have written the following code for feature selection
(where dimension of feature is : 179X8) :

for c=1:size(feature,1)
   feature(c,:)=feature(c,:)-mean(feature)
end

DataCov=cov(feature); % covariance matrix
[PC,variance,explained] = pcacov(DataCov)

This gives me :

PC =

0.0038   -0.0114    0.0517    0.0593    0.0039    0.3998    0.9085   -0.0922
0.0755   -0.1275    0.6339    0.6824   -0.3241   -0.0377   -0.0641    0.0052
0.7008    0.7113   -0.0040    0.0496   -0.0207    0.0042    0.0012    0.0002
0.0007   -0.0012    0.0051    0.0101    0.0272    0.0288    0.0873    0.9953
0.0320   -0.0236    0.1521    0.2947    0.9416   -0.0142   -0.0289   -0.0266
0.7065   -0.6907   -0.1282   -0.0851    0.0060    0.0003    0.0010   -0.0001
0.0026   -0.0037    0.0632   -0.0446    0.0053    0.9125   -0.4015    0.0088
0.0543   -0.0006    0.7429   -0.6574    0.0838   -0.0705    0.0311   -0.0001

variance =

0.0179
0.0008
0.0001
0.0000
0.0000
0.0000
0.0000
0.0000

explained =

94.9471
4.1346
0.6616
0.2358
0.0204
0.0003
0.0002
0.0000

This means first principle component has 94.9% variance explained and so on … but these are in order of most to least significant.
How can I know which features (from 1 to 8) to be selected based on above information.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-14T06:26:22+00:00Added an answer on June 14, 2026 at 6:26 am

    Your problem is the same as the COLUMNSELECT problem discussed by Mahoney and Drineas in “CUR matrix decompositions for improved data analysis”.

    They first compute the leverage scores for each dimension and then selects 3 of them randomly using the leverage scores as weights. Alternatively, you can select the largest ones. Here’s the script for your problem:

    I first got a real nature image from the web and resized it to the dimensions you ask. The image is as follows:

    img

    %# Example data from real image of size 179x8
    %# You can skip it for your own data
    features = im2double(rgb2gray(imread('img.png')));
    
    %# m samples, n dimensions
    [m,n] = size(features);
    

    Then, compute the centralized data:

    %# Remove the mean
    features = features - repmat(mean(features,2), 1, size(features,2));
    

    I use SVD to compute PCA since it gives you both the principal components and the coefficients. If the samples are in columns, then U holds the principal components. Check the second page of this paper for the relationship.

    %# Compute the SVD
    [U,S,V] = svd(features);
    

    The key idea here is that we want to get the dimensions having most of the variation. And an assumption is that there’s some noise in data. We select only the dominant eigenvectors, e.g. representing the 95% of the data.

    %# Compute the number of eigenvectors representing
    %#  the 95% of the variation
    coverage = cumsum(diag(S));
    coverage = coverage ./ max(coverage);
    [~, nEig] = max(coverage > 0.95);
    

    Then the leverage scores are computed using nEig of the principal components. That is, we take the norm of the nEig coefficients.

    %# Compute the norms of each vector in the new space
    norms = zeros(n,1);
    for i = 1:n
        norms(i) = norm(V(i,1:nEig))^2;
    end
    

    Then, we can sort the leverage scores:

    %# Get the largest 3
    [~, idx] = sort(norms);
    idx(1:3)'
    

    and get the indices of the vectors with the largest leverage scores:

    ans =
       6     8     5
    

    You can check the paper for more details.

    But, keep in mind that PCA-based technique is good if you have many many dimensions. In your case, the search space is very small. My advice is to search exhaustively in the space and get the best selection as @amit recommends.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Doing some work in javascript I have a multidimensional made up in this format
Doing Android development on a Mac and this very new phone I have doesn't
Im doing something like this to get a list of all users that have
Doing some experiences to understand bit fields Hi have this code: 01 #include <stdio.h>
Doing the getting started of Sinatra. I get this error: ./sinatra.rb:5: undefined method `get'
Doing some jquery animation. I have certain divs set up with an attribute of
Doing some homework here (second assignment, still extremely green...). The object is to read
Doing this: @resp = Net::HTTP.get_response(api.something.com, /feed/v1/offers.json?#{@params_api_string}) I get this response in @resp: #<Net::HTTPOK:0x7f451e9d3ef0> How
Doing this works in IE7: <a href= target=_blank>Link</a> But in IE8 it open a
Doing a code review I've stumbled over GWM in Java-Spring-GWT web-application. As this product

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.