Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7621253
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 31, 20262026-05-31T04:09:05+00:00 2026-05-31T04:09:05+00:00

I am a beginner in scikits and svm and I would like to check

  • 0

I am a beginner in scikits and svm and I would like to check a couple of questions. I have a sample of 700 items and 35 features and I have 3 classes. I have an array X with my samples and features that are scaled using the “preprocessing.scale(X)”.
The first step is to find the suitable SVM parameters and I am using the grid search with nested cross validation (see http://scikit-learn.org/stable/auto_examples/grid_search_digits.html#).
I am using all my samples (X) in the “grid search”. During the grid search, the data is split into training and testing (using StratifiedKFold).
When I get my SVM parameters, I perform the classification where I divide my data into training and testing.
Is it ok to use the same data in the grid search that I will be using during the real classification?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-31T04:09:07+00:00Added an answer on May 31, 2026 at 4:09 am

    Is it ok to use the same data in the grid search that I will be using during the real classification?

    It is ok to use this data for training (fitting) a classifier. Cross validation, as done by StratifiedKFold, is intended for situations where you don’t have enough data to hold out a validation set while optimizing the hyperparameters (the algorithm settings). You can also use if you’re too lazy to make a validation set splitter and want to rely on scikit-learn’s built-in cross validation 🙂

    The refit option to GridSearchCV will retrain the estimator on the full training set after finding the optimal settings with cross validation.

    It is, however, senseless to apply a trained classifier to the data you grid searched or trained on, since you already have the labels. If you want to do formal evaluation of a classifier, you should hold out a test set from the very beginning and not touch that again until you’ve done all your grid searching, validation and fitting.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Absolute beginner question: I have a template file index.html that looks like this: ...
Once again a very beginner-ish question, but here I go: I would like to
Beginner here, I have a simple question. In Android what would be the best
(R beginner here...) I have a dataset like: > head(q) Date Time System User
A beginner question... I have a list like this: x <- c(aa=v12, bb=x21, cc=f35,
Beginner level question Scenario: Have simple string cocantation tool, that I might expand later
Beginner programmer here....hope it makes sense :) I have created a console app that
(Beginner to HTML) I have made a Photoshop mock-up of the website I want
Android beginner, Can anybody guide me on this ? I do have the distance,
Beginner to assembly programming for x86. I have a simple asm file which I

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.