JCR (and Jackrabbit) has a node locking mechanism that may…

Question

0

Asked: May 11, 20262026-05-11T00:13:44+00:00 2026-05-11T00:13:44+00:00

I’ve got a classification problem in my hand, which I’d like to address with

0

I’ve got a classification problem in my hand, which I’d like to address with a machine learning algorithm ( Bayes, or Markovian probably, the question is independent on the classifier to be used). Given a number of training instances, I’m looking for a way to measure the performance of an implemented classificator, with taking data overfitting problem into account.

That is: given N[1..100] training samples, if I run the training algorithm on every one of the samples, and use this very same samples to measure fitness, it might stuck into a data overfitting problem -the classifier will know the exact answers for the training instances, without having much predictive power, rendering the fitness results useless.

An obvious solution would be seperating the hand-tagged samples into training, and test samples; and I’d like to learn about methods selecting the statistically significant samples for training.

White papers, book pointers, and PDFs much appreciated!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

score 0 · Answer 1 · 2026-05-11T00:13:45+00:00

2026-05-11T00:13:45+00:00Added an answer on May 11, 2026 at 12:13 am

You could use 10-fold Cross-validation for this. I believe it’s pretty standard approach for classification algorithm performance evaluation.

The basic idea is to divide your learning samples into 10 subsets. Then use one subset for test data and others for train data. Repeat this for each subset and calculate average performance at the end.

0

Reply
Share
Share

- Report

How to approach applying for a job at a company ...

What is a programmer’s life like?

How to handle personal stress caused by utterly incompetent and ...

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions