Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7739723
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 1, 20262026-06-01T08:35:36+00:00 2026-06-01T08:35:36+00:00

I am doing research on data mining and more precisely, decision trees. I would

  • 0

I am doing research on data mining and more precisely, decision trees.

I would like to know if there are multiple algorithms to build a decision trees (or just one?), and which is better, based on criteria such as

  • Performance
  • Complexity
  • Errors in decision making
  • and more.
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-01T08:35:37+00:00Added an answer on June 1, 2026 at 8:35 am

    Decision Tree implementations differ primarily along these axes:

    • the splitting criterion (i.e., how “variance” is calculated)

    • whether it builds models for regression (continuous variables, e.g., a
      score) as well as classification (discrete variables, e.g., a class
      label)

    • technique to eliminate/reduce over-fitting

    • whether it can handle incomplete data

    The major Decision Tree implementations are:

    • ID3, or Iterative Dichotomizer, was the first of three Decision Tree
      implementations developed by Ross Quinlan (Quinlan, J. R. 1986. Induction of Decision Trees. Mach. Learn. 1, 1 (Mar. 1986), 81-106.)

    • CART, or Classification And Regression Trees is often used as a generic
      acronym for the term Decision Tree, though it apparently has a more specific meaning. In sum, the CART implementation is very similar to C4.5; the one notable difference is that CART constructs the tree based on a numerical splitting criterion recursively applied to the data, whereas C4.5 includes the intermediate step of constructing rule sets.

    • C4.5, Quinlan’s next iteration. The new features (versus ID3) are:
      (i) accepts both continuous and discrete features; (ii) handles
      incomplete data points; (iii) solves over-fitting problem by (very
      clever) bottom-up technique usually known as “pruning”; and (iv)
      different weights can be applied the features that comprise the
      training data. Of these, the first three are very important–and I would suggest that any DT implementation you choose has all three. The fourth (differential weighting) is much less important

    • C5.0, the most recent Quinlan iteration. This implementation is
      covered by a patent and probably, as a result, is rarely implemented
      (outside of commercial software packages). I have never coded a C5.0
      implementation myself (I have never even seen the source code) so I can’t offer an informed comparison of C5.0 versus C4.5. I have always
      been skeptical about the improvements claimed by its inventor (Ross
      Quinlan)–for instance, he claims it is “several orders of magnitude”
      faster than C4.5. Other claims are similarly broad (“significantly more memory efficient”) and so forth. I’ll just point you to studies
      which report the result of comparison of the two techniques and you can decide for yourself.

    • CHAID (chi-square automatic interaction detector) actually predates
      the original ID3 implementation by about six years (published in a
      Ph.D. thesis by Gordon Kass in 1980). I know every little about this technique.The R Platform has a Package called CHAID which
      includes excellent documentation

    • MARS (multi-adaptive regression splines) is actually a term trademarked by the original inventor of MARS, Salford Systems. As a
      result, MARS clones in libraries not sold by Salford are named something other than MARS–e.g., in R, the relevant function is polymars in the poly-spline library. Matlab and Statistica also have
      implementations with MARS-functionality

    I would recommend CART or C4.5 (though again, I have no direct experience with C5.0 or with CHAID, though I am familiar with their feature sets).

    C4.5 is the Decision Tree flavor implemented in Orange; CART is the flavor in sklearn–both excellent implementations in excellent ML libraries.

    C4.5 is a major step beyond ID3–both in terms of range (C4.5 has a far broader use case spectrum because it can handle continuous variables in the training data) and in terms of model quality.

    Perhaps the most significant claimed improvement of C5.0 versus C4.5 is support for boosted trees. Ensemble support for DTs–boosted trees and Random Forests–has been included in the DT implementation in Orange; here, ensemble support was added to a C4.5 algorithm. sklearn also features a range of random forest and boosting methods.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

i am doing some research about entity data model in asp.net and there is
I am doing some research about how to access data in a UniData database
I am doing some research into methods of comparing time series data. One of
Many data mining algorithms/strategies use vector representation of data records in order to simulate
I am beginner at SQL. I am doing research about managing hierarchical data by
I am doing research about dedicated I/O software that would run on consumer hardware.
I am doing some research for a future project and it requires some data
I am doing a research based on text processing and mining. The principle is
I've just been doing some research into sql data services, and found out that
Doing research to try and pick a direction for notification types. I'd like to

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.