Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7911767
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 3, 20262026-06-03T13:22:08+00:00 2026-06-03T13:22:08+00:00

Problem Statement is somewhat like this: Given a website, we have to classify it

  • 0

Problem Statement is somewhat like this:

Given a website, we have to classify it into one of the two predefined classes (say whether its an e-commerce website or not?)

We have already tried Naive Bayes Algorithms for this with multiple pre-processing techniques (stop word removal, stemming etc.) and proper features.

We want to increase the accuracy to 90 or somewhat closer, which we are not getting from this approach.

The issue here is, while evaluating the accuracy manually, we look for a few identifiers on web page (e.g. Checkout button, Shop/Shopping,paypal and many more) which are sometimes missed in our algorithms.

We were thinking, if we are too sure of these identifiers, why don’t we create a rule based classifier where we will classify a page as per a set of rules(which will be written on the basis of some priority).

e.g. if it contains shop/shopping and has checkout button then it’s an ecommerce page.
And many similar rules in some priority order.

Depending on a few rules we will visit other pages of the website as well (currently, we visit only home page which is also a reason of not getting very high accuracy).

What are the potential issues that we will be facing with rule based approach? Or it would be better for our use case?

Would be a good idea to create those rules with sophisticated algorithms(e.g. FOIL, AQ etc)?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-03T13:22:10+00:00Added an answer on June 3, 2026 at 1:22 pm

    A Decision Tree algorithm can take your data and return a rule set for prediction of unlabeled instances.

    In fact, a decision tree is really just a recursive descent partitioner comprised of a set of rules in which each rule sits at a node in the tree and application of that rule on an unlabeled data instance, sends this instance down either the left fork or right fork.

    Many decision tree implementations explicitly generate a rule set, but this isn’t necesary, because the rules (both what the rule is and the position of that rule in the decision flow) are easy to see just by looking at the tree that represents the trained decision tree classifier.

    In particular, each rule is just a Boolean test for a particular value in a particular feature (data column or field).

    For instance, suppose one of the features in each data row describes the type of Application Cache; further suppose that this feature has three possible values, memcache, redis, and custom. Then a rule might be Applilcation Cache | memcache, or does this data instance have an Application Cache based on redis?

    The rules extracted from a decision tree are Boolean–either true or false. By convention False is represented by the left edge (or link to the child node below and to the left-hand-side of this parent node); and True is represented by the right-hand-side edge.

    Hence, a new (unlabeled) data row begins at the root node, then is sent down either the right or left side depending on whether the rule at the root node is answered True or False. The next rule is applied (at least level in the tree hierarchy) until the data instance reaches the lowest level (a node with no rule, or leaf node).

    Once the data point is filtered to a leaf node, then it is in essence classified, becasue each leaf node has a distribution of training data instances associated with it (e.g., 25% Good | 75% Bad, if Good and Bad are class labels). This empirical distribution (which in the ideal case is comprised of a data instances having just one class label) determines the unknown data instances’s estimated class label.

    The Free & Open-Source library, Orange, has a decision tree module (implementations of specific ML techniques are referred to as “widgets” in Orange) which seems to be a solid implementation of C4.5, which is probably the most widely used and perhaps the best decision tree implementation.

    An O’Reilly Site has a tutorial on decision tree construction and use, including source code for a working decision tree module in python.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Sorry for the long problem statement...I've spent two days debugging and have a lot
I have the below Problem Statement PS: Given a string str and a Non-Empty
I have come across a somewhat annoying problem during a project. I created this
Problem statement - [Business] 1---* [Branch] A business must have one or more branch(es).
This is the problem statement: This is a two player game. Initially there are
Problem statement I have a worker thread that basically scans a folder, going into
I have a problem with a continue statement in my C# Foreach loop. I
I have a problem with the SQL statement detailed below. The query returns the
I have an big problem with an SQL Statement in Oracle. I want to
I have a problem creating the following SQL Statement using LINQ & C# select

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.