Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6746447
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 26, 20262026-05-26T12:19:26+00:00 2026-05-26T12:19:26+00:00

I am writing a classifier for categorizing whether a special deal is for a

  • 0

I am writing a classifier for categorizing whether a special deal is for a restaurant/hotel/etc… This is part of a web-crawler for analyzing external sites.
For start I made a meal?() method, which accepts a piece of text and will return true if it think the text is about a meal deal. It can’t be 100% accurate, since only simple keyword matching is used.

def meal?(text)
  !text.match(/restaurant|meal|wine|.../i).nil?
end

Now I am writing a test for it, and I have two questions. The first one is that I think it is a bit redundant to re-list all of these keywords in the unit test again. What do you think?

The second question:
I have an .html file in source control. It is used to test the crawler’s parsing functionality. Theoretically all of its items should pass, so I am thinking to use that html in this categorizing test, parse that html and feed the descriptions of each deal into this method.

One drawback is that the .html is taken from an external site. When that site changes layout I will update this .html file, and then I have to change this categorizing test too. But I think this is okay.

Is this recommended? I thought of this way because I feels uneasy extracting information out of that .html and place it in the test script itself (not DRY, and makes test script quite big). Would feeding the parsed description violate any fundamental testing laws, like ‘this hides the necessary details away from developers’ or ‘this is bad for generating reports’?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-26T12:19:26+00:00Added an answer on May 26, 2026 at 12:19 pm

    OK so I obviously misunderstood the question so I will revise this answer completely.

    I personally think it is simpler and preferable to take the actual text from the html file and copy/paste it to the test as opposed to the indirection of loading an html file. Two reasons I can find…

    • When I write/read unit tests I prefer all the info to be there right in front of me instead of being an ‘external source’ like a resource file that I have to dig for. Personal preference tho.
    • It is a bit confusing, because you can use this method for other things as well not just reading text from html file and classifying it. So to keep it more generic I would just use raw text in the actual test.

    I cannot however find a reason why what you are trying to do is really really bad, I think it boils down to personal preference.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Writing some test scripts in IronPython, I want to verify whether a window is
Writing the code for the user authentication portion of a web site (including account
Writing something like this using the loki library , typedef Functor<void> BitButtonPushHandler; throws a
Writing a python program, and I came up with this error while using the
Writing fast native applications, with API calls and etc, in a modern cross platform
Writing a small HTML web page with some very simple Javascript in it, I
Writing a Reporting Service (2005) report My DataSet returns something like this: DESCRIPTION COUNT
Writing a Service that is running on IIS. Basically looks like this: void ProcessRequest(HttpContext
Writing a function I must declare input and output data types like this: int
Writing a client / server program and previously was using calls like this: TcpListener.AcceptTcpClient()

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.