Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 43131
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 10, 20262026-05-10T15:26:48+00:00 2026-05-10T15:26:48+00:00

When entering a question, stackoverflow presents you with a list of questions that it

  • 0

When entering a question, stackoverflow presents you with a list of questions that it thinks likely to cover the same topic. I have seen similar features on other sites or in other programs, too (Help file systems, for example), but I’ve never programmed something like this myself. Now I’m curious to know what sort of algorithm one would use for that.

The first approach that comes to my mind is splitting the phrase into words and look for phrases containing these words. Before you do that, you probably want to throw away insignificant words (like ‘the’, ‘a’, ‘does’ etc), and then you will want to rank the results.

Hey, wait – let’s do that for web pages, and then we can have a … watchamacallit … – a ‘search engine’, and then we can sell ads, and then …

No, seriously, what are the common ways to solve this problem?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. 2026-05-10T15:26:48+00:00Added an answer on May 10, 2026 at 3:26 pm

    One approach is the so called bag-of-words model.

    As you guessed, first you count how many times words appear in the text (usually called document in the NLP-lingo). Then you throw out the so called stop words, such as ‘the’, ‘a’, ‘or’ and so on.

    You’re left with words and word counts. Do this for a while and you get a comprehensive set of words that appear in your documents. You can then create an index for these words: ‘aardvark’ is 1, ‘apple’ is 2, …, ‘z-index’ is 70092.

    Now you can take your word bags and turn them into vectors. For example, if your document contains two references for aardvarks and nothing else, it would look like this:

    [2 0 0 ... 70k zeroes ... 0]. 

    After this you can count the ‘angle’ between the two vectors with a dot product. The smaller the angle, the closer the documents are.

    This is a simple version and there other more advanced techniques. May the Wikipedia be with you.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

The related questions that appear after entering the title, and those that are in
I have a question about validation in Java, I have looked at previous topic
Is it possible to check who is entering your website in PHP. I have
I have an application for entering in serial numbers to a database. A serial
I have a form in which people will be entering dollar values. Possible inputs:
I have a small problem. I am making a site that has Tags and
I already check some of asp.net mvc hosting sites listed here: https://stackoverflow.com/questions/637567/affordable-stable-asp-net-mvc-hosting-exist I worry
I have an app that works with an idea of redemption codes (schema: ID,
i have a text input box for entering the currency. it shud allow user
Having looked at this question, I have the following code: $/ = \0 answer

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.