Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 532575
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 13, 20262026-05-13T09:25:10+00:00 2026-05-13T09:25:10+00:00

Do you guys know where i can find a search engine parser design diagram?

  • 0

Do you guys know where i can find a search engine parser design diagram?
I need to understand how it processes user input. what functions / algorithms are being used? conditions. etc.

It doesn’t have to be Google’s.

Updated question to search engine parser

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-13T09:25:10+00:00Added an answer on May 13, 2026 at 9:25 am

    You need a better understanding about search engines first. There are normally

    1) a web crawler, something that get the documents you want to add to your search data space. THis is usually totally outside the scope of what you call “search engine”.

    2) a parser which is taking the document and splitting it into indexable text fragments. If usually works with different file formats, human languages and is preprocessing the text in maybe some fixed records and flow text. Linguistic algorithms (like stemmers – search for Porter Stemmer to get simple one) are also applied here.

    3) A indexer which might be as simple as an inverted list of words per document or as complex as you want if you try to be as clever as google. Building an index is the really magic part of a successfull search engine. Usually there are multiple ranking algorithms that are put together.

    4) The frontend with an optional query language. THis is where google is really bad but as you can see on googles success it might not be so important for 98% of the people. But i really miss this.

    I think you are asking for (3) the indexer. Basically there are 2 different kind of algorithms you find in classic information retrieval literature. Vector Space model and Boolean Search. The later is easy, just check if the search words are inside the document and return a boolean value. Each search term can be given a relevanz probability. And for different search terms you can use Bayesian probability to sum up the relevanz and add return the highest ranked documents. The vector model treats a document as a vector of all its words you can build a scalar vector product between documents to judge if they are close together – this is a much more complex theroy. The father of IR (information retrieval) was Gerald Salton, you will find a lot of literature under his name.

    This was the state of IR art until 1999 (i wrote my diploma thesis about a usenet news search engine in 1998). Then google came and all the theory went into the trashcan of academic stupidity and pratical irrelevanz.

    Google was not build on mainstream IR theory. Read in the link that Srirangan gave you about it. Its just an ad hock relevanz function build on many many different sources. You will not find anything in this area beside white paper marketing blablabla. This algorithms are the business secret and capital of the search engine companies.

    For simple search engines look at the lucence library or at dtsearch which was always my choice for an embeddable search engine library.

    There is not really a lot of example code nor available information in the open source world about IR technology. Most of them like lucense are just implementing the most primitive operations. You have to buy books and go to a university library to get access to research literature.

    As literature i would recommend starting with this book link text
    alt text http://ecx.images-amazon.com/images/I/41HKJYHTQDL._BO2,204,203,200_PIsitb-sticker-arrow-click,TopRight,35,-76_AA240_SH20_OU01_.jpg

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Do you guys know how I can use the Curl command line to POST
Might be an easy question for you guys. can't find it on google. I
do you know where I can find Windows Low Level Assembly examples programs? I
Just wondering if any of you guys know of any web-based/browser-based employee scheduling software/tools?
Hi guys anyone know what the helll is going on here? ERROR SNIPPET: Loading
I know that this is a simple question for PHP guys but I don't
Guys, can someone give me a brief run through of how to change the
hey guys, i wonder how difficult it is (and what programming languages i need)
Guys, I’ve been writing code for 15+ years, but managed to avoid Web Development
Ok guys just a small game: I have some specifications for a project. At

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.