Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6040225
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 23, 20262026-05-23T06:28:59+00:00 2026-05-23T06:28:59+00:00

I am confused between these terms. They somehow looks same to me. Can someone

  • 0

I am confused between these terms. They somehow looks same to me.
Can someone please Explain me the steps in which order they perform and which libraries can do the work. To me its all look the same.

I want to know at each step what is the input and what is the output e,g

Crawling
Input = URL
Output = ?

Indexing
Input = ?
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-23T06:29:00+00:00Added an answer on May 23, 2026 at 6:29 am

    I’ll give you a general discription, algorithmically, make the modifications to your python libs.

    Crawling: starting from a set of URLs and its goal is to expand the set’s size, it actually follows out links and try to expand the graph as much as it can (until it covers the net-graph connected to the initial set of URLs or until resources [usually time] expires).
    so:
    input = Set of URLs

    output = bigger set of URLs which are reachable from the input

    Indexing: using the data the crawlers gathered to “index” the files. index is actually a list that maps each term (usually word) in the collection to the documents that this term appears in.

    input:set of URLs

    output: index file/library.

    Search: use the index to search for relevant documents to a given query.

    input: a query (String) and the index [usually it is an implicit argument, since its part of the state..]

    output: relevant documents to the query (documents is actually a web site here, that was crawled…)

    I encourage you to have a look at PyLucene which do all of these things (and more!)… and read some more about Information Retrieval

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am very confused between these two consistency models. Please give some timeline examples
I am confused between the term file modification time and file changed time. Can
I'm a little confused between a unichar and a char. Can I treat unichar's
I'm totally confused between these 4. What is the difference between ElapsedMilliseconds (long), ElapsedTicks
I am a bit confused between these 2 selectors. Does the descendent selector: div
I am confused between these two functions: void Swap_byPointer1(int *x, int *y){ int *temp=new
I am confused on the differences between these two code blocks: $(#someButton).click(function() { var
I'm confused about the differences between these two special folders. Here's a code snippet
Im quite confused that what is difference between these two initializations: int (*p)[10]; and
To tell the truth, I am quite confused on all these terms (JDK/JRE/Java SDK).

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.