Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8626595
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 12, 20262026-06-12T08:04:15+00:00 2026-06-12T08:04:15+00:00

I built a web crawler but it is single threaded. Now I am extending

  • 0

I built a web crawler but it is single threaded. Now I am extending it to work with multiple threads. I am not able to understand the following :

  1. How many threads should I create? Should it be a fixed number or a dynamic one changing according to the length of the Queue holding the URIs? (Taking into consideration the available memory also)
  2. I have created a new class for the thread through the Runnable Interface and I want each thread’s run method to access an object I created in my Main class which is calling thread.start(). How should I access this object from each thread?

I am using NetBeans.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-12T08:04:16+00:00Added an answer on June 12, 2026 at 8:04 am

    You’re definitely going to want concurrency with a web a crawler 🙂

    And you’re probably going to want to set up a thread pool so that you can reuse threads and not bite the cost of instantiating new threads with each task.

    The thread pool options that you have are a FixedThreadPool and a CachedThreadPool. the benefits of each of these are explained in detail in the Java Concurrency Tutorial. The big drawback of the CachedThreadPool is that there’s no limit on how many threads can be created; in the event that a very large number of threads are added to the pool, you might see some significant performance degradation or timeouts (if you have a socket timeout defined).

    In either case, the best practice for setting up thread pools is through java.util.concurrent.Executors

    It’s just a matter of creating an ExecutorService by calling one of the following:

    ExecutorService threadPool = Executors.newCachedThreadPool();
    ExecutorService threadPool = Executors.newFixedThreadPool(500); 
    

    Once you have the threadpool, you can either invoke a single runnable (which doesn’t return a response) or a callable (which does) by using the submit() method.

    You can also run .invokeAll() if you’re using callables to generate futures:

    futures = cachedThreadPool.invokeAll(tasks,
                                         timeout,
                                         TimeUnit.MILLISECONDS);
    

    And then get the results:

    for (Future f: futures) {
       someList.add(f.get())
    }
    

    If you want multiple threads to be able to modify the same object, you’ll either need to use the synchronized keyword in the setters or use thread-safe data types.

    Hope this helps. Good luck!!

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

i already built some web applications using asp.net MVC 3 and they work well,
I have a web crawler built in C# (I know) and it has grown
I began to learn about web crawlers recently and I built a sample crawler
I want to build a web crawler based on Scrapy to grab news pictures
I have built Web sites with Python/Django and desktop applications with Objective-C/Cocoa so programming
We have built a web application that accepts SOAP messages, does some processing, calls
We recently built a web app using Prototype, making a fair amount of use
I've built a web site on ASP.NET MVC and one little section of it
I have built a web setup project in VS2008 which installs my ASP.NET/Silverlight app
I have built a web application using Java EE platform that sells one of

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.