Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6712095
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 26, 20262026-05-26T08:13:10+00:00 2026-05-26T08:13:10+00:00

I am writing a multithreaded parser. Parser class is as follows. public class Parser

  • 0

I am writing a multithreaded parser.
Parser class is as follows.

public class Parser extends HTMLEditorKit.ParserCallback implements Runnable {

    private static List<Station> itemList = Collections.synchronizedList(new ArrayList<Item>());
    private boolean h2Tag = false;
    private int count;
    private static int threadCount = 0;

    public static List<Item> parse() {
        for (int i = 1; i <= 1000; i++) { //1000 of the same type of pages that need to parse

            while (threadCount == 20) { //limit the number of simultaneous threads
                try {
                    Thread.sleep(50);
                } catch (InterruptedException ex) {
                    ex.printStackTrace();
                }
            }

            Thread thread = new Thread(new Parser());
            thread.setName(Integer.toString(i));
            threadCount++; //increase the number of working threads
            thread.start();            
        }

        return itemList;
    }

    public void run() {
        //Here is a piece of code responsible for creating links based on
        //the thread name and passed as a parameter remained i,
        //connection, start parsing, etc.        
        //In general, nothing special. Therefore, I won't paste it here.

        threadCount--; //reduce the number of running threads when current stops
    }

    private static void addItem(Item item) {
        itenList.add(item);
    }

    //This method retrieves the necessary information after the H2 tag is detected
    @Override
    public void handleText(char[] data, int pos) {
        if (h2Tag) {
            String itemName = new String(data).trim();

        //Item - the item on which we receive information from a Web page
        Item item = new Item();
        item.setName(itemName);
        item.setId(count);
        addItem(item);

        //Display information about an item in the console
        System.out.println(count + " = " + itemName); 
        }
    }

    @Override
    public void handleStartTag(HTML.Tag t, MutableAttributeSet a, int pos) {
        if (HTML.Tag.H2 == t) {
            h2Tag = true;
        }
    }

    @Override
    public void handleEndTag(HTML.Tag t, int pos) {
        if (HTML.Tag.H2 == t) {
            h2Tag = false;
        }
    }
}

From another class parser runs as follows:

List<Item> list = Parser.parse();

All is good, but there is a problem. At the end of parsing in the final list “List itemList” contains 980 elements onto, instead of 1000. But in the console there is all of 1000 elements (items). That is, some threads for some reason did not call in the handleText method the addItem method.

I already tried to change the type of itemList to ArrayList, CopyOnWriteArrayList, Vector. Makes the method addItem synchronized, changed its call on the synchronized block. All this only changes the number of elements a little, but the final thousand can not be obtained.

I also tried to parse a smaller number of pages (ten). As the result the list is empty, but in the console all 10.

If I remove multi-threading, then everything works fine, but, of course, slowly. That’s not good.

If decrease the number of concurrent threads, the number of items in the list is close to the desired 1000, if increase – a little distanced from 1000. That is, I think, there is a struggle for the ability to record to the list. But then why are synchronization not working?

What’s the problem?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-26T08:13:10+00:00Added an answer on May 26, 2026 at 8:13 am

    After your parse() call returns, all of your 1000 Threads have been started, but it is not guaranteed that they are finished. In fact, they aren’t that’s the problem you see. I would heavily recommend not write this by yourself but use the tools provided for this kind of job by the SDK.

    The documentation Thread Pools and the ThreadPoolExecutor are e.g. a good starting point. Again, don’t implement this yourself if you are not absolutely sure you have too, because writing such multi-threading code is pure pain.

    Your code should look something like this:

    ExecutorService executor = Executors.newFixedThreadPool(20);
    List<Future<?>> futures = new ArrayList<Future<?>>(1000);
    for (int i = 0; i < 1000; i++) { 
       futures.add(executor.submit(new Runnable() {...}));
    }
    for (Future<?> f : futures) {
       f.get();
    }
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

The classic of writing a singleton in java is like this: public class SingletonObject
When writing multithreaded applications, one of the most common problems experienced is race conditions.
I am writing a multithreaded socket application in Python using the socket module. the
I am writing a simple multithreaded socketserver and I am wondering how best to
I am writing a multithreaded client that uses an IO Completion Port. I create
so got a new problem... I'm writing a multithreaded proxychecker in c#. I'm using
I'm writing a small multithreaded client-side python application that contains a small webserver (only
I'm writing a highly parallel application that's multithreaded. I've already got an SSE accelerated
I am writing a multithreaded service that picks up jobs to process that has
I'm writing a multithreaded program, which would crash when a particular exception was thrown.

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.