I’m fairly new to programming and am working for my dissertation on a web

Question

0

Asked: May 12, 20262026-05-12T07:40:40+00:00 2026-05-12T07:40:40+00:00

I’m fairly new to programming and am working for my dissertation on a web

0

I’m fairly new to programming and am working for my dissertation on a web crawler. I’ve been provided by a web crawler but i found it to be too slow since it is single threaded. It took 30 mins to crawl 1000 webpages. I tried to create multiple threads for execution and with 20 threads simultaneously running the 1000 webpages took only 2 minutes. But now I’m encountering “Heap Out of Memory” errors. I’m sure what i did was wrong which was create a for loop for 20 threads. What would be the right way to multi-thread the java crawler without giving out the errors? And speaking of which, is multi-threading the solution to my problem or not?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-12T07:40:40+00:00

The simple answer (see above) is to increase the JVM memory size. This will help, but it is likely that the real problem is that your web crawling algorithm is creating an in-memory data structure that grows in proportion to the number of pages you visit. If that is the case, the solution maybe to move the data in that data structure to disc; e.g. a database.

The most appropriate solution to your problem depends on how your web crawler works, what it is collecting, and how many pages you need to crawl.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m fairly new to programming and am working for my dissertation on a web

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply