I have a web crawler that looks for specific information I want and returns

Question

0

Editorial Team

Asked: June 5, 20262026-06-05T21:36:12+00:00 2026-06-05T21:36:12+00:00

I have a web crawler that looks for specific information I want and returns

0

I have a web crawler that looks for specific information I want and returns it. This is run daily.

The issue is that my crawler has to do two things.

Get the link it has to crawl.
Crawl said link and push stuff to the db.

The issue with #1 is, there are 700+ links in total. These links don’t change VERY frequently – maybe once a month?

So one option is just to do a separate crawl for the ‘list of links’, once a month, and dump the links into the db.

Then, have the crawler do a db hit for each of those 700 links every day.

Or, I can just have a nested crawl within my crawler – where every single time the crawler is run (daily), it updates this list of 700 URLs and stores it in an array and pulls it from this array to do crawl each link.

Which is more efficient and be less taxing on Heroku – or whichever host?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-05T21:36:14+00:00

It depends on how you measure “efficiency” and “taxing”, but the local database hit is almost certain to be faster and “better” than an HTTP request + parsing an HTML(?) response for the links.

Further, not that it likely matters, but (assuming your database and adapter support it) you can begin to iterate through the DB request results and process them without waiting for or fetching the entire set into memory.

Network latency and resources are going to be much worse than poking at a DB that is already sitting there, running, and designed to be queried efficiently and quickly.

However: once per day? Is there a good reason to spend any energy optimizing this task?

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a web crawler that looks for specific information I want and returns

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply