I’m using PHP to scrape a website and collect some data. It’s all done

Question

0

Asked: May 14, 20262026-05-14T02:12:30+00:00 2026-05-14T02:12:30+00:00

I’m using PHP to scrape a website and collect some data. It’s all done

0

I’m using PHP to scrape a website and collect some data. It’s all done without using regex. I’m using php’s explode() method to find particular HTML tags instead.

It is possible that if the structure of the website changes (CSS, HTML), then wrong data may be collected by the scraper. So the question is – how do I know if the HTML structure has changed? How to identify this before storing any data to my database to avoid wrong data being stored.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-14T02:12:31+00:00

I think you don’t have any clean solutions if you are scraping a page where content changes.

I have developed several python scrapers and I know how can be frustrating when site just makes a subtle change on its layout.

You could try a solution a la mechanize (don’t know the php counterpart) and if you are lucky you could isolate the content you need to extract (links?).

Another possibile approach would be to code some constraints and check them before store to db.

For example, if you are scraping Urls, you will need to verify that what scraper has parsed is formally a valid Url; same for integer ID or whatever you want to scrape that can be recognized as valid.

If you are scraping plain text, it will be more difficult to check.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m using PHP to scrape a website and collect some data. It’s all done

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply