How can I get all IMDB ids from a page? For example, I want get all ids from here. In that page, urls are of the format:
http://www.imdb.com/title/tt0948470/
I need to get all ids from page using preg_match_all() – can any help me?
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
Okay, I’m giving cooked-up code, but I also explain it:
<a>hrefattributesExample/Demo
(Notes: You tagged this question PHP5, current stable PHP5 is 5.4, so is this example; If you configure your PHP5 version with the curl wrappers, this code is
curl.)Edit: Lower PHP Versions:
Edit2: Just seeing that IMDB tags it’s markup, so it’s possible to retrieve the actual movie entries of that list rather than any title links on that page.
This require a little improvement in the xpath expression used. Because the parsing is now much more intelligent, duplicates do not exist and so there is no need to remove them: