I would like to write a scraping script to retrieve comments from cnn articles.

Question

0

Asked: May 28, 20262026-05-28T07:21:40+00:00 2026-05-28T07:21:40+00:00

I would like to write a scraping script to retrieve comments from cnn articles.

0

I would like to write a scraping script to retrieve comments from cnn articles. For example, this article: http://www.cnn.com/2012/01/19/politics/gop-debate/index.html?hpt=hp_t1

I realize that cnn uses disqus for their comment discussion. As the comment loading is not webpage-based (ie, prev page, next page) and is dynamic (ie, need to click “load next 25”), I have no idea how to retrieve all the 5000+ comments for this article.

Any idea or suggestion?

Thanks so much!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-28T07:21:40+00:00

The option for scraping (other then getting the page), which might be less robust (depends on you’re needs) but will offer a solution for the problem you have, is to use some kind of wrapper around a full fledged web browser and literally code the usage pattern and extract the relevant data. Since you didn’t mention which programming language you know, I’ll give 3 examples: 1) Watir – ruby, 2) Watin – IE & Firefox via .net, 3) Selenium – IE via C#/Java/Perl/PHP/Ruby/Python

I’ll provide a little example using Watin & C#:

IE browser = new IE();
browser.GoTo(YOUR CNN URL);
List visibleComments = Browser.List(Find.ById("dsq-comments"));
//do your scraping thing
Link moreComments = Browser.Link(Find.ByClass("dsq-paginate-append-text");
moreComments.click();
//wait util ajax ended by searching for some indicator
Browser.WaitUntilContainsText(SOME TEXT);
//do your scraping thing

Notice:
I’m not familiar with disqus, but it might be a better option to force all the comments to show by looping the Link & click parts of the code I posted until all the comments are visible and the scrape the List element dsq-comments

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I would like to write a scraping script to retrieve comments from cnn articles.

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply