I am writing a web scraper that grabs content from decade articles from wikipedia.

Question

0

Asked: May 26, 20262026-05-26T00:23:50+00:00 2026-05-26T00:23:50+00:00

I am writing a web scraper that grabs content from decade articles from wikipedia.

0

I am writing a web scraper that grabs content from decade articles from wikipedia. (e.g. articles on the 10s, the 1970s, the 1670s BC, and so on.)

I am using logic that resembles this to grab the pages.

for (i = -1690; i <= 2010; i += 10)
    if (i < 0)
        page = (-i) + "s_BC"
    else
        page = i + "s"
    GrabContentFromURL("http://en.wikipedia.org/wiki/" + page)

This is working, except for one little detail that I hadn’t considered.

The problem is that there are two 0s decades. There is a 0s AD and a 0s BC. With the way my loop currently works, the program only grabs the content from the 0s AD page.

This is a pretty simple problem, but I’m having a hard time coming up with a very nice way to fix it. I know I can extract the body of the loop to a separate function and use two separate loops, but I feel like there’s a more elegant way to do this that I’m missing.

How can I fix this problem without introducing too much complexity?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-26T00:23:50+00:00

You mind hitting a few 404 pages along the way?

for (i = 0; i <= 2010; i+=10)
    GrabContentFromURL("http://en.wikipedia.org/wiki/" + i + "s")
    GrabContentFromURL("http://en.wikipedia.org/wiki/" + i + "s_BC")
end

If the answer to that question was “yes, I mind” then you can still toss in some ifs:

for (i = 0; i <= 2010; i+=10)
    GrabContentFromURL("http://en.wikipedia.org/wiki/" + i + "s")
    if (i < 1690)
        GrabContentFromURL("http://en.wikipedia.org/wiki/" + i + "s_BC")
end

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am writing a web scraper that grabs content from decade articles from wikipedia.

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply