Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 713133
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 14, 20262026-05-14T04:55:37+00:00 2026-05-14T04:55:37+00:00

This is a big question, that I don’t know how to start, so I

  • 0

This is a “big” question, that I don’t know how to start, so I hope some of you can give me a direction. And if this is not a “good” question, I will close the thread with an apology.

I wish to go through the database of Wikipedia (let’s say the English one), and do statistics. For example, I am interested in how many active editors (which should be defined) Wikipedia had at each point of time (let’s say in the last 2 years).

I don’t know how to build such a database, how to access it, how to know which types of data it has and so on. So my questions are:

  1. What tools do I need for this (besides basic R) ? MySQL on my computer? RODBC database connection?
  2. How do you start planning for such a project?
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-14T04:55:38+00:00Added an answer on May 14, 2026 at 4:55 am

    You’ll want to start here:
    http://en.wikipedia.org/wiki/Wikipedia:Database_download

    Which will take you to here:
    http://download.wikimedia.org/enwiki/20100312/

    And the file you probably want is:

    # 2010-03-17 04:33:50 done Log events to all pages.
        * This contains the log of actions performed on pages.
        * pages-logging.xml.gz 1.0 GB
    

    http://download.wikimedia.org/enwiki/20100312/enwiki-20100312-pages-logging.xml.gz

    You’ll then import the xml into MySQL. Generating a histogram of users per day, week, year, etc. won’t require R. You’ll be able to do that with a single MySQL query. Something like:

    select DAYOFYEAR(wiki_edit_timestamp), count(*)
    from page_logs
    group by DAYOFYEAR(wiki_edit_timestamp)
    order by DAYOFYEAR(wiki_edit_timestamp);
    

    etc.

    (I’m not sure what their actual schema is, but it’ll be something like that.)

    You’ll run into issues, no doubt, but you’ll learn a lot too. Good luck!

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I will preface the question by saying that I am somewhat new to the
Hi this question or problem i have its very hard i have search and
I am loading a big treeview in a seperate thread. This thread starts at
First time poster. I'm using MVVM-Light with Silverlight 4 and RIA Services. This has
I'm kind of new to writing sql and I have a question about joins.
I've been banging my head for a long time on this one I am
I've had a long standing interest in developing an OS UI tailored to my
We have a number of databases which store 10s to 100s of gigabytes of
Edited: (after seeing Luke's answer) I'm looking to develop a website and all the
Often I find myself having a expression where a division by int is a

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.