Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 3991880
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 20, 20262026-05-20T06:40:07+00:00 2026-05-20T06:40:07+00:00

I am developing a Rails 3 application from which I want to be able

  • 0

I am developing a Rails 3 application from which I want to be able to extract data (title and short text) about any topic from Wikipedia.

I need to get the info very “clean” in other words free from HTML, Wikitags and irrelevant data like reference list and such.

Is it possible do get only the title and some text about the topic?

I am using a gem to get the data but it is very ugly.

{{for|the television series|Solsidan (TV series)}} {{Infobox settlement |official_name = Solsidan |image_skyline = |image_caption = |pushpin_map = Sweden |pushpin_label_position = |coordinates_region = SE |subdivision_type = [[Country]] |subdivision_name = [[Sweden]] |subdivision_type3 = [[Municipalities of Sweden|Municipality]] |subdivision_name3 = [[Nacka Municipality]] |subdivision_type2 = [[Counties of Sweden|County]] |subdivision_name2 = [[Stockholm County]] |subdivision_type1 = [[Provinces of Sweden|Province]] |subdivision_name1 = [[Uppland]] |area_footnotes = {{cite web | title=Tätorternas landareal, folkmängd och invånare per km2 2000 och 2005 | publisher=[[Statistics Sweden]] | url=http://www.scb.se/statistik/MI/MI0810/2005A01B/T%c3%a4torternami0810tab1.xls | format=xls | language=Swedish | accessdate=2009-05-08}} |area_total_km2 = 0.23 |population_as_of = 2005-12-31 |population_footnotes = |population_total = 209 |population_density_km2 = 895 |timezone = [[Central European Time|CET]] |utc_offset = +1 |timezone_DST = [[Central European Summer Time|CEST]] |utc_offset_DST = +2 |coordinates_display = display=inline,title |latd=59 |latm=17 |lats= |latNS=N |longd=17 |longm=51 |longs= |longEW=E |website = }} '''Solsidan''' is a [[Urban areas in Sweden|locality]] situated in [[Nacka Municipality]], [[Stockholm County]], [[Sweden]] == References == {{Reflist}} {{Stockholm-geo-stub}} {{Localities in Nacka Municipality}} [[Category:Populated places in Stockholm County]] [[no:Solsidan]] [[sv:Solsidan, Nacka kommun]]
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-20T06:40:08+00:00Added an answer on May 20, 2026 at 6:40 am

    Wikipedia provides regular images at Wikipedia:Database download both as MySQL dumps in the schema used by mediawiki, and in an XML interchange format. You can load these onto your own server (~6GiB to download, ~30 GB uncompressed for the current text of all english wikipedia articles), and query/process however you wish. The content is not yet processed to HTML, so you can process the wiki markup and emit whatever you want to around it. The page has lots of links to libraries in various languages that process these dumps, though I don’t see a Ruby one so you might have to do it yourself.

    There are also various subsets provided. abstract.xml contains the titles and abstracts, which sounds like what you want, and is only 3GB.

    See also Wikipedia:Mirrors_and_forks for some discussion about the licensing requirements involved in reusing wikipedia content.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am developing Ruby on Rails application which uses Thinking Sphinx. Unfortunately, from time
I am currently developing a Rails application using a database that was designed before
I am developing a Rails application that will access a lot of RSS feeds
We are developing a considerably big application using Ruby on Rails framework (CRM system)
I'm developing a Rails 3 application that has two user types: Teacher and Company.
I'm developing a Rails 3 application that has two user types: Teacher and Company.
I am developing a little Rails application with a friend of mine. We are
I'm using the RESTful authentication Rails plugin for an app I'm developing. I'm having
Developing a .NET WinForms application: how can I check if the window is in
Developing a heavily XML-based Java-application, I recently encountered an interesting problem on Ubuntu Linux.

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.