I’m parsing some RSS feeds that aggregate what’s going on in a given city. I’m only interested in the stuff that is happening today.
At the moment I have this:
require 'rubygems'
require 'rss/1.0'
require 'rss/2.0'
require 'open-uri'
require 'shorturl'
source = "http://rss.feed.com/example.xml"
content = ""
open(source) do |s| content = s.read end
rss = RSS::Parser.parse(content, false)
t = Time.now
day = t.day.to_s
month = t.strftime("%b")
rss.items.each do |rss|
if "#{rss.title}".include?(day)&&(month)
# does stuff with it
end
end
Of course by checking whether the title (that I know contains the date of event in the following format: “(2nd Apr 11)”) contains the day and the month (eg. ‘2’ and ‘May’) I get also info about the events that happen on 12th May, 20th of May and so on. How can I make it foolproof and only get today’s events?
Here’s a sample title: “Diggin Deeper @ The Big Chill House (12th May 11)”
There could potentially be problems if the title contains other numbers. Does the title have any bounding characters around the date, such as a hyphen before the date or brackets around it? Adding those to the regex could prevent trouble. Could you give us an example title? (An alternative would be to use Time#strftime to create a string which would perfectly match the date as it appears in the title and then just use String#include? with that string, but I don’t think there’s an elegant way to put the ‘th’/’nd’/’rd’/etc on the day.)