I am trying to parse og meta tags using the HTTParty gem using this

Question

0

Asked: June 17, 20262026-06-17T01:57:44+00:00 2026-06-17T01:57:44+00:00

I am trying to parse og meta tags using the HTTParty gem using this

0

I am trying to parse og meta tags using the HTTParty gem using this code:

link = http://www.usatoday.com/story/gameon/2013/01/08/nfl-jets-tony-sparano-fired/1817037/
# link = http://news.yahoo.com/chicago-lottery-winners-death-ruled-homicide-181627271.html
resp = HTTParty.get(link)
ret_body = resp.body

# title
  og_title = ret_body.match(/\<[Mm][Ee][Tt][Aa] property\=\"og:title\"\ content\=\"(.*?)\"\/\>/)
  og_title = og_title[1].to_s

The problem is that it worked on some sites (yahoo!) but not others (usa today)

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-17T01:57:46+00:00

Don’t parse HTML with regular expressions, because they’re too fragile for anything but the simplest problems. A tiny change to the HTML can break the pattern, causing you to begin a slow battle of maintaining an ever expanding pattern. It’s a war you won’t win.

Instead, use a HTML parser. Ruby has Nokogiri, which is excellent. Here’s how I’d do what you want:

require 'nokogiri'
require 'httparty'

%w[
  http://www.usatoday.com/story/gameon/2013/01/08/nfl-jets-tony-sparano-fired/1817037/
  http://news.yahoo.com/chicago-lottery-winners-death-ruled-homicide-181627271.html
].each do |link|
  resp = HTTParty.get(link)

  doc = Nokogiri::HTML(resp.body)
  puts doc.at('meta[property="og:title"]')['content']
end

Which outputs:

Jets fire offensive coordinator Tony Sparano
Chicago lottery winner's death ruled a homicide

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am trying to parse og meta tags using the HTTParty gem using this

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply