I need to remove an image with the give src
img_src = "http://domain/img.jpg"
@doc.xpath("//img[@src='#{img_src}']")[0].remove
Doesn’t work. Tried it also like this
@doc.xpath("//img[@src='#{img_src}']") {|x| x.remove}
Doesn’t work either. Any ideas on what I’m doing wrong?
I got it. It was a stupid mistake. All your solutions were correct.
Nokogiri has two different parser modes, one for XML and one for HTML. XML is strict and HTML is very relaxed because, well, HTML is not always well-behaved.
or
This is how I generally parse an HTML file:
To strip a tag you need to locate it first, then
removeit. After we parse a HTML or XML document we’ll have a Nokogiri::HTML or Nokogiri::XML document respectively, and, at that point what we called “tags” are now called “nodes”. Nokogiri can find nodesets, which are nodes that match a search, or an individual node, which will be the first match from a search.This will search for the first node matching
src="a.png"using a CSS accessor, which is generally easier/cleaner than XPath. Nokogiri understands both XPath and CSS very well, and there are some advantages to CSS mentioned on the website:To locate all nodes matching the accessor you could replace
doc.at('img[@src="a.png"]').removewith:The tutorials are worth reading too.