I have not found any documentation nor tutorial for that. Does anything like that exist?
doc.xpath('//table/tbody[@id="threadbits_forum_251"]/tr')
The code above will get me any table, anywhere, that has a tbody child with the attribute id equal to “threadbits_forum_251”. But why does it start with double //? Why there is /tr at the end? See “Ruby Nokogiri Parsing HTML table II” for more details.
Can anybody tell me how to extract href, id, alt, src, etc., using Nokogiri?
td[3]/div[1]/a/text()' <--- extracts text
How can I extract other things?
Seems you need to read a XPath Tutorial
Your
//table/tbody[@id="threadbits_forum_251"]/trexpression means://– Anywhere in your XML documenttable/tbody– take a table element with a tbody child[@id="threadbits_forum_251"]– where id attribute are equals to “threadbits_forum_251”tr– and take itstrelementsSo, basically, you need to know:
@[]bracketsIf I correcly understood that API, you can go with
doc.xpath("td[3]/div[1]/a")["href"], ortd[3]/div[1]/a/@hrefif there is just one<a>element.