I am using selector gadget for the first time and am having trouble, when I run the code below, why do I only get the first result to display in the Terminal?
Also, is there any easier way to get the text after the ICD-10 code in the example page, because as of now selector gadget only gets the links, and not the plain text?
require 'rubygems'
require 'nokogiri'
require 'open-uri'
url = "http://en.wikipedia.org/wiki/ICD-10_Chapter_XVII:_Congenital_malformations,_deformations_and_chromosomal_abnormalities"
doc = Nokogiri::HTML(open(url))
puts doc.at_css("li li:nth-child(1) li a , li li ul:nth-child(5) :nth-child(1), .new, li:nth-child(3) li a, li li li:nth-child(10) li:nth-child(9) li:nth-child(4) :nth-child(1) li:nth-child(5) :nth-child(1) :nth-child(1) li:nth-child(2) :nth-child(1), li a:nth-child(4), li li li:nth-child(1), #mw-content-text li a:nth-child(5), li :nth-child(4) ul:nth-child(4) :nth-child(1), #mw-content-text li a:nth-child(3)").text
This gets all the text following a bullet with a Q code:
The XPath matches a list item (
li) which contains an external link withicd10in the URL, then extracts the text from it.It’s a bit of a broad brushstroke: it gets all the text, which means further manipulation will be necessary if you don’t want the code, or subitems that don’t have a code. But in any case it’s a start.