I am trying to write a ruby script to take the Nexpose Simple XML results export, parse it, and write the required results out to a prettier format for easy review. I am using Nokogiri to parse the XML. My issue is that I have a nested loop that for each device, iterates through each service section and pulls out the name, port, and protocol attributes from each one. This will ultimately be printed back out to a file either a text file or a csv. However, my nested loops seems to only pull those three attributes from the first service section and prints them repeatedly.
Sample Input (there will be more than one of these device blocks):
<device address="10.x.x.1" id="20xx">
<fingerprint certainty="0.85">
<description>Microsoft Windows</description>
<vendor>Microsoft</vendor>
<family>Windows</family>
<product>Windows</product>
<version/>
<device-class>General</device-class>
<architecture/>
</fingerprint>
<vulnerabilities>
</vulnerabilities>
<services>
<service name="NTP" port="123" protocol="udp">
<vulnerabilities>
</vulnerabilities>
</service>
<service name="HTTP" port="8080" protocol="tcp">
<fingerprint certainty="0.75">
<description>Apache</description>
</device>
<device address="10.x.x.2" id="20xx">
<fingerprint certainty="0.85">
<description>Microsoft Windows</description>
<vendor>Microsoft</vendor>
<family>Windows</family>
<product>Windows</product>
<version/>
<device-class>General</device-class>
<architecture/>
</fingerprint>
<vulnerabilities>
</vulnerabilities>
<services>
<service name="DNS" port="53" protocol="udp">
<vulnerabilities>
</vulnerabilities>
</service>
<service name="HTTP" port="80" protocol="tcp">
<fingerprint certainty="0.75">
<description>Apache</description>
</device>
Ruby Code:
#! /usr/bin/env ruby
require 'rubygems'
require 'nokogiri'
doc = Nokogiri::XML(open('report.xml').read)
device = doc.xpath('//device')
device.each do |d|
service = d.xpath('//service')
puts d.attr('address')
service.each do |s|
name = s.attr('name')
port = s.attr('port')
protocol = s.attr('protocol')
puts port
puts protocol
puts name
end
end
Desired Output:
10.x.x.1
123
udp
NTP
8080
tcp
HTTP
10.x.x.2
53
udp
DNS
80
tcp
HTTP
Actual Output:
123
NTP
udp
123
NTP
udp
So the code should show a list of service port, name, and protocol for each service of each device. However, the current code seems to just print the set for the first service (which is 123, NTP, and udp) over and over and over.
Am I missing something in the logic of my loop? Or do you see anything wrong with the loops? Any help getting this working would be helpful. Thanks.
Note that the XPath construct
//means find the element anywhere in the document. You don’t want to do that in the inner loop, because you’ve already done that for your device.Update
Based on the new input document, here is one way to extract the information you need. I took the liberty of using CSV, for a nice Excel-ready output file. Note that there is a single parsing loop. Code:
Here’s the contents of
devices.csv: