I’m trying to find duplicates within the xml returned by a web service call using ruby and nokogiri.
The output that i’m getting from the code below is something like this:
found duplicate["1", "1"]
found duplicate["1", "1"]
found duplicate["1", "1"]
found duplicate["1", "1"]
found duplicate["2", "2"]
What I want to know is that skus 1 and 2 have been duplicated. So something like this “found duplicate skus [Duplicated skus].”
the xml is like this:
<Root>
<Context>
<ID>1234</ID>
<Item>
<ID>4567</ID>
</Item>
<Item>
<ID>4567</ID>
</Item>
<Item>
<ID>5678</ID>
</Item>
#Context Items that will produce duplicates.
$context = ['a','b','c']
#Class that will search through an array to find duplicates
class Array
def only_duplicates
duplicates = []
self.each {|each| duplicates << each if self.count(each) > 1}
duplicates
end
end
#loops through each item in the $context array
$context.each do |item|
puts "C_ItemID = " + item
#Creates a url string using the context item
url = "url to the call"
#Creates a xml doc
doc = Nokogiri::XML(open(url))
#Declare a blank array that the text from the node will be stored in
values = []
#loops through each item_id node to find duplicates.
doc.xpath('//item/id').each do |node|
values << node.text
@values = values.to_a
if @values.only_duplicates.count > 1
puts "found duplicate" + @values.only_duplicates.inspect
end
end
end
A faster way to find duplicates(Credits: Ryan LeCompte). A slightly modified & shorter version.