I’d like to correct a script. But my head is nested. So I’d like to ask SO.
Script is:
from xml.dom.minidom import parse
from itertools import groupby
yXML = parse('/root/Desktop/gb/data/yConfig.xml')
servers = []
for AllConfigurations in yXML.getElementsByTagName('AllConfigurations'):
for DeployConfigurations in AllConfigurations.getElementsByTagName('DeployConfigurations'):
for Servers in DeployConfigurations.getElementsByTagName('Servers'):
for Group in Servers.getElementsByTagName('Group'):
for GApp in Group.getElementsByTagName('GApp'):
for Server in Group.getElementsByTagName('Server'):
servers.append((Server.getAttribute('name'),
Group.getAttribute('name'),
Server.getAttribute('ip'),
GApp.getAttribute('type')))
def line(machine, group, ip, services):
return " | ".join([machine.ljust(22), group.ljust(22), ip.ljust(18), services])
print line("Machine", "Group", "IP", "Services")
print line("----------", "----------", "----------", "----------")
for server, services in groupby(sorted(servers), lambda server: server[0:3]):
print line("- " + server[0], server[1], server[2],
", ".join(service[3] for service in set(services)))
XML is:
<AllConfigurations>
<DeployConfigurations>
<Servers>
<Group id="1" name="The Perfect Life" username="root" password="mypasswd123" state="">
<GApp id="1" name="JBoss Servers" type="JBoss" path="/root/Desktop/jboss-as-7.0.2.Final/" state="">
<Server id="1" name="Jboss1" ip="192.168.1.250" path="/root/Desktop/jboss-as-7.0.2.Final/" username="" password="" state="" />
<Server id="2" name="Jboss2" ip="192.168.1.251" path="/root/Desktop/jboss-as-7.0.2.Final/" username="" password="" state="" />
<Server id="3" name="Jboss3" ip="192.168.1.252" path="/root/Desktop/jboss-as-7.0.2.Final/" username="" password="" state="" />
<Server id="4" name="Jboss4" ip="192.168.1.253" path="/root/Desktop/jboss-as-7.0.2.Final/" username="" password="" state="" />
</GApp>
<GApp id="2" name="Tomcat Servers" type="Tomcat" path="/root/Desktop/apache-tomcat-7.0.22/" state="">
<Server id="1" name="Tom1" ip="192.168.1.250" path="/root/Desktop/apachee/" username="" password="" state="" />
<Server id="2" name="Tom2" ip="192.168.1.251" path="/root/Desktop/apache-tomcat-7.0.22/" username="" password="" state="" />
<Server id="3" name="Tom3" ip="192.168.1.252" path="/root/Desktop/apache-tomcat-7.0.22/" username="" password="" state="" />
<Server id="4" name="Tom4" ip="192.168.1.111" path="/root/Desktop/apache-tomcat-7.0.22/" username="" password="" state="" />
</GApp>
</Group>
</Servers>
</DeployConfigurations>
</AllConfigurations>
Current output is:
Machine | Group | IP | Services
---------- | ---------- | ---------- | ----------
- Jboss1 | The Perfect Life | 192.168.1.250 | Tomcat, JBoss
- Jboss2 | The Perfect Life | 192.168.1.251 | Tomcat, JBoss
- Jboss3 | The Perfect Life | 192.168.1.252 | JBoss, Tomcat
- Jboss4 | The Perfect Life | 192.168.1.253 | JBoss, Tomcat
- Tom1 | The Perfect Life | 192.168.1.250 | JBoss, Tomcat
- Tom2 | The Perfect Life | 192.168.1.251 | Tomcat, JBoss
- Tom3 | The Perfect Life | 192.168.1.252 | JBoss, Tomcat
- Tom4 | The Perfect Life | 192.168.1.111 | JBoss, Tomcat
The issues are:
1- As you see at Tom4 there is no JBoss Server on 192.168.1.111. This server is only for Tomcat. Jboss4 have only JBoss (253), and others (250, 251, 252) have both. Services part is not functional.
2- The IP prints more than one time. I can’t handle it…
3- And the Machine column…
They all must be like this:
Machine | Group | IP | Services
---------- | ---------- | ---------- | ----------
- Jboss1 / Tom1 | The Perfect Life | 192.168.1.250 | JBoss, Tomcat
- Jboss2 / Tom2 | The Perfect Life | 192.168.1.251 | JBoss, Tomcat
- Jboss3 / Tom3 | The Perfect Life | 192.168.1.252 | JBoss, Tomcat
- Jboss4 | The Perfect Life | 192.168.1.253 | JBoss
- Tom4 | The Perfect Life | 192.168.1.111 | Tomcat
So, what should I do?
Thanks
Warning: this answer is huge.
You have a bunch of problems in your code.
itertools.groupby()used incorrectlyThe most relevant is that you are sorting and grouping your servers using two different key functions. When you group a sequence, it should be ordered by the same key function that will group it. In your case, since you are going to group by IP (which is the third element of the server tuple), your function should be:
Now you even can sort the servers before processing them, for clarity:
The
groupby()iterator will yield various pairs, consisting by the key and a iterator which yields all results of that key, as you probably already know. Since the key is the IP, I will declare the loop as follows. Note that the function is the same that sorted the servers:Concatenating the values to form the columns
Inside the loop, what will we do? For each IP, we will get the set of all machines, the set of all groups* and the set of all services associated to the IP. First, we will create the empty sets:
Then, we will iterate over all servers yielded by the iterator returned by
groupby()for the given IP. For each server, we will add each server info to the corresponding set:Made that, we will join the machines, groups and services, each set in one string. Then just pass these values to the
line()function and print the result:Checkpoint
For clarity, the resulting code follows. You can just replace all the code before the declaration of the
line()function by the code below:The printed result is the one below:
Sorting by the machine names
It is not exactly what you asked for: the lines are sorted by IP, not by the machine column. Of course: this is how we sorted the servers before. To sort as you asked, I would propose this solution: just before the
for, create a list in a variable. Instead of printing the line, append a tuple with the values to this list:Then, sort the list by the first item of the tuples (the machine names):
Now iterate over all server tuples and print the lines:
Bonus point: removing the looping behemoth
In the beginning of the program, you have not less than six nested
forloops. Man, this is madness (or SPARTAAA, but both are bad ideas). You can easily remove all this nesting this way: retrieve all theServertags directly from theyXMLobject. From each tag, you can get the server name by callingserver.getAttribute('name'). TheGrouptag is the grandparent of theServertag, so you can get the group name withserver.parentNode.parentNode.getAttribute('name'). The IP can be retrieved from the server tag easily:server.getAttribute('ip'). And the service name is an attribute in the parent of the server tag, so you can get it this way:server.parentNode.getAttribute('type').Summing up, you can get all the servers with the rather smaller loop below:
Remember the Zen of Python:
Sorting the machines names
Oh, sure, there is still a problem: the machines names is not well sorted. This is easy to repair, however: just sort the sets. Replace the lines below
by the lines below:
In this example, we are sorting all sets, not only the machine names. I bet it is a good choice too.
I know this answer is inappropriately long but I hope to have both solved your problem and clarified a lot of points.