I am trying to figure out a solution for managing a set of linux machines(OS:Ubuntu,~40 nodes. same hardware). These machines are supposed to be images of each other, softwareinstalled in one needs to be installed in other. My software requirements are hadoop, R and servicemix. R packages on all machines also need to be synchronized(package installed on one needs to be available in all the others)
One solution I am using right now is by using NFS and pssh. I am hoping there is a better/easier solution out there, which would make my life a bit easier. Any suggestion is appreciated.
Two popular choices are Puppet from Puppet Labs and Chef from OpsCode.
Another potential mechanism is creating a new metapackage that
Requires:the packages you want installed on all machines. When you modify your metapackage, anapt-get update && apt-get -u dist-upgradewould install the new package on all your systems simultaneously.The metapackage approach might be less work to configure and use initially, but Puppet or Chef might provide better returns on investment in the long run, as they can manage far more than just package installs.