im looking forward for a 3rd part tool/solution that can monitor my server’s network usage with an intent to find out whether my server resources (e.g bandwidth) isnt highly utilized and if so, i can take premptive measures before my server get crashed (e.g to call on my secondary server to help for load sharing etc).
Currently i hav written a continuous ping logic in my servlet so that my 2 servers(1 behave as primary and other as backup server) should remain aware if the other server is available/alive or not.
plz suggest some standard tools/sol for my current ping-based-server-liveliness logic.
Note that im avoiding solutions which manage centraly all servers because im building a redundant system in which every instance(server) have to monitor and notify on its own
Nagios and Ichinga are both free, open-source monitoring software systems that work in roughly the same way. You can do things centrally or distributed.
If you are using mutual-pinging to check the webapps’ liveness, you are probably going to be disappointed. Instead, you should be properly clustering the servers with a failover-capable load-balancer. You can use JMX to observe activity on the backup server: any spike in activity would mean that the primary one is down (plus, you can directly instrument the primary server for that matter).
As for values to inspect, fire up
jconsoleon a development instance of your webapp and browse through the various data you can observe via JMX. Then, either use something likecheck_jmx(there are one or more Nagios plug-ins with that name) or Tomcat’s JMXProxyServlet (part of the Tomcat Manager webapp) over HTTP to grab those values on a regular basis.We use JMXProxyServlet + Nagios + a few custom scripts to read the responses from JXMProxyServlet and convert them into meaningful responses that Nagios understands, and it has worked quite well across multiple servers and environments, with many different values being sampled.