For developing an devices monitor system, I am using a InetAdress isReachable method to know if the device in a network is online or not.I am using the ScheduledExecutorService along with list of devices to ping the icmp devices concurrently.
Now if the devices number is low(say 60 devices) to be pinged by ScheduledExecutorService with a pool of 10 threads,it works fine..Means result will correctly show the device status.Here the isReachable has a time out of 5000 ms.
If the devices number gets increased say to 80, some devices shows the offline status with isReachable method even if the device is online.If I increase the time out of isReachable method to 10000 ms, the devices status can have better chance to have a correct status.
From these devices, most of the devices are Linux based system and isReachable always return the correct status for them but for Windows, the behavior is unpredictable.
I want to be sure always of correct status for the devices on a network. There can also be an alternative mechanism by starting a java process to see the exit value 0 for online devices.
For e.g “Process proc = new ProcessBuilder("ping", host).start();“
So, what experts will advice? Will the checking status for a device with a Process as stated above will be more better and reliable than isReachable call?
Running ping in subprocesses is unlikely to make things much better; while there will be less load on the Java process, you’re just shifting it around within one machine. (Furthermore, I’m not sure if you can actually ping multiple machines at once from one host, due to the way that ICMP ECHO — the standardized core of ping — works.) The other issue you’re likely to run into is that a machine can be responding to ping without actually being usefully reachable; I’ve seen machines where the kernel was working (making it pingable) but where there were no working user processes, and you can easily imagine the particular interesting service on the machine being down. (Also, some firewalls block ping.) It’s far better to actually detect if each of the machines is working using some kind of do-nothing connection to the real service running on that machine.
If you’re really looking into keeping track of the status of large numbers of machines, you should look into using software designed for the task (e.g., Nagios). That’s much more a question for ServerFault than Stack Overflow…