So I installed and configured SNMP on lets say node A. Also lets say that node A mounts through NFS from storage node B. Finally I have monitoring node C. When node C requests information via snmp from node A, then node beautifully complies when nothing is wrong.
I’ve been running into this problem though. Lets say that storage node B fails. If node C requests info via SNMP from node A the snmp demon freezes on node A because it can’t reach the mounting point.
This sort of action is very counter intuitive from the respect of the purpose of SNMP. SNMP is used for monitoring system stats. If something fails I want to know that soemthing fails. From node C’s perspective in this case it looks as if someone turned off node A (which clearly is not what happend).
I looked for configuration settings such that the SNMP will skip over MIBS it cant read, but found nothing. Any ideas?
Turns out that nfs timeout coupled with the snmp demon is a old bug. The way to fix it is to basically skip over all the NFS mounted points by using the following option in the snmpd.conf
skipNFSInHostResources yes