Require a huge help here, since this is affecting our production instance.
One of the replica server is failing due to lack of memory (see below chunk of piece from kern.log)
kernel: [80110.848341] Out of memory: kill process 4643 (mongod) score 214181 or a child
kernel: [80110.848349] Killed process 4643 (mongod)
UPDATE
kernel: mongod invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0
kernel: [85544.157191] mongod cpuset=/ mems_allowed=0
kernel: [85544.157195] Pid: 7545, comm: mongod Not tainted 2.6.32-318-ec2
Insight:
- Primary server DB size is 50GB out of which 30GB is filled by index.
- Primary server has 7GB Ram whereas secondary server has 3.1GB Ram.
- Both servers are 64-bit machine and running Debian/Ubuntu respectively.
- Running Mongo 2.0.2 on both servers
Note:
I see a similar issue has been created in Jira-Mongo web-site recently – no answer to that yet.
Have you got swap enabled on these instances? While generally not needed for mongoDB operation it can prevent the process from being killed by the kernel when you hit an OOM situation. That is mentioned here:
http://www.mongodb.org/display/DOCS/Production+Notes#ProductionNotes-Swap
The issue referenced is happening during a full re-sync rather than ongoing production replication – is that what you are doing also?
Once you get things stable, take a look at your Res memory in mongostat or MMS, if that is exceeding or close to 3GB you should consider upgrading your secondary.