We implemented mongodb sharding concept for our chat module by using node + mongodb.
MongoDB Sharding Configuration
===============================
Shard1 = PRIMARY + SECONDARY + ARBITER
Shard2 = PRIMARY + SECONDARY + ARBITER
Config
Mongos
Following details we got it on today morning. But we dont know how we can resolve this issue.
Please let me know how we can resolve this issue.
“errmsg” : “rollback 2 error findcommonpoint waiting a while before trying again”
“errmsg” : “error RS102 too stale to catch up”
data2:PRIMARY> rs.status()
{
"set" : "data2",
"date" : ISODate("2012-07-27T04:30:29Z"),
"myState" : 1,
"members" : [
{
"_id" : 0,
"name" : "50.52.108.16:20001",
"health" : 1,
"state" : 9,
"stateStr" : "ROLLBACK",
"uptime" : 322,
"optime" : {
"t" : 1343361602000,
"i" : 155
},
"optimeDate" : ISODate("2012-07-27T04:00:02Z"),
"lastHeartbeat" : ISODate("2012-07-27T04:30:29Z"),
**"errmsg" : "rollback 2 error findcommonpoint waiting a while before trying again"**
},
{
"_id" : 1,
"name" : "50.52.108.17:20002",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"optime" : {
"t" : 1343363429000,
"i" : 7
},
"optimeDate" : ISODate("2012-07-27T04:30:29Z"),
"self" : true
},
{
"_id" : 2,
"name" : "50.52.108.17:20003",
"health" : 1,
"state" : 7,
"stateStr" : "ARBITER",
"uptime" : 10880311,
"optime" : {
"t" : 0,
"i" : 0
},
"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
"lastHeartbeat" : ISODate("2012-07-27T04:30:28Z")
}
],
"ok" : 1
}
data1:PRIMARY> rs.status()
{
"set" : "data1",
"date" : ISODate("2012-07-27T04:30:17Z"),
"myState" : 1,
"members" : [
{
"_id" : 0,
"name" : "50.52.108.17:10001",
"health" : 1,
"state" : 3,
"stateStr" : "RECOVERING",
"uptime" : 35,
"optime" : {
"t" : 1343320338000,
"i" : 3
},
"optimeDate" : ISODate("2012-07-26T16:32:18Z"),
"lastHeartbeat" : ISODate("2012-07-27T04:30:16Z"),
"errmsg" : "error RS102 too stale to catch up"
},
{
"_id" : 1,
"name" : "50.52.108.16:10002",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"optime" : {
"t" : 1343363417000,
"i" : 30
},
"optimeDate" : ISODate("2012-07-27T04:30:17Z"),
"self" : true
},
{
"_id" : 2,
"name" : "50.52.108.16:10003",
"health" : 1,
"state" : 7,
"stateStr" : "ARBITER",
"uptime" : 10880162,
"optime" : {
"t" : 0,
"i" : 0
},
"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
"lastHeartbeat" : ISODate("2012-07-27T04:30:16Z")
}
],
"ok" : 1
}
Kumaran
Looks like the secondary was down for a very long period of time and now it can’t come in sync with the primary. This sync requires the oplog to contain all the writes going to the primary during the secondary’s down-time. If the secondary has been down for too long, the records might have been rolled out of the oplog since it is a “capped” collection.You need to do a full resyc:
http://www.mongodb.org/display/DOCS/Resyncing+a+Very+Stale+Replica+Set+Member
Thereafter, consider increasing the oplog size to avoid a similar situation in future.