I recently had a replica set member fall a few days out of sync.

Question

0

Asked: June 11, 20262026-06-11T09:05:05+00:00 2026-06-11T09:05:05+00:00

I recently had a replica set member fall a few days out of sync.

0

I recently had a replica set member fall a few days out of sync. Using the “Resyncing a Very Stale Replica Set Member” instructions, I stopped mongod on the secondary machine, wiped out the data directories, restarted the process, and let the machine re-sync to the primary.

Everything worked perfectly, or so it seemed. Logging suggested the sync went fine. Eventually, it showed as complete, resulting in the following rs.status() output on the secondary machine:

# The secondary machine's status for itself and its primary:
{
    "_id" : 0,
    "name" : "myprimary:myport",
    "health" : 1,
    "state" : 1,
    "stateStr" : "PRIMARY",
    "uptime" : 497,
    "optime" : {
        "t" : 1347562257000,
        "i" : 1
    },
    "optimeDate" : ISODate("2012-09-13T18:50:57Z"),
    "lastHeartbeat" : ISODate("2012-09-13T19:00:34Z"),
    "pingMs" : 3
    },
{
    "_id" : 2,
    "name" : "mysecondary:myport",
    "health" : 1,
    "state" : 2,
    "stateStr" : "SECONDARY",
    "optime" : {
        "t" : 1347562257000,
        "i" : 1
    },
    "optimeDate" : ISODate("2012-09-13T18:50:57Z"),
    "self" : true
}

As expected, the machines are in sync, and share an optime value. But the primary machine is a different story. It still shows the out-of-sync secondary, even though the optime for the primary advanced since the re-syncing completed.

# The primary machine's status for itself and its secondary:
{
    "_id" : 0,
    "name" : "myprimary:myport",
    "health" : 1,
    "state" : 1,
    "stateStr" : "PRIMARY",
    "uptime" : 497,
    "optime" : {
        "t" : 1347562257000,
        "i" : 1
    },
    "optimeDate" : ISODate("2012-09-13T18:50:57Z"),
    "self" : true
    },
{
    "_id" : 2,
    "name" : "mysecondary:myport",
    "health" : 1,
    "state" : 2,
    "stateStr" : "SECONDARY",
    "optime" : {
        "t" : 1347103757000,
        "i" : 1
    },
    "optimeDate" : ISODate("2012-09-08T11:29:17Z"),
    "lastHeartbeat" : ISODate("2012-09-11T17:27:06Z"),
    "pingMs" : 3
}

What am I missing? At first I thought “wait it out”, but it’s been nearly an hour and the database had inserts in that time. Can I force the primary to heartbeat-check the secondary, or do I need to re-sync them again?

The only real oddity I can find on the primary is this:

PRIMARY> use local;
PRIMARY> db.slaves.find()
{ "_id" : ObjectId("4f675b909d8e143a90055864"), "host" : "<hostIP>", "ns" : "local.oplog.rs", "syncedTo" : { "t" : 1347395837000, "i" : 1 } }
{ "_id" : ObjectId("50522761212b77e9637ad541"), "host" : "<hostIP>", "ns" : "local.oplog.rs", "syncedTo" : { "t" : 1347562257000, "i" : 1 } }

These are the same hosts (the secondary machine in question). My understanding is this should show one entry, but I’m hesitant to touch it without a better understanding of what it tracks and how it updates.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-11T09:05:06+00:00

Editorial Team

2026-06-11T09:05:06+00:00Added an answer on June 11, 2026 at 9:05 am

It might be a good idea to try bringing down the secondary, deleting both entries on the primary’s db.slaves collection, and then restarting the secondary.

Do the data files corroborate that the machines are in sync?

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I recently had a replica set member fall a few days out of sync.

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply