Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8215889
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 7, 20262026-06-07T11:56:16+00:00 2026-06-07T11:56:16+00:00

While converting a collection to a capped collection last night, my secondary’s optime started

  • 0

While converting a collection to a capped collection last night, my secondary’s optime started lagging behind the primary. It advanced slowly, a couple seconds every several minutes, and eventually fell out of the primary’s oplog window. Per the instructions here I stopped mongod on the secondary, deleted all data files, and restarted it, though I forgot to lock the primary from writes. Secondary went through its initialization phase, which took a solid amount of time, and was finally back in business, but when I logged in, replication was now even farther behind.

Since this is the cloud, after all, I created an image of my primary (which is supposed to copy all data), though I couldn’t run db.fsyncLock() at the time, because it was taking some writes. The new image finishes, and I launch a new server based on that image, get it added to my replica set, remove the old secondary, and life is good, right? Not quite – the new secondary is lagging about an hour behind, and through the course of the day (and tonight) eventually gets to the point where it’s 14 hours behind (though strangely enough still within the oplog window).

I take the next step from the “resyncing a stale member page”. Shutdown mongod on both servers, gzip and copy my data folder from primary to secondary, unzip, fire them both up, db.fsyncLock() my primary. What blows my mind is that even with THE SAME DATA, after initializing, my secondary says it’s 1 hour behind. I add it back into the replica set, and it catches up quickly to being 5 minutes behind.

All good, right? No – flash forward, secondary is advancing sloooooowly, and is now 20 minutes behind. Mongostat has secondary pegged at 95+ locked %, iostat -xm 2 doesn’t show anything crazy – primary is currently idle from not taking writes, secondary is definitely not doing much at all (.04 wMB/second). Not sure if it’s worth mentioning, but primary currently feels dog slowunresponsive logging into the mongo shell, etc.

What gives, Mongo? Why can’t you just catch up? What am I doing wrong in trying to get my secondary caught up?

EDIT
Answering questions:

  • Version: 2.0.4
  • Hardware: Both nodes are the same hardware, near as I can tell – 8GB RAM, quad core CPU. I assume it’s something virtualized.
  • Write rate: it varies. As mentioned, last night I was converting to a capped collection, which triggered the whole thing. Overnight, there was a process writing around a couple hundred small documents (~155 bytes each) a few times an hour, so max I’d estimate around 100-200kbytes/hour. During the day, processing was more intense, updating hundreds of thousands of 500-byte documents, and writing a couple hundred thousand more. Still not talking about enormous data quantities. EDIT found some iostat output from earlier today:
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
xvda              1.00  2564.50  243.50  282.50  8986.00 11388.00    77.47    11.32   21.46    2.36   37.93   0.50  26.50

That one was particularly bursty at 11 wMB/s, saw util% hit 34% with 7 wMB/s, and 72% at 52 rMB/s. So not saturated, but definitely a read-heavy workload in the morning. It is interesting though that despite having obj. size ~5GB, and ~1GB indexes (see below), there’s so much disk activity. Shouldn’t that all be in RAM?

  • Working set: I still haven’t found the accepted methodology for calculating working set, but if it helps:
    "collections" : 21,
    "objects" : 15540092,
    "avgObjSize" : 325.26198326238995,
    "dataSize" : 5054601144,
    "storageSize" : 5874327552,
    "numExtents" : 132,
    "indexes" : 43,
    "indexSize" : 864366720,
    "fileSize" : 10666115072,
    "nsSizeMB" : 16,
    "ok" : 1

I can’t imagine that that’s overwhelming 8GB of RAM, though I could be wrong.

  • some recent mongostat samples from secondary:
insert  query update delete getmore command flushes mapped  vsize    res faults locked % idx miss %     qr|qw   ar|aw  netIn netOut  conn    set repl       time 
    *0     *0     *0     *0       0     1|0       0  22.2g  44.9g   912m      0     99.2          0       0|0     0|1     2k   303b   151 mySet  SEC   03:47:54 
    *0     *0     *0     *0       0     1|0       0  22.2g  44.9g  1.85g      0      101          0       0|0     0|1     3k   303b   151 mySet  SEC   03:48:04 

EDIT

Tried more things. I shutdown the primary (now called A, secondary will be B), deleted its data, and unzipped its snapshot (now a couple hours old, but at this point, we’re not writing anything new). Started A up with –fastsync, and it’s still like 45 seconds behind the B’s (now primary) optime, which had been hanging out at about 02:19:52UTC. Finally about an hour later, A catches up, so I call rs.stepDown() on B. Instantly, rs.status() shows me that both servers have optimes around 04:08 UTC, but B (now secondary) is again lagging by 17 seconds…then 30 seconds…now 7 minutes…

EDIT

A few minutes after taking @matulef’s suggestion and re-creating indexes on my capped collections, as well as re-starting the secondary’s mongod process, its optime has only increased a few seconds. Secondary locked % from mongostat is still hovering from 95-104%, and interestingly enough, res size was swinging pretty wildly from 100M to 2GB and back again before settling around 1GB.

EDIT (the next evening)

Conclusion to the story – @matulef was on the right track, I should have been more careful about converting a replicated collection to a capped collection. What follows is what happened, though I don’t advertise this as being data safe – I freely acknowledge I may have lost some data in this process, so YMMV.

Creating the indexes for the capped collections on the primary (A) didn’t propagate to the secondary (B), and A happened to fail over (not intentionally). Once B was primary, I manually created the indexes on the capped collections there, and the resync operation to bring A in line with B started moving quickly. Unfortunately for me, my oplog windows no longer lined up, so I did end up having to snapshot data from B to A. Once I restarted mongo with the same set of data, A & B were happy again, and replication has been back in sync ever since.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-07T11:56:18+00:00Added an answer on June 7, 2026 at 11:56 am

    The problem here is that capped collections do not have an _id index by default (and the “convertToCapped” command actually drops all indexes for that collection). This is an issue because the secondaries perform updates by applying ops from the oplog, which refer to documents by their _ids. If you are missing an _id index, each update requires a full table scan on the secondaries, causing them to lag far behind.

    The solution is to create an _id index on the capped collection. However, if you create the index on the primary, but your secondaries are already lagging behind, they won’t receive the index creation operation quickly enough. Instead, the best way to fix things is to first fix up each lagging secondary, one-by-one. For each one, shut it down and restart in standalone mode (on a different port, without the –replSet option), build the _id index, then add it back into the set. Finally, once the secondaries are fixed up, you can step down the primary and repeat the process with it too.

    Update: In mongoDB 2.0.x and earlier, capped collections do not have an _id index by default. However, the default behavior is scheduled to change in mongoDB 2.2, so that capped collections created in 2.2+ will have an _id index created automatically, just as with non-capped collections. For capped collections created prior to 2.2, you will still need to manually create an _id index using the steps outlined above, but new collections should not suffer from the problems seen above.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a problem while converting a string whose value is dd.mm.yyyy to DateTime
While decoding,I am getting NSData bytes by decoding a string.I am converting NSData bytes
I've got a problem converting an object property to string while using reflection... string
I wrote this handy, generic function for converting a collection of collections into a
Have just started converting an existing job tracking system into an ASP.NET MVC application.
I've been developing with kinect XBOX 360 for a while. Today, after connecting a
I'm new to coffeescript, and while there is extensive information concerning the proper method
While debugging and keep pressing F5, if the source code does not exist, eclipse
While trying to build the mysql2 gem with ruby 1.9.2-p320 on Fedora 16, I
While deveoping a site (using Forms authentication and InProc sessionstate) a frequently run into

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.