This question has two aspects, both related to indices. I have a dataset with

Question

0

Asked: May 29, 20262026-05-29T03:52:16+00:00 2026-05-29T03:52:16+00:00

This question has two aspects, both related to indices. I have a dataset with

0

This question has two aspects, both related to indices.

I have a dataset with 530 million entries, each entry has an array of 10 elements. I am using a single mongod. I am constructing an index on the array post-bulk-insert. The array has two key-value pairs of type string – int.

I have already deduced/researched that putting up the index before construction is what mongodb is designed for and such large datasets cannot be (post-insert) indexed without a massive amount of ram/swappable-virtual-memory.

one: phases of index construction

What are the phases of index construction, I was looking at the log and saw it go up once from 0 to 100%, only to begin counting once it reached 100% (something to do with sorting ? ?). The second phase was MUCH slower then the first. Are there any more passes that need to be done ?

two: Index state

I wasn’t going to watch the index construction at this rate, and I have an indexed dataset as a backup(which I can’t trust anymore, keep reading). So, I kill -9'd the process. I started the process again, and the logs show the database acknowledging that a index build operation was in progress and ended incorrectly, but nothing beyond this. The index shows up in the db.<db-name>.getIndexes() list.

I find this VERY odd especially the getIndexes bit, I know for a fact that index construction in this case never ended, and now I can’t trust the backups I have in which I believe indexing ended ok.

I at least expect a database platform to be in a consistent state, or to get to one before it passes me control. So, either rollback the index construction,finish it, or refuse to start without a recovery operation.

So how do I find out if my database is in a consistant state, specifically the indices ?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-29T03:52:17+00:00

So how do I find out if my database is in a consistant state, specifically the indices ?

For this, there is a validate command. The command is a blocking command, like repair, but it looks like it has a few options.

So, either rollback the index construction,finish it, or refuse to start without a recovery operation.

Agreed. And the logs should be crystal clear about the state when the DB when it is restarted. However, MongoDB is definitely not “there” yet.

The second phase was MUCH slower then the first. Are there any more passes that need to be done ?

Indeed, once it is done the second phase, the DB then locks and performs a giant fsync as it flushes the newly created index to the disk. It was probably here when you killed it.

The last time I watched this process happen, there was no log message during the fsync. Given the size of your data, this will represent gigs and gigs of data flushing to the disk. Run some math on the speed of your drives vs. the index, but this phase could definitely represent a lot of waiting time.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

This question has two aspects, both related to indices. I have a dataset with

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply