Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7162319
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 28, 20262026-05-28T13:42:54+00:00 2026-05-28T13:42:54+00:00

I have done a simple experiment to test MongoDB’s performance and disk usage. I

  • 0

I have done a simple experiment to test MongoDB’s performance and disk usage. I insert 22GB data but it occupies 50GB on the disk. I will describe this experiment in details as below.

Setup:

  • Version – MongoDB 2.0.2.
  • Environment: 1) Single node without any replication or sharding. 2) VM via VirtualBox. 3) Linux Ubuntu 64bit. 4) 100GB fixed virtual disk and 1GB memory
  • Language: C# && MongoDB C# driver
  • Target and Procedure: Very simple. I just constantly create a new {KEY, VALUE} pair and insert it into MongoDB.
  • *Number of Insertion = 1024 * 1024 * 1024 / 3
  • Size of the KEY = 20 bytes (byte array), a counter with increment of 1 for each insertion, i.e., KEY = {1, 2, 3, …, 1024*1024*1024}
  • Size of the VALUE = 100 bytes (byte array), randomly generated through Random class.

Results:

So this experiment means I wished to insert about 40GB of data (120 bytes of data for each insertion) into MongoDB and I believe it is simple enough. However, I stopped when the actual inserted data reached 22GB because I found the storage overhead issue. The actual data I inserted is about 22GB, but all the indexdb.* files are with size of 50GB. So there is more than 100% storage overhead.

My own thoughts:

I have read quite a bit of MongoDB’s docs. According to what I have read, there might be two kinds of overhead for the storage.

  1. the oplog. But it is meant to be capped about 5% of disk space. In my case, it is capped about 5GB.
  2. preallocated data file. I didn’t change any settings of mongod, so I think it is 2GB in advance. And let me assume that the latest 2GB file in use is nearly empty, so totally at most 4GB overhead.

So from my calculation, whatever size of data I insert, there should be 9GB overhead at most. But now the overhead is 50GB – 22GB = 28GB. And I don’t get a clue what is inside that 28GB. And if this overhead is always more than 100%, it is quite a lot.

Can any one please explain it to me?


Here is some mongodb stats I obtained from the mongo shell.

db.serverStatus() {
"host" : "mongodb-VirtualBox",
"version" : "2.0.2",
"process" : "mongod",
"uptime" : 531693,
"uptimeEstimate" : 460787,
"localTime" : ISODate("2012-01-26T16:32:12.888Z"),
"globalLock" : {
     "totalTime" : 531692893756,
     "lockTime" : 454374529354,
     "ratio" : 0.8545807827977436,
     "currentQueue" : {
          "total" : 0,
          "readers" : 0,
          "writers" : 0
     },
     "activeClients" : {
          "total" : 0,
          "readers" : 0,
          "writers" : 0
     }
},
"mem" : {
     "bits" : 64,
     "resident" : 292,
     "virtual" : 98427,
     "supported" : true,
     "mapped" : 49081,
     "mappedWithJournal" : 98162
},
"connections" : {
     "current" : 3,
     "available" : 816
},
"extra_info" : {
     "note" : "fields vary by platform",
     "heap_usage_bytes" : 545216,
     "page_faults" : 14477174
},
"indexCounters" : {
     "btree" : {
          "accesses" : 3808733,
          "hits" : 3808733,
          "misses" : 0,
          "resets" : 0,
          "missRatio" : 0
     }
},
"backgroundFlushing" : {
     "flushes" : 8861,
     "total_ms" : 26121675,
     "average_ms" : 2947.93759169394,
     "last_ms" : 119,
     "last_finished" : ISODate("2012-01-26T16:32:03.825Z")
},
"cursors" : {
     "totalOpen" : 0,
     "clientCursors_size" : 0,
     "timedOut" : 0
},
"network" : {
     "bytesIn" : 44318669115,
     "bytesOut" : 50995599,
     "numRequests" : 201846471
},
"opcounters" : {
     "insert" : 0,
     "query" : 3,
     "update" : 201294849,
     "delete" : 0,
     "getmore" : 0,
     "command" : 551619
},
"asserts" : {
     "regular" : 0,
     "warning" : 0,
     "msg" : 0,
     "user" : 1,
     "rollovers" : 0
},
"writeBacksQueued" : false,
"dur" : {
     "commits" : 28,
     "journaledMB" : 0,
     "writeToDataFilesMB" : 0,
     "compression" : 0,
     "commitsInWriteLock" : 0,
     "earlyCommits" : 0,
     "timeMs" : {
          "dt" : 3062,
          "prepLogBuffer" : 0,
          "writeToJournal" : 0,
          "writeToDataFiles" : 0,
          "remapPrivateView" : 0
     }
},
"ok" : 1}

db.index.dataSize(): 29791637704

db.index.storageSize(): 33859297120

db.index.totalSize(): 45272200048

db.index.totalIndexSize(): 11412902928

db.runCommand(“getCmdLineOpts”): { “argv” : [ “./mongod” ], “parsed” : { }, “ok” : 1 }


My code fragment. I just removed those MongoDB connection codes and keep the cores here.

static void fillupDb()
{
    for (double i = 0; i < 1024 * 1024 * 1024 / 3; i++)
    {
        //Convert the counter i to a 20 bytes of array as KEY
        byte[] prekey = BitConverter.GetBytes(i);
        byte[] key = new byte[20];
        prekey.CopyTo(key, 0);

        // Generate a random 100 bytes of VALUE
        byte[] value = getRandomBytes(100);
        put(key, value);
    }
}

public void put(byte[] key, byte[] value)
{
    BsonDocument pair = new BsonDocument {
        { "_id", key } /* I am using _id as the index */,
        { "value", value }};
    collection.Save(pair);
}
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-28T13:42:55+00:00Added an answer on May 28, 2026 at 1:42 pm

    Well, first of all. How do you measure the size of your input data? A key-value pair can be two strings or a JSON object.

    Additionally, every document has some additional padding added to it to facilitate the document growing through subsequent updates. The average padding factor can be retrieved through db.col.stats().paddingFactor

    Finally, there’s more than just the oplog that may add to your overhead. There’s always an index on _id which in your case (since your document are so small) will introduce significant overhead in terms of disk space usage. Unless you disabled it (–nojournal) the journal will add quite a few of bytes to the total as well.

    Hope that helps.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I've done some simple string -> DateTime conversions before using DateTime.ParseExact(), but I have
I have done a couple of simple swing based apps with static layout, but
I have done some simple servlets in the past but no SOAP WS clients.
I have done simple Rails finds for two tables, but now find that with
I am creating a simple slide show. I have done everything, but am struggling
I'm familiar with Core Data basics and have done some dabbling, but have not
I have done simple java app for blackberry, while building am getting following error.
I have recently done a very simple highlighting with jQuery and a highlight plugin.
I hope I am not missing something very simple here. I have done a
I'm looking for a simple way to have this done. I would to have

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.