I’m currently working on a ‘simple’ photo sytem with mongoDB, using a Replica Set

Question

0

Asked: June 7, 20262026-06-07T22:11:19+00:00 2026-06-07T22:11:19+00:00

I’m currently working on a ‘simple’ photo sytem with mongoDB, using a Replica Set

0

I’m currently working on a ‘simple’ photo sytem with mongoDB, using a Replica Set and GridFS.

The principle is simple, I put a lot of photos using GridFS, the client knows the filename, and from the filename I can retrieve the file.

Is GridFS using filename as indexes ? Hopefully yes, I couldn’t find it written down in any official doc.

My stats are :

     {
        "ns" : "photos.socialphotos.files",
        "count" : 758086,
        "size" : 168295128,
        "avgObjSize" : 222.00004748801587,
        "storageSize" : 220647424,
        "numExtents" : 15,
        "nindexes" : 2,
        "lastExtentSize" : 43311104,
        "paddingFactor" : 1,
        "flags" : 1,
        "totalIndexSize" : 125084624,
        "indexSizes" : {
            "_id_" : 22925504,
            "filename_1_uploadDate_1" : 102159120
        },
        "ok" : 1
    }

EDIT : by reIndex() the collections, I won 30 Go, but it’s still way too high..

My indexes are :

{
    "v" : 1,
    "key" : {
        "_id" : 1
    },
    "ns" : "photos.socialphotos.files",
    "name" : "_id_"
},
{
    "v" : 1,
    "key" : {
        "filename" : 1,
        "uploadDate" : 1
    },
    "ns" : "photos.socialphotos.files",
    "name" : "filename_1_uploadDate_1"
}

Indexes size :

"keysPerIndex" : {
    "photos.socialphotos.files.$_id_" : 758086,
    "photos.socialphotos.files.$filename_1_uploadDate_1" : 758086
}

I never use _id_ as I don’t store it, is it OK to remove it ?
Index size is 125084624 which means I should have almost all my photos in RAM, which is a bit strange ?

Additional questions :

Statistics : mongostats is the basics, is there another good tool for monitoring, or do I have to create my own tool ?
Faults : I could see a LOT (around 100 a sec) when I’m doing lots of inserts, I have nothing on the console… where should I investigate ?
Connecion Pool with JAVA/Tomcat : I’m using a simple Tomcat webapp connection to MongoDB, would you recommand to open a new connection to mongoDB for each request (I guess not) or to keep a reference as a singleton on the Mongo object (with Holder for example) or using a good pool, but I didn’t find a standard one ?

Thank you very much !

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-07T22:11:20+00:00

To address your questions:

1) When you initialize a GridFS collection using the Java driver, that driver will automatically create indexes on the .files and the .chunks collections.

2) MongoDB requires that you have an ‘_id’ field and a unique ‘_id’ index. The default ‘_id’ is only 12 bytes long — there’s really no significant overhead from having it present.

Reference: http://www.mongodb.org/display/DOCS/Object+IDs

3) The stats on the “filename_1_uploadDate_1” index only indicate the size of the index. This index contains only the contents of the filename and the upload data fields – it does not contain any of the photo data itself. You want to have the active portion of the index fit in RAM for performance reasons.

References:

4) If you want to have advanced statistics and monitoring, enroll your system in the free MMS monitoring system provided by 10gen. For more information, start here: https://mms.10gen.com/help/

5) Page faults are normal when loading in new data. MongoDB uses memory-mapped files, so every time you write to a new location within the data file, the OS will need to fault in that page.

For more information about memory mapped files, look here: http://docs.mongodb.org/manual/faq/storage/

6) The MongoDB Java driver provides its own connection pool. Unless you’re doing a really high-performance application, you’re probably best off using the Mongo object as a singleton.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m currently working on a ‘simple’ photo sytem with mongoDB, using a Replica Set

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply