Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7532389
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 30, 20262026-05-30T05:26:18+00:00 2026-05-30T05:26:18+00:00

created a collection in MongoDB consisting of 11446615 documents. Each document has the following

  • 0

created a collection in MongoDB consisting of 11446615 documents.

Each document has the following form:

{ 
 "_id" : ObjectId("4e03dec7c3c365f574820835"), 
 "httpReferer" : "http://www.somewebsite.pl/art.php?id=13321&b=1", 
 "words" : ["SEX", "DRUGS", "ROCKNROLL", "WHATEVER"],     
 "howMany" : 3 
}

httpReferer: just an url

words: words parsed from the url above. Size of the list is between 15 and 90.

I am planning to use this database to obtain list of webpages which have similar content.

I ‘ll by querying this collection using words field so I created (or rather started creating) index on this field:

db.my_coll.ensureIndex({words: 1})

Creating this collection takes very long time. I tried two approaches (tests below were done on my laptop):

  1. Inserting and indexing Inserting took 5.5 hours mainly due to cpu intensive preprocessing of data. Indexing took 30 hours.
  2. Indexing before inserting It would take a few days to insert all data to collection.

My main focus it to decrease time of generating the collection. I don’t need replication (at least for now). Querying also doesn’t have to be light-fast.

Now, time for a question:

I have only one machine with one disk were I can run my app. Does it make sense to run more than one instance of the database and split my data between them?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-30T05:26:23+00:00Added an answer on May 30, 2026 at 5:26 am

    Yes, it does make sense to shard on a single server.

    1. At this time, MongoDB still uses a global lock per mongodb server.
      Creating multiple servers will release a server from one another’s locks.

    2. If you run a multiple core machine with seperate NUMAs, this can
      also increase performance.

    3. If your load increases too much for your server, initial sharding makes for easier horizontal scaling in the future. You might as well do it now.

    Machines vary. I suggest writing your own bulk insertion benchmark program and spin up a various number of MongoDB server shards. I have a 16 core RAIDed machine and I’ve found that 3-4 shards seem to be ideal for my heavy write database. I’m finding that my two NUMAs are my bottleneck.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm using MongoDB to hold a collection of documents. Each document has an _id
I have a MongoDB collection of documents, each document representing a fish . Users
I have a MongoDB collection which has a created_at stored in each document. These
I have a collection in MongoDB with 20 million documents and an index created
Suppose I create a document in a MongoDB collection with the following structure and
I've got a MongoDB collection that holds about 100M documents. The documents basically look
Firstly I created a collection which is Collection list; and in a function RSS
I have created a site collection inside a web application with user A as
I have created a custom collection class that I am trying to bind to
I already created a top level Web Application and site collection. The powershell input

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.