I wrote a short Ruby script to profile MongoDB, just to see how its disk space increased as I added records. I wanted it to create 100,000,000 records, but inserts started silently failing a little after 7,000,000. Any ideas why? Here is the code:
#!/usr/bin/env ruby
require 'rubygems'
require 'mongo'
@conn = Mongo::Connection.new
@conn.drop_database('benchmark')
@db = @conn['benchmark']
@reqs = @db['requests']
last_count = 0
last_elapsed = 0
total_elapsed = 0
puts
puts "inserts\tsize\tt_elapsed\tt_per_insert"
print_at = [
1,
1000,
# ...
7_000_000,
8_000_000,
# ...
].inject({}) {|h,x| h[x] = 1; h}
1.upto 100_000_000 do |i|
req = {'user_id' => i,
'role_name' => 'user',
'day' => [2011,5,30],
'method' => 'get',
'page' => 'http://www.example.com/users/5/edit',
'referrer' => 'http://www.example.com/projects/57/notes'}
t1 = Time.new
@reqs.insert(req)
t2 = Time.new
total_elapsed += t2 - t1
if print_at[i]
elapsed_per = (total_elapsed - last_elapsed) / (i - last_count)
puts "#{i}\t#{@reqs.stats['storageSize']}\t#{total_elapsed}\t#{elapsed_per}\t#{@reqs.count}"
last_count = i
last_elapsed = total_elapsed
end
end
Here are the results:
inserts size t_elapsed t_per_insert
1 13568 0.000333 0.000333 1
1000 284928 0.440234999999999 0.000440342342342342 1000
5000 4626688 2.399554 0.000489829750000001 5000
10000 4626688 4.04515699999996 0.00032912059999999 10000
50000 18520320 18.3045380000001 0.000356484525000004 50000
100000 35192576 36.1132420000052 0.000356174080000102 100000
250000 79207168 89.8520730000556 0.000358258873333669 250000
500000 142587904 179.141312000645 0.000357156956002356 500000
750000 184073216 262.518961001337 0.00033351059600277 750000
1000000 233855488 347.697380001333 0.000340713675999983 1000000
2000000 554531072 722.684815985293 0.00037498743598396 2000000
3000000 827051520 1122.17787597268 0.000399493059987388 3000000
4000000 1005428224 1468.68356799303 0.000346505692020353 4000000
5000000 1219480064.0 1803.55257001283 0.000334869002019792 5000000
6000000 1476342016.0 2152.29274403266 0.000348740174019833 6000000
7000000 1784576256.0 2497.58802604997 0.000345295282017315 7000000
8000000 1784576256.0 2877.84758905944 0.000380259563009462 7692111
You can see in that last line that after doing 8,000,000 saves, the db only has 7,692,111 entries.
Here is a little environment info:
$ ruby --version
ruby 1.8.7 (2009-06-12 patchlevel 174) [i486-linux]
$ uname -a
Linux shiny 2.6.31-19-generic #56-Ubuntu SMP Thu Jan 28 01:26:53 UTC 2010 i686 GNU/Linux
$ mongod --version
db version v1.8.1, pdfile version 4.5
Sun May 29 21:58:20 git version: a429cd4f535b2499cc4130b06ff7c26f41c00f04
Note that my disk still has 22G free after running this test, so I guess that’s not the problem. Here are the MongoDB files:
$ ls -lh /var/lib/mongodb
total 3.0G
-rw------- 1 mongodb nogroup 16M 2011-05-29 17:24 benchmark.0
-rw------- 1 mongodb nogroup 32M 2011-05-29 16:38 benchmark.1
-rw------- 1 mongodb nogroup 64M 2011-05-29 16:36 benchmark.2
-rw------- 1 mongodb nogroup 128M 2011-05-29 16:39 benchmark.3
-rw------- 1 mongodb nogroup 256M 2011-05-29 16:48 benchmark.4
-rw------- 1 mongodb nogroup 512M 2011-05-29 16:58 benchmark.5
-rw------- 1 mongodb nogroup 512M 2011-05-29 17:09 benchmark.6
-rw------- 1 mongodb nogroup 512M 2011-05-29 17:17 benchmark.7
-rw------- 1 mongodb nogroup 512M 2011-05-29 17:24 benchmark.8
-rw------- 1 mongodb nogroup 512M 2011-05-29 17:16 benchmark.9
-rw------- 1 mongodb nogroup 16M 2011-05-29 17:24 benchmark.ns
-rwxr-xr-x 1 mongodb nogroup 6 2011-05-28 15:46 mongod.lock
drwxr-xr-x 2 mongodb nogroup 4.0K 2011-05-29 16:34 _tmp
I guess regardless of the specific reason for the failed inserts, I’d especially like to know why no exception was raised. I understand with replication the user can “succeed” before all nodes have successfully saved the data, but that shouldn’t be the issue with just a vanilla instance running on my laptop, right?
There is a 32 bit restriction in mongo. which allows only 2.5 GB of data to be stored. Thats the max size. Check this link for more info.