I want to execute a java script file via the mongos for inserting data to my sharding set. In addition to that I want to add a dynamic variable and the NULL value –
I would login (manually) to the shell by
mongo hostip:port/admin my_script.js
My js looks like:
var amount = 1000000;
var x=1;
var doc= ”;
for (i=0; i<amount; i++)
{
doc = { a: '1', b: '2', c: 'text' , d: 'x', e: 'NULL'}
db.mycol.insert(doc);
x=x + 1
}
(Rather “x” I could just use “i”)
does “d” writes the value of “x” or just the letter “x”?
does “e” writes the text “Null” or the .. let’s say “database NULL”
Is the way I do that procedure correctly? (Concerning how I connect to mongos / the sharding set)
best regards
EDIT:
And very important – how can I figure out the time, the mongodb/sharding set needs to store all the data? And balanced it?
Edit 2nd:
Hi Ross,
I have a sharding set that consists of two shards (two replicasets). At the moment I’m testing and therefore I use the loop-counter as the shard key.
Is there a way to check the time within the javascript?
Update:
So measuring the time that is needed for storing the data is equivalent to the time the javascript is executed? (Or the time the mongo shell isn’t accessible because of executing)
Is that assumption acceptable for measuring the query response time?
(where do I have to store the java script file?)
You dont need to keep multiple counters – as you are incrementing
ion each iteration of the for loop. As you want the values and not strings the useifor the value ofdandnullinstead of the string"NULL"– heres the cleaned up loop:Regarding how long it takes to store / balance your data – that depends on a few factors.
Firstly, what is your shard key? Is it a random value or is it an increasing value (like a timestamp). A random pattern for shard keys help ensure an even distribution of writes and if you know the ranges of the shard key, you could pre-split the shard to try and ensure that it stays balanced when loading data. If the shard key is increasing like a timestamp then most likely one shard will become hot and it will always be at the top end of the range and will have to split chunks and migrate the data to the other shards.
At MongoDB UK there were a couple of good presentations about sharding: Overview of sharding and Sharding best practices.
Update:
Regarding how long will it take for the shards to become balanced – this depends on the load on your machines. Balancing is a lightweight process so should be considered a background operation. Its important to note, that even with a sharded system as soon as the data is written to the
mongosits accessible for querying against. So if a shard becomes imbalanced during a data load the data is still accessible – it may take time to rebalance the shard – depending on the load of the shard and the additions of new data, meaning chunks need to be split before migrating.Update2
The inserts to
mongosare synchronous, so the time it takes to run the script is the time it took to apply the inserts. There are other options about the durability of writes using getLastError essentially how long you block while the write is written. The shell callsgetLastError()transparently but the default for your language of choice is to be asynchronous and not wait for a server response.Where to store the javascript file? – Well thats up to you – its your application code. Most users will write an application in their preferred language and use the driver to call mongodb.