I am having some strange behaviour trying to fix some objects in my MongoDB. I am trying to change the language code (lc) from may to msa and I have a unique index on text and language code, e.g. {t:1, lc:1}
First I get the count:
db.Unit.count({lc: "may"});
The I try:
db.Unit.find({lc: "may"}, {"t":1}).limit(1000).forEach(function(obj) {
try {
db.Unit.update({ _id: obj._id }, {$set : { "lc": "msa"}} );
print('Changed :' + obj.t + '#' + obj._id);
} catch (err) {
print(err);
}
});
This seems to work an prints out lots of objects, then fails with:
E11000 duplicate key error index: jerome5.Unit.$t_1_lc_1 dup key: { : "laluan", : "msa" }
Now I expected the matches before the fail would have been correctly updated, but the count returns exactly the same number.
Have I missed something obvious with my Javascript?
Update: It looks like some of objects printing out without throwing an exception are also duplicates. So looks like there is some delay before an error is thrown (I have journaling enabled). Is this normal behaviour?
The short answer is that the issue is with the JS code.
Updates in Mongo are fire and forget by default, so even if an individual update fails because of a duplicate key, the “try” statement will still have completed successfully, and the code in the “catch” section will never be executed. It may appear that “catch” code is being executed because when the forEach loop ends, the JS shell returns db.getLastError(), which will return null if the operation succeeds. GetLastError is explained in the documentation here:
http://www.mongodb.org/display/DOCS/getLastError+Command
This is perhaps best explained via example:
Lets create a simple collection, and a unique index:
We are going to run a script to change all of the “may” values to “msa”. Before we do, lets make some changes, so changing some values of “may” to “msa” will create duplicate values in the index:
Now when our script hits documents _id:4 and _id:5, it will not be able to change the value of “lc” to “may” because doing so will create duplicate entries in the index.
Lets run a version of your script. I have added some extra lines to make it more verbose:
As you can see, “boo” was never printed, because the “catch” code was never executed, even though two records could not be updated. Technically, the update() did not fail, it simply was unable to change the document because of the duplicate index entry and generated a message to that effect.
All of the records that could be changed have been successfully changed.
If the script is run again, the following output is generated:
As you can see the last error message was printed twice: Once when we printed it in our script, and again when the script finished.
Forgive the verbose nature of this response. I hope that this has improved your understanding of getLastError and how operations are executed in the JS shell.
The script can be re-written without the try/catch statement, and simply print out the _ids of any documents that were unable to be updated: