Generally when I need to insert/update multiple rows, I will generate (in PHP) a statement like the following:
INSERT INTO Products (ProductID, Name, Weight, Color) VALUES
(1001, "Product A", 14.5, "Red"),
(1002, "Product B", 12.3, "Green"),
(null, "Product C", 15.3, "Yellow"),
...
(null, "Product Z", 10.4, "Blue")
ON DUPLICATE KEY UPDATE
Name = VALUES(Name),
Weight = VALUES(Weight),
Color = Values(Color)
(The table Products only has a key on ProductID, but has many more columns than those being updated here.)
I use this kind of query because it can be generated to insert/update any number of Products by iterating through the set of products to be inserted/updated.
The problem, however, is when some of the values should not be updated.
If possible, I’d like to generate a query that only updates certain columns on a per-row basis:
INSERT INTO Products (ProductID, Name, Weight, Color) VALUES
(1001, "Product A", DO_NOT_UPDATE, "Red"),
(1002, "Product B", 12.3, DO_NOT_UPDATE),
(null, "Product C", 15.3, "Yellow"),
...
(null, "Product Z", 10.4, "Blue")
ON DUPLICATE KEY UPDATE
Name = VALUES(Name),
Weight = VALUES(Weight),
Color = Values(Color)
I can think of three options:
-
Generate multiple queries that insert/update one row at a time.
-
Generate multiple queries that update one (non-id) column at a time. First you would have to do your insert statement separately, use mysql_insert_id() to add IDs, then do one update query per column, including only the elements that need the particular column updated. Generally you would end up with as many queries as you have columns (unless you were fortunate enough that there was some overlap in the columns not to be updated).
-
For values not needing an update, get the existing values first in a
SELECTquery, then simply generate theINSERT ... ON DUPLICATE KEY UPDATEquery as written at the top of this post.
As far as I can tell from the mysql ‘on duplicate key’ documentation, there is no way to ignore a value on a row-by-row basis. (Obviously using NULL is just going to set the value of that row to NULL.) Is there anything I am missing?
Any thoughts as to which of these methods will be most efficient, particularly in the case where the number of rows is > 50 and the number of columns is 5-10? Also for the case when there are only 1-2 columns to be updated per row, though the columns to be updated vary by row across all the columns.
Thanks.
If you don’t need
NULLas a valid value for your fields, then you can do something like this:So, if you pass a
NULLthenUPDATEsets value of a field to itself (thus effectively ignoring the operation). Otherwise, the field is updated to the new value.If you do need
NULL, pick another flag value.