I have been researching and writing/re-writing a program to do this task for a week now. I need some collaboration on this to maybe bring up something I haven’t though of before. Specifically, we have an auto-generated XML file sent to us daily with ~70k records (~75MB in size.) I was asked to make a table on one of the servers (SQL) which contains this information so that it can queried. Also, this program must Update existing records (if data has changed) and Insert new records DAILY. Records must not be deleted from the db
Here is the list of methods I have attempted (so far) and reasons they did not work.
-
SQLXMLBulkLoad – This worked excellent for importing the data.
However, the limitation of the Bulk Load class is that it can not
Update and/or Insert. Time for a re-write. -
SQL OpenRowSet (using SQLCommand, etc.) – This does not work
because the server, program, and XML file will all 3 be on different
computers. These devices CAN be configured to allow each other
access to the file (specifically the server), however this method
was deemed “Not realistic, too much overhead” Time for a re-write. -
DataSet Merge, then TableAdapter.Update – This method intitially
seemed like it would definitely work. The idea is simple, use
DataSet.XMLRead() method to put the XML data into a table in the
dataset, then just add the SQL table to the dataset (Using
SQLCommand, etc.), merge the two tables, and then use Table Adapter
to Update/Insert the table into the existing SQL table. This method
seems not to work because the XML file has two nodes (columns) which
contains dates. Unfortunately, there is not a uniform Date datatype
between SQL and XML. I even attempted changing all of the date
formats from the XML file to the DateTime SQL format, which worked,
but still deemed a datatype mismatch exception upon running.
At this point, I am out of ideas. This seems to be a task that has surely been done before. I am not necessarily looking for someone to write this code for me (I am fully capable of this), I just need some collaboration on the topic.
Thank You
We do something similar with database imports received in XML format, and all I do is pass the XML directly to a stored procedure and then shred the XML using XQuery and OPENXML.
Both of those technologies allow you to query XML in SQL as if it was a table in your database. Taking that approach, you can just pass your XML to a script or stored procedure, query it in SQL, and insert the results wherever you need them. Anecdotally, OPENXML is better for processing large XML files, but you could try both and see how they work for you. Below is an example using OPENXML and a simple merge statement.