I got this issue. I have about 10k xml files containing a bunch of performance data. I need to parce, and then import them into excel so I can generate a graph out of it.
I am trying to decide what would be the best approach to solving this. I can’t do a direct import because excel doesn’t recognize it as a valid xml format. (excel gives me schema not recognized or some thing)
the file format goes something like this: (I have only included the useful information.)
name of the file goes like this: YYDDMM.startOfPMPeriod_endOfPMPeriod
and in the file:
<time stamp>
<PM category1>
<PM category2>
<PM category3>
...
<sub system 1>
<result>1</result>
<result>2.0</result>
...
<sub system 2>
<result>0.221</result>
<result>2.0</result>
...
<sub system n>
<result>1</result>
<result>2.0</result>
And there are approx 10k these files. each files goes for about 6k lines. 🙂
I am not sure about how to approach this. I got the basic logic of it:
while (we got more files to read)
read a file
parse PM category and timestamp
while (not end of file)
reading in results data and the subsystems
//store it in an array of some sort, but I am not sure about the structure of it
//once we are done with our files
pass the array to excel, (somehow, maybe as a CSV?)
What do you guys think would be the best approach to solve this? My programming skills are limited. I am familiar with java, c++ and bash scripting. 3 dimensional arrays are beyond me. I have enough trouble with 2 dimensions. 🙂 My most complicated assignment was to make a multi-threaded banking application with java.
Davy
Update: it is for excel 2003
and the excel table should look like this: and I can’t attach images so you will have to make do with this:
timestamp 1 timestamp2 timestamp 3
subsystem 1 pm cat 1
pm cat 2
pm cat 3
subsystem 2 pm cat 1
pm cat 2
pm cat 3
I recommend you start by by inserting a piece of data using VSTO. Once you insert a single row, you can re-use what you learned to insert multiple rows.
XML to an NxN array is an overly complicated means to attack XML parsing. XML parsing can be effectively accomplished via XPATH or LINQ to XML. If you have no experience with LINQ, perhaps XPATH is a better start.
First figure out how you want the rows to look in excel, then extract the XML accordingly. This will avoid N x N arrays and give you a goal of producing a known output.