This is my problem:
I’m reading data from an Excel file on a .NET MVC app, what I’m doing is to read all data from the excel and then loop over each record inserting the data contained in the record into my business model.
All works perfectly. However, I’ve found that one field, sometimes, return an empty string when retrieved from the excel. Curiously this field can contain a simple string or a string that will be treated as an array (it can include ‘|’ characters to build the array) on some excel files the field returns empty when the ‘|’ char is present and in others when it isn’t, and this behaviour is consistent all along that file.
There are other fields that can receive the separator and work always ok. The only difference between both fields are that the working ones are pure strings and the one that’s failing is a string of numbers with possibles ‘|’ separating them.
I’ve tried to change the separator character (I tried with ‘#’ with same results) and to specifically format the cells as text without any success.
This is the method that extracts data from the excel
private DataSet queryData(OleDbConnection objConn) {
string strConString = "SELECT * FROM [Hoja1$] WHERE NUMACCION <> ''";
OleDbCommand objCmdSelect = new OleDbCommand(strConString, objConn);
OleDbDataAdapter objAdapter1 = new OleDbDataAdapter();
objAdapter1.SelectCommand = objCmdSelect;
DataSet objDataset = new DataSet();
objAdapter1.Fill(objDataset, "ExcelData");
return objDataset;
}
I first check the fields from the excel with:
fieldsDictionary.Add("Hours", table.Columns["HOURS"].Ordinal);
And later, when looping through the DataSet I extract data with:
string hourString = row.ItemArray[fieldsDictionary["Hours"]].ToString();
This hourString is empty in some records. In some Excel files it’s empty when the record contains ‘|’, on others it’s empty when it doesn’t. I haven’t found yet a file where it returns empty on records of both classes.
I’m quite confused about this. I’m pretty sure it has to be related to the numerical nature of field data, but cannot understand why it doesn’t solve when I force the cells on the excel file to be “text”
Any help will be more than welcome.
Ok. I finally solved this.
It seems like Excel isn’t able to recognize a whole column as same data type if it contains data of possibly different classes. This happens even if you force the cell format to be text on the workbook, as when you query the data it will recognize the field as a determinated type according to the first record it receives; that was the reason why different files emptied different type of records, files starting with a plain text emptied numeric values and vice versa.
I’ve found a solution to this just changing the connection string to Excel.
This was my original connection string
And this the one that fixes the problem
The parameter IMEX=1 states to excel that it must manage all mixed data columns as plain text. This won’t work for you if you need to edit the excel file, as this parameter also opens it on read-only mode. However it was perfect for my situation.