I’ve read a bunch of stuff on the web about how to get at cell data using the OpenXML API. But there’s really not much out there that’s particularly straightforward. Most seems to be about writing to SpreadsheetML, not reading… but even that doesn’t help much.
I’ve got a spreadsheet that has a table in it. I know what the table name is, and I can find out what sheet it’s on, and what columns are in the table. But I can’t figure out how to get a collection of rows back that contain the data in the table.
I’ve got this to load the document and get a handle to the workbook:
SpreadsheetDocument document = SpreadsheetDocument.Open("file.xlsx", false);
WorkbookPart workbook = document.WorkbookPart;
I’ve got this to find the table/sheet:
Table table = null;
foreach (Sheet sheet in workbook.Workbook.GetFirstChild<Sheets>())
{
WorksheetPart worksheetPart = (WorksheetPart)document.WorkbookPart.GetPartById(sheet.Id);
foreach (TableDefinitionPart tableDefinitionPart in worksheetPart.TableDefinitionParts)
{
if (tableDefinitionPart.Table.DisplayName == this._tableName)
{
table = tableDefinitionPart.Table;
break;
}
}
}
And I can iterate over the columns in the table by foreaching over table.TableColumns.
To read an Excel 2007/2010 spreadsheet with OpenXML API is really easy. Somehow even simpler than using OleDB as we always did as quick & dirty solution. Moreover it’s not just simple but verbose, I think to put all the code here isn’t useful if it has to be commented and explained too so I’ll write just a summary and I’ll link a good article. Read this article on MSDN, it explain how to read XLSX documents in a very easy way.
Just to summarize you’ll do this:
SpreadsheetDocumentwithSpreadsheetDocument.Open.Sheetyou need with a LINQ query from theWorkbookPartof the document.WorksheetPart(the object you need) using the Id of theSheet.In code, stripping comments and error handling:
Now (but inside the using!) what you have to do is just to read a cell value:
If you have to enumerate the rows (and they are a lot) you have first to obtain a reference to the
SheetDataobject:Now you can ask for all the rows and cells:
To simply enumerate a normal spreadsheet you can use
Descendants<Row>()of theWorksheetPartobject.If you need more resources about OpenXML take a look at OpenXML Developer, it contains a lot of good tutorials.