I have just written what has to be considered utterly hideous code to count the rows that contain data in the worksheets called “Data” from all the spreadsheets in a given directory. Here’s the code
private const string _ExcelLogDirectoryPath = @"..\..\..\..\Model\ExcelLogs\";
static void Main()
{
var excelLogPaths = Directory.GetFiles(_ExcelLogDirectoryPath, "*.xl*");
var excel = new Microsoft.Office.Interop.Excel.Application();
var excelRowCounts = new Dictionary<string, int>();
foreach (var filePath in excelLogPaths)
{
var spreadsheet = excel.Workbooks.Open(Path.GetDirectoryName(System.Windows.Forms.Application.ExecutablePath) + "/" + filePath);
var worksheet = spreadsheet.Sheets["Data"] as Worksheet;
if (worksheet != null)
{
// var rowCount = UsedRange.Rows.Count - 1; DOES NOT WORK, THE number is bigger than the 'real' answer
var rowCount = 0;
for (var i = 1 ; i < 1000000000; i++)
{
var cell = worksheet.Cells[i, 1].Value2; // "Value2", great name for a property, thanks guys
if (cell != null && cell.ToString() != "") // Very fragile (e.g. skipped rows will break this)
{
rowCount++;
}
else
{
break;
}
}
var name = spreadsheet.Name.Substring(spreadsheet.Name.IndexOf('p'), spreadsheet.Name.IndexOf('.') - spreadsheet.Name.IndexOf('p'));
excelRowCounts.Add(name, rowCount - 1);
}
}
I cannot believe this is the right way to do this. It is crazy slow and includes calls to properties with names like Value2 that do not feel like an intended part of a public API. But the method suggested elsewhere dramatically over reports the number of rows (with data in them).
What is the correct was to count the rows with data in them from an Excel worksheet?
========== EDIT 1 ==========
The reason that both UsedRange.Rows.Count and Sid’s ACE.OLEDB solution over report the number of rows appears to be a pink background colour that is applied to some of the columns (but only extending to row 7091). Is there a simple/elegant way to count the rows with data in them (i.e. with non-null cell values) regardless of the display colour?
========== EDIT 2 ===========
Sid’s ACE.OLEDB solution with the addition he suggests so that the tSQL line reads
var sql = "SELECT COUNT (*) FROM [" + sheetName + "$] WHERE NOT F1 IS NULL";
works. I’ll mark that as the answer.
This should do the trick. You can call it with each filename to retrieve the number of rows.
There might be an even faster way of retrieving just the row count, but I know this works.