I don’t understand it. The XSLX table is about 3MB large yet even 1024MB of RAM is not enough for PHPExcel to load it into memory?
I might be doing something horribly wrong here:
function ReadXlsxTableIntoArray($theFilePath)
{
require_once('PHPExcel/Classes/PHPExcel.php');
$inputFileType = 'Excel2007';
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
$objReader->setReadDataOnly(true);
$objPHPExcel = $objReader->load($theFilePath);
$rowIterator = $objPHPExcel->getActiveSheet()->getRowIterator();
$arrayData = $arrayOriginalColumnNames = $arrayColumnNames = array();
foreach($rowIterator as $row){
$cellIterator = $row->getCellIterator();
$cellIterator->setIterateOnlyExistingCells(false); // Loop all cells, even if it is not set
if(1 == $row->getRowIndex ()) {
foreach ($cellIterator as $cell) {
$value = $cell->getCalculatedValue();
$arrayOriginalColumnNames[] = $value;
// let's remove the diacritique
$value = iconv('UTF-8', 'ISO-8859-1//TRANSLIT', $value);
// and white spaces
$valueExploded = explode(' ', $value);
$value = '';
// capitalize the first letter of each word
foreach ($valueExploded as $word) {
$value .= ucfirst($word);
}
$arrayColumnNames[] = $value;
}
continue;
} else {
$rowIndex = $row->getRowIndex();
reset($arrayColumnNames);
foreach ($cellIterator as $cell) {
$arrayData[$rowIndex][current($arrayColumnNames)] = $cell->getCalculatedValue();
next($arrayColumnNames);
}
}
}
return array($arrayOriginalColumnNames, $arrayColumnNames, $arrayData);
}
The function above reads data from an excel table to an array.
Any suggestions?
At first, I allowed PHP to use 256MB of RAM. It was not enough. I then doubled the amount and then also tried 1024MB. It still runs out of memory with this error:
Fatal error: Allowed memory size of 1073741824 bytes exhausted (tried to allocate 50331648 bytes) in D:\data\o\WebLibThirdParty\src\PHPExcel\Classes\PHPExcel\Reader\Excel2007.php on line 688
Fatal error (shutdown): Allowed memory size of 1073741824 bytes exhausted (tried to allocate 50331648 bytes) in D:\data\o\WebLibThirdParty\src\PHPExcel\Classes\PHPExcel\Reader\Excel2007.php on line 688
There’s plenty been written about the memory usage of PHPExcel on the PHPExcel forum; so reading through some of those previous discussions might give you a few ideas. PHPExcel holds an “in memory” representation of a spreadsheet, and is susceptible to PHP memory limitations.
The physical size of the file is largely irrelevant… it’s much more important to know how many cells (rows*columns on each worksheet) it contains.
The “rule of thumb” that I’ve always used is an average of about 1k/cell, so a 5M cell workbook is going to require 5GB of memory. However, there are a number of ways that you can reduce that requirement. These can be combined, depending on exactly what information you need to access within your workbook, and what you want to do with it.
If you have multiple worksheets, but don’t need to load all of them, then you can limit the worksheets that the Reader will load using the setLoadSheetsOnly() method.
To load a single named worksheet:
Or you can specify several worksheets with one call to setLoadSheetsOnly() by passing an array of names:
If you only need to access part of a worksheet, then you can define a Read Filter to identify just which cells you actually want to load:
Using read filters, you can also read a workbook in “chunks”, so that only a single chunk is memory-resident at any one time:
If you don’t need to load formatting information, but only the worksheet data, then the setReadDataOnly() method will tell the reader only to load cell values, ignoring any cell formatting:
Use cell caching. This is a method for reducing the PHP memory that is required for each cell, but at a cost in speed. It works by storing the cell objects in a compressed format, or outside of PHP’s memory (eg. disk, APC, memcache)… but the more memory you save, the slower your scripts will execute. You can, however, reduce the memory required by each cell to about 300bytes, so the hypothetical 5M cells would require about 1.4GB of PHP memory.
Cell caching is described in section 4.2.1 of the Developer Documentation
EDIT
Looking at your code, you’re using the iterators, which aren’t particularly efficient, and building up an array of cell data. You might want to look at the toArray() method, which is already built into PHPExcel, and does this for you. Also take a look at this recent discussion on SO about the new variant method rangeToArray() to build an associative array of row data.