Is there anyway I can optimize reading this file?
Btw this is a Laravel application. data_fill() is a Laravel specific function. Its just my php array building logic.
https://github.com/laravel/framework/blob/3414dcfcbe27cf0f4deee0670f022983e8016392/src/Illuminate/Support/helpers.php#L427
I took the 150,000 row, 16 column file and chopped it down to a much smaller sample of 4261 rows and it takes about 10-15 seconds. The complete file takes minutes (had to heavily modify nginx and php to allow for this).
Here is a blackfire.io report of the smaller file - https://blackfire.io/profiles/c4087f40-dd5c-42ed-9258-3c6d5a1ace51/graph
Looks like it is reading all 68176 cells (4261 rows * 16 columns) multiple times on the smaller file 349380 times (the darker red boxes).
Spreadsheets attached.
spreadsheets.zip
Here is the code i am using to process the file(s). I have hard coded some parameters for testing. Normally these would be request variables to allow for different column selection based on the header row.
public function report02()
{
ini_set('max_execution_time', '400');
// Time how long script takes to run
$executionStartTime = microtime(true);
$reader = ReaderFactory::create(Type::XLSX); // for XLSX files
$reader->open(storage_path('smaller.xlsx'));
$headers = [];
$header_tech = strtoupper('INSTALLER NBR');
$header_tech_index = 0;
$header_equipment_type = strtoupper('ITEM');
$header_equipment_type_index = 0;
$header_equipment_sn = strtoupper('SERIAL NUMBER');
$header_equipment_sn_index = 0;
$header_equipment_status = strtoupper('EQUIP STS');
$header_equipment_status_index = 0;
$header_equipment_age = strtoupper('DAYS ASSIGNED');
$header_equipment_age_index = 0;
$show_old_equipment_limit_in_days = 21;
// convert comma delimitied tech numbers to an array and trim white space
$techs = array_map('trim', explode(",", '9448,69091,69165,69327,69430,69445,69449,69711,70056'));
$data = [
'old_equipment' => [
'total' => 0
]
];
foreach ($reader->getSheetIterator() as $sheet) {
foreach ($sheet->getRowIterator() as $row => $values) {
// Check for last row - end of file
$values2 = array_map('trim', $values);
if (! empty($values2)) {
// Header row position
if ($row == 2) {
$headers = $values2;
$header_tech_index = array_search($header_tech, $headers);
$header_equipment_type_index = array_search($header_equipment_type, $headers);
$header_equipment_sn_index = array_search($header_equipment_sn, $headers);
$header_equipment_status_index = array_search($header_equipment_status, $headers);
$header_equipment_age_index = array_search($header_equipment_age, $headers);
}
// Data after header
if ($row > 2) {
// This row contains data for a tech we requested
if (in_array($values2[$header_tech_index], $techs)) {
// 7 Status
if ($values2[$header_equipment_status_index] == 7) {
if (array_has($data, 'techs.' . $values2[$header_tech_index] . '.counts.7 Status.' . $values2[$header_equipment_type_index])) {
$data['techs'][$values2[$header_tech_index]]['counts']['7 Status'][$values2[$header_equipment_type_index]]++;
} else {
data_fill($data, 'techs.' . $values2[$header_tech_index] . '.counts.7 Status.' . $values2[$header_equipment_type_index], 1);
}
// All tech equipment
data_fill($data, 'techs.' . $values2[$header_tech_index] . '.equipment.7 Status.' . $values2[$header_equipment_sn_index], [
'type' => $values2[$header_equipment_type_index],
'age' => $values2[$header_equipment_age_index],
]);
}
// T Status
if ($values2[$header_equipment_status_index] == 'T') {
if (array_has($data, 'techs.' . $values2[$header_tech_index] . '.counts.T Status.' . $values2[$header_equipment_type_index])) {
$data['techs'][$values2[$header_tech_index]]['counts']['T Status'][$values2[$header_equipment_type_index]]++;
} else {
data_fill($data, 'techs.' . $values2[$header_tech_index] . '.counts.T Status.' . $values2[$header_equipment_type_index], 1);
}
// All tech equipment
data_fill($data, 'techs.' . $values2[$header_tech_index] . '.equipment.T Status.' . $values2[$header_equipment_sn_index], [
'type' => $values2[$header_equipment_type_index],
'age' => $values2[$header_equipment_age_index],
]);
}
// X Days or older
if ($values2[$header_equipment_age_index] >= $show_old_equipment_limit_in_days && ($values2[$header_equipment_status_index] !== 'L')) {
data_fill($data, 'old_equipment.techs.' . $values2[$header_tech_index] . '.' . $values2[$header_equipment_sn_index], [
'type' => $values2[$header_equipment_type_index],
'status' => $values2[$header_equipment_status_index],
'age' => $values2[$header_equipment_age_index],
]);
$data['old_equipment']['total']++;
}
}
}
}
unset($values2);
}
}
$reader->close();
$executionEndTime = microtime(true);
$data['runtime_in_seconds'] = round($executionEndTime - $executionStartTime, 2);
return response()->json($data, 200);
}
Screenshot of smaller file output

Thanks for any help I receive.
Is there anyway I can optimize reading this file?
Btw this is a Laravel application.
data_fill()is a Laravel specific function. Its just my php array building logic.https://github.com/laravel/framework/blob/3414dcfcbe27cf0f4deee0670f022983e8016392/src/Illuminate/Support/helpers.php#L427
I took the 150,000 row, 16 column file and chopped it down to a much smaller sample of 4261 rows and it takes about 10-15 seconds. The complete file takes minutes (had to heavily modify nginx and php to allow for this).
Here is a blackfire.io report of the smaller file - https://blackfire.io/profiles/c4087f40-dd5c-42ed-9258-3c6d5a1ace51/graph
Looks like it is reading all 68176 cells (4261 rows * 16 columns) multiple times on the smaller file 349380 times (the darker red boxes).
Spreadsheets attached.
spreadsheets.zip
Here is the code i am using to process the file(s). I have hard coded some parameters for testing. Normally these would be request variables to allow for different column selection based on the header row.
Screenshot of smaller file output
Thanks for any help I receive.