I’m dealing with a database with about 30 tables, and 10 million unique entries.

Question

0

Asked: May 30, 20262026-05-30T09:17:30+00:00 2026-05-30T09:17:30+00:00

I’m dealing with a database with about 30 tables, and 10 million unique entries.

0

I’m dealing with a database with about 30 tables, and 10 million unique entries.

I am trying to use PHP to present that data in a certain format using the echo “function” and placing the variables using {$variable}.

Also, the data is hierarchical so I used a join command in order to include several columns and that resulting table was probably about 15 columns.

I ran the php file in Google Chrome, and it ran for about 1 hour on a pretty decent core2duo machine.

But the result set stopped at about 18 thousand entries – I had put no limit on the query by the way.

The most important part of my question is how do I run this file to get all the results? I don’t want to sit there and set the offset over and over, if there is another way, I would be very grateful.

Secondarily – and I know you probably need more information, just not sure what – can I make the process faster? I’m planning on re-running it on a better machine, but are there other ways?

Thanks

Update:

<?php
    include ('includes/functions.php');
    $connection=connectdb();

    $result=runquery('
    SELECT taxonomic_rank.rank as shortrank, scientific_name_element.name_element as shortname, sne.name_element as pname, tr.rank as prank
    FROM taxon_name_element
    LEFT JOIN scientific_name_element ON taxon_name_element.scientific_name_element_id = scientific_name_element.id
    LEFT JOIN taxon ON taxon_name_element.taxon_id = taxon.id
    LEFT JOIN taxonomic_rank ON taxonomic_rank.id = taxon.taxonomic_rank_id
    LEFT JOIN taxon_name_element AS tne ON taxon_name_element.parent_id = tne.taxon_id
    LEFT JOIN scientific_name_element AS sne ON sne.id = tne.scientific_name_element_id
    LEFT JOIN taxon AS tax ON tax.id = tne.taxon_id
    LEFT JOIN taxonomic_rank AS tr ON tr.id = tax.taxonomic_rank_id');
set_time_limit(0);
ini_set('max_execution_time',0);
    while($taxon_name_element = mysql_fetch_array($result)){
        if ($taxon_name_element['shortrank'] == 'species'){
            $subitem = $taxon_name_element['pname']."_".$taxon_name_element['shortname'];}

        else{$subitem = $taxon_name_element['shortrank']."_".$taxon_name_element['shortname'];}
        $parentitem = $taxon_name_element['prank']."_".$taxon_name_element['pname'];
        echo 
"\n<!-- http://invertnet.ill/med#{$subitem}\" -->\n
<owl:Class rdf:about=\"http://invertnet.ill/med#{$subitem}\">
    <rdfs:label xml:lang=\"en\">{$subitem}</rdfs:label>
    <rdfs:subClassOf rdf:resource=\"http://invertnet.ill/med#{$parentitem}\"/>
</owl:Class>\n\n";}
echo "<br>".count($taxon_name_element)." number of stuff";
?>

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-30T09:17:31+00:00

Reading the below lines, it doesn’t seem to be the slow query issue.

“I ran the php file in Google Chrome, and it ran for about 1 hour on a pretty decent core2duo machine.
But the result set stopped at about 18 thousand entries – I had put no limit on the query by the way”

The browser isn’t the best medium to throw 10 million records, not Chrome at least :-). My suggestion is that you put some pagination in your PHP file so that you do not have to set the offset manually every time. Put a simple previous-next link showing say 10000 records per page.

If it is not absolutely required to run in a browser, another way could be to write all output to a text file.

Some notes on the query too: any specific reason for adding LEFT JOIN twice for each table? It seems it has something to do with taxon_name_element.parent_id but since I’m not sure on the requirement and the table schema, can’t comment on it. But if the query is running too slow, do consider optimizing it.

EDIT 1 – I’ve tried to workout a little on your query. And since you want both the name of the element and it’s parent name, I think it is possible to do it in a simpler query without JOINING the same tables twice. It will need coding some extra logic though.

Few observations that I learn from the query:

the element and its parent name are both coming from the same table taxon_name_element
there is another column “rank” and it as well is coming from the same table taxonomic_rank for both the element and its parent
From this specific join taxon_name_element.parent_id = tne.taxon_id, I learn that both the element and its parent are in the same table `taxon_name_element”

Now let us see the simpler query:

SELECT `tr`.`rank` AS `shortrank`, `sne`.`name_element` AS `shortname`, `tne`.`parent_id`, `tne`.`taxon_id`
FROM `taxon_name_element` `tne`
LEFT JOIN `scientific_name_element` `sne` ON `tne`.`scientific_name_element_id` = `sne`.`id`
LEFT JOIN `taxon` `tax` ON `tne`.`taxon_id` = `tax`.`id`
LEFT JOIN `taxonomic_rank` `tr` ON `tr`.`id` = `tax`.`taxonomic_rank_id`;

The resultset will now contain both taxon_id and parent_id. So the idea is to store all results in the array such that the KEY is set to the parent_id. Like:

$arrOutput = $arrParent = Array();
while ($row = mysql_fetch_array($result) {
    $arr = Array(
        'shortrank' => $row['shortrank'],
        'shortname' => $row['shortname'],
        'taxonid' => $row['taxon_id'],
        'parentid' => $row['parent_id']
        );
    $arrOutput[] = $arr;
    if (!empty($row['parent_id'])) {
        $arrParent[$row['parent_id']] = $arr;
    }
}
// $arrOutput is now the final array with all the results and you can loop through it like you do in your original code. When looping, the parent can directly be accessed using parent_id as the associative key.
foreach ($arrOutput as $arr) {
    $elementName = $arr['shortname'];
    $elementRank = $arr['shortrank'];
    $parentName = $arrParent[$arr['parentid']]['shortname'];
    $parentRank = $arrParent[$arr['parentid']]['shortrank'];
}

Hope that makes sense! Well, the above ofcourse is only needed if the original query is expensive.

CAUTION: the above code isn’t tested and I only hope that it works. Minor changes or fixes might be needed 😉

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m dealing with a database with about 30 tables, and 10 million unique entries.

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply