Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 506965
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 13, 20262026-05-13T06:45:19+00:00 2026-05-13T06:45:19+00:00

Background: I am parsing a 330 meg xml file into a DB (netflix catalog)

  • 0

Background:

I am parsing a 330 meg xml file into a DB (netflix catalog) using PHP script from the console.

I can successfully add about 1,500 titles every 3 seconds until i addd the logic to add actors, genre and formats. These are separate tables linked by an associative table.

right now I have to run many, many queries for each title, in this order ( i truncate all tables first, to eliminate old titles, genres, etc)

  1. add new title to ‘titles’ and capture insert id
  2. check actor table for exising actor
  3. if present, get id, if not insert
    actor and get insert id
  4. insert title id and actor id into
    associative table

(steps 2-4 are repeated for genres too)

This drops my speed don to about 10 per 3 seconds. which would take eternitty to add the ~250,00 titles.

so how would I combine the 4 queries into a single query, without adding duplicate actors or genres

My goal is to just write all queries into a data file, and do a bulk insert.

I started by writing all associative queries into a data file, but it didn’t do much for performance.


I start by inserting th etitle, and saving ID

function insertTitle($nfid, $title, $year){
    $query="INSERT INTO ".$this->titles_table." (nf_id, title, year ) VALUES ('$nfid','$title','$year')";
    mysql_query($query);
    $this->updatedTitleCount++;
    return mysql_insert_id();
}

that is then used in conjunction with each actor’s name to create the association

function linkActor($value, $title_id){
    //check if we already know value
    $query="SELECT * FROM ".$this->persons_table." WHERE person = '$value' LIMIT 0,1";
    //echo "<br>".$query."<br>";
    $result=mysql_query($query);
    if($result && mysql_num_rows($result) != 0){
        while ($row = mysql_fetch_assoc($result)) {
            $value_id=$row['id'];
        }
    }else{
        //no value known, add to persons table
        $query="INSERT INTO ".$this->persons_table." (person) VALUES ('$value')";
        mysql_query($query);
        $value_id=mysql_insert_id();

    }   
    //echo "linking title:".$title_id." with rel:".$value_id;
    $query = " INSERT INTO ".$this->title_persons_table." (title_id,person_id) VALUE ('$title_id','$value_id');";
    //mysql_query($query);
    //write query to data file to be read in bulk style
    fwrite($this->fh, $query);
}
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-13T06:45:20+00:00Added an answer on May 13, 2026 at 6:45 am

    This is a perfect opportunity for using prepared statements.
    Also take a look at the tips at http://dev.mysql.com/doc/refman/5.0/en/insert-speed.html, e.g.

    To speed up INSERT operations that are performed with multiple statements for nontransactional tables, lock your tables

    You can also decrease the number of queries. E.g. you can eliminate the SELECT...FROM persons_table to obtain the id by using INSERT...ON DUPLICATE KEY UPDATE and LAST_INSERT_ID(expr).

    ( sorry, running out of time for a lengthy description, but I wrote an example before noticing the time 😉 If this answer isn’t downvoted too much I can hand it in later. )

    class Foo {
      protected $persons_table='personsTemp';
      protected $pdo;
      protected $stmts = array();
    
      public function __construct($pdo) {
        $this->pdo = $pdo;
        $this->stmts['InsertPersons'] = $pdo->prepare('
          INSERT INTO
            '.$this->persons_table.'
            (person)
          VALUES
            (:person)
          ON DUPLICATE KEY UPDATE
            id=LAST_INSERT_ID(id)
        ');
      }
    
      public function getActorId($name) {
        $this->stmts['InsertPersons']->execute(array(':person'=>$name));
        return $this->pdo->lastInsertId('id');
      }
    }
    
    $pdo = new PDO("mysql:host=localhost;dbname=test", 'localonly', 'localonly'); 
    $pdo->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
    
    // create a temporary/test table
    $pdo->exec('CREATE TEMPORARY TABLE personsTemp (id int auto_increment, person varchar(32), primary key(id), unique key idxPerson(person))');
    // and fill in some data
    foreach(range('A', 'D') as $p) {
      $pdo->exec("INSERT INTO personsTemp (person) VALUES ('Person $p')");
    }
    
    $foo = new Foo($pdo);
    foreach( array('Person A', 'Person C', 'Person Z', 'Person B', 'Person Y', 'Person A', 'Person Z', 'Person A') as $name) {
      echo $name, ' -> ', $foo->getActorId($name), "\n";
    }
    

    prints

    Person A -> 1
    Person C -> 3
    Person Z -> 5
    Person B -> 2
    Person Y -> 6
    Person A -> 1
    Person Z -> 5
    Person A -> 1
    

    (someone might want to start a discussion whether a getXYZ() function should perform an INSERT or not …but not me, not now….)

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am parsing a form from another one of my sites using a php
First some background. I'm parsing a simple file format, and wish to re-use the
I am trying to launch a background thread to retrieve XML data from a
Background: I have encoded a raw h264 file using ffmpeg. I'm trying to create
Background - I want to extract specific columns from a csv file. The csv
I'm parsing an XML file and trying to return the output to a div.
When parsing XML in Objective-C on an iOS app, when can the main thread
I've got a Thread parsing XML in the background of my app. As it
Background I'm reading and writing an XML document using reader and writer, with filtering
Background: I am comming from the Java world and I am fairly new to

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.