I’m learning PHP and Zend Framework. The following PHP function is supposed to fill a temporary table using “INSERT INTO … SELECT” style query. However, when I SELECT * from the newly appended table, I see that most but not all of the new records have been duplicated once. I have deleted the contents of the table each time I run this scripts. Anyone know why there would be duplicates?
public function fillTableByOfficeName($officeName) {
if ($officeName != '') {
$officePhrase = "b.oof_name ='" . $officeName . "' AND ";
} else {
$officePhrase = '';
}
$whereAddenda = $officePhrase .
"a.fil_bool_will_file_online = false AND " .
"a.fil_bool_confirmed = false AND " .
"a.fil_bool_duplicate = false AND " .
"a.fil_bool_not_found = false AND " .
"(a.fil_res_id_fk NOT IN (4,7,10) OR a.fil_res_id_fk IS NULL) AND " .
"a.fil_will_recorder_rec_id IS NULL AND " .
"d.tag_description NOT IN (
'Already a trust client',
'Not received from local office',
'Southtrust client (already centralized)')";
//"a.fil_date_of_transfer_to_will_recorder IS NULL";
$sql = "INSERT INTO adds(fil_id,REC_ID,FIRST_NAME,LAST_NAME,MIDDLE_INITIAL,SSN," .
"MAILING_ADDRESS_1,MAILING_ADDRESS_2,CITY,STATE,ZIP_CODE,PHONE_NUMBER,BIRTH_DATE," .
"ORIGINATION_OFFICE,FILE_LOCATION,WILL_DATE,LAST_CODICIL_DATE,TRUST_DATE,REV_TRUST,POA_DATE) " .
"SELECT a.fil_id_pk, " .
"a.fil_will_recorder_rec_id, " .
"a.fil_first_name, " .
"a.fil_last_name, " .
"a.fil_middle_name, " .
"a.fil_ssn, " .
"a.fil_mailing_address_1, " .
"a.fil_mailing_address_2, " .
"a.fil_city_address, " .
"a.fil_state_address, " .
"a.fil_zip_code_fk, " .
"a.fil_phone_number, " .
"a.fil_date_of_birth, " .
"b.oof_name, " .
"a.fil_box_id_fk, " .
"a.fil_date_of_will, " .
"a.fil_date_of_last_codicil, " .
"a.fil_date_of_trust, " .
"a.fil_notes, " .
"a.fil_date_of_poa " .
"FROM files a, origination_offices b, nn_files_tags c, tags d " .
"WHERE " .
"a.fil_oof_id_fk = b.oof_id_pk AND " .
"a.fil_id_pk = c.fil_id_fk AND " .
"d.tag_id_pk = c.tag_id_fk AND " .
$whereAddenda;
$this->getAdapter()->query($sql);
return $this;
}
You are using C for a many to many relationship. For example, if you have invoices between companies and customers and you select from join of them, you will get as many rows as you have invoices. From that, if you only select the company name and costumer name, you will have many duplicates because the same pair has produced many invoices.
This is the same issue you have here.
As asc99c said, you could use an inner select to make your WHERE clause without joining on that relationship or you could use the DISTINCT key word (which effectively is a group by on everything in your SELECT clause). I would think the INNER SELECT solution more efficient (yet I could be totally wrong about that), but the DISTINCT way is 8 key press away…