I am using the following script to upload records to my MYSQL database, the problem I can see is if a client record is uploaded and it already exists in the database and is duplicated.
I have seen lots of posts on here about people asking on how to remove duplicates from the csv file itself on upload, e.g if there are two instances of the name bob and the postcode lh456gl in the csv dont upload it, but what I want to know is if its possible to check the database for a record first before adding that record so not to insert a record that already is there.
So something like :
if exist namecolumn=$name_being_inserted and postcode=postcode_being_inserted then
do not add that record.
Is this even possible to do ?.
<?php
//database connect info here
//check for file upload
if(isset($_FILES['csv_file']) && is_uploaded_file($_FILES['csv_file']['tmp_name'])){
//upload directory
$upload_dir = "./csv";
//create file name
$file_path = $upload_dir . $_FILES['csv_file']['name'];
//move uploaded file to upload dir
if (!move_uploaded_file($_FILES['csv_file']['tmp_name'], $file_path)) {
//error moving upload file
echo "Error moving file upload";
}
//open the csv file for reading
$handle = fopen($file_path, 'r');
while (($data = fgetcsv($handle, 1000, ',')) !== FALSE) {
//Access field data in $data array ex.
$name = $data[0];
$postcode = $data[1];
//Use data to insert into db
$sql = sprintf("INSERT INTO test (name, postcode) VALUES ('%s','%s')",
mysql_real_escape_string($name),
mysql_real_escape_string($postcode)
);
mysql_query($sql) or (mysql_query("ROLLBACK") and die(mysql_error() . " - $sql"));
}
//delete csv file
unlink($file_path);
}
?>
There are two pure MySQL methods that I can think of that would deal with this issue.
REPLACE INTOandINSERT IGNORE.REPLACE INTOwill overwrite the existing row whereasINSERT IGNOREwill ignore errors triggered by duplicate keys being entered in the database.This is described in the manual as:
For
INSERT IGNOREto work you will need to setup aUNIQUEkey/index on one or more of the fields. Looking at your code sample though you do not have anything that could be considered unique in your insert query. What if there are two John Smiths in Wolverhampton? Ideally you would have something like an email address to define as unique.