I want to give the user the ability to import a csv file into my php/mysql system, but ran into some problems with encoding when the language is russian which excel only can store in UTF-16 tab-coded tab files.
Right now my database is in latin1, but I will change that to utf-8 as described in question “a-script-to-change-all-tables-and-fields-to-the-utf-8-bin-collation-in-mysql”
But how should I import the file? and store the strings?
Should I for example translate it to html_entitites?
I am using the fgetcsv command to get the data out of the csv file.
My code looks something like this right now.
file_put_contents($tmpfile, str_replace("\t", ";", file_get_contents($tmpfile)));
$filehandle = fopen($tmpfile,'r');
while (($data = fgetcsv($filehandle, 1000, ";")) !== FALSE) {
$values[] = array(
'id' => $data[0],
'type' => $data[1],
'text' => $data[4],
'desc' => $data[5],
'pdf' => $data[7]);
}
As note, if I store the xls file as csv in excel, i special chars are replaced by ‘_’, so the only way I can get the russian chars out of the file, is to store the file in excel as tabbed seperated file in UTF16 format.
Okay, the solution was to export the file from excel to UTF16 unicode text and add the ‘;’ instaid of ‘\t’ and convert from utf16 to utf8.
file_put_contents($tmpfile, str_replace("\t", ";", iconv('UTF-16', 'UTF-8', file_get_contents($tmpfile))));The table in mysql has to be changed from latin1 to utf8
And then the file could be imported as before.
When I want to export the data from the database to a excel file, the csv-version is not an option. It has to be done in excel's html mode. Where data is corrected by eg.
urlencode()orhtmlentities()Here some example code.