I want to import a CSV file into version 9.2 but the CSV file has double-quote double-quote in the final column position to represent a NULL value:
"2","1001","9","2","0","0","130","","2012-10-22 09:33:07.073000000",""
which is mapped to a column of type Timestamp. postgreSQL doesn’t like the “”. I’ve tried to set the NULL option but maybe I’m not doing it correctly? I’ve tried NULL as '"" and NULL '' and NULL as '' and NULL "" but without success; here’s my command:
COPY SCH.DEPTS
FROM 'H:/backups/DEPTS.csv'
WITH (
FORMAT CSV,
DELIMITER ',' ,
NULL '',
HEADER TRUE,
QUOTE '"'
)
but it fails with an error:
ERROR: invalid input syntax for type timestamp: “”
CONTEXT: COPY depts, line 2, column expirydate: “”
P.S. Is there a way to specify the string representation of Booleans to the COPY command? The utility that produced the CSVs (of which there are many) used “false” and “true”.
The empty string (“”) isn’t a valid timestamp, and
COPYdoesn’t appear to offer aFORCE NULLorFORCE EMPTY TO NULLmode; it has the reverse,FORCE NOT NULL, but that won’t do what you want.You probably need to
COPYthe data into a table with atextfield for the timestamp, probably anUNLOGGEDorTEMPORARYtable, then use anINSERT INTO real_table SELECT col1, col, col3, NULLIF(tscol,'') FROM temp_table;.COPYshould accepttrueandfalseas booleans, so you shouldn’t have any issues there.Alternately, read the CSV with a simple Python script and the
csvmodule, and then usepsycopg2toCOPYrows into Pg. Or just write new cleaned up CSV out and feed that intoCOPY. Or use an ETL tool that does data transforms like Pentaho Kettle or Talend.