Example CSV line:
"2012","Test User","ABC","First","71.0","","","0","0","3","3","0","0","","0","","","","","0.1","","4.0","0.1","4.2","80.8","847"
All values after “First” are numeric columns. Lots of NULL values just quoted as such, right.
Attempt at COPY:
copy mytable from 'myfile.csv' with csv header quote '"';
NOPE: ERROR: invalid input syntax for type numeric: ""
Well, yeah. It’s a null value. Attempt 2 at COPY:
copy mytable from 'myfile.csv' with csv header quote '"' null '""';
NOPE: ERROR: CSV quote character must not appear in the NULL specification
What’s a fella to do? Strip out all double quotes from the file before running COPY? Can do that, but I figured there’s a proper solution to what must be an incredibly common problem.
While some database products treat an empty string as a NULL value, the standard says that they are distinct, and PostgreSQL treats them as distinct.
It would be best if you could generate your CSV file with an unambiguous representation. While you could use sed or something to filter the file to good format, the other option would be to
COPYthe data in to a table where atextcolumn could accept the empty strings, and then populate the target table. TheNULLIFfunction may help with that: http://www.postgresql.org/docs/9.1/interactive/functions-conditional.html#FUNCTIONS-NULLIF — it will return NULL if both arguments match and the first value if they don’t. So, something likeNULLIF(txtcol, '')::numericmight work for you.