I want to load a plain file into Greenplum database using external tables.
Can I specify input format for timestamps/date/time fields? (If you know the answer for PostgreSQL, please reply as well)
For example, with Oracle I can use DATE_FORMAT DATE MASK ‘YYYYMMDD’ to tell how to parse the date. For Netezza I can specify DATESTYLE ‘YMD’. For Greenplum I cannot find the answer. I can describe fields as char, and then parse them during the load, but this is an ugly workaround.
Here is my tentative code:
CREATE EXTERNAL TABLE MY_TBL (X date, Y time, Z timestamp )
LOCATION (
'gpfdist://host:8001/file1.txt',
'gpfdist://host:8002/file2.txt'
) FORMAT 'TEXT' (DELIMITER '|' NULL '')
It appears that you can:
before
SELECTing from the table. This will affect the interpretation of all dates, though, not just those from the file. If you consistently use unambiguous ISO dates elsewhere that will be fine, but it may be a problem if (for example) you need to also accept ‘D/M/Y’ date literals in the same query.This is specific to GreenPlum’s
CREATE EXTERNAL TABLEand does not apply to SQL-standardSQL/MEDforeign data wrappers, as shown below.What surprises me is that PostgreSQL proper (which does not have this
CREATE EXTERNAL TABLEfeature) always accepts ISO-styleYYYY-MM-DDandYYYYMMDDdates, irrespective ofDATESTYLE. Observe:… so if GreenPlum behaved the same way, you should not need to do anything to get these
YYYYMMDDdates to be read correctly from the input file.Here’s how it works with a PostgreSQL
file_fdwSQL/MEDforeign data wrapper:The CSV file contents are:
so you can see that Pg will always accept ISO dates for CSV, irrespective of datestyle.
If GreenPlum doesn’t, please file a bug. The idea of
DateStylechanging the way a foreign table is read after creation is crazy.