I am about to embark on a PostgreSQL project for a client. They want to develop a huge professional database with many complex joins, so after consideration I have chosen to go with PostgreSQL over MySQL.
An important consideration is how to effectively interface to the database with scripts. Currently, the client uses about a million scripts to import and reshape data to their needs, but uses no database (unless you consider CSV files to be a database). With the arrival of a database structure with queries and views, the need for scripts will be less, but importing will still need to be done often, and exporting/reporting as well. For me the ideal end result would be a series of standardized scripts, preferably with a web interface, so that the client can perform regular tasks quickly and error-free with a click of the button.
My question is which scripting approach will be most appropriate. Probably any scripting language with a Postgres or an ODBC plugin would suffice, but I am looking to make a smart choice for the long term. Does anybody have experience with this? Does Postgres offer an internal scripting language, and is it easy to build a GUI for that? Are there any standardized tools available for importing/exporting, and are they customizable enough to allow standardization of tasks to click-level? How about PHP or perl?
Thanks in advance. Any tips, resources, puzzled looks or pitiful gestures will be truly appreciated 😉
Since you are talking about scripts that expressly just manipulate the database, I would start with the most native tools.
COPY FROMandCOPY TOfor importing from and exporting to flat filesNow, you want to provide some easy web interface for interfacing with these scripts. Here the best language is probably the one you or your team already knows. All major languages have Postgres drivers. The language you choose will have very little impact if you keep your data manipulation tasks at the database layer.
One thing to consider is how long the typical script will take to execute. If it is more than a few minutes, then I suggest decoupling it from the web interface. In that case, the web interface should allow the user to queue the script to start so that the server can run it independent of the web request cycle.