At work we currently have a PostgreSQL database and we access it via some Perl bindings to access the database and marshal responses to Perl types. This works OK, but for various reasons we are becoming unhappy with Perl. One option we’ve been considering is to move the majority of the work in this API to the database itself as plpgsql stored procedures.
Brief Example
For example, we might have the following in the database:
-- This matches our 'Entity::Artist' object
CREATE TYPE loaded_artist (
artist_id uuid,
revision_id integer,
artist_tree_id integer,
name text,
sort_name text,
artist_type_id integer,
-- etc
);
-- This gets the latest 'master' version of an artist and joins in basic data
-- from the artist tree
CREATE FUNCTION get_latest_artist_by_mbid(in_mbid UUID)
RETURNS SETOF loaded_artist AS $$
BEGIN
RETURN QUERY
SELECT
artist_id, revision_id, artist_tree_id, name.name,
sort_name.name AS sort_name, artist_type_id
FROM artist
JOIN artist_revision USING (artist_id)
JOIN artist_tree USING (artist_tree_id)
JOIN artist_data USING (artist_data_id)
WHERE artist.master_revision_id = revision_id
AND artist_id = in_mbid;
END;
$$ LANGUAGE 'plpgsql';
Now our current Perl API can be simplified to effectively the following:# And in Perl
package Data::Artist;
sub get_latest_by_mbid {
my ($self, $mbid) = @_;
return $self->new_from_row(
$self->sql->select_single_row_hash(
'SELECT * FROM get_latest_artist_by_mbid(?)',
$mbid));
}
Is this sensible?
On face value, I like this. We:
- Move away from Perl, but don’t commit to another language. This means we can move our actual application to Python/whatever in the future and the majority of our API is already done.
- Get extra type safety from PostgreSQL due to specifying things like
RETURNS SETOF loaded_artist - Still have unit tests and stuff via PGTAP.
There are a few disadvantages:
- Potentially lower development cycle as we now have to replace functions in the database. Not the end of the world, but this effectively introduces a ‘compile’ step into our workflow that was not previously there.
- Potentially more difficult version control, but there are certainly ways of doing it
Has anyone done work like this? Would you encourage it, or was it fraught with peril?
Footnote: A little more about our case
This is for an open source website. We distribute dumps of our database for people to import into PostgreSQL databases. We have no plans to move away from PG any time soon, so database agnostic decisions don’t really apply to us. We are a very small team (2 paid developers, more open source contributors) and this lets us be quite flexible in terms of deployment strategies.
Advantages:
SELECTqueries.Disadvantages:
Best combination is when you implement a deal of your business logic on the database side and not only wrapper functions.
Schema version control is possible. It is more tricky to version the data in the configuration tables. In one of the project I’m involved this is done via the external tool (perl based) that handles this part for us:
We’re versioning the extract files instead (which a plain SQL) and have a special step in the installation script to load the new configuration.