I have a table person in my PostgresSQL database, which contains data of different users.
I need to write a test case, which ensures that some routine does modify the data of user 1, and does not modify data of user 2.
For this purpose, I need to
a) calculate a hash code of all rows of user 1 and those of user 2,
b) then perform the operation under test,
c) calculate the hash code again and
d) compare hash codes from steps a) and c).
I found a way to calculate the hash code for a single row:
SELECT md5(CAST((f.*)AS text))
FROM person f;
In order to achieve my goal (find out whether rows of user 2 have been changed), I need to perform a query like this:
SELECT user_id, SOME_AGGREGATE_FUNCTION(md5(CAST((f.*)AS text)))
FROM person f
GROUP BY user_id;
What aggregate function can I use in order to calculate the hash code of a set of rows?
Note: I just want to know whether any rows of user 2 have been changed. I do not want to know, what exactly has changed.
The simplest way – just concat all the string form
md5withstring_agg. But to use this aggregate correctly you must specifyORDER BY.Or use
md5(string_agg(md5(CAST((f.*)AS text)),''))with someORDER BY– it will change if any field off.*changes and it is cheap to compare.