I’ve been using GROUP BY for all types of aggregate queries over the years. Recently, I’ve been reverse-engineering some code that uses PARTITION BY to perform aggregations.
In reading through all the documentation I can find about PARTITION BY, it sounds a lot like GROUP BY, maybe with a little extra functionality added in.
Are they two versions of the same general functionality or are they something different entirely?
They’re used in different places.
GROUP BYmodifies the entire query, like:But
PARTITION BYjust works on a window function, likeROW_NUMBER():GROUP BYnormally reduces the number of rows returned by rollingthem up and calculating averages or sums for each row.
PARTITION BYdoes not affect the number of rows returned, but itchanges how a window function’s result is calculated.