Here’s the basic guts of my schema and problem: http://sqlfiddle.com/#!1/72ec9/4/2
Note that the periods table can refer to a variable range of time – it could be an entire season, it could be a few games or one game. For a given team and year all period rows represent exclusive ranges of time.
I’ve got a query written which joins up tables and uses a GROUP BY periods.year to aggregate scores for a season (see sqlfiddle). However, if a coach had two positions in the same year the GROUP BY will count the same period row twice. How can I ditch the duplicates when a coach held two positions but still sum up periods when a year is comprised of multiple periods? If there’s a better way to do the schema I’d also appreciate it if you pointed it out to me.
The underlying problem (joining to multiple tables with multiple matches) is explained in this related answer:
To fix, I first simplified & formatted your query:
Yields the same, incorrect result as your original, but simpler / faster / easier to read.
No point in joining the table
coachas long as you don’t use columns in theSELECTlist. I removed it completely and replaced theWHEREcondition withwhere pp.coach = 1.You don’t need
COALESCE.NULLvalues are ignored in the aggregate functionsum(). No need to substitute0.Use table aliases to make it easier to read.
Next, I solved your problem like this:
Aggregate positions and periods separately before joining them.
In the first sub-query
polist positions only once witharray_agg(DISTINCT ...).In the second sub-query
pe…GROUP BY period, because a coach can have multiple positions per period.JOINto periods-data after that, and then aggregate to get sums.db<>fiddle here
Old sqlfiddle