I have a PostgreSQL table that is mostly a bridge table but it also has some extra stuff.
Essentially it holds the information about players in a game. So we have a unique id for this instance of a player in a game. Then an id that is FK to game table, and an id that is FK to player table. There is also some other irrelevant stuff. Something like this:
Table players_games
| id | 12564
| player_id | 556
| game_id | 156184
What I want to do is find how many occurrences there are of a player playing with another. So, if player1 is in the same game as player2, they have played together once. There are 2+ players in a game.
So what I want to do is populate a new table, that holds three values: player_lo, player_hi, times_played.
And either have one row for each pair and the number of times they played, or if it ends up being more efficient, a row for each iteration and have the value set as 1 so these can be added together later, maybe distributed. So you might see something like:
p1, p2, 1
p1, p2, 1
And these get reduced later to:
p1, p2, 2
So I was wondering if there was some clever way to do this with SQL, or if there’s SQL that can reduce my programming effort, before starting to write a slightly complex python script to do it.
To do this, you need to do a self join on the player_games table. The first subquery is for the first player, and the second for the second player. The “first” player is the one with the lower player id.
Note that the join condition uses a “<” on the player ids. This is to prevent counting duplicates (so players A,B are not also counted as B,A).
Also, I added a “distinct” in the inner subqueries just in case a single player might appear more than once for a given game. Perhaps this is not necessary. To be sure, you should have a unique index on the composite key game_id, player_id.