I’m building a face match web application.
Note: I just found out that people don’t call this type of application as a facematch application.
Here is a basic workflow.
- users upload photos
- admin either approve/deny a photo
- when a user access the page, two photos are randomly selected from the database.
- the user has two options
- choose one of the photos
- skip to another match
There is one condition. Users do not see a duplicated match. If a user already played with 1 vs 2, then the user does not see 2 vs 1 again.
Let’s say I have the following 4 photos
table photo
id 1 2 3 4
there are 6 possible matches. Those are
1 vs 2 1 vs 3 1 vs 4 2 vs 3 2 vs 4 3 vs 4
in order to make those matches, I use the following cross join query.
select p1.id, p2.id from photos as p1 cross join photos as p2 where p1.id < p2.id
it works without a problem. My concern is that it would be slower as the number of matches grow.
I get 1999000 matches with just 2000 photos. That is such a huge number.
so I thought about a solution and came up with creating a new table that stores all the possible matches. The rows are created when the admin approves a photo.
table matches
id1 id2 1 2 1 3 1 4 and so on
finally, my question is
should I keep using cross join or should I create a new table ‘matches’?
which one would be better?
any other better solutions would be appreciated!
I think in this case you’d be better off not storing all matches at all. As you’ve figured out, the number of matches is quadratic to the number of rows. Based on your use case, it seems it would be better to keep a table with all seen pairs per user and exclude them at the time you query for that user. This will likely be pretty sparse compared to entire space of combinations. Unless you need to store data for all combinations at the time the admin approves, there’s no reason to generate them at that time.