Say I have three tables:
User Table
{
UserId INT,
Username NVARCHAR
...
}
Questions
{
QuestionId INT
QuestionText NVARCHAR
}
Answers
{
AnswerId INT,
QuestionId INT,
UserId INT,
Answer NVARCHAR
}
This structure is obviously overly simplified, but for the purpose of this example it should suffice.
What would be the best way to select users who have specific answers for specific questions, for instance – Assuming the tables are populated with the following data:
User Table
UserId Username ...
--------------------------------------------------------------------------------------------------------
1 User1 ...
2 User2 ...
3 User3 ...
4 User4 ...
5 User5 ...
6 User6 ...
7 User7 ...
8 User8 ...
9 User9 ...
10 User10 ...
... ... ...
etc
Questions Table
QuestionId QuestionText
--------------------------------------------------------------------------------------------------------
1 What is your favorite color?
2 What do you prefer cats or dogs?
3 Do you prefer if it is too hot or too cold?
4 What is your favorite season (Summer, Autumn (Fall), Winter, Spring)?
5 How Old Are you?
... ...
etc
Answers Table
AnswerId QuestionId UserId Answer
--------------------------------------------------------------------------------------------------------
1 1 1 Red
2 1 2 Red
3 1 3 Blue
4 1 4 Green
5 1 5 Black
6 2 6 Cats
7 2 1 Dogs
8 3 1 Too Cold
9 4 1 Spring
10 5 1 22
11 2 4 Dogs
12 3 4 Too Hot
13 3 3 Too Cold
14 5 6 46
15 1 8 Purple
If I wanted to select users who liked dogs and red or purple and under the age of 50 etc
Would the best (most efficient) way to do this be to have multiple joins from the user table to the answers table (one for each answer condition required)
For example:
If I were wanting to get the users who liked dogs and the color red I could use the following MSSQL:
SELECT *
FROM
Users
JOIN Answers As a1
ON Users.UserId = a1.UserId
JOIN Answers as a2
ON Users.UserId = a2.UserId
WHERE
(
a1.QuestionId = 1 AND
a1.Answer = 'Red'
) AND
(
a2.QuestionId = 2 AND
a2.Answer = 'Dogs'
)
There could be numerous answer conditions.
Basically the question I am asking is what is the best way to write a query where you have conditions on multiple rows from the same table with the same columns…
Sorry if this is confusing, feel free to ask any questions I will try and answer them as best as I can…
Thanks.
Your basic query looks just fine. As you get more elaborate, you would construct your WHERE clauses slightly differently depending on how your conditions need to be combined.
For example, in the example you provided, of either red or purple being acceptable answers, you could construct the WHERE clause like this:
It gets more complicated if only certain sets of answers are acceptable, so if ‘Red’ and ‘Dogs’ or ‘Purple’ and ‘Cats’ are acceptable, it would look more like this:
If your conditions get more complicated, you might want to read Dynamic Search Conditions in T-SQL. While your conditions aren’t dynamic, there’s a lot of useful information there.
Finally, since it’s easy to get confused with what question ID goes with what answers, particularly if they aren’t nice, human-recognizable values, it can help to use CTEs to pre-select the answers: