I am having something wrong with my query results. It is not displaying the correct rows with the “QuestionContent” field.
For example in the database table if it states (not included all fields):
QuestionContent Option Type QuestionMarks Answer ....// other fields
What is 2+2 A-D 1 D
What is 3+3 A-D 1 B
What is 4+4 A-E 2 C
Then why when I create my query that the result display’s this:
QuestionContent Option Type QuestionMarks Answer ....// other fields
What is 2+2 A-D 1 D
What is 3+3 A-D 1 D
What is 4+4 A-D 1 D
Below is the query I am using, how can this be fixed to be able to display the correct result?
SELECT q.QuestionContent, o.OptionType, q.NoofAnswers, a.Answer, r.ReplyType,
q.QuestionMarks
FROM Answer a
INNER JOIN Question q ON a.QuestionId = q.QuestionId
JOIN Reply r ON q.ReplyId = r.ReplyId
JOIN Question qu ON r.ReplyId = qu.ReplyId
JOIN Option_Table o ON qu.OptionId = o.OptionId
GROUP BY q.QuestionContent
UPDATE: BELOW is the schema of the 4 tables:
Question Table:
SessionId (PK) Varchar(3)
QuestionId(PK) INT
QuestionContent Varchar(250)
NoofAnswers INT
QuestionMarks INT
ReplyId(FK) Varchar(2)
OptionId(FK) Varchar(2)
Answer Table
SessionId (PK) Varchar(3)
QuestionId(PK) INT
Answer Varchar(10)
Option_Table Table
OptionId(PK) Varchar(2)
OptionType Varchar(10)
Reply Table
ReplyId(PK) Varchar(2)
ReplyType Varchar(10)
In the Query I want to display these fields:
QuestionContent
OptionType
NoofAnswers
Answer
ReplyType
QuestionMarks
Hope that is enough information, if not please comment to me 🙂
Given that the primary key of the Question table is SessionID + QuestionID, and that the primary key of the Answer table is also SessionID + QuestionID, you must specify the join on those two tables on both columns:
Without both, you end up with a Cartesian product effect when a question ID appears in more than one session.
Also, since you are not selecting SessionID, you will need to deal with duplicate results where two different sessions have the same question and answer information. I think SELECT DISTINCT is probably better than GROUP BY for the purpose. Reserve GROUP BY for when you have aggregates (such as COUNT(*) or SUM(expression)) and do not use it for general ‘duplicate elimination’.
Original commentary
I believe there are are two parts to your problem — one possibly not crucial, the other probably crucial.
Your query is, more or less:
For some reason, you’ve listed Question in that twice, but there’s no evidence that you need the second alias for it. On that basis alone, you can simplify the query to:
That is the possibly not crucial change; you appeared to have a 1:1 join. The optimizer might be able to ignore the superfluous second reference to the Question table, but not mentioning that which is not used is better.
The probably crucial issue is the GROUP BY clause. In most dialects of SQL, you’d need to list all of the non-aggregate selected values in the GROUP BY clause. In this query, where there are no non-aggregates shown, that would mean listing all 6 result columns. When you don’t do this, MySQL takes things somewhat at random. My suspicion is that you need to reveal whether there are any aggregates in the full query, and explain what the GROUP BY is supposed to do for you. I’m not sure whether simply omitting it will give you the answer you need, or whether you need to do something else. One reason for the lack of certainty is the lack of schema in the question; we cannot tell all that much about the schema of the tables and the primary key, foreign key relationships between them.
So, I’d recommend using (trying):
If that gives ‘the wrong number of rows’, then you need to explain the schema and the relationships between the rows.