I have a table called words, consisting of three columns word(VARCHAR(16)), doc_id(INT), weight(DOUBLE).
Here is what I need to do, I have two queries:
SELECT doc_id, weight FROM words WHERE word = 'bla';
doc_id weight
------ ------
1 0.14
2 0.61
3 0.32
and
SELECT doc_id, weight FROM words WHERE word = 'blabla';
doc_id weight
------ ------
2 0.19
3 0.45
4 0.14
I need to get the intersection of the two on doc_id and select the lower weight value as the weight, i.e. I want the results to be:
doc_id weight
------ ------
2 0.19
3 0.32
Is there a way to do that in a single query? Doing it in the program makes it damn slow!
I also need to get their UNION and select the higher weight value, i.e. I want the results to be:
doc_id weight
------ ------
1 0.14
2 0.61
3 0.45
4 0.14
Keep in mind that the column word and doc_id are not unique, so one word can be assigned to many docs.
For the intersect part it sounds like you want “the lowest weight for all doc_id where the doc_id has one row for the word ‘bla’ AND one row for the word ‘blabla'”. That can be found by
For the union part what you want is “the highest weight for all doc_id where the doc_id has one row for the word ‘bla’ OR one row for the word ‘blabla'”. That can be found by