I am not able to explain my question in abstract terms. It is a very simple question, but I need to go through this very palpable example. It’s completely made up, and therefore should be comparable to simmilar applications.
We have a bunch of tables with information about users, all the tables are what I believe is normalized, some values are only references via IDs to other tables.
I am using mySQL ( and PHP with the mysqli extension – in case that matters, which I doubt)
So here’s for example what I have:
table user_data
=====================================================
|| User_ID || Name || age || gender || location_ID ||
=====================================================
|| U000001 || Paul || 30 || m || L00001 ||
|| U000002 || John || 20 || m || L00001 ||
|| U000003 || Mike || 25 || m || L00002 ||
|| U000004 || Anna || 25 || f || L00003 ||
table user_personal_info
============================================
|| User_ID || color || food || profession||
============================================
|| U000001 || red || pizza || architect ||
|| U000002 || blue || pasta || policeman ||
|| U000003 || green || steak || plumber ||
|| U000004 || pink || salad || teacher ||
table locations
========================================================
|| location_ID || country || state || city ||
========================================================
|| L00001 || USA || New York || New York ||
|| L00002 || USA || New York || Buffalo ||
|| L00003 || USA || California || Sacramento ||
|| L00004 || Canada || Ontario || Toronto ||
|| L00005 || Canada || Quebec || Montreal ||
table user_activities
=========================================
|| activity_ID || user_ID || priority ||
=========================================
|| A0003 || U000001 || 5 ||
|| A0005 || U000001 || 4 ||
|| A0004 || U000002 || 2 ||
|| A0006 || U000002 || 1 ||
|| A0001 || U000003 || 3 ||
|| A0002 || U000004 || 4 ||
|| A0001 || U000004 || 1 ||
|| A0003 || U000004 || 5 ||
table activities
=================================
|| activity_ID || description ||
=================================
|| A0001 || surfing ||
|| A0002 || exercising ||
|| A0003 || baseball ||
|| A0004 || theater ||
|| A0005 || dancing ||
|| A0006 || reading ||
Alrighty, you get the concept, right?
to DISPLAY each entry, I make the following mySQL statement and then loop through the resultset in PHP and so on:
SELECT * FROM user_data
JOIN user_personal_info USING (User_ID)
in order to also display what their favorite activities are, I also have to do this:
SELECT * FROM user_activities
WHERE user_ID = (current user_id)
of course I have to translate what the activity ID stands for and what the location ID stands for with additional queries…
(By the way: Does anyone have a better suggestion for how to display all users and all fields associated with them, rather then doing two queries?)
Now I want to build a thorough search function to find very specific users.
I would know how to filter through my results using PHP, but that would require me to download the entire db first, and that probably takes very long to do, once a couple thousand users are in the DB.
I know how to find users who are male, female, or both, who like food or a color, who are from a specific location (location_ID=L00001 or so)…
I know how to assign rules about ages (=, >, <…). I know about the LIKE %?% parameter.
My question is:
How would I find all users from a certain country or a certain state?
*How do I ask mySQL to only show those users,who’s location_ID matches one from an array of location_IDs?*
How do I find all users with one AND/OR more specific activities?
How do I ask mySQL to only show those users, who’s array of activities matches at least all activities from an array (that would be the AND version)?
*How do I ask mySQL to only show those users, who’s array of activities contains at least one of the activities from an array (that would be the OR version)?*
And now the really important question is:
How do I combine those statements with my normal statements from above?
Meaning: How would I find all users from NEW YORK STATE who are into SURFING and who are MALE and who like PIZZA?
or
How would I find all users from USA who are into READING, and DANCING and who are OVER 30 and who like GREEN?
or
How would I find all users from SACRAMENTO, CA who are PLUMBERS and FEMALE?
etc. etc. the examples are obviously endless!
I am sure someone will just be able to tell me “you should research this keyword”. But because I am unable to express my question in a conceise manner, I wasn’t successful finding much information…
UPDATE:
Thanks for the answer. There were a couple of useful things I was pointed to, here is a summary of the things I didn’t know but do now:
- utilizing JOIN more effectively
- the IN operator
- the GROUP BY operator combined with HAVING COUNT()
- and SUB SELECTS
Thanks for pointing those things out to me! 🙂
Well, I think one of the keywords you’re looking for is the
INoperator.would return all the rows where one of the values in the IN-clause is matched against the country field. So it’s like writing this:
As for the rest of your questions:
Does anyone have a better suggestion for how to display all users and all fields associated with them, rather then doing two queries?
Simply join them all together, like:
Of course, depending on your structure you’d use
LEFT JOINorRIGHT JOIN, etc. It’s also not a good practice to simply retrieve all data bySELECT *, but to really only select the fields you need.Furthermore you could/should create one/more views representing the joined data you need and select from it/them.
How would I find all users from a certain country or a certain state?
Depending on how you get the data from the user and how you prepare it for your statement in PHP. For instance, assuming your users searches for a country and you get it via a post method:
The same principle goes for state, of course.
How do I find all users with one AND/OR more specific activities?
Here you would need to join against the activities table and use the IN operator as shown above.
How do I combine those statements with my normal statements from above?
Taking your example How would I find all users from NEW YORK STATE who are into SURFING and who are MALE and who like PIZZA?
Hope this helps and gets you in the right direction.
UPDATE
Of course the IN-operator here could be substituted with
description = 'surfing', since it’s only one value. And you’re right if you add another value likedescription IN ('surfing', 'reading')it would meansurfing OR reading. So if you want to get all the users who are intosurfing AND readingI guess I’d do it with a sub-select:So the sub-select means: count each user-id that appears with a ‘surfing’ or ‘reading’, and if the count is equal to 2 (meaning they match for both) retrieve the user-id.
And the outer select simply selects the data from every user of the subset.
Now, I didn’t test this so it may vary. And there are probably simpler ways. At least something you could do to simplify this query is to create a view as I mentioned before and select from it.