I’m having trouble with a mySQL query.
We have a table that tracks the amount of times a user clicks. It is a single integer field that increases by 1 each click.
Schema:
+------------+--------------+------+-----+
| Field | Type | Null | Key |
+------------+--------------+------+-----+
| id | int(11) | NO | PRI |
| arrival_id | int(11) | NO | |
| clickouts | int(11) | NO | |
+------------+--------------+------+-----+
Example Data:
id arrival_id clickouts
5 22 0
6 23 1
7 24 7
If I want to determine the percentage of arrivals that generated at least one click, I can write the following query. (Note – not every arrival has a clickout, hence the LEFT JOIN):
SELECT SUM(click_tracker.clickouts > 0)/COUNT(DISTINCT arrivals.id)
FROM arrivals
LEFT JOIN click_tracker ON arrivals.id = click_tracker.arrival_id;
The issue comes in when I also LEFT JOIN arrivals to another table (not click_tracker) during the same query.
I have another table called advertisements, that tracks various types of ads a user will see. The ads are tied to an arrival, and an arrival can have many ads. The schema is as follows:
+------------+--------------+------+-----+
| Field | Type | Null | Key |
+------------+--------------+------+-----+
| id | int(11) | NO | PRI |
| arrival_id | int(11) | NO | |
| ad_type | varchar(255) | NO | |
+------------+--------------+------+-----+
Now, I want to find the percentage of arrivals with ad_type = “ZXY” along with the percentage of arrivals that generated at least one click (regardless of the ad_type). Doing the following won’t work:
SELECT COUNT(DISTINCT advertisements.id)/COUNT(DISTINCT arrivals.id) AS ad_pct,
SUM(click_tracker.clickouts > 0)/COUNT(DISTINCT arrivals.id) AS click_pct
FROM arrivals
LEFT JOIN advertisements ON arrivals.id = advertisements.arrival_id
LEFT JOIN click_tracker ON arrivals.id = click_tracker.arrival_id
The LEFT JOIN of the advertisements table creates a result set duplicate arrival id’s. Thus, when I subsequently LEFT JOIN onto the click_tracker table, I get duplicate click_tracker id’s as well.
I need a way to do SUM(click_tracker.clickouts > 0) for distinct click_tracker rows only. I tried SUM(DISTINCT click_tracker.clickouts > 0) but that did not give me the correct results.
I hate to say it, but those two calculations are completely independent, and I think you’ll need to execute two queries for them. Here they are (and an sqlfiddle with the data I used):
Percentage of arrivals by
ad_type:(Or you could change the
GROUP BYto aWHEREclause for a single ad_type.)Percentage of arrivals with at least one click: