I don’t know how to writ the Title for this question, but what I need is a query that return what is the N record with a specific value.
The table that I have is over 5.2M records
The records are similar to:
- session (string, primary indexed)
- customer_id (int, indexed)
- clicks (int, indexed)
- order_number (int, indexed)
- date_entry (datetime, indexed)
- many other fields
what I need to know is how many times the same customer logged into the site (different sessions) before placing an order (order_number is 0 unless an order is placed during that session)
a sample data can be (simplify data)
session | c_id | clicks | ord_num | entry | abc | 123 | 2 | 0 | 2012-08-01 00:00:00 | cde | 456 | 2 | 0 | 2012-08-01 00:00:01 | efg | 457 | 2 | 0 | 2012-08-01 00:00:02 | hij | 123 | 5 | 0 | 2012-08-01 00:00:03 | kod | 986 | 10 | 0 | 2012-08-01 00:00:04 | wdg | 123 | 2 | 9876 | 2012-08-01 00:00:05 | qwe | 123 | 2 | 0 | 2012-08-01 00:00:06 | wvr | 986 | 12 | 8656 | 2012-08-01 00:00:07 |
What I want is a query that shows something similar to:
- entry – date entry
- tot_sess – total number of session
- tot_cust – total number of customers
- 1sess – customer1 with only one session
- 2sess – customers with 2 sessions
- 3sess – customers with 3 sessions
- 4sess – customers with 4 sessions
- more4sess – customers with more than 4 sessions
- order1sess – customers that ordered on the first session
- order2sess – customers that ordered on the second session
- order3sess – customers that ordered on the third session
- order4sess – customers that ordered on the fourth session
- orderMore4Sess – customers that ordered after the fourth session
entry |tot_sess|tot_cust| 1sess | 2sess | 3sess | 4sess | more4sess | order1sess | order2sess | order3sess | order4sess | orderMore4Sess | 2012-08-01 | 8 | 4 | 2 | 1 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 0 |
I am already able to get the information about the session with the following query:
SELECT
t.`date_entry`,
COUNT(sess) `cust`,
SUM(sess) `session`,
COUNT(IF(sess>1,sess,NULL)) `more than once`,
COUNT(IF(sess=1,sess,NULL)) `one`,
COUNT(IF(sess=2,sess,NULL)) `two`,
COUNT(IF(sess=3,sess,NULL)) `three`,
COUNT(IF(sess=4,sess,NULL)) `four`,
COUNT(IF(sess>4,sess,NULL)) `more`,
ROUND(COUNT(IF(sess>1,sess,NULL))/COUNT(sess),2) `perc > 1`,
ROUND(COUNT(IF(sess>2,sess,NULL))/COUNT(sess),2) `perc > 2`,
ROUND(COUNT(IF(sess>3,sess,NULL))/COUNT(sess),2) `perc > 3`,
ROUND(COUNT(IF(sess>4,sess,NULL))/COUNT(sess),2) `perc > 4`
FROM
(
SELECT
`customer_id`,
COUNT(`session`) `sess`,
DATE(`date_entry`) `date_entry`
FROM `customer_activity_log`
WHERE
`clicks` > 1
AND `customer_id` > 0
AND `date_entry` > '2012-08-01'
AND subsite_id <=1
GROUP BY `date_entry`, `customer_id`
) t
GROUP BY date_entry
Once I had that I will also need to look at the data in a different way, for example, if customer 123 showed on the first time on 2012-01-01 and then came back 15 times and placed the order on 2012-08-01 and then came back 5 more times and placed another order on 2012-10-12 I will need a query that will not restrain by date but only by customer, in other words the restrain date_entry will be removed
I hope it makes sense
See it on sqlfiddle.