I have been importing my raw IIS log files into a SQL Server table using the
Log Parser tool for several months now. This is done to enable the writing of
SSRS reports based on this log data.
One of the reports that I am working on is supposed to determine the number of
Visits from each unique IP address. A Visit is defined as an IP address hitting
a page on the site and then making 4 more requests within an hour of each
other. All of the 5 requests are within one Visit to the site. Later on that
night the same IP address hits the site, except that now it’s 3 hours later, so
we count this new activity from the same IP as a new Visit. Here is a sample of
the data:
IPAddress, RequestDateTime, UriStem
10.1.1.100, 2010-10-15 13:30:30, /
10.1.1.100, 2010-10-15 13:30:31, /style.css
10.1.1.100, 2010-10-15 13:30:31, /script.js
10.1.1.100, 2010-10-15 13:30:32, /funny.gif
10.1.1.100, 2010-10-15 13:30:33, /picture.jpg
10.1.1.101, 2010-10-15 13:40:50, /page2.html
10.1.1.101, 2010-10-15 13:40:51, /style.css
10.1.1.102, 2010-10-15 14:10:20, /page4.html
10.1.1.102, 2010-10-15 14:10:21, /style.css
10.1.1.100, 2010-10-15 16:55:10, /
10.1.1.100, 2010-10-15 16:55:11, /style.css
10.1.1.100, 2010-10-15 16:55:11, /script.js
10.1.1.100, 2010-10-15 16:55:12, /funny.gif
10.1.1.100, 2010-10-15 16:55:13, /picture.jpg
By looking at the data above I can easily discern that the 10.1.1.100 IP
address has visited the site twice and had 5 hits on each visit. However, I am
at a loss as to how to express that in SQL code. Is there an easy way to group
and count these date ranges by IP address?
I understand that this information can be captured by using tools such as
AWStats, but I do not have the luxury of being able to install Perl on the
systems we use.
Give the code below a trial run. The code groups and numbers the visits from each IP address. Then it looks to see how many “uristem” hits compared with the “threshold” value. I tested the code on a table named “Foo” and you need to check your table and column names prior to running the test.
The output of my test on SQL 2008 is