Let’s say I have following table:
CREATE TABLE `occurences` (
`object_id` int(10) NOT NULL,
`seen_timestamp` int(10) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8
which contains ID of object (not unique, it repeats) and timestamp when this object ID has been observed.
Observation is running 24/7 and inserts every occurrence of object ID with current timestamp.
Now I want to write query to select all object IDs which has been seen during any 10 minute period at least 7 times.
It should function like detection of intrusion.
Similar algorithm is used in denyhost script which checks for invalid SSH logins.
If find configured number of occurrences during configured time period, it blocks IP.
Any good suggestion?
This should work:
You can move
@num_occurencesand@num_occurencesto your code and set these as parameters of your statement. Depending on your client you can also move the the initialisation of@rownum_startand@rownum_endin front of the query, that might improve the query performance (you should test that nontheless, just a gut feeling looking at the explain of both versions)Here’s how it works:
It selects the entire table twice and joins each row of
offset_startwith the row inoffset_endwhich has an offset of@num_occurences. (This is done using the@rownum_*variables to create the index of each row, simulating row_number() functionality known from other rdbms).Then it just checks whether the two rows refer to the same object_id and satisfy the period requirements.
Since this is done for every occurence row, the object_id would be returned multiple times if the number of occurences is actually larger than
@max_occurences, so it’s grouped in the end to make the returnedobject_ids unique