How can I best write an application that sits in front of a generic SQL database (SQL Server, MySQL, Oracle, etc.) and listens to SQL queries?
The application needs to be able to intercept (prevent passing to the SQL database) or pass (send to SQL database) the query, based on the specific query type.
Is there a way to do this generically so that it is not tied to a specific database backend?
The basic system isn’t particularly easy, though neither is it incredibly difficult. You create a daemon that listens on a port (or a set of ports) for connection attempts. It accepts those connections, then establishes its own connection to the DBMS, forming a man-in-the-middle relay/interception point. The major issues are in how to configure things so that:
You can still run into issues, though. Most notably, if the GSL is on the same machine as the DBMS listener, then when the GSL connects to the DBMS, it looks to the DBMS like a local connection instead of a remote connection. If the GSL is on a different machine, then it looks like all connections are coming from the machine where the GSL is running.
Additionally, if the information is being sent encrypted, then your GSL can only intercept encrypted communications. If the encryption is any good, you won’t be able to log it. You may be able to handle Diffie-Hellman exchanges, but you need to know what the hell you’re up to, and what the DBMS you’re intercepting is up to — and you probably need to get buy-in from the clients that they’ll go through your middleman system. Of course, if the ‘clients’ are web servers under your control, you can deal with all this.
The details of the connection tend to be straight-forward enough as long as your code is simply transmitting and logging the queries. Each DBMS has its own protocol for how SQL requests are handled, and to intercept and modify or reject operations will require understanding of the each DBMS’s protocol.
There are commercial products that do this sort of thing. I work for IBM and know that IBM’s Guardium products include those abilities for a number of DBMS (including, I believe, all those mentioned above — if there’s an issue, it is likely to be MySQL that is least supported). Handling encrypted communications is still tricky, even for systems like Guardium.
I’ve got a daemon which runs on Unix that is adapted to one specific DBMS. It handles much of this — but doesn’t attempt to interfere with encrypted communication; it simply records what the two end parties say to each other. Contact me if you want the code — see my profile. Many parts would probably be reusable with other DBMS; other parts are completely peculiar to the specific DBMS it was designed for.