I have a large php application (purely php, no frameworks, etc.) which uses an oracle-db.
All queries are executed like this:
oci_parse($conn_id,"insert into table (bla) values ('bla')");
oci_execute($stmt)
I know this is bad! No need pointing out stuff like “use bind” or something similar. I know that, but I can’t change this.
What we all also know is that you have to escape characters.
This question is especially about the ' charcter.
I have many queries like this:
$query = "INSERT INTO table (field1, field2,field3,field4) VALUES ('bla,bla','blub', 'mimi'm', 'mu's'c'hle')";
$query2 = "UPDATE table SET field1 = 'bla,bla', field2 = 'blub', field3 = 'mimi'm', field4 = 'mu's'c'hle' WHERE field5 = 'lol'zj'd'"
Sure, normally they do not have so many ' in it – but thats just for demonstration.
Now to the question:
Is there any way to validate/escape the whole query-string in php? I can’t think/find of a way to accomplish this, no matter how I think of it.
It’s obvious that it’s easy to escape all values before building the query-strings, by just replacing ' with '' – but is it possible when you only have the whole query as a string (like the examples above)? I personally can’t think of an “universal solution”…
I believe this is insoluble with traditional means, at the time when the query is already built:
Let’s take part of your second example:
A normal query parser would see the
field3value as'mini'followed by an erroneousm, where it expects a comma. This is not something a parser is designed to handle.So suppose we write something custom to handle this. Let’s say we decide that the apostrophe, given that it isn’t followed by a comma, must be part of the value. That’s fine, but what about the next apostrophe, which is intended to be a delimiter?
How does our code know whether the apostrophe is a delimiter, as opposed to the value actually containing an apostrophe followed by a comma? In fact, the value could contain something that looks exactly like the rest of the query! (Furthermore, how would we detect queries that actually are invalid, once we start to question the structure of the query itself in this way).
tl;dr
GIGO = garbage in, garbage out
You can’t write (traditional) software to sort out an arbitrary mess!