I’m trying to figure out the most efficient way to generate a WHERE query. I asked another question earlier, which was similar, but I will get right to the point on this one.
Given a collection of number ranges, ie 1-1000, 1500-1600 it is quite simple to create a mysql where condition to select records which are between these values.
ie, you would just do:
WHERE (lft BETWEEN 1 and 1000) OR (lft BETWEEN 1500-1600). However, what if you wanted to incorporate a NOT BETWEEN as well.
For example, if you define several rules, like…
- ALLOW BETWEEN 1 – 1000
- ALLOW BETWEEN 1500 – 1600
- ALLOW BETWEEN 1250 – 1300
- DENY BETWEEN 25 – 50
How can I merge these rules in order to efficiently generate a WHERE condition.
I would like the WHERE to dissect the ALLOW BETWEEN 1 - 1000 in order to create a gap in it. So that it would become 1-24 and 51-1000. Because the DENY rule is defined after the first rule, it “overwrites” the previous rules.
As another example,
Say that you have
- ALLOW BETWEEN 5 – 15
- DENY BETWEEN 10 – 50
- ALLOW BETWEEN 45 – 60
Then I would like to generate a WHERE condition which would allow me to do:
WHERE (lft BETWEEN 5 and 9) OR (lft BETWEEN 45 and 60).
Notes (Edits)
- Also, the maximum range that would ever allowed is 1 – 5600000. (Which would be ‘Earth’) ie. Allow everything on Earth.
- The number ranges are actually the LEFT values in a NESTED SET MODEL. These aren’t unique keys.
You can read why I want to do this in this question I asked earlier.
https://stackoverflow.com/questions/6020553/generating-a-mysql-between-where-condition-based-on-an-access-ruleset - Possible important note on my number ranges I maybe shouldn’t have used the sample example which I did, but one important note about the nature of the number ranges is that, the ranges should actually always entirely consume or be consumed by a previous rule. For example, I used the example above, 10-50 allow, and deny 45-60. This wouldn’t actually ever happen in my data set. It would actually be,
allow 10-50, then the DENY would have to either be entirely consumed by that range, ie, 34-38. OR, entirely consume the previous rule.9-51. This is because the ranges actually represent lft and rgt values in a nested set model and you cannot have overlaps like I presented.
I didn’t think to mention that when asking the question, but after seeing the working sample code below, I can see that this note is actually important.
(Edited example mysql to include OR instead of AND as per comment below)
Honestly, why bother? As long as the key you’re querying against is indexed, just put the multiple queries in there:
You could squeze a slight bit of efficiency by building a dissector, but I would question if it’s worth it. All the WHERE clause items would be off of an index, so you’re not preventing any hard operation from occurring (meaning you’re not stopping a full-table-scan by doing it).
So rather than spending time building a system to do it for you, just implement an easy solution (
ORing together the Allows, andANDing together the Denys) and move on to more important things. Then if it becomes a problem later, revisit it then. But I really don’t think this will ever become too big of a problem…Edit Ok, here’s a very simple algorithm for doing this. It uses strings as the data store, so it’s reasonably efficient for smaller numbers (below 1 million):
Usage:
Generates: