I see SELECT EXISTS used a lot like:
if db.query("""
SELECT EXISTS (
SELECT 1 FROM checkout
WHERE checkout_id = %s
)
""" % checkout_id).getresult()[0][0] == 't':
vs. what i prefer:
if db.query("""
SELECT 1 FROM checkout
WHERE checkout_id = %s
LIMIT 1
""" % checkout_id).getresult():
Which one is preferred and why?
P.S. i am using Python and PosgreSQL.
cert=> explain SELECT EXISTS (SELECT 1 FROM checkout WHERE checkout_id = 3);
QUERY PLAN
--------------------------------------------------------------------------------------
Result (cost=4.03..4.03 rows=1 width=0)
InitPlan
-> Index Scan using checkout_pkey on checkout (cost=0.00..4.03 rows=1 width=0)
Index Cond: (checkout_id = 3)
(4 rows)
cert=> explain SELECT 1 FROM checkout WHERE checkout_id = 3 limit 1;
QUERY PLAN
------------------------------------------------------------------------------------
Limit (cost=0.00..4.03 rows=1 width=0)
-> Index Scan using checkout_pkey on checkout (cost=0.00..4.03 rows=1 width=0)
Index Cond: (checkout_id = 3)
(3 rows)
My point is, why getting a row from the result and check if it’s first column is true, if i can just check if there are any rows at all, meaning the same?
PostgreSQL seems to clever enough to treat both statements alike as you can clearly see in your execution plans.
My tests with a local table with ~150000 rows and ~100 selected out of that by the condition also show the same behaviour
The bottomline is: it doesn’t really matter which one you use, but you should be aware that other DBMS might not behave in the same way.