I’ve got the following problen: I know SQL and I don’t know how to work with SQLAlchemy but I need to change it in 1 place in the project that I’ve inherited.
So, I’ve got this:
ModelCategories = request.sa.query(
Model.category_id
, Category.name
, Category.alias).distinct().join(Category).order_by(Category.alias
, Category.name )
And it generates a rather slow request:
SELECT DISTINCT
model.category_id AS model_category_id
, category.name AS category_name
, category.alias AS category_alias
FROM model
JOIN category ON category.id = model.category_id
ORDER BY category.alias, category.name
And I need to change it with this:
SELECT
model.category_id AS model_category_id
, category.name AS category_name
, category.alias AS category_alias
FROM ( SELECT DISTINCT model_category_id ) as model
JOIN category ON category.id = model.category_id
ORDER BY category.alias, category.name
But in terms of SQLAlchemy as is the first request.
First of all check the SQL execution plan. If you have an
indexon themodel.category_idcolumn, the query should not really be slow.Otherwise, following options are available:
Option-1: almost your current solution
This is like your current solution but somewhat cleaner in my view. I assume the performance issue might come from the fact that all table
Modelis used in the query, and this is also why you need to usedistinct.Option-2: use any() on relationship
This should boost your performance already. I prefer this to following option-3 as it is again more clean code
Option-3: use subquery
This should give you exactly the SQL you asked for. As mentioned, I personally prefer version-2.