UPDATE: I am using the sql query shown in my question in production, but you are welcome to read the entire thread if you want to see an alternate approach to this, using sql with a UNION
I’ve experimented and made a result set to be used in a content search, but I want to make sure it’s performance is the best it can be.
I have a table named SECTIONS which holds 2 levels of sections, i.e. level 1 (a section) and level 2 (a subsection), in an Adjacency List model
SECTIONS: id, parent_id, name
I query that table twice to get columns in the arrangement
sec_id, sec_name, subsec_id, subsec_name
( this is so I can create uri links like /section_id/subsection_id )
Now I join a separate table named PAGES where a page can be related to a section or a subsection (both not both) through the field section_id
-- columns to return
SELECT
s.id as section_id,
s.name as section_name,
ss.id as subsection_id,
ss.parent_id as subsection_parent_id,
ss.name as subsection_name,
p.section_id as page_section_id,
p.name as page_name
-- join SECTIONS into Sections and SubSections
FROM
( select id, name from sections where parent_id=0 ) as s
LEFT JOIN
( select id, parent_id, name from sections where parent_id!=0 ) as ss
ON
ss.parent_id = s.id
-- now join to PAGES table
JOIN
( select id, section_id, name from pages where active=1 ) as p
ON
(
p.section_id = s.id
OR
p.section_id = ss.id
)
-- need to use GROUP BY to eliminate duplicate pages
GROUP BY p.id
I get duplicate pages in the result set, so I use GROUP BY pages.id to remove the duplicates, but it degrades performance a little.
Can you suggest a better way to eliminate duplicates?
I’ve thought of creating a column in the SECTIONS join that holds the Section ID OR the Subsection ID (depending on the type of row – section or subsection), and then use that to relate to the PAGES section_id, so there would not be duplicate rows, but I can’t figure out how to do it.
Thanks
This is gonna be long 🙁
Note that I didn’t use this approach in the end because it’s performance was worse the my original attempt using GROUP BY
I had to modify the data table design for the PAGES table to include a new column to hold the id of the subsection that the page belonged to, so now the PAGES table has columns that indicate the section it belongs to, and the subsection also. That structure modification was only for testing and I did not use it in the final version.
Here is the query I created using the concept of a UNION between 2 queries.
This UNION query used
0.0388 seconds
for 5 rows of Pages and 4 rows of sections/subsections,
versus the original query which used
0.0017 seconds,
so I stuck with the original as shown above in my question. BTW in my dev environment mysql is running on a P3 Katmai 450 Mhz 256 RAM to force me to write efficient queries 🙂
Thanks for reading, if you have additional thoughts & comments please add them.