I am trying to do a report off three tables where data gets loaded on the daily basis. Catch is that data is “dirty” (see WHERE clause). Even though I have an index on each of the three tables for the column I do my equijoin against , the query never returns.
SELECT
A.SITE || ',' ||
A.SCRIPT || ',' ||
A.TESTRESULT || ',' ||
A.BISTERROR || ',' ||
A.HDDMANUFACTURER || ',' ||
A.TESTDATE || ',' ||
A.STB_MODEL || ',' ||
A.RECEIVERID || ',' ||
B.STB_MODEL || ',' ||
B.STB_MANUFACTURER || ',' ||
B.STB_MFRDATE || ',' ||
B.SW_VERSION || ',' ||
B.SW_NAME || ',' ||
B.HW_VERSION || ',' ||
B.RECEIVERID || ',' ||
C.HDDMODELNUMBER || ',' ||
C.HDDSERIALNUMBER || ',' ||
C.DISKPORT || ',' ||
C.DISKSIZE || ',' ||
C.SECTORSIZE || ',' ||
C.POWERONHOURS || ',' ||
C.CURRENTTEMP || ',' ||
MAX(TRUNC(A.LOAD_DATE))
FROM
HDD_SMARTLOG_FILEINFO A,
HDD_SMARTLOG_BOXINFO B,
HDD_SMARTLOG_DISKINFO C
WHERE
A.FILENAME=B.FILENAME
AND
B.FILENAME=C.FILENAME
AND
INSTR(TRANSLATE(A.RECEIVERID,'0123456789','XXXXXXXXXX'),'X') != LENGTH(A.RECEIVERID)
AND
INSTR(TRANSLATE(B.RECEIVERID,'0123456789','XXXXXXXXXX'),'X') != LENGTH(B.RECEIVERID)
AND
LOWER(C.DISKPORT)='internal'
AND
A.SITE IS NOT NULL OR A.SITE <> ''
AND
A.BISTERROR IS NOT NULL OR A.BISTERROR <> ''
AND
B.STB_MODEL IS NOT NULL OR B.STB_MODEL <> ''
AND
B.SW_VERSION IS NOT NULL OR B.SW_VERSION <> ''
AND
C.POWERONHOURS IS NOT NULL OR C.POWERONHOURS <> ''
AND
C.CURRENTTEMP IS NOT NULL OR C.CURRENTTEMP <> ''
AND
INSTR(TRANSLATE(C.POWERONHOURS,'0123456789','XXXXXXXXXX'),'X') != LENGTH(C.POWERONHOURS)
AND
INSTR(TRANSLATE(C.CURRENTTEMP,'0123456789','XXXXXXXXXX'),'X') != LENGTH(C.CURRENTTEMP)
GROUP BY
A.SITE,
A.SCRIPT,
A.TESTRESULT,
A.BISTERROR,
A.HDDMANUFACTURER,
A.TESTDATE,
A.STB_MODEL,
A.RECEIVERID,
B.STB_MODEL,
B.STB_MANUFACTURER,
B.STB_MFRDATE,
B.SW_VERSION,
B.SW_NAME,
B.HW_VERSION,
B.RECEIVERID,
C.HDDMODELNUMBER,
C.HDDSERIALNUMBER,
C.DISKPORT,
C.DISKSIZE,
C.SECTORSIZE,
C.POWERONHOURS,
C.CURRENTTEMP;
It used to work about a month ago when we had less than 100K records on a daily basis. Now, we are processing +/- 300K daily.
How can I optimize this query. My boss wants it to be run daily. Can you please provide a few pointers?
Thanks in advance!
Do all of these:
Create indexes for the fields which are part of your where.
Instead of using Descartes multiplication (which you are currently using) use joins.
In your where clause put the quicker and less probable logical operands first. For instance where A and B is true if both of them are true, so if A is false, B won’t be calculated at all and you win a lot of time. This might be a difference of doing hundreds of thousands of logical checking or not doing it, so, of course this is an optimisation.
I’m not rewriting your query, because I can’t tell which operands are less probable, which operands are quicker, you will have to measure each operand and order them in such a way that the less probable or quicker operands are calculated first and hoping that the others won’t get calculated.
I hope this helps.