We are using a query that uses coalesce to compare potentially null values. I want to get away from this because not only does it make queries more difficult to maintain, it’s flat out ugly. Imagine this:
where coalesce(tbl1.field,'~') <> Coalesce(tbl2.field,'~')
…being repeated 30+ times in a where clause. Hecky nah.
I had been under the impression that EXISTS would allow me to circumvent this fugliness, but it turns out that I was wrong.
The best alternative is to drive the coalesced value into your physical data model as a column default in place of NULL. This requires that the business accepts and understands that a token value will be substituted in the data model for whatever the business currently accepts NULL to represent. The tricky thing about NULL is sometimes accurately describing in your metadata and logical data model as to what NULL represents. It shouldn’t be permitted to mean multiple things from a purist standpoint. The other alternatives are work arounds to address the fact that NULL can not be compared to another value, NULL or not NULL.
COALESCE is treated like an inline CASE statement in Teradata (and other databases for that matter). The trouble with numerous COALESCE statements in your WHERE clause is that the optimizer may not be accurately able to estimate the resulting cardinality because COALESCE allows more than two comparisons.
COALESCE(A.Col1, B.Col2, C.Col3, '~')will return the first non-NULL value it encounters.You can eliminate NULLS from consideration if they are dimensions whose reference table does not have a NULL value in its domain. In other words, NULL is not a valid primary key. However, you will find in the optimizer that it will likely insert a condition on the table where NULLS are permitted to spool only those records where
A.Col1 IS NOT NULL. So there is some overhead associated with NULL being permitted in your data model.