I have a question because I’m really bad at SQL. I understand basic functions but when
it gets a bit more complex, I’m completly lost.
here is what I have:
tables: tA, tB
columns: tA: refA tB: refB
basically refA and refB represent the same thing (some id of a form like xxx-xxx-xxx), but
refB can have information appended (like xxx-xxx-xxx_Zxxx or xxx-xxx-xxx Zxxx)
here is what I know how to do:
querying items that are in a table but not in another (when they are exactly the same)
select refA
from tA
where not exists (select *
from tB
where tB.refB = tA.refA
)
What i want to do:
I want a query that will list items from refA that are not in refB.
BUT, Problem is if I run a “simple” query with a NOT EXISTS like I just showed, it will return everything,
because of the appends. so I thought about using some syntax like this:
SELECT refA
FROM tA
WHERE NOT EXISTS (SELECT *
FROM tB
WHERE tB.refB LIKE CONCAT(tA.refA,'%'))
but… of course, it doesn’t work.
Could someone show me how it should be done, and also explain how it works, so I can learn ?
Thanks in advance !
edit: additional info
I can’t use a left() or something alike, because the ref format is similar but not always the same (varies in number of characters).
The only way to detect the end of the id before the append, is that there is either a blank space or an underscore.
edit 2: data sample causing problems (MON, Jan. 10th)
here is some actual data from the tables, which makes most answers people have given here
miss some results :/
in tA:
B20-60-04-6A-1
B20-60-04-6A-11
B20-60-04-6A-12
B20-60-04-6A-13
in tB:
B20-60-04-6A-11_XX
B20-60-04-6A-12_XX
B20-60-04-6A-13_XX
problem with mid(), left(), etc. is that if we check “B20-60-04-6A-1” (14 chars)
against the 14 first chars, it will return 3 positives, while in fact it is not in tB…
so, how can we proceed ?
Examples of data patterns in tA are like this:
(X, XYZ: charaters. x: alphanumerical)
Xxx-xx-xx-x
Xxx-xx-xx-xx
Xxx-xx-xx-xx-xx
Xxx-xx-xx-xx-xx-x
etc
examples of data patterns in tB:
Xxx-xx-xx-xx-xx-XYZ-xx Z xxx_XX
Xxx-xx-xx-xx-xx-XYZZxxx_XX
Xxx-xx-xx-xx-xx Z xxx_XX
XYZ are always the same 3 characters. When we do not have XYZ, there is always a blank space or an underscore.
so the string of data we compare should be trimmed according to this:
– from start to -XYZ string
– or, if no -XYZ in the string, from start to the first ” ” or “_”
I’d write that lightning fast in VBA, but in SQL… well, I’ll give it a shot, but I’m really bad at it 😀
So, first off, you need a function that will change refB to not have the appended information, so it can be compared properly with refA. There will be several approaches, but something like this should work:
That will convert any refB like “123-456 123 EXTRA STUFF” or “123-456_123_EXTRA_STUFF” into “123-456”. That result should then be okay to compare directly with a refA.
EDIT: A short explanation of the expression above. What I’m doing is:
Replacefunction). Now we know that the separator is always an underscore, and we also know from step 1 that there will be at least one underscore.InStrfunction). This is the position where refB is split between refA and the additional stuff.So, that gives you something like this:
I would use this approach rather than comparing with wildcards, or trimming refB to match the length of refA, because of this scenario:
In this case, trimming or wildcard matching refA with refB will result in success for all refAs, because “123*”, “123-456*” and “123-456-789*” all match “123-456-789_This_is_a_test”.