I have a third party table that is being populated with some cluttered data that I’m needing to get the most recent distinct records out of. The table will be fed a new row every year, or every time the “Person” changes. The table works based on that the most recent ActiveDate is the correct person. I’ve created a mock table and data to show this.
CREATE TABLE `Persons` (
`PersonId` varchar(200) NOT NULL,
`Name` varchar(200) NOT NULL DEFAULT '',
`ActiveDate` varchar(25) NOT NULL,
`ExpireDate` varchar(25) DEFAULT NULL,
`Job` varchar(200) NOT NULL DEFAULT '',
`Position` varchar(200) NOT NULL DEFAULT ''
)
And some mock data:
Id |`Name` |ActiveDate |ExpireDate |Job |`Position`
---------------------------------------------------------------------------------------------------
J1234 |Doe, John |2010-08-15 00:00:00 |2011-08-15 00:00:00 |Worker |Janitor
J1234 |Doe, John |2011-08-15 00:00:00 |0000-00-00 00:00:00 |Worker |Janitor
777 |Doe, Jane |2010-06-04 00:00:00 |0000-00-00 00:00:00 |Boss |Janitor
777 |Doe, Jane |2011-04-30 00:00:00 |0000-00-00 00:00:00 |Boss |Janitor
654G |Smith, Jane |2011-01-20 00:00:00 |0000-00-00 00:00:00 |Worker |Janitor
The table also has and ExpireDate column which is actually set by the end user, and is not always set much to my dismay. Currently I’m using a dummy table to pull the distinct records out into and store for the day. I would use a temporary table but I’m not 100% sure how to in MySQL, plus I dislike them. The way I’m doing it is just temporary in hope for better SQL.
The data then has to be joined with a multitude of other tables to get the finished product. But I’m still needing to deal with the initial set of distinct data. And joining in the other table right from the start just wont work.
So here is how I’m pulling my data, storing it, and then pulling it again later and joing it to other tables:
INSERT INTO tmp_Person (Id, `Name`, Job, `Position`)
SELECT DISTINCT Id, `Name`, Job, `Position`
FROM Person
SELECT tmp_Person.Id,
tmp_Person.`Name`,
tmp_Person.Job,
tmp_Person.`Position`,
Pricing.Cost,
Pricing.Benefit
FROM tmp_Person
LEFT OUTER JOIN Pricing AS CL ON CL.PersonId = tmp_Person.Id
AND CL.PriceScredule = 'Major-Client'
AND CL.ExpireDate = '0000-00-00 00:00:00'
LEFT OUTER JOIN Pricing AS Inter ON Inter.PersonId = tmp_Person.Id
AND Inter.PriceScredule = 'Internal-Client'
AND Inter.ExpireDate = '0000-00-00 00:00:00'
How can I write this to avoid the cost of processing out the duplicate rows using a temporary table (in any form)? HOpefully I’ve made this clear enough, if not I can gladly add, or clarify.
Replace
tmp_Personwith the code you have for the temp table:As @Andriy spotted, using
Pricing.CostorPricing.Benefitin the SELECT list would raise error. I guess you forgot to change it when you posted.