I wish to do the following pseudo-code in SQL without the use of a CURSOR, if possible.
for each zipcode
{
-- What city is this zipcode in? A zipcode can be in multiple cities,
-- but one city area is always way greater that the others.
-- Eg. 90210 is 96% in the city of Beverly Hills.
-- The remaining 4% is that it s in some other fringe cties ..
--but only just (some mismatch mapping errors).
** grab the city largest shape area which this zip is a part of **
}
Now, I have some SQL to help us out.
-- All zipcodes and each boundary/shapefile.
SELECT ZipCodeId, Boundary FROM ZipCodes -- Boundary is a GEOGRAPHY field type.
To determine if a zipcode boundary is in a city....
SELECT CityId, CityName,
City.Boundary.Intersection(@someZipCodeBoundary).STArea() AS Area
FROM Cities
WHERE City.Boundary.Intersects(@someZipCodeBoundary) = 1
and to get the area of intersection (because we want the highest area of intersection ie. TOP(1) ORDER BY Area DESC or a DISTINCT with an ORDER BY sort of thing. We use the Intersection SQL method.
Note: Intersects and Intersection are two different Sql methods.
Ok – got it ๐ The trick was to use a
PARTITION BY. @In Sane gave me the idea when I realised I’ve done something similar, before ๐So .. here we go..
So in this filtered example (Filtered by ZipCode 12010 or 90210), we can see that this zipcode exists in 4 different cities/towns. Each zipcode can have 1 to many results, which are then ordered by the Area value .. but the key here is the
PARTITIONkeyword .. which does this ordering by ZipCode groups or partitions. Very funky ๐ Notice how the zipcode 90210 has it’s own rank results? same with 12010 ?Next, we make that a subquery, and just grab all the Rank == 1 ๐
Sweet as candy ๐
Side Note: This also shows me that my Los Angeles city shapefile/boundary is corrupted, because it’s intersecting the zipcode 90210 far too much (which I visually confirmed :P)