I’m looking into using a CASS-Certified address validation service to correct user-provided street addresses at the time of entry. (Specifically, I’m looking at SmartyStreets’ LiveAddress.) However, USPS dictates that a correct address must be in all caps, so CASS services almost uniformly return addresses that way. When mailing to the client at that address, though, it would be preferable to use a more humane, conventional casing.
The question, of course, is how to make that happen. I know there’s no such thing as a perfect solution that doesn’t involve an complete nation-wide database of correctly capitalized street and city names. A set of passable heuristics might be good enough, though, since we will probably be kicking the corrected address back to the user, ultimately leaving it up to them.
A short list of problems that I was able to come up with after a few minutes of thought:
SW FIRST STshould beSW First St, notSw First St.MCDOUGLE STshould beMcDougle St, notMcdougle St.MACDOUGLE STshould probably beMacdougle Strather thanMacDougle St, sinceMacoroni Rdshould usually not beMacOroni Rd.1ST STshould be1st St, not1St St.- Not knowing if a street name is based on a surname, we can possibly not safely make
VANintovan, butVONcan probably becomevon.
Are there any existing libraries that could at least get me started? Addresses are complicated and fickle things, and I’d rather not home-brew the whole thing if I don’t have to. I’m using C#, but I’m open to porting code from another language.
Barring that, does anyone have a decent reference of common capitalization exceptions for street and/or city names?
Great to see that you’re using the LiveAddress service to facilitate address verification and standardization. There is one thing you may want to be aware of that will help you significantly in the process of applying casing rules to your standardized address:
We recently introduced a new REST+JSON endpoint that returns the standardized form of the address as well as various component parts of the address. Because of this, it’s very easy to apply your casing rules to "street_name" and "city_name" values returned independent of the street suffix and pre/post-directionals.
You’re welcome to contact SmartyStreets support for additional help with this issue in addition to questions here on Stack Overflow (which we monitor continually). I should probably also mention that I’m the founder of SmartyStreets. Lastly, we’re working on being able to return properly cased addresses, but I don’t have any kind of release time frame on it yet.