Recently there was a request to create a function that would parse out the city, state and postal code information from a single string into separate columns for both US and Canadian addresses. There are obviously many different ways to approach this problem, and in my first few attempts I tried using the PATINDEX function with regular expression to first extract the postal code and then to identify the state and city in that order. Since using regular expressions in T-SQL does not quite work the same as using regular expression in managed code, I was finding it difficult to account for all the different variations without explicitly defining each and everyone of them first. The comments in the code below explain some of these variations.

I wanted to keep it simple avoid solutions involving CLR or using OLE functions. Yes, a CLR solution would probably be the best in terms of both performance and robustness, but I wanted to keep the solution within the scope of a single T-SQL function and I think the solution below works quite well.

Below you’ll find the complete code for the function along with comments and usage documentation. Please take note of the assumptions being made about the data in the code comments notes section. I did test this against a large data set with many different variations which are also mentioned in the code comments. If you come across any examples that don’t parse correctly with this code, please let me know in a comment below. Hope you enjoy.