I have a problem, and besides it sounds trivial, it’s not simple (for me)

Question

0

Asked: June 4, 20262026-06-04T21:12:28+00:00 2026-06-04T21:12:28+00:00

I have a problem, and besides it sounds trivial, it’s not simple (for me)

0

I have a problem, and besides it sounds trivial, it’s not simple (for me) to find a straight forward, scalable and performatic solution. I have one input text where the website user can search for locations.

Today the location can be a city, a address in a city or a neighborhood in a city, and the user must separate the address or the neighborhood from the city using a comma, then it’s easy for me to split the string and find if the first block is a address, a neighborhood or a city. If the user fails to fill the input with all the needed information, putting a address without a city, and I match more than a street with the same name, we show all the locations for him to choose the correct one.

Using the search log we find out that most of the users don’t use the comma, even with all the tool tips pointing how to use the location search (thx google :p).

So, a new requirement for the location search is needed, to accept non comma separated addresses, like:

1. "5th Avenue"
2. "Manhattan"
3. "New York"
4. "5th Avenue Manhattan"
5. "5th Avenue Manhattan New York"
6. "Manhattan New York"
7. "5th Avenue New York"

But I can’t find a way to find the meaning of each block or a dynamic way to make this work. Ie, if I get a string like “New Yok”, “new” can be a address, and “york” can be a city.

My question is, is there some kind of technique or framework to achieve what I need or I will need to work my way in a algorithm (based on the number of words, commas, etc) to do that specifically?

Edit1:

Because I use SQL Server, I’m thinking about full text search multiple columns search, doing a exact match before and a non exact later. But I think some incomplete addresses will return thousands of rows.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-04T21:12:29+00:00

Isn’t the key that specificity decreases from left to right? That is, the right-most semantic element (whether “New York” or “Manhattan”) is always the least-specific (if it’s a Borough, then we don’t have to worry about City, if it’s a Street, we don’t have to worry about Borough, etc.)

So reverse the tokens and recurse through, seeking either a complete hit (“Manhattan”) or a keyword (“Avenue”, “Street”, “New”) that indicates either the beginning or end of a semantic element. So after a pass, you might have:

"5th Avenue" -> TOKEN STREET_END_TOKEN
"Manhattan" -> BOROUGH
"New York" -> COMPOUND_BEGIN_TOKEN TOKEN
"5th Avenue Manhattan" -> TOKEN STREET_END_TOKEN BOROUGH
"5th Avenue Manhattan New York" -> TOKEN STREET_END_TOKEN BOROUGH COMPOUND_BEGIN_TOKEN TOKEN
"Manhattan New York" -> BOROUGH COMPOUND_BEGIN_TOKEN TOKEN
"5th Avenue New York" -> TOKEN STREET_END_TOKEN COMPOUND_BEGIN_TOKEN TOKEN

Which ought to give you enough to pattern-match against.

UPDATE:

OK, to expand on the general strategy:

Step 1 : Generate a pattern of the query structure by identifying keywords ("Manhattan"), and semantically-meaningful ("Street", "Avenue") or grammatically-significant ("New", "Saint") tokens. 
Step 2: Match the generated pattern against a set of templates -- "* BOROUGH *" -> (Street) (BOROUGH) (City)", "* STREET_END_TOKEN" -> (Street name) (Street type), etc.  
Step 3: The result of Step 2 ought to give you a sense of what kind of query you're dealing with. You'll have to apply domain rules at that point (if you know the complete query is TOKEN STREET_END_TOKEN then you know "Well, this is a query that just specifies a street" and you have to apply whatever rule is appropriate (grab the locale of their browser? Use their query history to guess which neighborhood and city? etc.)

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a problem, and besides it sounds trivial, it’s not simple (for me)

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply