I have an interesting problem that I need help with. I am currently working

Question

0

Asked: June 5, 20262026-06-05T08:43:39+00:00 2026-06-05T08:43:39+00:00

I have an interesting problem that I need help with. I am currently working

0

I have an interesting problem that I need help with. I am currently working on a feature of my program and stumbled into this issues

I have a huge list of street names in Indonesia ( > 100k rows ) stored in database,
Each street name may have more than 1 word. For example : “Sudirman”, “Gatot Subroto”, or “Jalan Asia Afrika” are all legit street names
have a bunch of texts ( > 1 Million rows ) in databases, that I split into sentences. Now, the features ( function to be exact ) that I need to do , is to test whether there are street names inside the sentences or no, so just a true / false test

I have tried to solve it by doing these steps:

a. Putting the street names into a Key,Value Hash

b. Split each sentences into words

c. Test whether words are in the hash

This is fast, but will not work with multiple words

Another alternatives that I thought of is to do these steps:

a. Split each sentences into words

b. Query the database with LIKE statement ( i,e. SELECT #### FROM street_table WHERE name like ‘%word%’ )

c. If query returned a row, it means that the sentence contains street names

Now, this solution is going to be a very IO intensive.

So my question is “What is the most efficient way to do this test” ? regardless of the programming language. I do this in python mainly, but any language will do as long as I can grasp the concepts

============EDIT 1 =================

Will this be periodical ?

Yes, I will call this feature / function with an interval of 1 minute. Each call will take 100 row of texts at least and test them against the street name database

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-05T08:43:41+00:00

Editorial Team

2026-06-05T08:43:41+00:00Added an answer on June 5, 2026 at 8:43 am

A simple solution would be to create a dictionary/multimap with first-word-of-street-name=>full-street-name(s). When you iterate each word in your sentence you’ll look up potential street names, and check if you have a match (by looking at the next words).

This algorithm should be fairly easy to implement and should perform pretty good too.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have an interesting problem that I need help with. I am currently working

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply