Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9130745
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 17, 20262026-06-17T08:01:03+00:00 2026-06-17T08:01:03+00:00

I have a CSV file with about 20 million rows that I’d like to

  • 0

I have a CSV file with about 20 million rows that I’d like to use in my web application. The data is a mapping of postal/zip codes to actual street addresses in the following format:

[zip_or_postal_code] [street_number] [street_name] [city] [state_or_province] [country]

My goal is to keep my lookups (searching by zip/postal code) under 200ms.

I’m not sure if this would make a difference, but I was planning on doing the following:

  • Move the state/province, country, and city columns to their own tables and reference those in my primary table in order to avoid unnecessary bloat.
  • Some zip/postal codes cover multiple streets and addresses, so I will consolidate the data and have 1 zip/postal code and will store multiple addresses in something like a varchar. This should cut down a few million rows from the table.

What are some optimizations I could make to help with lookup speed? As an example, Google’s reverse geolocation API returns a result in under 300 ms with HTTP overhead included. How do they do it?

Also, I am open to using other databases, but since I’m already using MySQL, that would be preferable.

Edit: The lookups will always be done by zip/postal code, so as an example: given the zip 12345 I’d need to return the street #(s)/name(s), city, state, and country. The street #(s)/name(s) will be stored as a single string field, however, so my app will take care of parsing them.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-17T08:01:04+00:00Added an answer on June 17, 2026 at 8:01 am

    20 million rows is not a lot for MySQL. Just index the zip/postal code and it will be fast. Way under 200ms fast. No need to split between tables. MySQL does get slow when the result set is large, but it doesn’t seem like you would encounter that issue. MySQL will do just fine with hundreds of millions of records for basic queries like yours.

    You will need to adjust the MySQL settings so that it uses more memory. The default settings are pretty low.

    MySQL does support spacial indexes. So you could pull the longitude/latitude for the postal codes and use a spacial index to do proximity searches. Doesn’t seem like you are looking for that though.

    If you want things really, really fast, go the route you were thinking of but use memcache or redis. You can use the zip/postal code as the lookup key. You would still need a persistant disk based data store to load the data from. I don’t think memcache/redis is necessary, but it’s an option.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a VB web application that reads from a CSV file which contains
I have a very large csv file (about 91 million rows so a for
I have a CSV file. It contain 1.4 million rows of data, so I
I have a CSV file with 5 columns and about 2*10 4 rows that
I have a csv file with about 30 columns that i would like to
I have some questions about importing data from Excel/CSV File into SQL Server. Let
I have a CSV file that goes something like this: ['Name1', '', '', '',
I have a CSV file containing some user data it looks like this: 10333,,an.10,Kenyata,,Aaron,,,,,,,,,,
I have a CSV file that I'd like to split up based on a
I have a single csv file with data about schools: their locations, their names

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.