Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8068289
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 5, 20262026-06-05T12:37:17+00:00 2026-06-05T12:37:17+00:00

I have a large table with 9 columns and 12 million rows, like this:

  • 0

I have a large table with 9 columns and 12 million rows, like this:

col1  col2  col3  col4  col5  col6  col7  col8  col9
12.3  37.4  7771  -675  -23   23.8  78.8  -892  67.5
79.3  -6.3  6061  -555  -24   28.1  77.1  -889  32.6
55.6  -7.3  8888  -921  -56   78.3  22.3  -443  22.9
....  ....  ....  ....  ....  ....  ....  ....  ....

Currently the table is saved as TSV (tab-separated vector) format in my hard disk, 432MB in size. I want to populate the table into Redis in order to complete this kind of query most efficiently: Given a min value and a max value for each column, count the number of rows that are within the given range, i.e.

(min_col1 <= col1 <= max_col1) &&
(min_col2 <= col2 <= max_col2) &&
(min_col3 <= col3 <= max_col3) &&
(min_col4 <= col4 <= max_col4) &&
(min_col5 <= col5 <= max_col5) &&
(min_col6 <= col6 <= max_col6) &&
(min_col7 <= col7 <= max_col7) &&
(min_col8 <= col8 <= max_col8) &&
(min_col9 <= col9 <= max_col9)

So my questions are:

1) How to populate the table into Redis? What kind of key/value data structure should I use? Hashes, lists, sets, sorted sets, or what else?

2) After populating the table, given 9 min and max values for the 9 columns, how to write the query in order to get the count, i.e. number of rows falling within the 9 ranges? One way I can think of is, first find out the rows that satisfy (min_colX <= colX <= max_colX) for each X in 1 to 9, and then calculate their intersection. But I guess this is not the most efficient way. I just want to retrieve the count as fast as possible.

By the way, I have tried MongoDB. It is straightforward to populate the table using mongoimport, but it takes 10 seconds to complete my query, which is too slow and not acceptable for my real-time application. In contrast, Redis holds data in memory, so I hope Redis can shorten the query time to 1 second.

For your reference, this is what I did in MongoDB.

mongoimport -u my_username -p my_password -d my_db -c my_coll --type tsv --file my_table.tsv --headerline
use my_db
db.my_coll.ensureIndex({col1:1, col2:1, col3:1, col4:1, col5:1, col6:1, col7:1, col8:1, col9:1 }).
db.my_coll.count({ col1: {$gte: min_col1, $lte: max_col1), col2: {$gte: min_col2, $lte: max_col2}, col3: {$gte: min_col3, $lte: max_col3}, col4: {$gte: min_col4, $lte: max_col4}, col5: {$gte: min_col5, $lte: max_col5}, col6: {$gte: min_col6, $lte: max_col6}, col7: {$gte: min_col7, $lte: max_col7}, col8: {$gte: min_col8, $lte: max_col8}, col9: {$gte: min_col9, $lte: max_col9} }).

I used explain() to make sure the Btree index was actually used rather than a table scan.

I also tried creating a ram disk and saving the my MongoDB database into the ram disk, it shortened the query time from 10s to 9s, far from acceptable for my real-time application.

mkdir ~/ram
chmod -R 755 ~/ram
mount -t tmpfs none ~/ram -o size=8192m
mongod --dbpath ~/ram --noprealloc --smallfiles
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-05T12:37:18+00:00Added an answer on June 5, 2026 at 12:37 pm

    Make each col a sorted set, then use ZRANGEBYSCORE on each key, and do the intersection and count in the application. I use phpredis and I do that a lot in memory, using array_intersect.

    The perfomance problem is in ZADD, which you will use to create the sorted sets.

    Once you have all the sorted sets created in Redis’ memory, the rest is really fast.


    Creating sorted sets (Redis sample)

    ZADD col1 12.3 line1
    ZADD col1 79.3 line2
    ZADD col1 55.6 line3
    
    ZADD col2 37.4 line1
    ZADD col2 -6.3 line2
    ZADD col2 -7.3 line3
    

    PHP, finding ranges, intersection and count

    $COL1 = $redis->zrangebyscore('col1', -10, 10);
    $COL2 = $redis->zrangebyscore('col2', 2010, 2012);
    $count = count(array_intersect($COL1, $COL2));
    

    Hope that helps.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

At work, I have a large table (some 3 million rows, like 40-50 columns).
We have a large table (450 million rows containing 34 columns of numeric or
Say I have a large table, about 2 million rows and 50 columns. Using
I have a large table (~170 million rows, 2 nvarchar and 7 int columns)
I have a table with large number of rows(~200 million) and I want to
I have a large table (~2 million rows), each row of which represents one
I have a table with around 6 million rows and has many columns with
I have a pretty large table: 20+ million rows and I need to update
I have a really large table (around 32 columns and 1000+ rows) which I'm
I have a large table (100000+ rows) in which are 4 columns of same

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.