I have a large table with 9 columns and 12 million rows, like this:

Question

0

Asked: June 5, 20262026-06-05T12:37:17+00:00 2026-06-05T12:37:17+00:00

I have a large table with 9 columns and 12 million rows, like this:

0

I have a large table with 9 columns and 12 million rows, like this:

col1  col2  col3  col4  col5  col6  col7  col8  col9
12.3  37.4  7771  -675  -23   23.8  78.8  -892  67.5
79.3  -6.3  6061  -555  -24   28.1  77.1  -889  32.6
55.6  -7.3  8888  -921  -56   78.3  22.3  -443  22.9
....  ....  ....  ....  ....  ....  ....  ....  ....

Currently the table is saved as TSV (tab-separated vector) format in my hard disk, 432MB in size. I want to populate the table into Redis in order to complete this kind of query most efficiently: Given a min value and a max value for each column, count the number of rows that are within the given range, i.e.

(min_col1 <= col1 <= max_col1) &&
(min_col2 <= col2 <= max_col2) &&
(min_col3 <= col3 <= max_col3) &&
(min_col4 <= col4 <= max_col4) &&
(min_col5 <= col5 <= max_col5) &&
(min_col6 <= col6 <= max_col6) &&
(min_col7 <= col7 <= max_col7) &&
(min_col8 <= col8 <= max_col8) &&
(min_col9 <= col9 <= max_col9)

So my questions are:

1) How to populate the table into Redis? What kind of key/value data structure should I use? Hashes, lists, sets, sorted sets, or what else?

2) After populating the table, given 9 min and max values for the 9 columns, how to write the query in order to get the count, i.e. number of rows falling within the 9 ranges? One way I can think of is, first find out the rows that satisfy (min_colX <= colX <= max_colX) for each X in 1 to 9, and then calculate their intersection. But I guess this is not the most efficient way. I just want to retrieve the count as fast as possible.

By the way, I have tried MongoDB. It is straightforward to populate the table using mongoimport, but it takes 10 seconds to complete my query, which is too slow and not acceptable for my real-time application. In contrast, Redis holds data in memory, so I hope Redis can shorten the query time to 1 second.

For your reference, this is what I did in MongoDB.

mongoimport -u my_username -p my_password -d my_db -c my_coll --type tsv --file my_table.tsv --headerline
use my_db
db.my_coll.ensureIndex({col1:1, col2:1, col3:1, col4:1, col5:1, col6:1, col7:1, col8:1, col9:1 }).
db.my_coll.count({ col1: {$gte: min_col1, $lte: max_col1), col2: {$gte: min_col2, $lte: max_col2}, col3: {$gte: min_col3, $lte: max_col3}, col4: {$gte: min_col4, $lte: max_col4}, col5: {$gte: min_col5, $lte: max_col5}, col6: {$gte: min_col6, $lte: max_col6}, col7: {$gte: min_col7, $lte: max_col7}, col8: {$gte: min_col8, $lte: max_col8}, col9: {$gte: min_col9, $lte: max_col9} }).

I used explain() to make sure the Btree index was actually used rather than a table scan.

I also tried creating a ram disk and saving the my MongoDB database into the ram disk, it shortened the query time from 10s to 9s, far from acceptable for my real-time application.

mkdir ~/ram
chmod -R 755 ~/ram
mount -t tmpfs none ~/ram -o size=8192m
mongod --dbpath ~/ram --noprealloc --smallfiles

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-05T12:37:18+00:00

Make each col a sorted set, then use ZRANGEBYSCORE on each key, and do the intersection and count in the application. I use phpredis and I do that a lot in memory, using array_intersect.

The perfomance problem is in ZADD, which you will use to create the sorted sets.

Once you have all the sorted sets created in Redis’ memory, the rest is really fast.

Creating sorted sets (Redis sample)

ZADD col1 12.3 line1
ZADD col1 79.3 line2
ZADD col1 55.6 line3

ZADD col2 37.4 line1
ZADD col2 -6.3 line2
ZADD col2 -7.3 line3

PHP, finding ranges, intersection and count

$COL1 = $redis->zrangebyscore('col1', -10, 10);
$COL2 = $redis->zrangebyscore('col2', 2010, 2012);
$count = count(array_intersect($COL1, $COL2));

Hope that helps.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a large table with 9 columns and 12 million rows, like this:

Leave an answerCancel reply

1 Answer

Creating sorted sets (Redis sample)

PHP, finding ranges, intersection and count

Leave an answer
Cancel reply