I’m trying to figure out a way to gather a dataset without having to

Question

0

Editorial Team

Asked: June 3, 20262026-06-03T02:59:13+00:00 2026-06-03T02:59:13+00:00

I’m trying to figure out a way to gather a dataset without having to

0

I’m trying to figure out a way to gather a dataset without having to loop 700,000 mysql queries.

I have two tables

users with

id autoincrement, 
time timestamp, 
username varchar(200), 
email varchar(100), 
ip varchar(20)

and uniq_ip with

ip unique varchar(20), 
most_recent datetime, 
count (int)

users has 25 million rows and records the activity of users as they work on the site. uniq_ip has a list of all the IP numbers and how many times it’s listed in users (on trigger update).

At the moment, while daydream coding, I get a list of all the IPs from uniq_ip and loop them to get the most recent 2000 records for each of those IPs. As uniq_ip has 700,000 rows, this loop is really nasty, making 700,000 queries total, using

select * from users where ip = '$outerloopip' order by `time` desc limit 2000;

I’m trying to get a single query that will grab the most recent 2000 listings for each of the IPs. If 1.2.3.4 is listed 10,000 times, I just want the most recent 2000, based on the time field.

Any ideas how to do it in one query?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-03T02:59:15+00:00

I’m sorry about previous answer and re-read and applied updated query. I missed and thought you wanted only most recent 2000 IP addresses. Anyhow, this one does ALL IP addresses and limits the total records per IP to 2,000 entries with most recent at the top. I would ensure you have an index on

(IP,TIME DESC)

Then, try this query. The critical thing I missed to clarify. The HAVING clause is applied AFTER any group-by or order-by clause. So the data is pre-returned in proper order of IP address and date/time DESCENDING, then the @sql variables are applied. Once the record is qualified and READY to be added to the final result set, the HAVING clause is applied. At THAT moment, it looks at the sequence counter and says… if its greater than 2000, throw it out and move on to the next record.

By my original query, it was saving everything, then cycling through a second time and kicking out those greater than 2000 which was probably why it was blowing your disk space away.

select
      U.*,
      @LastSeq := IF( @LastIP = U.IP, @LastSeq +1, 1 ) as IPSequence,
      @LastIP := U.IP as carryForNextRecord
   from 
      ( select @LastIP := '', @LastSeq := 0 ) sqlvars,
      Users U
   order by
      U.IP,
      U.time DESC
   having 
      IPSequence <= 2000

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to figure out a way to gather a dataset without having to

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply