I have a script that reads system log files into pandas dataframes and produces

Question

0

Asked: June 18, 20262026-06-18T08:09:29+00:00 2026-06-18T08:09:29+00:00

I have a script that reads system log files into pandas dataframes and produces

0

I have a script that reads system log files into pandas dataframes and produces charts from those. The charts are fine for small data sets. But when I face larger data sets due to larger timeframe of data gathering, the charts become too crowded to discern.

I am planning to resample the dataframe so that if the dataset passes certain size, I will resample it so there are ultimately only the SIZE_LIMIT number of rows. This means I need to filter the dataframe so every n = actual_size/SIZE_LIMIT rows would aggregated to a single row in the new dataframe. The agregation can be either average value or just the nth row taken as is.

I am not fully versed with pandas, so may have missed some obvious means.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-18T08:09:30+00:00

You could use the pandas.qcut method on the index to divide the index into equal quantiles. The value you pass to qcut could be actual_size/SIZE_LIMIT.

In [1]: from pandas import *

In [2]: df = DataFrame({'a':range(10000)})

In [3]: df.head()

Out[3]:
   a
0  0
1  1
2  2
3  3
4  4

Here, grouping the index by qcut(df.index,5) results in 5 equally binned groups. I then take the mean of each group.

In [4]: df.groupby(qcut(df.index,5)).mean()

Out[4]:
                       a
[0, 1999.8]        999.5
(1999.8, 3999.6]  2999.5
(3999.6, 5999.4]  4999.5
(5999.4, 7999.2]  6999.5
(7999.2, 9999]    8999.5

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a script that reads system log files into pandas dataframes and produces

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply