Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8599193
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 12, 20262026-06-12T01:20:32+00:00 2026-06-12T01:20:32+00:00

Using DataFrame (pandas as pd, numpy as np): test = pd.DataFrame({‘A’ : [10,11,12,13,15,25,43,70], ‘B’

  • 0

Using DataFrame (pandas as pd, numpy as np):

test = pd.DataFrame({'A' : [10,11,12,13,15,25,43,70],  
                     'B' : [1,2,3,4,5,6,7,8],  
                     'C' : [1,1,1,1,2,2,2,2]})


In [39]: test
Out[39]: 
    A  B  C
0  10  1  1
1  11  2  1
2  12  3  1
3  13  4  1
4  15  5  2
5  25  6  2
6  43  7  2
7  70  8  2

Grouping DF by ‘C’ and aggregating with np.mean (also sum, min, max) produces column-wise aggregation within groups:

In [40]: test_g = test.groupby('C')

In [41]: test_g.aggregate(np.mean)
Out[41]: 
       A    B
C            
1  11.50  2.5
2  38.25  6.5

However, it looks like aggregating using np.median produces DataFrame-wise aggregation within groups:

In [42]: test_g.aggregate(np.median)
Out[42]: 
      A     B
C            
1   7.0   7.0
2  11.5  11.5

(using groupby.median method seems to produce expected column-wise results though)

I would appreciate addressing following issues:

  1. What is the reason/mechanism of such an outcome?
  2. If this behaviour is confirmed, how does it affect recommended “best practices” of aggregating groupings? Could other aggregation functions work this way?
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-12T01:20:33+00:00Added an answer on June 12, 2026 at 1:20 am

    The reason is quite funny. Probably some pandas specialists would want to chime in, but it comes down to a ping-pong between numpy and pandas. Note that the documentation says:

    Function to use for aggregating groups. If a function, must either
    work when passed a DataFrame or when passed to DataFrame.apply. If
    pass a dict, the keys must be DataFrame column names

    The first thing is a 2D (array_like) the second method comes down to 1D array_likes being passed to the function you give in.

    This means aggregate passes first the 2D series in. In the first case (np.mean), numpy knows that arrays have a .mean attribute, so it does what it always does it calls this. However it calls it with axis=None (default for numpy). This makes Pandas throw an Exception (it wants axis to be 0 or 1 and never None) and it goes to the second step, which passes it as 1D and is foolproof.

    However, when you give in np.median numpy arrays do not have the .median attribute, so it does the normal numpy machinery, which is to flatten the array (ie, typically axis=None).

    The workaround would be to use test_g.aggregate([np.median, np.median]) to force it to always take the second path. or what would work too: test_g.aggregate(np.median, axis=0) which passes the axis=0 on into np.median and thus tells numpy how to handle it correctly. In generally I wonder if pandas should not at least throw a warning, afterall broadcasting the result to both columns should be almost never what is wanted.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm using the Pandas package and it creates a DataFrame object, which is basically
Is it possible to reindex a pandas DataFrame using a column made up of
I'm using by() to evaluate a function by factors in my dataframe, but I
I have the following pandas Dataframe: from pandas import DataFrame, MultiIndex index = MultiIndex.from_tuples(zip([21,22,23],[45,45,46]),
If I use the following methodology to construct a pandas.DataFrame , I get an
I need to extract parts of a dataframe, using the values which I have
I am trying to do a pivot table of frequency counts using Pandas. I
I worked now for quite some time using python and pandas for analysing a
I want to output a dataframe using R2HTML, and remove scientific notation. Ideas?
I have a dataframe generated from Python's Pandas package. How can I generate heatmap

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.