Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8521873
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 11, 20262026-06-11T06:56:59+00:00 2026-06-11T06:56:59+00:00

I want to mark some quantiles in my data, and for each row of

  • 0

I want to mark some quantiles in my data, and for each row of the DataFrame, I would like the entry in a new column called e.g. “xtile” to hold this value.

For example, suppose I create a data frame like this:

import pandas, numpy as np
dfrm = pandas.DataFrame({'A':np.random.rand(100), 
                         'B':(50+np.random.randn(100)), 
                         'C':np.random.randint(low=0, high=3, size=(100,))})

And let’s say I write my own function to compute the quintile of each element in an array. I have my own function for this, but for example just refer to scipy.stats.mstats.mquantile.

import scipy.stats as st
def mark_quintiles(x, breakpoints):
    # Assume this is filled in, using st.mstats.mquantiles.
    # This returns an array the same shape as x, with an integer for which
    # breakpoint-bucket that entry of x falls into.

Now, the real question is how to use transform to add a new column to the data. Something like this:

def transformXtiles(dataFrame, inputColumnName, newColumnName, breaks):
    dataFrame[newColumnName] = mark_quintiles(dataFrame[inputColumnName].values, 
                                              breaks)
    return dataFrame

And then:

dfrm.groupby("C").transform(lambda x: transformXtiles(x, "A", "A_xtile", [0.2, 0.4, 0.6, 0.8, 1.0]))

The problem is that the above code will not add the new column “A_xtile”. It just returns my data frame unchanged. If I first add a column full of dummy values, like NaN, called “A_xtile”, then it does successfully over-write this column to include the correct quintile markings.

But it is extremely inconvenient to have to first write in the column for anything like this that I may want to add on the fly.

Note that a simple apply will not work here, since it won’t know how to make sense of the possibly differently-sized result arrays for each group.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-11T06:57:01+00:00Added an answer on June 11, 2026 at 6:57 am

    What problems are you running into with apply? It works for this toy example here and the group lengths are different:

    In [82]: df
    Out[82]: 
       X         Y
    0  0 -0.631214
    1  0  0.783142
    2  0  0.526045
    3  1 -1.750058
    4  1  1.163868
    5  1  1.625538
    6  1  0.076105
    7  2  0.183492
    8  2  0.541400
    9  2 -0.672809
    
    In [83]: def func(x):
       ....:     x['NewCol'] = np.nan
       ....:     return x
       ....: 
    
    In [84]: df.groupby('X').apply(func)
    Out[84]: 
       X         Y  NewCol
    0  0 -0.631214     NaN
    1  0  0.783142     NaN
    2  0  0.526045     NaN
    3  1 -1.750058     NaN
    4  1  1.163868     NaN
    5  1  1.625538     NaN
    6  1  0.076105     NaN
    7  2  0.183492     NaN
    8  2  0.541400     NaN
    9  2 -0.672809     NaN
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

To improve performance of our MVC3 application we want to mark some Controller's to
my problem is: I have an image and want to mark some areas over
I'm trying to parse text-formatting. I want to mark inline code, much like SO
I have some mark-up like: <th>Header <a class=uiFilter>Filter</a></th> and some code like: $('.uiGridHeader th').click(theadClick);
HI ALL, I have one Big pdf file.Now i want to mark some important
What I want to do is to mark some values in the cache as
I want to mark each value that comes out of my loop with a
DataGrid is binded to some DataTable. User changed some values. I want to mark
I want to style/mark a MenuItem in GWT MenuBar. So i have some logic
I want to mark some default constructors and setters as not available/recommended for use.

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.