Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8249205
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 7, 20262026-06-07T23:27:22+00:00 2026-06-07T23:27:22+00:00

Following up on a previous question , is there a preferred efficient manner to

  • 0

Following up on a previous question, is there a preferred efficient manner to get the type of each object within a column? This is specifically for the case where the dtype of the column is object to allow for heterogeneous types among the elements of the column (in particular, allowing for numeric NaN without changing the data type of the other elements to float).

I haven’t done time benchmarking, but I am skeptical of the following immediately obvious way that comes to mind (and variants that might use map or filter). The use cases of interest need to quickly get info on the types of all elements, so generators and the like probably won’t be an efficiency boon here.

# df is a pandas DataFrame with some column 'A', such that
# df['A'].dtype is 'object'

dfrm['A'].apply(type) #Or np.dtype, but this will fail for native types.

Another thought was to use the NumPy vectorize function, but is this really going to be more efficient? For example, with the same setup as above, I could try:

import numpy as np
vtype = np.vectorize(lambda x: type(x)) # Gives error without lambda

vtype(dfrm['A'])

Both ideas lead to workable output, but it’s the efficiency I’m worried about.

Added

I went ahead and did a tiny benchmark in IPython. First is for vtype above, then for the apply route. I repeated it a dozen or so times, and this example run is pretty typical on my machine.

The apply() approach clearly wins, so is there a good reason to expect that I won’t get more efficient than with apply()?

For vtype()

In [49]: for ii in [100,1000,10000,100000,1000000,10000000]:
   ....:     dfrm = pandas.DataFrame({'A':np.random.rand(ii)})
   ....:     dfrm['A'] = dfrm['A'].astype(object)
   ....:     dfrm['A'][0:-1:2] = None
   ....:     st_time = time.time()
   ....:     tmp = vtype(dfrm['A'])
   ....:     ed_time = time.time()
   ....:     print "%s:\t\t %s"%(ii, ed_time-st_time)
   ....:     
100:         0.0351531505585
1000:        0.000324010848999
10000:       0.00209212303162
100000:      0.0224051475525
1000000:     0.211136102676
10000000:    2.2215731144

For apply()

In [50]: for ii in [100,1000,10000,100000,1000000,10000000]:
   ....:     dfrm = pandas.DataFrame({'A':np.random.rand(ii)})
   ....:     dfrm['A'] = dfrm['A'].astype(object)
   ....:     dfrm['A'][0:-1:2] = None
   ....:     st_time = time.time()
   ....:     tmp = dfrm['A'].apply(type)
   ....:     ed_time = time.time()
   ....:     print "%s:\t %s"%(ii, ed_time-st_time)
   ....:     
100:         0.000900983810425
1000:        0.000159025192261
10000:       0.00117015838623
100000:      0.0111050605774
1000000:     0.103563070297
10000000:    1.03093600273
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-07T23:27:25+00:00Added an answer on June 7, 2026 at 11:27 pm

    Series.apply and Series.map use a specialized Cython method (pandas.lib.map_infer) I wrote that is roughly 2x faster than using numpy.vectorize.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

In a previous question ( Get object call hierarchy ), I got this interesting
This problem follows on from a previous question . When I run the following
Okay, this is following on from my previous question reguarding performing a simple ajax
This question is following a previous one which i posted: Django Callback on Facebook
Partly following on from this question. Hopefully the example speaks for itself: there's a
Following Jonathan Holland's suggestion in his comment for my previous question: Is there any
Following a previous question ( How to do this query in Mysql? ) Lets
Following on from my previous question I have been working on getting my object
me again. Sorry about this. Following on from my previous question (I think I
Following my previous question , I need to create a value consisting of a

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.