Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7798985
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 2, 20262026-06-02T00:09:43+00:00 2026-06-02T00:09:43+00:00

I am getting really weird timings for the following code: import numpy as np

  • 0

I am getting really weird timings for the following code:

import numpy as np
s = 0
for i in range(10000000):
    s += np.float64(1) # replace with np.float32 and built-in float
  • built-in float: 4.9 s
  • float64: 10.5 s
  • float32: 45.0 s

Why is float64 twice slower than float? And why is float32 5 times slower than float64?

Is there any way to avoid the penalty of using np.float64, and have numpy functions return built-in float instead of float64?

I found that using numpy.float64 is much slower than Python’s float, and numpy.float32 is even slower (even though I’m on a 32-bit machine).

numpy.float32 on my 32-bit machine. Therefore, every time I use various numpy functions such as numpy.random.uniform, I convert the result to float32 (so that further operations would be performed at 32-bit precision).

Is there any way to set a single variable somewhere in the program or in the command line, and make all numpy functions return float32 instead of float64?

EDIT #1:

numpy.float64 is 10 times slower than float in arithmetic calculations. It’s so bad that even converting to float and back before the calculations makes the program run 3 times faster. Why? Is there anything I can do to fix it?

I want to emphasize that my timings are not due to any of the following:

  • the function calls
  • the conversion between numpy and python float
  • the creation of objects

I updated my code to make it clearer where the problem lies. With the new code, it would seem I see a ten-fold performance hit from using numpy data types:

from datetime import datetime
import numpy as np

START_TIME = datetime.now()

# one of the following lines is uncommented before execution
#s = np.float64(1)
#s = np.float32(1)
#s = 1.0

for i in range(10000000):
    s = (s + 8) * s % 2399232

print(s)
print('Runtime:', datetime.now() - START_TIME)

The timings are:

  • float64: 34.56s
  • float32: 35.11s
  • float: 3.53s

Just for the hell of it, I also tried:

from datetime import datetime
import numpy as np

START_TIME = datetime.now()

s = np.float64(1)
for i in range(10000000):
    s = float(s)
    s = (s + 8) * s % 2399232
    s = np.float64(s)

print(s)
print('Runtime:', datetime.now() - START_TIME)

The execution time is 13.28 s; it’s actually 3 times faster to convert the float64 to float and back than to use it as is. Still, the conversion takes its toll, so overall it’s more than 3 times slower compared to the pure-python float.

My machine is:

  • Intel Core 2 Duo T9300 (2.5GHz)
  • WinXP Professional (32-bit)
  • ActiveState Python 3.1.3.5
  • Numpy 1.5.1

EDIT #2:

Thank you for the answers, they help me understand how to deal with this problem.

But I still would like to know the precise reason (based on the source code perhaps) why the code below runs 10 times slow with float64 than with float.

EDIT #3:

I rerun the code under the Windows 7 x64 (Intel Core i7 930 @ 3.8GHz).

Again, the code is:

from datetime import datetime
import numpy as np

START_TIME = datetime.now()

# one of the following lines is uncommented before execution
#s = np.float64(1)
#s = np.float32(1)
#s = 1.0

for i in range(10000000):
    s = (s + 8) * s % 2399232

print(s)
print('Runtime:', datetime.now() - START_TIME)

The timings are:

  • float64: 16.1s
  • float32: 16.1s
  • float: 3.2s

Now both np floats (either 64 or 32) are 5 times slower than the built-in float. Still, a significant difference. I’m trying to figure out where it comes from.

END OF EDITS

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-02T00:09:45+00:00Added an answer on June 2, 2026 at 12:09 am

    Summary

    If an arithmetic expression contains both numpy and built-in numbers, Python arithmetics works slower. Avoiding this conversion removes almost all of the performance degradation I reported.

    Details

    Note that in my original code:

    s = np.float64(1)
    for i in range(10000000):
      s = (s + 8) * s % 2399232
    

    the types float and numpy.float64 are mixed up in one expression. Perhaps Python had to convert them all to one type?

    s = np.float64(1)
    for i in range(10000000):
      s = (s + np.float64(8)) * s % np.float64(2399232)
    

    If the runtime is unchanged (rather than increased), it would suggest that’s what Python indeed was doing under the hood, explaining the performance drag.

    Actually, the runtime fell by 1.5 times! How is it possible? Isn’t the worst thing that Python could possibly have to do was these two conversions?

    I don’t really know. Perhaps Python had to dynamically check what needs to be converted into what, which takes time, and being told what precise conversions to perform makes it faster. Perhaps, some entirely different mechanism is used for arithmetics (which doesn’t involve conversions at all), and it happens to be super-slow on mismatched types. Reading numpy source code might help, but it’s beyond my skill.

    Anyway, now we can obviously speed things up more by moving the conversions out of the loop:

    q = np.float64(8)
    r = np.float64(2399232)
    for i in range(10000000):
      s = (s + q) * s % r
    

    As expected, the runtime is reduced substantially: by another 2.3 times.

    To be fair, we now need to change the float version slightly, by moving the literal constants out of the loop. This results in a tiny (10%) slowdown.

    Accounting for all these changes, the np.float64 version of the code is now only 30% slower than the equivalent float version; the ridiculous 5-fold performance hit is largely gone.

    Why do we still see the 30% delay? numpy.float64 numbers take the same amount of space as float, so that won’t be the reason. Perhaps the resolution of the arithmetic operators takes longer for user-defined types. Certainly not a major concern.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Hi I'm getting nullpointerexception at rs.next() or rs.getString(1) it is really weird that sometimes
I'm not really getting how this code does what it does: char shellcode[] =
I'm getting a really weird FileNotFoundException thrown the first time I try to use
I'm just starting a new project and I am getting some really weird stuff
I'm running django on Dreamhost right now with fastcgi, and I'm getting really weird
I'm developing a Silverlight 3 app and getting this really weird error when I
Im getting a really weird result using == in MATLAB_R2009b on OS X. Example
I'm getting a really weird error where after I leave the for scope I
I have a really weird problem with background worker. Code is too complicated so
I'm getting some really weird issues with the Django admin application. I'm running everything

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.