Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7000225
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 27, 20262026-05-27T20:38:37+00:00 2026-05-27T20:38:37+00:00

I have two algorithms of finding primes, in Python. The inner loop of each

  • 0

I have two algorithms of finding primes, in Python. The inner loop of each one seems to be executed the same number of times, and is equally simple. However, one of them takes 10 times as much as the other. My question is:

Why? Is this some quirk of Python that can be optimized away (how?), or am I missing something else?

The problem I am solving is essentially from http://www.spoj.pl/problems/PRIME1/. In my case, I have N = 10 ** 9, delta = 10 ** 5, and I want to find all primes between N-delta and delta. I also have smallprimes, a pre-made list of all primes less than or equal to square root of N. The first algorithm is very simple:

def range_f1(lo, hi, smallprimes):
  """Finds all primes p with lo <= p <= hi. 

  smallprimes is the sorted list of all primes up to (at least) square root of hi.
  hi & lo might be large, but hi-lo+1 miust fit into a long."""

  primes =[]
  for i in xrange(hi-lo+1):
    n = lo + i

    isprime = True
    for p in smallprimes:
        if n % p == 0:
            isprime = False
            break

    if isprime:
        primes.append(n)

  return primes

Calling range_f1(N-delta,N,smallprimes) takes a long time (about 10 seconds). The inner loop is called 195170 times. I also have a version of this algorithm that replaces the loop with a list comprehension (That is the one I actually use for profiling; see the end of the question) but that is no faster.

The second version is (an ugly implementation of) the sieve of Eratosthenes:

def range_sieve(lo, hi, smallprimes):
  """Parameters as before"""

  # So ugly! But SO FAST!! How??

  delta = hi-lo+1
  iamprime = [True] * delta      # iamprime[i] says whether lo + i is prime
  if lo<= 1:
    iamprime[1 - lo] = False

  def sillyfun():      # For profiling & speed comparison
    pass

  for p in smallprimes:
    rem = lo % p
    if rem == 0:
        iamprime[0] = False
    for i in xrange(p - rem, delta, p):
        iamprime[i] = False
        sillyfun()

    if p >= lo and p <= hi:
        iamprime[p - lo] = True

  return [p + lo for (p, ami) in enumerate(iamprime) if ami]

This is about 10 times as fast, takes less than 2 seconds. However, the inner loop (sillyfun()) is called 259982 times, more than in the last case. I am at a loss to explain why this is fast.

I thought that maybe the reason is because the inner loop of the first algorithm contains modular arithmetic, while the second only has an assignment. However, the following seems to imply that assignment is no faster than modular arithmetic:

>>> from timeit import timeit
>>> timeit("10**9 % 2341234")
0.23445186469234613
>>> timeit("a[5000]=False", "a = [True] * 10000")
0.47924750212666822

Here’s the (less readable) version of the first algorithm I actually use:

def range_f2(lo, hi, smallprimes):

  primes =[]
  for i in xrange(hi-lo+1):
    n = lo + i

    try:
        (1 for p in smallprimes if n % p ==0).next()
    except StopIteration:
        primes.append(n)

  return primes

Here are the result of calling the profiler for range_f2() (note the number of time generating expression is evaluated):

>>> from cProfile import run as prof
>>> prof("range_f2(N-delta,N,sp)")
 200005 function calls in 13.866 CPU seconds

 Ordered by: standard name

 ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      1    0.000    0.000   13.866   13.866 <string>:1(<module>)
 195170   12.632    0.000   12.632    0.000 prime1.py:101(<genexpr>)
      1    1.224    1.224   13.865   13.865 prime1.py:90(range_f2)
   4832    0.009    0.000    0.009    0.000 {method 'append' of 'list' objects}
      1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Here is the profiler result for range_sieve():

>>> prof("range_sieve(N-delta,N,sp)")
259985 function calls in 1.370 CPU seconds

Ordered by: standard name
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1    0.003    0.003    1.370    1.370 <string>:1(<module>)
     1    0.877    0.877    1.367    1.367 prime1.py:39(range_sieve)
259982    0.490    0.000    0.490    0.000 prime1.py:51(sillyfun)
     1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Finally,here is he complete code that generates the lists of small primes (in a very silly way) so that you can check what results you get: http://pastebin.com/7sfN4mG4

UPDATE By popular demand, the profiling data for the first chunk of code. No data on how many times the inner loop is executed, but it seems pretty clear it’s the same as the third.

>>> prof("range_f1(N-delta,N,sp)")
      4835 function calls in 14.184 CPU seconds
Ordered by: standard name
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1    0.000    0.000   14.184   14.184 <string>:1(<module>)
     1   14.162   14.162   14.183   14.183 prime1.py:69(range_f1)
  4832    0.021    0.000    0.021    0.000 {method 'append' of 'list' objects}
     1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-27T20:38:38+00:00Added an answer on May 27, 2026 at 8:38 pm

    The difference is an algorithmic one. In the first version, trial division, you test each candidate against all small primes – that you don’t stop when the small prime exceeds candidate ** 0.5 doesn’t matter for the range(10**9 - 10**5 ,10**9) if smallprimes has a good upper bound, but it would if the length of the range were much larger in relation to the upper bound. For composites, that doesn’t incur much cost, since most of them have at least one small prime divisor. But for primes, you have to go all the way to N**0.5. There are roughly 10**5/log(10**9) primes in that interval, each of them is trial-divided by about 10**4.5/log(10**4.5) primes, so that makes about 1.47*10**7 trial divisions against primes.

    On the other hand, with the sieve, you only cross off composites, each composite is crossed off as many times as it has prime divisors, while primes aren’t crossed off at all (so primes are free). The number of prime divisors of n is bounded by (a multiple of) log(n) (that’s a crude upper bound, usually greatly overestimating), so that gives an upper bound of 10**5*log(10**9) (times a small constant) crossings-off, about 2*10**6. In addition to that, the crossing off may be less work than a division (don’t know about Python, it is for C arrays). So you’re doing less work with the sieve.

    Edit: collected the actual numbers for 10**9-10**5 to 10**9.

    Ticks: 259987
    4832
    Divisions: 20353799
    4832
    

    The sieve does only 259987 crossings-off, you see that the crude upper bound above is overestimating by a large factor. The trial division needs more than 20 million divisions, 16433632 of those for primes (x/log x underestimates the number of primes, for x = 10**4.5 by about 10%), 3435183 are used for the 3297 numbers in that range whose smallest prime factor is larger than n**(1/3).

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Do the two algorithms have the same theta characterization of Θ(n^2)? int sum =
I have two applications written in Java that communicate with each other using XML
I have two classes, and want to include a static instance of one class
I have two bitmaps, produced by different variations of an algorithm. I'd like to
I have a two part question Best-Practice I have an algorithm that performs some
What's the best algorithm for comparing two arrays to see if they have the
Have two folders with approx. 150 java property files. In a shell script, how
I have two arrays of animals (for example). $array = array( array( 'id' =>
I have two arrays of System.Data.DataRow objects which I want to compare. The rows
I have two elements: <input a> <input b onclick=...> When b is clicked, I

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.