Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6980313
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 27, 20262026-05-27T18:00:00+00:00 2026-05-27T18:00:00+00:00

Is there any fast way to obtain unique elements in numpy? I have code

  • 0

Is there any fast way to obtain unique elements in numpy? I have code similar to this (the last line)

tab = numpy.arange(100000000)

indices1 = numpy.random.permutation(10000)
indices2 = indices1.copy()
indices3 = indices1.copy()
indices4 = indices1.copy()

result = numpy.unique(numpy.array([tab[indices1], tab[indices2], tab[indices3], tab[indices4]]))

This is just an example and in my situation indices1, indices2,...,indices4 contains different set of indices and have various size. The last line is executed many times and Inoticed that it’s actually the bottleneck in my code ({numpy.core.multiarray.arange} to be precesive). Besides, ordering is not important and element in indices array are of int32 type. I was thinking about using hashtable with element value as key and tried:

seq = itertools.chain(tab[indices1].flatten(), tab[indices2].flatten(), tab[indices3].flatten(), tab[indices4].flatten())
myset = {}
map(myset.__setitem__, seq, [])
result = numpy.array(myset.keys())

but it was even worse.

Is there any way to speed this up? I guess the performance penalty comes from ‘fancy indexing’ that copy the array but I need the resulting element only to read (I don’t modify anything).

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-27T18:00:00+00:00Added an answer on May 27, 2026 at 6:00 pm

    Sorry I don’t completely understand your question, but I’ll do my best to help.

    Fist {numpy.core.multiarray.arange} is numpy.arange not fancy indexing, unfortunately fancy indexing does not show up as a separate line item in the profiler. If you’re calling np.arange in the loop you, should see if you can move it outside.

    In [27]: prun tab[tab]
         2 function calls in 1.551 CPU seconds
    
    Ordered by: internal time
    
    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    1.551    1.551    1.551    1.551 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler'    objects}
    
    In [28]: prun numpy.arange(10000000)
         3 function calls in 0.051 CPU seconds
    
    Ordered by: internal time
    
    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.047    0.047    0.047    0.047 {numpy.core.multiarray.arange}
        1    0.003    0.003    0.051    0.051 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
    

    Second I assume that tab is not np.arange(a, b) in your code, because if it is than tab[index] == index + a, but I assume that was just for your example.

    Third, np.concatenate is about 10 times faster than np.array

    In [47]: timeit numpy.array([tab[indices1], tab[indices2], tab[indices3], tab[indices4]])
    100 loops, best of 3: 5.11 ms per loop
    
    In [48]: timeit numpy.concatenate([tab[indices1], tab[indices2], tab[indices3],     tab[indices4]])
    1000 loops, best of 3: 544 us per loop
    

    (Also np.concatenate gives a (4*n,) array and np.array gives a (4, n) array, where n is the length if indices[1-4]. The latter will only work if the indices1-4 are all the same length.)

    And last, you could also save even more time if you can do the following:

    indices = np.unique(np.concatenate((indices1, indices2, indices3, indices4)))
    result = tab[indices]
    

    Doing it in this order is faster because you reduce the number of indices you need to look up in tab, but it’ll only work if you know that the elements of tab are unique (otherwise you could get repeats in result even if the indices are unique).

    Hope that helps

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Is there any fast way, or do I have to convert manually them? Thanks.
I need to check in some fast way if there is any text nodes
Is there any fast way in C (below 1 sec) to find the number
Is there any fast way to verify null arguments via attributes or something? Convert
Is there any fast way to determine the size of the largest strongly connected
Is there any fast way to flatten an array and select subkeys ('key'&'value' in
Is there any fast way to convert given byte (like, by number - 65
Is there any fast way to get all subarrays where a key value pair
Is there any way to create a fast connection between mysql and Flash that
Is there any fast way to find the largest power of 10 smaller than

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.