I have two arrays, which can look like this: X = np.array([ 157, 262,

Question

0

Asked: June 15, 20262026-06-15T19:58:53+00:00 2026-06-15T19:58:53+00:00

I have two arrays, which can look like this: X = np.array([ 157, 262,

0

I have two arrays, which can look like this:

X = np.array([ 157,  262,  368,  472,  577,  682,  786,  891,  996, 1100, 1204,
       1310, 1415, 1520, 1625, 1731, 1879])

Y = np.array([  30,  135,  240,  345,  450,  555,  660,  765,  870,  975, 1080,
       1185, 1290, 1395, 1500, 1605])

The arrays will:

Have values sorted in ascending order from start.
Be of unequal length at times.

I want to interleave these two into a new array Z based on the following:

Each element may only be used once
All elements need not be used
An element Xi may only be included in Z if there is an element Yj in Y such that there are no other elements in Y with value difference smaller than abs(Xi - Yj) and that there are no element in X for which the value distance to Yj is smaller than abs(Xi - Yj). (The same rule applies to elements in Y.)

I see that I can do this with a bunch of nested for loops, but I wonder if there is some smarter, neater way of doing this?

(I realize, the way I put the question, that it sounds like cut from a textbook. It is not. But maybe it is a classic sort function, who knows, but for me as a biologist… all I can say is I’m at a loss as how to solve it in an efficient, neat way.)

Edit: Not so pretty example

new_list = list()
for i in X:
    delta_i = np.abs(Y - i)
    delta_reciprocal = np.abs(X - Y[delta_i.argmin()])
    if delta_i.min() == delta_reciprocal.min():
        new_list += sorted([Y[delta_i.argmin()],
        X[delta_reciprocal.argmin()]])
Z = np.array(new_list)

I’m not even totally sure it fulfills all the criteria, but when rewriting the old code I got down to just one loop… still there must be some nicer way!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-15T19:58:54+00:00

Let’s try to work out the solution for this example:

In [1]: import numpy as np

In [5]: X = np.array([1879, 1731])

In [6]: Y = np.array([1481, 1691, 1586, 1796])

We can compute all the distances between values in X and values in Y like this:

In [7]: dist = np.abs(np.subtract.outer(X,Y))

In [8]: dist
Out[8]: 
array([[398, 188, 293,  83],
       [250,  40, 145,  65]])

The rows correspond to X values, the columns correspond to Y values.

To find the X values which are closest to some element in Y, we are looking
for the X which corresponds to a minimum value in a column of the dist
matrix. Each column corresponds to a particular Y, so the minimum distance in
a column corresponds to the minimum between some X and a particular Y.

Visually speaking, what we are looking for are values in dist which are
minimums for both the row that they are in, and the column that they are
in. Let’s call them “row-column minimums”.

In the dist array above, 40 is a row-column minimum. 65 is a column-minimum,
but not a row-column minimum.

For each column, we can find the X-index which minimizes the column this way:

In [6]: idx1 = np.argmin(dist, axis = 0)

In [7]: idx1
Out[7]: array([1, 1, 1, 1])

Similarly, for each row, we can find the Y-index this way:

In [8]: idx2 = np.argmin(dist, axis = 1)

In [9]: idx2
Out[9]: array([3, 1])

Now, let’s forget about this example for a second and suppose idx1 looked like this:

        0,1,2,3,4,5   # the index value 
idx1 = (_,_,_,_,_,2,...)

This is saying in the 5th column, row 2 has the minimum value.

Then if row 2, column 5 were to correspond to a row-column minimum, then idx2
would have to look like this:

        0,1,2        # index value
idx2 = (_,_,5,...)

We can express this relationship in NumPy with

idx1[idx2] == np.arange(len(X))
idx2[idx1] == np.arange(len(Y))

So the X, Y values which correspond to row-column minimums are

X[idx1[idx2] == np.arange(len(X))]

and

Y[idx2[idx1] == np.arange(len(Y))]

import numpy as np
tests = [
    (np.array([1879, 1731]),
     np.array([1481, 1691, 1586, 1806])), 
    (np.array([1879, 1731]),
     np.array([1481, 1691, 1586, 1796])),
    (np.array([ 157,  262,  368,  472,  577,  682,  786,  891,  996, 1100, 1204]),
     np.array([  30,  135,  240,  345,  450,  555,  660,  765,  870,  975])),
    (np.array([ 157, 262, 368, 472, 577, 682, 786, 891, 996, 1100, 1204, 1310,
                1415, 1520, 1625, 1731, 1879]),
     np.array([ 221, 326, 431, 536, 641, 746, 851, 956, 1061, 1166, 1271, 1376,
                1481, 1586, 1691, 1796]))]

def find_close(X,Y):
    new_list = list()
    for i in X:
        delta_i = np.abs(Y - i)
        # print(delta_i)
        delta_reciprocal = np.abs(X - Y[delta_i.argmin()])
        if delta_i.min() == delta_reciprocal.min():
            new_list += sorted([Y[delta_i.argmin()],
                                X[delta_reciprocal.argmin()]])
    Z = np.array(new_list)
    return Z

def alt_find_close(X,Y):
    dist = np.abs(np.subtract.outer(X,Y))
    idx1 = np.argmin(dist, axis = 0)
    idx2 = np.argmin(dist, axis = 1)
    Z = np.r_[X[idx1[idx2] == np.arange(len(X))], Y[idx2[idx1] == np.arange(len(Y))]]
    return Z

for X, Y in tests:
    assert np.allclose(sorted(find_close(X,Y)), sorted(alt_find_close(X,Y)))

Timeit results:

% python -mtimeit -s'import test' 'test.find_close(test.X, test.Y)'
1000 loops, best of 3: 454 usec per loop
% python -mtimeit -s'import test' 'test.alt_find_close(test.X, test.Y)'
10000 loops, best of 3: 40.6 usec per loop

So alt_find_close is significantly faster than find_close.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have two arrays, which can look like this: X = np.array([ 157, 262,

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply