I have some board numpy arrays like that: array([[0, 0, 0, 1, 0, 0,

Question

0

Asked: June 4, 20262026-06-04T15:26:59+00:00 2026-06-04T15:26:59+00:00

I have some board numpy arrays like that: array([[0, 0, 0, 1, 0, 0,

0

I have some board numpy arrays like that:

array([[0, 0, 0, 1, 0, 0, 0, 0],
       [1, 0, 0, 0, 0, 1, 0, 1],
       [0, 0, 0, 0, 0, 0, 0, 1],
       [0, 1, 0, 1, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 1],
       [0, 0, 0, 0, 1, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 1, 0],
       [1, 0, 0, 0, 0, 1, 0, 0]])

And I’m using the following code to find the sum of elements on each nth diagonal from -7 to 8 of the board (and the mirrored version of it).

n = 8
rate = [b.diagonal(i).sum()
        for b in (board, board[::-1])
        for i in range(-n+1, n)]

After some profiling, this operation is taking about 2/3 of overall running time and it seems to be because of 2 factors:

The .diagonal method builds a new array instead of a view (looks like numpy 1.7 will have a new .diag method to solve that)
The iteration is done in python inside the list comprehension

So, there are any methods to find these sums faster (possibly in the C layer of numpy)?

After some more tests, I could reduce 7.5x the total time by caching this operation… Maybe I was looking for the wrong bottleneck?

One more thing:

Just found the .trace method that replaces the diagonal(i).sum() thing and… There wasn’t much improvement in performance (about 2 to 4%).

So the problem should be the comprehension. Any ideas?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-04T15:27:01+00:00

There’s a possible solution using stride_tricks. This is based in part on the plethora of information available in the answers to this question, but the problem is just different enough, I think, not to count as a duplicate. Here’s the basic idea, applied to a square matrix — see below for a function implementing the more general solution.

>>> cols = 8
>>> a = numpy.arange(cols * cols).reshape((cols, cols))
>>> fill = numpy.zeros((cols - 1) * cols, dtype='i8').reshape((cols - 1, cols))
>>> stacked = numpy.vstack((a, fill, a))
>>> major_stride, minor_stride = stacked.strides
>>> strides = major_stride, minor_stride * (cols + 1)
>>> shape = (cols * 2 - 1, cols)
>>> numpy.lib.stride_tricks.as_strided(stacked, shape, strides)
array([[ 0,  9, 18, 27, 36, 45, 54, 63],
       [ 8, 17, 26, 35, 44, 53, 62,  0],
       [16, 25, 34, 43, 52, 61,  0,  0],
       [24, 33, 42, 51, 60,  0,  0,  0],
       [32, 41, 50, 59,  0,  0,  0,  0],
       [40, 49, 58,  0,  0,  0,  0,  0],
       [48, 57,  0,  0,  0,  0,  0,  0],
       [56,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  7],
       [ 0,  0,  0,  0,  0,  0,  6, 15],
       [ 0,  0,  0,  0,  0,  5, 14, 23],
       [ 0,  0,  0,  0,  4, 13, 22, 31],
       [ 0,  0,  0,  3, 12, 21, 30, 39],
       [ 0,  0,  2, 11, 20, 29, 38, 47],
       [ 0,  1, 10, 19, 28, 37, 46, 55]])
>>> diags = numpy.lib.stride_tricks.as_strided(stacked, shape, strides)
>>> diags.sum(axis=1)
array([252, 245, 231, 210, 182, 147, 105,  56,   7,  21,  42,  70, 105,
       147, 196])

Of course, I have no idea how fast this will actually be. But I bet it will be faster than a Python list comprehension.

For convenience, here’s a fully general diagonals function. It assumes you want to move the diagonal along the longest axis:

def diagonals(a):
    rows, cols = a.shape
    if cols > rows:
        a = a.T
        rows, cols = a.shape
    fill = numpy.zeros(((cols - 1), cols), dtype=a.dtype)
    stacked = numpy.vstack((a, fill, a))
    major_stride, minor_stride = stacked.strides
    strides = major_stride, minor_stride * (cols + 1)
    shape = (rows + cols - 1, cols)
    return numpy.lib.stride_tricks.as_strided(stacked, shape, strides)

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have some board numpy arrays like that: array([[0, 0, 0, 1, 0, 0,

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply