[An earlier version of this post got absolutely no response, so, in case this

Question

0

Asked: May 28, 20262026-05-28T06:02:19+00:00 2026-05-28T06:02:19+00:00

[An earlier version of this post got absolutely no response, so, in case this

0

[An earlier version of this post got absolutely no response, so, in case this was due to a lack of clarity, I’ve reworked it, with additional explanations and code comments.]

I want to compute a mean and standard deviation over elements of a numpy n-dimensional array that do not correspond to a single axis (but rather to k > 1 non-consecutive axes), and get the results collected in a new (n – k + 1)-dimensional array.

Does numpy include standard constructs to perform this operation efficiently?

The function mu_sigma copied below is my best attempt at solving this problem, but it has two glaring inefficiencies: 1) it requires making a copy of the original data; 2) it computes the mean twice (since the computation of the standard deviation requires computing the mean).

The mu_sigma function takes two arguments: box and axes. box is an n-dimensional numpy array (aka “ndarray”), and axes is a k-tuple of integers, representing (not necessarily consecutive) dimensions of box. The function returns a new (n – k + 1)-dimensional ndarray containing the mean and standard deviation computed over the “hyperslabs” of box represented by the k specified axes.

The code below also includes an example of mu_sigma in action. In this example, the box argument is a 4 x 2 x 4 x 3 x 4 ndarray of floating-point numbers, and the axes argument is the tuple (1, 3). (Hence, we have n == len(box.shape) == 5, and k == len(axes) == 2.) The result (which here I’ll call outbox) returned for this sample input is a 4 x 4 x 4 x 2 ndarray of floating point numbers. For each triplet of indices i, k, j (where each index ranges over the set {0, 1, 2, 3}), the element outbox[i, j, k, 0] is the mean of the 6 elements specified by the numpy expression box[i, 0:2, j, 0:3, k]. Similarly, outbox[i, j, k, 1] is the standard deviation of the same 6 elements. This means that the first n – k == 3 dimensions of the result range over the same indices as do the n – k non-axes dimensions of the input ndarray box, which in this case are dimensions 0, 2 and 4.

The strategy used in mu_sigma is to

permute the dimensions (using the transpose method) so that the axes specified in the function’s second argument are all put at the end; the remaining (non-axes) dimensions are left at the beginning (in their original ordering);
collapse the axes dimensions into one (by using the reshape method); the new “collapsed” dimension is now the last dimension of the reshaped ndarray;
compute an ndarray of the means using the last “collapsed” dimension as axis;
compute an ndarray of the standard deviations using the last “collapsed” dimension as axis;
return an ndarray obtained from concatenating the ndarrays produced in (3) and (4)

import numpy as np

def mu_sigma(box, axes):
    inshape = box.shape

    # determine the permutation needed to put all the dimensions given in axes
    # at the end (otherwise preserving the relative ordering of the dimensions)
    nonaxes = tuple([i for i in range(len(inshape)) if i not in set(axes)])

    # permute the dimensions
    permuted = box.transpose(nonaxes + axes)

    # determine the shape of the ndarray after permuting the dimensions and
    # collapsing the axes-dimensions; thanks to Bago for the "+ (-1,)"
    newshape = tuple(inshape[i] for i in nonaxes) + (-1,)

    # collapse the axes-dimensions
    # NB: the next line results in copying the input array
    reshaped = permuted.reshape(newshape)

    # determine the shape for the mean and std ndarrays, as required by
    # the subsequent call to np.concatenate (this reshaping is not necessary
    # if the available mean and std methods support the keepdims keyword;
    # instead, just set keepdims to True in both calls).
    outshape = newshape[:-1] + (1,)

    # compute the means and standard deviations
    mean = reshaped.mean(axis=-1).reshape(outshape)
    std = reshaped.std(axis=-1).reshape(outshape)

    # collect the results in a single ndarray, and return it
    return np.concatenate((mean, std), axis=-1)

inshape = 4, 2, 4, 3, 4
inbuf = np.array(map(float, range(np.product(inshape))))
inbox = np.ndarray(inshape, buffer=inbuf)
outbox = mu_sigma(inbox, tuple(range(len(inshape))[1::2]))

# "inline tests"
assert all(outbox[..., 1].ravel() ==
           [inbox[0, :, 0, :, 0].std()] * outbox[..., 1].size)
assert all(outbox[..., 0].ravel() == [float(4*(v + 3*w) + x)
                                      for v in [8*y - 1
                                                for y in [3*z + 1
                                                          for z in range(4)]]
                                      for w in range(4)
                                      for x in range(4)])