Given any n x n matrix of real coefficients A, we can define a bilinear form bA : Rn x Rn → R by
bA(x, y) = xTAy ,
and a quadratic form qA : Rn → R by
qA(x) = bA(x, x) = xTAx .
(For most common applications of quadratic forms qA, the matrix A is symmetric, or even symmetric positive definite, so feel free to assume that either one of these is the case, if it matters for your answer.)
(Also, FWIW, bI and qI (where I is the n x n identity matrix) are, respectively, the standard inner product, and squared L2-norm on Rn, i.e. xTy and xTx.)
Now suppose I have two n x m matrices, X and Y, and an n x n matrix A. I would like to optimize the computation of both bA(x,i, y,i) and qA(x,i) (where x,i and y,i denote the i-th column of X and Y, respectively), and I surmise that, at least in some environments like numpy, R, or Matlab, this will involve some form of vectorization.
The only solution I can think of requires generating diagonal block matrices [X], [Y] and [A], with dimensions mn x m, mn x m, and mn x mn, respectively, and with (block) diagonal elements x,i, y,i, and A, respectively. Then the desired computations would be the matrix multiplications [X]T[A][Y] and [X]T[A][X]. This strategy is most definitely uninspired, but if there is a way to do it that is efficient in terms of both time and space, I’d like to see it. (It goes without saying that any implementation of it that does not exploit the sparsity of these block matrices would be doomed.)
Is there a better approach?
My preference of system for doing this is numpy, but answers in terms of some other system that supports efficient matrix computations, such as R or Matlab, may be OK too (assuming that I can figure out how to port them to numpy).
Thanks!
Of course, computing the products XTAY and XTAX would compute the desired bA(x,i, y,i) and qA(x,i) (as the diagonal elements of the resulting m x m matrices), along with the O(m2) irrelevant bA(x,i, y,j) and bA(x,i, x,j), (for i ≠ j), so this is a non-starter.
Here’s a solution in numpy that should give you what you’re looking for:
This does matrix multiplication for XT * A, then does element-by-element array multiplication to multiply by YT. The rows of the resulting array are then summed to yield a 1-D array.