Suppose I have a matrix in the CSR format, what is the most efficient way to set a row (or rows) to zeros?
The following code runs quite slowly:
A = A.tolil()
A[indices, :] = 0
A = A.tocsr()
I had to convert to scipy.sparse.lil_matrix because the CSR format seems to support neither fancy indexing nor setting values to slices.
I guess scipy just does not implement it, but the CSR format would support this quite well, please read the wikipedia article on “Sparse matrix” about what
indptr, etc. are:Of course this removes 0s that were set from another place with
eliminate_zerosfrom the sparsity pattern. If you want to do that (at this point) depends on what you are doing really, ie. elimination might make sense to delay until all other calculations that might add new zero’s are done as well, or in some cases you may have 0 values, that you want to change again later, so it would be very bad to eliminate them!You could in principle of course short-circuit the
eliminate_zerosandprune, but that should be a lot of hassle, and might be even slower (because you won’t do it in C).Details about eliminiate_zeros (and prune)
The sparse matrix, does generally not save zero elements, but just stores where the nonzero elements are (roughly and with various methods).
eliminate_zerosremoves all zeros in your matrix from the sparsity pattern (ie. there is no value stored for that position, when before there was a vlaue stored, but it was 0). Eliminate is bad if you want to change a 0 to a different value lateron, otherwise, it saves space.Prune would just shrink the data arrays stored when they are longer then necessary. Note that while I first had
A.prune()in there,A.eliminiate_zeros()already includes prune.