Economical SVD
Posted in Computing 1 year, 5 months agoPrinciple Component Analysis (PCA) generates a rotation for the covariance matrix of random variables so that it becomes diagonal. You can do this by computing the eigenvalue decomposition of the covariance matrix. An easier method is to compute the singular value decomposition (SVD) of the data itself (which in general is non-square.) The eigenvalues of the covariance are the square of the singular values and the eigenvectors are only off by a constant factor.

Let’s do a quick check of the dimensions: If the original matrix with
the data Y is of the dimension m-by-n, then U is m-by-m, S is
m-by-n and V is n-by-n.
If your data comes from images, then for a 1024×1024 image, the number of rows in Y is on the order of a million. U in this case would be a million by a million. Ouch! Moreover, S is always diagonal and of the same dimensions as the original matrix. In general, this isn’t stored in a sparse fashion.
With PCA, you’re only interested in the first few principle components anyways, so there’s no point in computing all of the basis vectors when you don’t need them.
Enter Economical SVD.
I first came across this while reading the documentation for
VXL. Using the “economy” mode for the SVD only computes the first n
eigenvectors and eigenvalues. This results in U being m-by-n, S
being n-by-n and V being n-by-n. A huge saving in memory and
computation time!
In languages such as Matlab, calling this routine is as simple as sending a zero in the second argument. Wikipedia has a whole section on reduced SVDs.
