Matrix decomposition, also known as matrix factorization, involves describing a given matrix using its constituent elements.
Perhaps the most known and widely used matrix decomposition method is the Singular-Value Decomposition, or SVD.
All matrices have an SVD, which makes it more stable than other methods, such as the eigendecomposition.
As such, it is often used in a wide array of applications including compressing, denoising, and data reduction.
This tutorial is divided into 5 parts; they are:
- What is the Singular-Value Decomposition
- Calculate Singular-Value Decomposition
- Reconstruct Matrix
- Pseudoinverse
- Dimensionality Reduction
A. What is the Singular-Value Decomposition
The Singular-Value Decomposition, or SVD for short, is a matrix decomposition method for reducing a matrix to its constituent parts in order to make certain subsequent matrix calculations simpler.
For the case of simplicity we will focus on the SVD for real-valued matrices and ignore the case for complex numbers. A = U · Σ · V T.
Where A is the real n × m matrix that we wish to decompose, U is an m × m matrix, Σ represented by the uppercase Greek letter sigma) is an m × n diagonal matrix, and V T is the V transpose of an n × n matrix where T is a superscript.
The diagonal values in the Σ matrix are known as the singular values of the original matrix A.
The columns of the U matrix are called the left-singular vectors of A, and the columns of V are called the right-singular vectors of A.
The SVD is calculated via iterative numerical methods.
The singular value decomposition (SVD) provides another way to factorize a matrix, into singular vectors and singular values. The SVD allows us to discover some of the same kind of information as the eigendecomposition. However, the SVD is more generally applicable.
The singular value decomposition (SVD) has numerous applications in statistics, machine learning, and computer science. Applying the SVD to a matrix is like looking inside it with X-ray vision...
B. Calculate Singular-Value Decomposition
# singular-value decomposition
from numpy import array
from scipy.linalg import svd
# define a matrix
A = array([
[1, 2],
[3, 4],
[5, 6]])
print(A)
# factorize
U, s, V = svd(A)
print(U)
print(s)
print(V)
from numpy import array
from scipy.linalg import svd
# define a matrix
A = array([
[1, 2],
[3, 4],
[5, 6]])
print(A)
# factorize
U, s, V = svd(A)
print(U)
print(s)
print(V)
-----Result-----
[[1 2]
[3 4]
[5 6]]
[3 4]
[5 6]]
[[-0.2298477 0.88346102 0.40824829]
[-0.52474482 0.24078249 -0.81649658]
[-0.81964194 -0.40189603 0.40824829]]
[ 9.52551809 0.51430058]
[[-0.61962948 -0.78489445]
[-0.78489445 0.61962948]]
C. Reconstruct Matrix
The original matrix can be reconstructed from the U, Σ, and V T elements. The U, s, and V elements returned from the svd() cannot be multiplied directly.
The s vector must be converted into a diagonal matrix using the diag() function. By default, this function will create a square matrix that is m × m, relative to our original matrix.
This causes a problem as the size of the matrices do not fit the rules of matrix multiplication, where the number of columns in a matrix must match the number of rows in the subsequent matrix.
After creating the square Σ diagonal matrix, the sizes of the matrices are relative to the original n × m matrix that we are decomposing, as follows: U(m × m) · Σ(m × m) · V T (n × n)
Where, in fact, we require:
U(m × m) · Σ(m × n) · V T (n × n)
We can achieve this by creating a new Σ matrix of all zero values that is m × n (e.g. more rows) and populate the first n × n part of the matrix with the square diagonal matrix calculated via diag().
# reconstruct rectangular matrix from svd
from numpy import array
from numpy import diag
from numpy import zeros
from scipy.linalg import svd
# define matrix
A = array([
[1, 2],
[3, 4],
[5, 6]])
print(A)
# factorize
U, s, V = svd(A)
# create m x n Sigma matrix
Sigma = zeros((A.shape[0], A.shape[1]))
# populate Sigma with n x n diagonal matrix
Sigma[:A.shape[1], :A.shape[1]] = diag(s)
# reconstruct matrix
B = U.dot(Sigma.dot(V))
print(B)
from numpy import array
from numpy import diag
from numpy import zeros
from scipy.linalg import svd
# define matrix
A = array([
[1, 2],
[3, 4],
[5, 6]])
print(A)
# factorize
U, s, V = svd(A)
# create m x n Sigma matrix
Sigma = zeros((A.shape[0], A.shape[1]))
# populate Sigma with n x n diagonal matrix
Sigma[:A.shape[1], :A.shape[1]] = diag(s)
# reconstruct matrix
B = U.dot(Sigma.dot(V))
print(B)
-----Result-----
[[1 2]
[3 4]
[5 6]]
[[1. 2.]
[ 3. 4.]
[ 5. 6.]]
D. Pseudoinverse
Matrix inversion is not defined for matrices that are not square.
When A has more columns than rows, then solving a linear equation using the pseudoinverse provides one of the many possible solutions.
The pseudoinverse is denoted as A+, where A is the matrix that is being inverted and + is a superscript.
The pseudoinverse is calculated using the singular value decomposition of A: A+ = V · D+ · UT
Where A+ is the pseudoinverse, D+ is the pseudoinverse of the diagonal matrix Σ and VT is the transpose of V.
We can get U and V from the SVD operation. A = U · Σ · VT
The D+ can be calculated by creating a diagonal matrix from Σ, calculating the reciprocal of each non-zero element in Σ, and taking the transpose if the original matrix was rectangular.
The pseudoinverse provides one way of solving the linear regression equation, specifically when there are more rows than there are columns, which is often the case.
# pseudoinverse
from numpy import array
from numpy.linalg import pinv
# define matrix
A = array([
[0.1, 0.2],
[0.3, 0.4],
[0.5, 0.6],
[0.7, 0.8]])
print(A)
# calculate pseudoinverse
B = pinv(A)
print(B)
from numpy import array
from numpy.linalg import pinv
# define matrix
A = array([
[0.1, 0.2],
[0.3, 0.4],
[0.5, 0.6],
[0.7, 0.8]])
print(A)
# calculate pseudoinverse
B = pinv(A)
print(B)
-----Result-----
[[ 0.1 0.2]
[ 0.3 0.4]
[ 0.5 0.6]
[ 0.7 0.8]]
[[ -1.00000000e+01 -5.00000000e+00 9.04289323e-15 5.00000000e+00]
[ 8.50000000e+00 4.50000000e+00 5.00000000e-01 -3.50000000e+00]]
E. Dimensionality Reduction
A popular application of SVD is for dimensionality reduction.
Data with a large number of features, such as more features (columns) than observations (rows) may be reduced to a smaller subset of features that are most relevant to the prediction problem.
The result is a matrix with a lower rank that is said to approximate the original matrix.
To do this we can perform an SVD operation on the original data and select the top k largest singular values in Σ.
The scikit-learn provides a TruncatedSVD class that implements this capability directly.
# svd data reduction in scikit-learn
from numpy import array
from sklearn.decomposition import TruncatedSVD
# define matrix
A = array([
[1,2,3,4,5,6,7,8,9,10],
[11,12,13,14,15,16,17,18,19,20],
[21,22,23,24,25,26,27,28,29,30]])
print(A)
# create transform
svd = TruncatedSVD(n_components=2)
# fit transform
svd.fit(A)
# apply transform
result = svd.transform(A)
print(result)
from numpy import array
from sklearn.decomposition import TruncatedSVD
# define matrix
A = array([
[1,2,3,4,5,6,7,8,9,10],
[11,12,13,14,15,16,17,18,19,20],
[21,22,23,24,25,26,27,28,29,30]])
print(A)
# create transform
svd = TruncatedSVD(n_components=2)
# fit transform
svd.fit(A)
# apply transform
result = svd.transform(A)
print(result)
-----Result-----
[[ 1 2 3 4 5 6 7 8 9 10]
[11 12 13 14 15 16 17 18 19 20]
[21 22 23 24 25 26 27 28 29 30]]
[11 12 13 14 15 16 17 18 19 20]
[21 22 23 24 25 26 27 28 29 30]]
[[ 18.52157747 6.47697214]
[ 49.81310011 1.91182038]
[ 81.10462276 -2.65333138]]
No comments:
Post a Comment