Vector Norms
5 minute read
In this section we will understand about Vector Norms.
Vector Norm:
It is a measure of the size of a vector or distance from the origin.
Vector norm is a function that maps a vector to a real number, i.e, \({\| \cdot \|} : \mathbb{R}^n \rightarrow \mathbb{R}\).
Vector Norm should satisfy following 3 properties:
- Non-Negativity:
Norm is always greater than or equal to zero,
\( {\| x \|} \ge 0\), and \( {\| x \|} = 0\), if and only if \(\vec{x} = \vec{0}\). - Homogeneity (or Scaling):
\( {\| \alpha x \|} = |\alpha| {\| x \|} \). - Triangle Inequality:
\( {\| x + y \|} \le {\| x \|} + {\| y \|} \).
P-Norm:
It is a generalised form of most common family of vector norms.
It is defined as:
We can change the value of \(p\) to get different norms.
L1 Norm:
It is the sum of absolute values of all the elements of a vector; also known as Manhattan distance.
p=1:
L2 Norm:
It is the square root of the sum of squares of all the elements of a vector; also known as Euclidean distance.
p=2:
L-\(infty\) Norm:
It is the maximum of absolute values of all the elements of a vector; also known as Chebyshev distance.
p=\infty:
- Let, vector \(\mathbf{x} = \begin{bmatrix} 3 \\ \\ -4 \end{bmatrix}\), then
\({\| x \|_1} = |3| + |-4| = 7\)
\({\| x \|_2} = \sqrt{3^2 + (-4)^2} = \sqrt{25} = 5\)
\({\| x \|_\infty} = max(|3|, |-4|) = max(3, 4) = 4\)
Matrix Norm:
It is a function that assigns non-negative size or magnitude to a matrix.
Matrix Norm is a function that maps a matrix to a non-negative real number, i.e, \({\| \cdot \|} : \mathbb{R}^{m \times n} \rightarrow \mathbb{R}\)
It should satisfy following 3 properties:
- Non-Negativity:
Norm is always greater than or equal to zero,
\( {\| x \|} \ge 0\), and \( {\| x \|} = 0\), if and only if \(\vec{x} = \vec{0}\). - Homogeneity (or Scaling):
\( {\| \alpha x \|} = |\alpha| {\| x \|} \). - Triangle Inequality:
\( {\| x + y \|} \le {\| x \|} + {\| y \|} \).
There are 2 types of matrix norms:
- Element wise norms, e.g,, Frobenius norm
- Vector induced norms Frobenius Norm:
Frobenius Norm:
It is equivalent to the Euclidean norm of the matrix if it were flattened into a single vector.
If A is a matrix of size \(m \times n\), then, Frobenius norm is defined as:
\(\sigma_i\) is the \(i\)th singular value of matrix A.
Vector Induced Norm:
It measures the maximum stretching a matrix can apply when multiplied with a vector,
where the vector has a unit length under the chosen vector norm.
Matrix Induced by Vector P-Norm:
P-Norm:
P=1 Norm:
P=\(\infty\) Norm:
P=2 Norm:
Also called Spectral norm, i.e, maximum factor by which the matrix can stretch a unit vector in Euclidean norm.
For example:
Let, matrix \(\mathbf{A} = \begin{bmatrix} a_{11} & a_{12} \\ \\ a_{21} & a_{22} \end{bmatrix}\), then find Frobenius norm.
\({\| A \|_F} = \sqrt{a_{11}^2 + a_{12}^2 + a_{21}^2 + a_{22}^2}\)
\(\mathbf{A}^T = \begin{bmatrix} a_{11} & a_{22} \\ \\ a_{12} & a_{22} \end{bmatrix}\)
=> \(\mathbf{A^TA} = \begin{bmatrix} a_{11} & a_{22} \\ \\ a_{12} & a_{22} \end{bmatrix} \begin{bmatrix} a_{11} & a_{12} \\ \\ a_{21} & a_{22} \end{bmatrix} = \begin{bmatrix} a_{11}^2 + a_{12}^2 & a_{11}.a_{21} + a_{12}.a_{22} \\ \\ a_{21}.a_{11} + a_{22}.a_{12} & a_{21}^2 + a_{22}^2 \end{bmatrix} \)
Therefore, \(Trace(\mathbf{A^TA}) = a_{11}^2 + a_{12}^2 + a_{21}^2 + a_{22}^2\)
=> \({\| A \|_F} = \sqrt{Trace(A^TA)} = \sqrt{a_{11}^2 + a_{12}^2 + a_{21}^2 + a_{22}^2}\)Let, matrix \(\mathbf{A} = \begin{bmatrix} 1 & -2 & 3 \\ \\ 4 & 5 & -6 \end{bmatrix}\), then
Column 1 absolute value sum = |1|+|4| = 5
Column 2 absolute value sum = |-2|+|5|= 7
Column 3 absolute value sum = |3|+|-6|= 9
Row 1 absolute value sum = |1|+|-2|+|3| = 6
Row 2 absolute value sum = |4|+|5|+|-6| = 15
\({\| A \|_1} = max(5,7,9) = 9\) = max column absolute value sum.
\({\| A \|_\infty} = max(6,15) = 15\) = max row absolute value sum.
- Let, matrix \(\mathbf{A} = \begin{bmatrix} 2 & 1 \\ \\ 1 & 2 \end{bmatrix}\), then find spectral norm.
Spectral norm can be found using the singular value decomposition, in order to get the largest singular value.
\({\| A \|_2} = \sigma_{max}(A) \)
\(\mathbf{A} = \mathbf{U \Sigma V^T} \) , where
\(\mathbf{U} = \mathbf{AA^T} \),
\(\mathbf{V} = \mathbf{A^TA} \)
Let’s find the largest eigen value of \(\mathbf{A^TA} \), square root of which will give the largest singular value of \(\mathbf{A}\).
\( \mathbf{V} = \mathbf{A^TA} =
\begin{bmatrix} 2 & 1 \\ \\ 1 & 2 \end{bmatrix}
\begin{bmatrix} 2 & 1 \\ \\ 1 & 2 \end{bmatrix}
= \begin{bmatrix} 5 & 4 \\ \\ 4 & 5 \end{bmatrix}
\)
Now, lets find the eigen values of the above matrix V:
\(det(V-\lambda I) = 0 \)
=> \(
det\begin{bmatrix} 5 - \lambda & 4 \\ \\ 4 & 5- \lambda \end{bmatrix} = 0
\)
=> \( (5 - \lambda)^2 - 16 = 0 \)
=> \( (5 - \lambda) = \pm 4 \)
=> \( (5 - \lambda) = 4 \) or \( (5 - \lambda) = -4 \)
=> \( \lambda = 1 \) or \( \lambda = 9 \)
=> Largest Singular Value = Square root of largest eigen value = \(\sqrt{9} = 3\)
Therefore, \({\| A \|_2} = \sigma_{max}(A) = 3\)
End of Section