这是我参与8月更文挑战的第20天,活动详情查看:8月更文挑战
Notes of Andrew Ng’s Machine Learning —— (2) Linear Algebra Review
Matrices and Vectors
- Matrices are 2-dimensional arrays:
The above matrix has four rows and three columns, so it is a 4 x 3 matrix
.
- Vector are matrices with one column and many rows:
The above vector is a 4 x 1 matrix
.
Notation and terms
- refers to the element in the ith row and jth column of matrix A.
- A vector with 'n' rows is referred to as a
'n'-dimensional vector
. - refers to the element in the ith row of the vector.
- In general, all our vectors and matrices will be
1-indexed
, which refers that it's beginning from1
. Note that this is different to lots of programming languages. - Matrices are usualy denoted by uppercase names while vectors are lowercase.
Scalar
means that an object is a single value, not a vector or matrix.- referss to the set of scalar real numbers.
- refers to the set of n-dimensional vectors of real numbers.
in Octave/Matlab
% The ; denotes we are going back to a new row.
A = [1, 2, 3; 4, 5, 6; 7, 8, 9; 10, 11, 12]
% Initialize a vector
v = [1;2;3]
% Get the dimension of the matrix A where m = rows and n = columns
[m,n] = size(A)
% You could also store it this way
dim_A = size(A)
% Get the dimension of the vector v
dim_v = size(v)
% Now let's index into the 2nd row 3rd column of matrix A
A_23 = A(2,3)
Output:
A =
1 2 3
4 5 6
7 8 9
10 11 12
v =
1
2
3
m = 4
n = 3
dim_A =
4 3
dim_v =
3 1
A_23 = 6
Addition and Scalar Multiplication
Addition
Addition and subtraction are element-wise, so you simply add or subtract each corresponding element:
To add or subtract two matrices, their dimensions must be the same.
Scalar multiplication
In scalar multiplication, we simply multiply every element by the scalar value:
in Octave/Matlab
% Initialize matrix A and B
A = [1, 2, 4; 5, 3, 2]
B = [1, 3, 4; 1, 1, 1]
% Initialize constant s
s = 2
% See how element-wise addition works
add_AB = A + B
% See how element-wise subtraction works
sub_AB = A - B
% See how scalar multiplication works
mult_As = A * s
% Divide A by s
div_As = A / s
% A Matrix + scalar will get Matrix + a new matrix that each element equals the scalar
add_As = A + s
Output:
A =
1 2 4
5 3 2
B =
1 3 4
1 1 1
s = 2
add_AB =
2 5 8
6 4 3
sub_AB =
0 -1 0
4 2 1
mult_As =
2 4 8
10 6 4
div_As =
0.50000 1.00000 2.00000
2.50000 1.50000 1.00000
add_As =
3 4 6
7 5 4
Matrix-Vector Multiplication
We map the column of the vector onto each row of the matrix, multiplying each element and summing the result.
The result is a vector. The number of columns of the matrix must equal the number of rows of the vector.
An m x n matrix
multiplied by an n x 1 vector
results in an m x 1 vector
.
in Octave/Matlab
% Initialize matrix A
A = [1, 2, 3; 4, 5, 6;7, 8, 9]
% Initialize vector v
v = [1; 1; 1]
% Multiply A * v
Av = A * v
Output:
A =
1 2 3
4 5 6
7 8 9
v =
1
1
1
Av =
6
15
24
Neat Trick
Say, we have a set of four sizes of houses, and we have a hypotheses for predictiong what the price of a house. We are going to compute of each of our 4 houses:
House sizes:
Hypothesis:
It turns out there's neat way of posing this, applying this hypothesis to all of my houses at the same time via a Matrix-Vector multiplication.
- Construct a
DataMatrix
:
- Put
Parameters
to a vector:
- Then, the
Predictions
will be clear by calculate a Matrix-Vector Multiplication:
The reuslt will be something like this:
Obviously, it's equal to the codes below:
for (i = 0; i < X.size(); i++) {
Predictions[i] = h(X[i]);
}
However, our new trick simplifies the code, makes it more readable as well as driving it faster to be solved in most programming languages, we just construct two matrices and do a multiplication:
DataMatrix = [...]
Parameters = [...]
Predictions = DataMatrix * Parameters
Matrix-Matrix Multiplication
We multiply two matrices by breaking it into serveral vector multiplications and concatenating the result.
An m x n matrix
multiplied by an n x o matrix
result in an m x o
matrix (). In the above example, a 3 x 2 matrix times a 2 x 2 matrix resulted in a 3 x 2 matrix.
To multiply two matrices, the number of columns of the first matrix must equal the number of rows of the second matrix.
in Octave/Matlab
A = [1, 2; 3, 4; 5, 6]
B = [7, 8; 9, 10]
A*B
Output:
A =
1 2
3 4
5 6
B =
7 8
9 10
ans =
25 28
57 64
89 100
Neat Trick
Let's say, as befor, that we have four houses, and we want to predict their prices. Ony now, we have three competing hypotheses. We want to apply all three competing hypotheses to all four Xs. It turns out we can do that very efficiently using a matrix-matrix multiplication.
Matrix Multiplication Properties
Non-commutative
Matrices are not commutative:
Associative
Matrices are associative:
Identity matrix
Identity matrix
: a matrix that simply has 1
's on the diagonal (upper left to lower right diagonal) and 0
's elsewhere.
The identity matrix, when multiplied by any matrix of the same dimensions, results in the original matrix. It's just like multiplying numbers by 1.
Notice that when doing A*I
, the I
should match the matrix's columns and when doing I*A
, the I
should match the matrix's rows:
in Octave/Matlab
% Initialize random matrices A and B
A = [1,2;4,5]
B = [1,1;0,2]
% Initialize a 2 by 2 identity matrix
I = eye(2)
% The above notation is the same as I = [1,0;0,1]
% What happens when we multiply I*A ?
IA = I*A
% How about A*I ?
AI = A*I
% Compute A*B
AB = A*B
% Is it equal to B*A?
BA = B*A
% Note that IA = AI but AB != BA
Output:
A =
1 2
4 5
B =
1 1
0 2
I =
Diagonal Matrix
1 0
0 1
IA =
1 2
4 5
AI =
1 2
4 5
AB =
1 5
4 14
BA =
5 7
8 10
Inverse and Transpose
Inverse
The inverse of a matrix is denoted . Multiplying by the inverse results in the identity matrix:
A non square matrix does not have an inverse matrix. We can compute inverses of matrices in octave with the pinv(A)
function and in Matlab with the inv(A)
function. Matrices that don't have an inverse are singular or degenerate.
In practice, when we are using normal equation with Octave, there are two functions to inverse a Matrix -- pinv and inv. For some mathematically reason, The pinv(A)
will always offer us the value of data that we want, even if A is non-invertible.
Transpose
The transposition of a matrix is like rotating the matrix 90º in clockwise direction and then reversing it.
In other words: Let be an matrix, and let . Then is an matrix, and .
We can compute transposition of matrices in matlab with the transpose(A)
function or A'
in Octave/Matlab
% Initialize matrix A
A = [1,2,0;0,5,6;7,0,9]
% Transpose A
A_trans = A'
% Take the inverse of A
A_inv = inv(A)
% What is A^(-1)*A?
A_invA = inv(A)*A
Output:
A =
1 2 0
0 5 6
7 0 9
A_trans =
1 0 7
2 5 0
0 6 9
A_inv =
0.348837 -0.139535 0.093023
0.325581 0.069767 -0.046512
-0.271318 0.108527 0.038760
A_invA =
1.00000 -0.00000 0.00000
0.00000 1.00000 -0.00000
-0.00000 0.00000 1.00000