torch.autograd.backward

1. Jacobian matrix

In vector calculus, the Jacobian matrix is the matrix of all first-order partial derivatives of a vector-valued function. When the matrix is a square matrix, both the matrix and its determinant are referred to as the Jacobian in literature.

Suppose ${\bf f}:{\Bbb R}^{n} \to {\Bbb R}^{m}$ is a function which takes as input the vector ${\bf x} \in {\Bbb R}^{n}$ and produces as output the vector ${\bf f}({\bf x}) \in {\Bbb R}^{m}$ . Then the Jacobian matrix ${\bf J}$ of ${\bf f}$ is an $m \times n$ matrix, usually defined and arranged as follows:

{\bf J}=
\begin{bmatrix}
\frac{\partial {\bf f}}{\partial x_1} & \cdots & \frac{\partial {\bf f}}{\partial x_n}
\end{bmatrix}=
\begin{bmatrix}
\frac{\partial f_1}{\partial x_1} & \cdots & \frac{\partial f_1}{\partial x_n} \\
\vdots & \ddots & \vdots \\
\frac{\partial f_m}{\partial x_1} & \cdots & \frac{\partial f_m}{\partial x_n}
\end{bmatrix}

or, component-wise:

{\bf J}_{ij} = \frac{\partial f_i}{\partial x_j}

2. Examples

假设:

x \to y \to z, \\
y=f(x), z=g(y), \\
f: {\Bbb R}^{a} \to {\Bbb R}^{b}, \\
g: {\Bbb R}^{b} \to {\Bbb R}^{c}

则有：

{\bf J_{x \to y}}=
\begin{bmatrix}
\frac{\partial f_1}{\partial x_1} & \cdots & \frac{\partial f_1}{\partial x_a} \\
\vdots & \ddots & \vdots \\
\frac{\partial f_b}{\partial x_1} & \cdots & \frac{\partial f_b}{\partial x_a}
\end{bmatrix},
{\bf J_{y \to z}}=
\begin{bmatrix}
\frac{\partial g_1}{\partial y_1} & \cdots & \frac{\partial g_1}{\partial y_b} \\
\vdots & \ddots & \vdots \\
\frac{\partial g_c}{\partial y_1} & \cdots & \frac{\partial g_c}{\partial y_b}
\end{bmatrix}

令

{\bf x}
=
\begin{bmatrix}
x_{11}, & x_{12}
\end{bmatrix}
=
\begin{bmatrix}
1, & 2
\end{bmatrix},
{\bf y}
=
\begin{bmatrix}
2x_1 + x_2^2, & x_1^2 + 2x_2^3
\end{bmatrix}

所以：

{\bf J_{x \to y}}
=
\begin{bmatrix}
2, & 2x_2 \\
2x_1, & 6x_2^2
\end{bmatrix}
=
\begin{bmatrix}
2, & 4 \\
2, & 24
\end{bmatrix}

假设 $y.backward(torch.tensor(k_1, k_2))$ ，则结果为： $k_1 * [2, 2x_2] + k2 * [2x_1, 6x_2^2]$

>>> import torch

>>> x = torch.tensor([[1, 2]], requires_grad=True)
>>> y = torch.zeros(1,2)
>>> y[0,0] = 2 * x[0,0] + x[0,1] ** 2
>>> y[0,1] = x[0,0] ** 2 + 2 * x[0,1] ** 3
>>> print(x, y)
tensor([[ 1,  2]]) tensor([[  6.,  17.]])

>>> y.backward(torch.tensor([[1,0]]), retain_graph=True)
>>> print(x.grad)
tensor([[ 2,  4]])

>>> x.grad.zero_()
>>> y.backward(torch.tensor([[0,1]]), retain_graph=True)
>>> print(x.grad)
tensor([[  2,  24]])

>>> x.grad.zero_()
>>> y.backward(torch.tensor([[1,2]]), retain_graph=True)
>>> print(x.grad)
tensor([[  6,  52]])

>>> x.grad.zero_()
>>> y.backward(torch.tensor([[2,1]]), retain_graph=True)
>>> print(x.grad)
tensor([[  6,  32]])

即：

(k1,k2)=(1,0) $\longrightarrow$ 1 * [2, 4] + 0 * [2, 24] = [2, 4]
(k1,k2)=(0,1) $\longrightarrow$ 0 * [2, 4] + 1 * [2, 24] = [2, 24]
(k1,k2)=(1,2) $\longrightarrow$ 1 * [2, 4] + 2 * [2, 24] = [6, 52]
(k1,k2)=(2,1) $\longrightarrow$ 2 * [2, 4] + 1 * [2, 24] = [6, 32]

PyTorch自动求导机制（1）

torch.autograd.backward

1. Jacobian matrix

2. Examples