如何为ANN梯度下降绘制矢量场和梯度图5/5 - (1票) 这是《神经网络中的梯度下降--ANN学习简单指南--Finx

在这篇文章中，你将学习如何制作该文章中的图形，特别是矢量场

数据可视化是探索性数据分析中的一项启蒙任务，它是基于在现有数据中寻找关系和结构。

然而，还有很多东西需要绘制。

一个例子是成本函数的梯度。正如你在机器学习的道路上可能看到的那样，寻找正确模型的一个基本部分是最小化一个函数。要做到这一点，通常需要依靠梯度下降法。

那么，为什么不绘制梯度图，以了解它要带我们去哪里？

如何绘制二维矢量图

矢量和矢量场是通过PyPlot对象中的 matplotlib.pyplot.quiver()方法绘制。

让我们从创建一个图形开始，然后根据需要对quiver进行微调，以获得正确的比例。

plt.figure()
ax = plt.subplot(111)

用四分仪绘制的矢量主要有以下输入。X、Y、U、V。X、Y是其尾部的坐标，U、V有两种不同的作用，取决于 "*角度 "*参数的值。首先，当我们不指定角度时，U、V代表矢量头部的坐标。

我们用一个例子来说明。

假设我们想绘制一个从原点[0, 1] ，指向[2,3] 的矢量。然后我们把每一个条目作为一个一维数组。

ax.quiver([0], [1], [2], [3])

然后，我们的第一个矢量就出来了!

(如果你不是在笔记本上，请不要忘记plt.show() )

更有趣的是绘制几个箭头（也就是一个箭垛）。例如，让我们把矢量的起点设为[1，0]、[2，0]、[3，0]、[4，0]，它们的头部设为[5，3]、[6，3]、[7，3]和[8，3]。

我们将尾部的所有第一和第二坐标收集为1D数组。

X = [1, 2, 3, 4] 
Y = [0, 0, 0, 0]

头部的情况也一样。

U = [5, 6, 7, 8]
V = [3, 3, 3, 3]

所得的颤音通过命令被添加到一个新的图形中。

plt.figure()
ax1 = plt.subplot(111)
ax1.quiver(X, Y, U, V)

不过，它的外观并不令人满意。

X坐标肯定被弄乱了，因为第一个头（应该在[5,3]处）应该在最右边的尾巴（正确地绘制在[4,0]处）上方交叉。

纠正这种非对称性的最简单方法是设置参数angles,scale 和scale_units 。最后，我们的代码将看起来像。

ax1.quiver(X, Y, U, V, angles='xy', scale=1, scale_units='xy')

然而，当我们设置angles='xy' ，U, V的作用更类似于切线矢量，而不是头部的坐标。

在以下意义上：认为你坐在[4，0]坐标上，从那里你想移动到[8，3]。那么，你的X，Y坐标将是[4，0]（因为它已经是了），但新的U，V坐标将是你从一个坐标到另一个坐标的运动矢量。

U, V = [8 - 4, 3 - 0]

使用np.array，可以很容易地得到新的U、V坐标。

X = np.array(X)
Y = np.array(Y)
U = np.array(U)
V = np.array(V)

U_new = U - X
V_new = V - X

回头再看情节。

plt.figure()
ax2 = plt.subplot(111)
ax2.quiver(X, Y, U_new, V_new, angles='xy', scale=1, scale_units='xy')

哎呀......超出了范围

plt.figure()
ax2 = plt.subplot(111)
ax2.quiver(X, Y, U_new, V_new, angles='xy', scale=1, scale_units='xy')
ax2.set_xlim((-1,10))
ax2.set_ylim((-1,5))
plt.show()

(耶！)

正如你所看到的，你可以像你习惯的那样管理pyplot.Axes 对象：包括一个标题，设置刻度线标签，添加坐标标签，等等。

在继续讨论矢量字段之前，我们观察到，当你把你的矢量放在一个列表/np.array 中时，列切片就有了用武之地。

XY = np.array( [ [1,0], [2,0], [3,0], [4,0] ])
UV = np.array( [ [5,3], [6,3], [7,3], [8,3] ])

UV_new = vectors_uv - vectors_xy


plt.figure()
ax2 = plt.subplot(111)
ax2.quiver(XY[:,0], XY[:,1], UV_new[:,0], UV_new[:,1], angles='xy', scale=1, scale_units='xy')
ax2.set_xlim((-1,10))
ax2.set_ylim((-1,5))
plt.show()

上面的代码产生了相同的结果。

我们的下一步是绘制一个矢量场。我们现在继续使用angles='xy' ，但在绘制三维图时要回到原来的角度约定。

如何绘制矢量场：np.meshgrid()

一个子集的向量场是一个向量族，向量场中的每个点都有一个。它通常由一个代数表达式给出，梯度场就是一个例子。

就像一个（切向）矢量代表一个（无限小的）运动一样，一个矢量场代表一个流（意味着，每一个点都在运动）。

我们用两个矢量场来说明：一个代表绕原点旋转，另一个是钟摆的相位空间场（关于最后一个的更多信息在这里--有趣的是，第一个是第二个的第一近似值，用于小运动）。

我们首先定义描述这两个场的函数。这些函数将有一个点的坐标作为输入，并将输出应该连接到该点的矢量。

def infinitesimal_rotation(x, y):
    u = y
    v = -x
    return [u,v] 

def phase_pendulum(x, y):
    u = y
    v = -np.sin(x)
    return[u, v]

注意，我们使用np.sin ，而不是math.sin ，因为我们将把x作为一个数组输入。

绘图的第二个成分是X、Y尾部坐标。幸运的是，Numpy为此提供了一个很好的解决方案。

我们所需要的是提供X和Y坐标族，比如说每个方向上从-5到5的30个标记。

该函数 [np.meshgrid()](https://blog.finxter.com/numpy-meshgrid-understanding-np-meshgrid-simple-guide/)将对这些坐标集进行交叉求导，并输出两个矩阵，其中包含X标记和Y标记的所有可能组合。

# x and y markers
x = np.linspace(-5,5,30)
y = np.linspace(-5,5,30)

# np.meshgrid(x,y) outputs two 30x30 matrices which will be our X and Y inputs
X, Y = np.meshgrid(x,y)

这里我们有30×30=900个点，均匀地分布在xy平面的[-5,5]x[-5,5]的正方形里。每个点都对应于两个矩阵的一个条目（i,j）：例如，网格中第34个点的x和y坐标分别由X中的（2,4）个条目和Y中的（2,4）个条目给出。

此外，如前所述，我们可以直接将X、Y应用于我们的函数，直接获得U、V坐标。

U_rot, V_rot = infinitesimal_rotation(X,Y)
U_pend, V_pend = phase_pendulum(X,Y)

最后，quiver方法允许二维数组输入（urra!），所以我们实际上已经准备好将字段付诸行动了（）。

plt.figure() 
ax_rot = plt.subplot(111)
ax_rot.quiver(X, Y, U_rot, V_rot, angles='xy', scale=1, scale_units='xy')
ax_rot.set_xlim((-5,5))
ax_rot.set_ylim((-5,5))
plt.show()

plt.figure() 
ax_pend = plt.subplot(111)
ax_pend.quiver(X, Y, U_pend, V_pend, angles='xy', scale=1, scale_units='xy')
ax_pend.set_xlim((-5,5))
ax_pend.set_ylim((-5,5))
plt.show()

很酷，不是吗？

尽管如此，我们还是要做一些调整，以示有趣。

plt.figure() 
ax_rot = plt.subplot(111)
ax_rot.quiver(X, Y, U_rot, V_rot, color='r', angles='xy', scale=1, scale_units='xy', alpha=.6)
ax_rot.set_xlim((-5,5))
ax_rot.set_ylim((-5,5))
plt.show()

x_pend = np.linspace(-15,15,90)
y_pend = np.linspace(-5,5,30)

X, Y = np.meshgrid(x_pend,y_pend)

U_pend,V_pend = phase_pendulum(X,Y)



plt.figure(figsize=(15,5)) 
ax_pend = plt.subplot(111)
ax_pend.quiver(X, Y, U_pend, V_pend, angles='xy', scale=4, scale_units='xy', alpha=.8, headaxislength=3, headlength=3, width=.001)
ax_pend.set_xlim((-15,15))
ax_pend.set_ylim((-5,5))
ax_pend.set_title('Phase Space - Non-linear Pendulum')
ax_pend.set_xlabel('Position')
ax_pend.set_ylabel('Momentum')
plt.show()

三维图谱和梯度

最后，matplotlib为3D绘图提供了类似的实现。唯一需要注意的是，没有选项可以将角度参数设置为 "xy"，我们必须明确提供头部和尾部的坐标。

在介绍该方法之前，让我们导入matplotlib的3D Axes对象并将其实例化。

from mpl_toolkits.mplot3d import axes3d

fig = plt.figure(figsize=(14,10))
ax = fig.add_subplot(111,projection='3d')

Quiver方法现在有6个主要输入。X、Y、Z、U、V、W。同样，X、Y、Z代表尾部的坐标，U、V、W代表头部的坐标。

它们可以作为任何类似数组的对象来传递（例如，meshgrid ，可以用三个数组来完成）。

P = [0, 1, 1]
Q = [1, 2, 1.5]

ax.quiver(*P, *Q)
ax.set_xlim((-1,3))
ax.set_ylim((-1,3))
ax.set_zlim((-1,3))
plt.show()

虽然现在不存在缩放问题了，但在二维平面屏幕上实现三维可视化始终是一个问题。不过，如果你使用的是Notebook，你可以求助于这个神奇的命令

%matplotlib notebook

把它包含在笔记本的最开始。它将允许你旋转绘图，这极大地改善了可视化效果。

接下来，我们回顾一下绘制的曲面。

我们首先定义这个图形的功能。

def f(x, y):
    return ((x-1)*(x+2)*x*(x-2) + 2*y ** 2)

曲面上的每一个点的形式都是[x, y, f(x,y)] 。因此，为了绘制它，我们定义一个新的网格，并通过对其应用函数来恢复第三个坐标。

x = np.linspace(-2, 2, 30)
y = np.linspace(-2, 2, 30)

X, Y = np.meshgrid(x, y)
Z = f(X, Y)

这样就得到了实际的曲面图。

surface = ax.plot_surface(X, Y, Z,  cmap='Blues', edgecolor='none', antialiased=False, alpha=.7)

为了增加右边的水平色块条，请输入。

fig.colorbar(surface, shrink=0.5, aspect=5)

现在，通过使用X、Y、Z三个表面点的坐标，可以无害地将一个向量添加到绘图的表面。

px = 2
py = 1.5
P = [px, py, f(px,py)]

Q = [1.5, 2, 1]

ax.quiver(*P, *Q, color='r')  # we change color for better visualization

但要看到矢量，就必须旋转曲面。如果你不在Jupyter Notebooks中，或者需要保存图形，知道以下命令是很有用的。

ax.view_init(elev=0., azim=30)

尽管如此，这个矢量并不是与曲面相切的。当一个矢量定义了在曲面内运动的粒子的速度时，它就是与曲面相切的。

另一方面，一个向上述方向运动的粒子会立即逃脱。也就是说，纠正最后一个矢量，使其相切是一件很容易的事。为了理解在这种情况下什么是切向，假设你是以.为坐标运动的。

特别是，你的速度必须是这样的：当且仅当每一瞬间t，你将停留在曲面上。

其中最后一个坐标的表达式来自于链式规则。让我们把这个向量表达式包在一个函数中。

def tangent_vector(x,y,u,v):
    '''
    Inputs a point in the plane xy and a direction vector and outputs the corresponding vector tangent to the graph
    '''
    grad_f = np.array([(x+2)*x*(x-2)+(x-1)*x*(x-2)+(x-1)*(x+2)*(x-2)+(x-1)*(x+2)*x , 4*y])
    q = np.array([u,v])
    return [u,v,np.dot(q,grad_f)]

如果我们保留Q的两个第一坐标方向，我们得到。

Q_tangent = tangent_vector(2, 1.5, 1.5, 2)

ax.quiver(*P, *Q_tangent, color='r', arrow_length_ratio=0.15) # some aesthetical touches

# and we keep a side view
ax.view_init(elev=9., azim=95)

作为最后一项任务，我们在三维图的xy平面上绘制梯度场。我们首先定义一个函数，返回f 的二维梯度向量。

def grad_f(x,y):
    '''
    Input a point in the plane and outputs the gradient at that point
    '''
    return np.array([(x+2)*x*(x-2)+(x-1)*x*(x-2)+(x-1)*(x+2)*(x-2)+(x-1)*(x+2)*x , 4*y])

不要忘记：由于我们是在xy平面上绘图，第三个坐标必须对应于z轴上的最低值。在本例中是-8。

U0,V0 = grad_f(X,Y)

Z0 = np.zeros_like(X) - 8 

ax.quiver(X, Y , Z0, X+U0, Y+V0, Z0, color='r', arrow_length_ratio=0.2, length=.1, normalize=True, linewidth=.6)

还要记住，这里的五线谱参数U、V代表箭头的头部。因此，我们必须将基点添加到grad_f 坐标中。此外，如果你想要反向梯度，请使用-U0，-V0。

ax.quiver(X, Y , Z0, X-U0, Y-V0, Z0, color='r', arrow_length_ratio=0.2, length=.1, normalize=True, linewidth=.6)

没错!我们已经做到了!

简而言之

我们刚刚回顾了如何在二维和三维轴图中绘制箭形（=一组箭形）。我们还可以用 [np.meshgrid()](https://blog.finxter.com/numpy-meshgrid-understanding-np-meshgrid-simple-guide/)作为一个辅助工具来绘制矢量场，就像下面的例子（重复）。

def phase_pendulum(x, y):
    u = y
    v = -np.sin(x)
    return[u, v]

x_pend = np.linspace(-15,15,90)
y_pend = np.linspace(-5,5,30)

X, Y = np.meshgrid(x_pend,y_pend)

U_pend,V_pend = phase_pendulum(X,Y)

plt.figure(figsize=(15,5)) 
ax_pend = plt.subplot(111)
ax_pend.quiver(X, Y, U_pend, V_pend, angles='xy', scale=4, scale_units='xy', alpha=.8, headaxislength=3, headlength=3, width=.001)
ax_pend.set_xlim((-15,15))
ax_pend.set_ylim((-5,5))
ax_pend.set_title('Phase Space - Non-linear Pendulum')
ax_pend.set_xlabel('Position')
ax_pend.set_ylabel('Momentum')
plt.show()

此外，我们可以用它来绘制与曲面相切的向量和梯度场。下面是我们例子中的完整代码。

# Importing and instanciating a 3D axes
from mpl_toolkits.mplot3d import axes3d

fig = plt.figure(figsize=(14,10))
ax = fig.add_subplot(111,projection='3d')

# Defining the desired function
def f(x, y):
    return ((x-1)*(x+2)*x*(x-2) + 2*y ** 2)

# Getting a grid to plot
x = np.linspace(-2, 2, 30)
y = np.linspace(-2, 2, 30)

X, Y = np.meshgrid(x, y)
Z = f(X, Y)

# The actual plot
surface = ax.plot_surface(X, Y, Z,  cmap='Blues', edgecolor='none', antialiased=False, alpha=.7)

# To add the horizontal colormap bar on the right, type:
fig.colorbar(surface, shrink=0.5, aspect=5)

# Making a vector tangent
def tangent_vector(x,y,u,v):
    '''
    Inputs a point in the plane xy and a direction vector and outputs the corresponding vector tangent to the graph
    '''
    grad_f = np.array([(x+2)*x*(x-2)+(x-1)*x*(x-2)+(x-1)*(x+2)*(x-2)+(x-1)*(x+2)*x , 4*y])
    q = np.array([u,v])
    return [u,v,np.dot(q,grad_f)]

# Plotting a tangent vector
Q_tangent = tangent_vector(2, 1.5, 1.5, 2)
ax.quiver(*P, *Q_tangent, color='r', arrow_length_ratio=0.15) # some aesthetical touches

# Making a side view
ax.view_init(elev=9., azim=95)

# Plotting the Gradient field
def grad_f(x,y):
    '''
    Input a point in the plane and outputs the gradient at that point
    '''
    return np.array([(x+2)*x*(x-2)+(x-1)*x*(x-2)+(x-1)*(x+2)*(x-2)+(x-1)*(x+2)*x , 4*y])

U0,V0 = grad_f(X,Y)
Z0 = np.zeros_like(X) - 8 # Putting it in the lower bottom of the figure
ax.quiver(X, Y , Z0, X+U0, Y+V0, Z0, color='r', arrow_length_ratio=0.2, length=.1, normalize=True, linewidth=.6)

如果你还没有玩够，我们在这里为读者留下两个练习题

不把梯度场作为xy平面上的矢量场来绘制，而是作为与图形相切的矢量场来绘制
绘制梯度下降的路径（作为一个曲线）。

编码愉快!