Python数据分析系列之Numpy常用操作第七篇

178 阅读2分钟

这是我参与11月更文挑战的第17天,活动详情查看:2021最后一次更文挑战

Numpy提供更丰富的索引数组的方法。
使用整数索引进行获取数据:

In [1]: import numpy as np

In [2]: data = np.arange(10)

In [3]: data
Out[3]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [4]: data = data ** 2

In [5]: data
Out[5]: array([ 0, 1, 4, 9, 16, 25, 36, 49, 64, 81])

In [8]: indices = np.array([0, 3, 5, 7])

In [9]: data[indices]
Out[9]: array([ 0, 9, 25, 49])

In [10]: indices = np.array([[0, 1], [2, 3]])

In [11]: data[indices]
Out[11]:
array([[0, 1],
[4, 9]])

使用布尔数组进行获取数据:

In [12]: data = np.arange(15).reshape(3, -1)

In [13]: data
Out[13]:
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])

In [14]: data2 = data >7

In [15]: data2
Out[15]:
array([[False, False, False, False, False],
[False, False, False, True, True],
[ True, True, True, True, True]])

In [16]: result = data[data2]

In [17]: result
Out[17]: array([ 8, 9, 10, 11, 12, 13, 14])

可以把布尔数组中为True的值批量的统一修改为某个值
In [18]: data[data2] = 99

In [19]: data
Out[19]:
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 99, 99],
[99, 99, 99, 99, 99]])
基本的数组操作函数:

In [20]: data
Out[20]:
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 99, 99],
[99, 99, 99, 99, 99]])
矩阵转置
In [21]: data.transpose()
Out[21]:
array([[ 0, 5, 99],
[ 1, 6, 99],
[ 2, 7, 99],
[ 3, 99, 99],
[ 4, 99, 99]])

Numpy小技巧:
In [22]: data
Out[22]:
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 99, 99],
[99, 99, 99, 99, 99]])
当改变数组形状时,可以把某一个数字设为-1, Numpy会自动找到准确合适的值
In [23]: data.reshape(5, -1)
Out[23]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 99],
[99, 99, 99],
[99, 99, 99]])
如何合并数组:
把两个相同大小的一维数组合并为一个数组
In [25]: data = np.arange(10)

In [26]: data
Out[26]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [27]: data2 = np.arange(10)**2

In [28]: data2
Out[28]: array([ 0, 1, 4, 9, 16, 25, 36, 49, 64, 81])

In [30]: np.hstack([data, data2])
Out[30]:
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 4, 9, 16, 25, 36, 49, 64, 81])

把两个相同大小的一维数组合并为一个两维数组
In [31]: data
Out[31]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [32]: data2
Out[32]: array([ 0, 1, 4, 9, 16, 25, 36, 49, 64, 81])

In [33]: np.vstack([data, data2])
Out[33]:
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[ 0, 1, 4, 9, 16, 25, 36, 49, 64, 81]])