张量 (Tensors) – PyTorch 的基石张量 (Tensors) – PyTorch 的基石 1. 什么是张

张量 (Tensors) – PyTorch 的基石

1. 什么是张量 (Tensor)？

核心定义： 在 PyTorch 中，张量 (Tensor) 是一种特殊的数据结构，与 NumPy 的 ndarray 非常相似。你可以把它理解为一个多维数组。它是 PyTorch 中存储和变换数据的基本单位。
与 NumPy 的关系： 如果你熟悉 NumPy，那么理解 PyTorch 张量会非常容易。它们共享许多相似的功能和操作。一个关键的区别是，PyTorch 张量可以利用 GPU 进行计算加速，这对于深度学习至关重要。
维度 (Dimensions) / 阶 (Rank)：
- 0-D 张量 (标量 Scalar)： 一个单独的数字。例如：torch.tensor(5)
- 1-D 张量 (向量 Vector)： 一组数字，类似列表。例如：torch.tensor([1, 2, 3])
- 2-D 张量 (矩阵 Matrix)： 一个数字的表格，有行和列。例如：torch.tensor([[1, 2], [3, 4]])
- 3-D 张量： 可以想象成一个立方体的数字。例如，在计算机视觉中，一张彩色图片通常表示为 (channels, height, width) 或 (height, width, channels) 的 3D 张量。
- N-D 张量： 可以有任意数量的维度。例如，在处理一批 (batch) 彩色图片时，数据通常是 4D 张量 (batch_size, channels, height, width)。

2. 创建张量

PyTorch 提供了多种创建张量的方法：

直接从数据创建：torch.tensor() (推荐)

这是最常用的方法，可以从 Python 列表或 NumPy 数组创建张量。PyTorch 会自动推断数据类型，或者你可以显式指定。

import torch
import numpy as np

# 从 Python 列表创建
data_list = [[1, 2], [3, 4]]
t1 = torch.tensor(data_list)
print("t1 (从列表创建):\n", t1)
print("t1 dtype:", t1.dtype) # 默认为 torch.int64

# 指定数据类型
t2 = torch.tensor(data_list, dtype=torch.float32)
print("\nt2 (指定 float32):\n", t2)
print("t2 dtype:", t2.dtype)

# 从 NumPy 数组创建
np_array = np.array([[5, 6], [7, 8]])
t3 = torch.tensor(np_array) # 也可以用 torch.from_numpy(np_array)
print("\nt3 (从 NumPy 数组创建):\n", t3)
print("t3 dtype:", t3.dtype)

创建特定形状和值的张量：
- torch.empty(rows, cols, ...): 创建一个未初始化的张量，里面的值是随机的（取决于内存状态）。
- torch.zeros(rows, cols, ...): 创建一个全零张量。
- torch.ones(rows, cols, ...): 创建一个全一张量。
- torch.rand(rows, cols, ...): 创建一个在 [0, 1) 区间均匀分布的随机数张量。
- torch.randn(rows, cols, ...): 创建一个从标准正态分布 (均值为0，方差为1) 中采样的随机数张量。
- torch.full((rows, cols, ...), fill_value): 创建一个用 fill_value 填充的张量。
- torch.arange(start, end, step): 创建一个一维张量，包含从 start 到 end-1，步长为 step 的序列。
- torch.linspace(start, end, steps): 创建一个一维张量，包含从 start 到 end 均匀间隔的 steps 个点。
```
t_empty = torch.empty(2, 3)
print("\nt_empty (未初始化):\n", t_empty)

t_zeros = torch.zeros(2, 3)
print("\nt_zeros:\n", t_zeros)

t_ones = torch.ones(2, 3, dtype=torch.double) # 可以指定 dtype
print("\nt_ones (double):\n", t_ones)

t_rand = torch.rand(2, 3)
print("\nt_rand:\n", t_rand)

t_randn = torch.randn(2, 3)
print("\nt_randn:\n", t_randn)

t_full = torch.full((2,3), 7)
print("\nt_full:\n", t_full)

t_arange = torch.arange(0, 5, 1)
print("\nt_arange:\n", t_arange)

t_linspace = torch.linspace(0, 10, steps=5)
print("\nt_linspace:\n", t_linspace)
```
使用 _like 方法创建与另一个张量形状相同的张量：
- torch.zeros_like(input_tensor)
- torch.ones_like(input_tensor)
- torch.rand_like(input_tensor)
- torch.randn_like(input_tensor)
```
x = torch.tensor([[1,2,3],[4,5,6]])
t_zeros_like_x = torch.zeros_like(x)
print("\nt_zeros_like_x:\n", t_zeros_like_x)
```
torch.Tensor() vs torch.tensor() (注意！)
- torch.tensor(data): 推荐使用。 它总是复制 data。它会根据输入数据推断数据类型。
- torch.Tensor(shape_or_data): 这是一个构造函数。
  - 如果传入的是形状 (如 torch.Tensor(2, 3)), 它等同于 torch.empty(2, 3)，会创建一个未初始化的张量。
  - 如果传入的是数据 (如 torch.Tensor([1, 2]))，它会根据全局默认数据类型 (通常是 torch.float32) 创建张量，可能会进行类型转换。
- 为了避免混淆和潜在的未初始化值问题，通常建议使用 torch.tensor() 来从数据创建张量，使用 torch.empty(), torch.zeros() 等函数来创建特定形状的张量。

3. 张量的属性 (Attributes)

每个张量都有以下重要属性：

tensor.dtype: 张量中元素的数据类型 (例如 torch.float32, torch.int64, torch.bool)。
- 常见类型：torch.float32 (或 torch.float), torch.float64 (或 torch.double), torch.int32 (或 torch.int), torch.int64 (或 torch.long), torch.bool。
tensor.shape 或 tensor.size(): 张量的形状，返回一个 torch.Size 对象 (类似于元组)。
tensor.ndim 或 len(tensor.shape): 张量的维度数量 (阶)。
tensor.device: 张量所在的设备 (例如 cpu, cuda:0)。
tensor.requires_grad: 一个布尔值，指示是否需要为该张量计算梯度。默认为 False。我们将在 Autograd 部分详细讨论。

t = torch.randn(3, 4, dtype=torch.float32, device="cpu")

print("\nTensor t:\n", t)
print("t.dtype:", t.dtype)
print("t.shape:", t.shape)
print("t.size():", t.size())
print("t.ndim:", t.ndim)
print("t.device:", t.device)
print("t.requires_grad:", t.requires_grad) # 默认为 False

4. 张量操作 (Operations)

PyTorch 提供了极其丰富的张量操作，大部分与 NumPy 类似。

算术运算：
- 加、减、乘、除、幂、取模等：+, -, *, /, **, %。这些都是逐元素操作。
- 也可以使用函数形式：torch.add(), torch.sub(), torch.mul(), torch.div(), torch.pow(), torch.remainder()。
- 原地操作 (In-place operations)： 大部分操作都有一个带下划线后缀 _ 的原地版本，例如 t.add_(5) 会直接修改 t 的值，而不是返回一个新的张量。
  - 注意： 原地操作可以节省内存，但在计算梯度时可能会有问题，因为它们会破坏计算历史。
```
a = torch.tensor([[1, 2], [3, 4]], dtype=torch.float32)
b = torch.tensor([[5, 6], [7, 8]], dtype=torch.float32)

print("\na:\n", a)
print("b:\n", b)

# 加法
c1 = a + b
c2 = torch.add(a, b)
print("\na + b:\n", c1)

# 乘法 (逐元素)
d = a * b
print("\na * b (element-wise):\n", d)

# 原地加法
print("\nOriginal a:\n", a)
a.add_(b) # a 的值被修改
print("a after a.add_(b):\n", a)
```

索引与切片 (Indexing and Slicing)：

与 NumPy 数组的索引和切片规则完全相同。
可以使用整数索引、切片 (start:end:step)、布尔索引、花式索引 (使用整数列表或张量进行索引)。

t = torch.arange(1, 13).reshape(3, 4) # 1 到 12 的数字，形状为 3x4
# t:
# tensor([[ 1,  2,  3,  4],
#         [ 5,  6,  7,  8],
#         [ 9, 10, 11, 12]])
print("\nOriginal tensor t:\n", t)

# 获取第一行
print("First row t[0]:", t[0])

# 获取最后一列
print("Last column t[:, -1]:", t[:, -1])

# 获取子矩阵 (前两行，第1、2列)
print("Sub-matrix t[0:2, 1:3]:\n", t[0:2, 1:3])

# 布尔索引 (获取大于5的元素)
print("Elements > 5 in t:", t[t > 5])

# 花式索引 (获取第0行和第2行)
print("Rows 0 and 2:\n", t[[0, 2]])

维度变换 (Reshaping & Manipulating Dimensions)：
- tensor.reshape(new_shape) 或 torch.reshape(tensor, new_shape): 改变张量的形状，只要元素总数不变即可。返回的张量可能与原张量共享数据，也可能不共享 (如果需要复制)。
- tensor.view(new_shape): 与 reshape 类似，但要求张量在内存中是连续的 (contiguous)，并且返回的张量总是与原张量共享数据。如果张量不连续，需要先调用 .contiguous()。
  - -1 在 reshape 或 view 中表示该维度的大小由其他维度和元素总数自动推断。
- tensor.squeeze(): 移除所有大小为 1 的维度。
- tensor.squeeze(dim): 移除指定的大小为 1 的维度。
- tensor.unsqueeze(dim): 在指定位置插入一个大小为 1 的新维度。
- tensor.permute(dims_tuple): 重新排列张量的维度。例如，对于图像数据，从 (C, H, W) 转为 (H, W, C)。
- tensor.transpose(dim0, dim1): 交换指定的两个维度。permute 的一个特例。
- tensor.t(): 对于 2D 张量，进行转置。等价于 tensor.transpose(0, 1)。
- tensor.flatten(start_dim=0, end_dim=-1): 将指定范围内的维度展平成一维。
```
x = torch.randn(2, 3, 4) # Shape: (2, 3, 4)
print("\nOriginal x shape:", x.shape)

# Reshape
y = x.reshape(2, 12)
print("y (reshaped) shape:", y.shape)
z = x.reshape(2, -1) # -1 会自动推断为 12
print("z (reshaped with -1) shape:", z.shape)

# View (假设 x 是连续的)
w = x.view(6, 4)
print("w (view) shape:", w.shape)

# Squeeze and Unsqueeze
a = torch.randn(1, 3, 1, 5) # Shape: (1, 3, 1, 5)
print("\nOriginal a shape:", a.shape)
b = a.squeeze() # Shape: (3, 5)
print("b (squeezed) shape:", b.shape)
c = b.unsqueeze(0) # Shape: (1, 3, 5)
print("c (unsqueezed at dim 0) shape:", c.shape)
d = b.unsqueeze(-1) # Shape: (3, 5, 1)
print("d (unsqueezed at last dim) shape:", d.shape)

# Permute
img_chw = torch.randn(3, 224, 224) # (Channels, Height, Width)
img_hwc = img_chw.permute(1, 2, 0) # (Height, Width, Channels)
print("\nimg_chw shape:", img_chw.shape)
print("img_hwc (permuted) shape:", img_hwc.shape)

# Flatten
f = torch.randn(2,3,4)
f_flat = f.flatten(start_dim=1) # 将从第1维开始的维度展平
print("\nf shape:", f.shape)      # torch.Size([2, 3, 4])
print("f_flat shape:", f_flat.shape) # torch.Size([2, 12])
```
- Contiguous Tensors: 一个张量如果在内存中是按其维度顺序连续存储的，则称为连续的。有些操作 (如 view) 要求张量是连续的。可以使用 tensor.is_contiguous() 检查，使用 tensor.contiguous() 获取一个连续的副本。

拼接与拆分 (Joining and Splitting)：

torch.cat(tensors_tuple, dim=0): 沿指定维度拼接一系列张量。这些张量在非拼接维度上必须具有相同的形状。
torch.stack(tensors_tuple, dim=0): 沿一个新的维度堆叠一系列张量。这些张量必须具有完全相同的形状。
torch.split(tensor, split_size_or_sections, dim=0): 将张量沿指定维度拆分成块。
torch.chunk(tensor, chunks, dim=0): 将张量沿指定维度拆分成指定数量 (chunks) 的块。

t1 = torch.randn(2, 3)
t2 = torch.randn(2, 3)
t3 = torch.randn(2, 3)

# Concatenate
cat_dim0 = torch.cat((t1, t2, t3), dim=0) # Shape: (6, 3)
cat_dim1 = torch.cat((t1, t2, t3), dim=1) # Shape: (2, 9)
print("\nt1 shape:", t1.shape)
print("cat_dim0 shape:", cat_dim0.shape)
print("cat_dim1 shape:", cat_dim1.shape)

# Stack
stack_dim0 = torch.stack((t1, t2, t3), dim=0) # Shape: (3, 2, 3)
stack_dim1 = torch.stack((t1, t2, t3), dim=1) # Shape: (2, 3, 3)
print("stack_dim0 shape:", stack_dim0.shape)
print("stack_dim1 shape:", stack_dim1.shape)

# Split
a = torch.arange(10).reshape(5,2) # Shape (5,2)
# tensor([[0, 1],
#         [2, 3],
#         [4, 5],
#         [6, 7],
#         [8, 9]])
split_res = torch.split(a, 2, dim=0) # 按大小为2的块在第0维切分
# split_res is a tuple: (tensor([[0,1],[2,3]]), tensor([[4,5],[6,7]]), tensor([[8,9]]))
print("\nSplitting 'a':")
for i, chunk in enumerate(split_res):
    print(f"Chunk {i}:\n{chunk}")

# Chunk
chunk_res = torch.chunk(a, 3, dim=0) # 在第0维切分成3块 (尽可能平均)
print("\nChunking 'a' into 3 parts:")
for i, c in enumerate(chunk_res):
    print(f"Chunk {i}:\n{c}")

其他常用操作：
- 数学运算： torch.abs(), torch.sqrt(), torch.exp(), torch.log(), torch.sin(), torch.cos(), torch.sigmoid(), torch.tanh() 等。
- 比较运算： torch.eq(), torch.ne(), torch.gt(), torch.ge(), torch.lt(), torch.le(), torch.equal(t1, t2) (判断两个张量是否逐元素相等且形状相同)。
- 归约操作 (Reduction ops)：
  - torch.sum(tensor, dim=None, keepdim=False): 求和。
  - torch.mean(tensor, dim=None, keepdim=False): 求均值。
  - torch.prod(tensor, dim=None, keepdim=False): 求积。
  - torch.max(tensor, dim=None, keepdim=False): 求最大值 (也返回索引)。
  - torch.min(tensor, dim=None, keepdim=False): 求最小值 (也返回索引)。
  - torch.argmax(tensor, dim=None, keepdim=False): 返回最大值索引。
  - torch.argmin(tensor, dim=None, keepdim=False): 返回最小值索引。
  - torch.std(tensor, dim=None, unbiased=True, keepdim=False): 求标准差。
  - torch.var(tensor, dim=None, unbiased=True, keepdim=False): 求方差。
  - keepdim=True 会保留被归约的维度，使其大小变为1，方便后续广播。
- 矩阵运算：
  - torch.matmul(t1, t2) 或 t1 @ t2: 矩阵乘法。
  - torch.mm(mat1, mat2): 仅用于 2D 矩阵乘法。matmul 更通用。
  - torch.bmm(batch1, batch2): 批处理矩阵乘法。
  - torch.dot(vec1, vec2): 两个 1D 张量的点积。
  - torch.inverse(matrix): 求矩阵的逆。
  - torch.svd(matrix): 奇异值分解。
```
m = torch.tensor([[1., 2.], [3., 4.]])
print("\nMatrix m:\n", m)

# Sum
print("Sum of all elements in m:", torch.sum(m))
print("Sum along dim 0 (columns sum):", torch.sum(m, dim=0))
print("Sum along dim 1 (rows sum) with keepdim=True:\n", torch.sum(m, dim=1, keepdim=True))

# Max
max_val, max_idx = torch.max(m, dim=1)
print("Max values in each row:", max_val)
print("Indices of max values in each row:", max_idx)

# Matrix multiplication
n = torch.tensor([[5., 6.], [7., 8.]])
mat_mul_res = torch.matmul(m, n)
print("\nMatrix m @ n:\n", mat_mul_res)
```

5. 与 NumPy 的桥梁 (NumPy Bridge)

PyTorch 张量与 NumPy 数组之间可以高效地相互转换。

Tensor to NumPy array: tensor.numpy()
- 如果张量在 CPU 上，转换后的 NumPy 数组将与 PyTorch 张量共享底层内存。这意味着修改其中一个会影响另一个。
- 如果张量在 GPU 上，需要先将其移至 CPU (tensor.cpu()) 再调用 .numpy()。
NumPy array to Tensor: torch.from_numpy(ndarray)
- 转换后的 PyTorch 张量将与 NumPy 数组共享底层内存。修改其中一个会影响另一个。

# Tensor to NumPy
cpu_tensor = torch.ones(5)
np_array_shared = cpu_tensor.numpy()
print("\nCPU Tensor:", cpu_tensor)
print("NumPy array (shared):", np_array_shared)

cpu_tensor.add_(1) # Modify PyTorch tensor
print("CPU Tensor after modification:", cpu_tensor)
print("NumPy array (shared) after PyTorch tensor modification:", np_array_shared) # NumPy array also changed

np_array_shared[0] = 100 # Modify NumPy array
print("NumPy array (shared) after its modification:", np_array_shared)
print("CPU Tensor after NumPy array modification:", cpu_tensor) # PyTorch tensor also changed

# NumPy to Tensor
np_array = np.arange(5)
tensor_shared = torch.from_numpy(np_array)
print("\nNumPy array:", np_array)
print("Tensor (shared):", tensor_shared)

np_array += 10 # Modify NumPy array
print("NumPy array after modification:", np_array)
print("Tensor (shared) after NumPy array modification:", tensor_shared) # Tensor also changed

tensor_shared.mul_(2) # Modify Tensor
print("Tensor (shared) after its modification:", tensor_shared)
print("NumPy array after Tensor modification:", np_array) # NumPy array also changed

# If tensor is on GPU
if torch.cuda.is_available():
    gpu_tensor = torch.ones(3, device="cuda")
    # np_from_gpu = gpu_tensor.numpy() # This would cause an error
    np_from_gpu = gpu_tensor.cpu().numpy() # Correct way
    print("\nNumPy array from GPU tensor (after .cpu()):", np_from_gpu)

这种内存共享机制非常高效，但在修改数据时需要特别小心。如果你不希望共享内存，可以使用 tensor.clone().numpy() 或 torch.tensor(np_array) (它会复制数据)。

6. GPU 上的张量 (CUDA Tensors)

PyTorch 的核心优势之一是能够轻松地在 GPU 上执行计算。

检查 GPU 可用性： torch.cuda.is_available()
获取 GPU 数量： torch.cuda.device_count()
获取当前 GPU 设备名称： torch.cuda.get_device_name(0) (0 是设备索引)

将张量移至 GPU：

在创建时指定：torch.tensor(data, device="cuda") 或 torch.randn(2, 3, device="cuda")

使用 .to(device) 方法：

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
my_tensor = torch.randn(2,3)
my_tensor_gpu = my_tensor.to(device)
print(f"\nmy_tensor device: {my_tensor.device}")
print(f"my_tensor_gpu device: {my_tensor_gpu.device}")

使用 .cuda() 方法 (如果 GPU 可用)：my_tensor_gpu = my_tensor.cuda()

将张量移回 CPU：
- my_tensor_cpu = my_tensor_gpu.to("cpu")
- my_tensor_cpu = my_tensor_gpu.cpu()
注意： 参与运算的张量必须在同一个设备上。尝试在 CPU 张量和 GPU 张量之间直接进行运算会导致错误。

if torch.cuda.is_available():
    print(f"\nCUDA is available! Using GPU: {torch.cuda.get_device_name(0)}")
    device = torch.device("cuda")

    # Create tensor directly on GPU
    x_gpu = torch.ones(2, 2, device=device)
    y_gpu = torch.rand(2, 2, device=device)

    # Perform operation on GPU
    z_gpu = x_gpu + y_gpu
    print("z_gpu (on GPU):\n", z_gpu)
    print("z_gpu device:", z_gpu.device)

    # Move result back to CPU
    z_cpu = z_gpu.cpu()
    print("z_cpu (moved to CPU):\n", z_cpu)
    print("z_cpu device:", z_cpu.device)

    # Trying to operate on tensors on different devices will error
    # x_cpu = torch.ones(2,2)
    # result = x_cpu + x_gpu # This would raise a RuntimeError
else:
    print("\nCUDA is not available. Operations will be on CPU.")
    device = torch.device("cpu")

7. requires_grad 属性 (简要介绍)

张量有一个 requires_grad 属性，默认为 False。
如果将其设置为 True，PyTorch 的 Autograd 系统将开始跟踪在该张量上进行的所有操作，以便后续自动计算梯度。
这对于神经网络的训练至关重要。我们将在下一节“自动求导 (Autograd)”中详细学习。

a = torch.tensor([1.0, 2.0, 3.0], requires_grad=False) # Default
b = torch.tensor([4.0, 5.0, 6.0], requires_grad=True) # Track operations on b
c = a + b
print(f"\na.requires_grad: {a.requires_grad}")
print(f"b.requires_grad: {b.requires_grad}")
print(f"c.requires_grad: {c.requires_grad}") # True, because b requires_grad

实践与练习：

创建不同类型的张量：
- 创建一个包含数字 0 到 9 的 1D 张量。
- 创建一个 3x3 的全零张量，数据类型为 torch.float32。
- 创建一个 2x4 的随机数张量，数值在 [0, 1) 之间。
- 从一个 NumPy 数组 np.array([[1,1],[2,2]]) 创建一个张量。
张量属性检查：
- 打印出你创建的每个张量的 shape, dtype, 和 device。
张量操作：
- 将你创建的 1D 张量 (0-9) reshape 成 2x5 的张量。
- 取出该 2x5 张量的第一行和最后一列。
- 创建一个 2x5 的全一张量，并与你 reshape 后的张量相加。
- 将一个 3x4 的张量 permute 成 4x3 的张量。
NumPy 桥接：
- 将一个 PyTorch 张量转换为 NumPy 数组。修改 NumPy 数组，并检查 PyTorch 张量是否也发生了变化。
- 反过来，从 NumPy 数组创建 PyTorch 张量，修改 PyTorch 张量，并检查 NumPy 数组。
GPU 操作 (如果你的环境支持 CUDA)：
- 创建一个张量，并将其移动到 GPU。
- 在 GPU 上创建两个张量并执行加法操作。
- 将结果移回 CPU 并打印。