知识积累:
-
PyTorch:Bi-LSTM的文本生成
-
How to create a poet / writer using Deep Learning (Text Generation using Python)?
2-1.如何利用深度学习写诗歌(使用Python进行文本生成)
-
Essentials of Deep Learning : Introduction to Long Short Term Memory 学习LSTM理论
-
PyTorch LSTM: Text Generation Tutorial 5-1 PyTorch LSTM: Text Generation Tutorial 5-2 PyTorch LSTM:文本生成教程
6.安娜卡列尼娜》文本生成——利用TensorFlow构建LSTM模型
以上都是前期的知识积累,有些优秀的代码, 下面: 磐创AI 进行代码练习:
下载原数据
Project Gutenberg is a library of over 60,000 free eBooks
解析文本数据的处理过程:
最后的文本只保留了letters中27个字符结构
字符是不能直接使用的,需要转化对应的数值类型。
def create_dictionary(text):
char_to_idx = dict()
idx_to_char = dict()
idx = 0
for char in text:
if char not in char_to_idx.keys():
# 构建字典
char_to_idx[char] = idx
idx_to_char[idx] = char
idx += 1
return char_to_idx,idx_to_char
测试一下:
就是对27个字符进行进行编号,这里其实我觉得完全可以自己手动完成,总共也就26个字母+一个空格的字符,没必要遍历全部的文档,我猜测可能是担心整个文档439522个字符中不是全部包含了26+''这些字符。
序列生成
中滑动窗口为4,如果包含空格字符的话:应该是5个字符才对啊
x = list()
y = list()
for i in range(len(text)):
try:
# 从文本中获取字符窗口
# 将其转换为其idx表示
sequence1 = text[i:i + window]
sequence = [char_to_idx[char] for char in sequence1]
# 得到target
# 转换到它的idx表示
target1 = text[i + window]
target = char_to_idx[target1]
# 保存sequence和target
x.append(sequence)
y.append(target)
except:
pass
x = np.array(x)
y = np.array(y)
return x, y
这么理解的:输入的事前四个字符 sequence,输出预测是不是 target
最后x是一个N*4的矩阵,
这个N是text的总长度
Bi-LSTM循环神经网络
该方法包括将每个字符序列传递到嵌入层,这将为构成序列的每个元素生成向量形式的表示,因此我们将形成一个嵌入字符序列。随后,嵌入字符序列的每个元素将被传递到Bi-LSTM层。随后,将生成构成双LSTM(前向LSTM和后向LSTM)的LSTM的每个输出的串联。紧接着,每个前向+后向串联的向量将被传递到LSTM层,最后一个隐藏状态将从该层传递给线性层。最后一个线性层将有一个Softmax函数作为激活函数,以表示每个字符的概率。
标准LSTM和Bi-LSTM的关键区别在于Bi-LSTM由2个LSTM组成,通常称为“正向LSTM”和“反向LSTM”
首先让我们了解一下如何构造TextGenerator类的构造函数
代码段4-文本生成器类的构造函数
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
class TextGenerator(nn.ModuleList):
def __init__(self, args, vocab_size):
super(TextGenerator, self).__init__()
self.batch_size = args.batch_size
self.hidden_dim = args.hidden_dim
self.input_size = vocab_size
self.num_classes = vocab_size
self.sequence_len = args.window
# Dropout
self.dropout = nn.Dropout(0.25)
# Embedding 层
self.embedding = nn.Embedding(self.input_size, self.hidden_dim, padding_idx=0)
# Bi-LSTM
# 正向和反向
self.lstm_cell_forward = nn.LSTMCell(self.hidden_dim, self.hidden_dim)
self.lstm_cell_backward = nn.LSTMCell(self.hidden_dim, self.hidden_dim)
# LSTM 层
self.lstm_cell = nn.LSTMCell(self.hidden_dim * 2, self.hidden_dim * 2)
# Linear 层
self.linear = nn.Linear(self.hidden_dim * 2, self.num_classes)
第一次在jupyter中运行pytorch
按照警告:ipywidgets.readthedocs.io/en/stable/u…
安装 ipywidgets
pip install ipywidgets -i https://pypi.douban.com/simple
jupyter nbextension enable --py widgetsnbextension
我在jupyter使用的事虚拟环境
在运行导入pytorch 就没有警告了。 解答
另一方面,在第20行和第21行中,我们定义了组成Bi-LSTM的两个「LSTMCells」 (向前和向后)。 在第24行中,我们定义了LSTMCell,它将与「Bi-LSTM」的输出一起馈送。 值得一提的是,隐藏状态的大小是Bi-LSTM的两倍,这是因为Bi-LSTM的输出是串联的。 稍后在第27行定义线性层,稍后将由softmax函数过滤。
完整代码
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
class TextGenerator(nn.ModuleList):
def __init__(self, args, vocab_size):
super(TextGenerator, self).__init__()
self.batch_size = args.batch_size
self.hidden_dim = args.hidden_dim
self.input_size = vocab_size
self.num_classes = vocab_size
self.sequence_len = args.window
# Dropout
self.dropout = nn.Dropout(0.25)
# Embedding layer
self.embedding = nn.Embedding(self.input_size, self.hidden_dim, padding_idx=0)
# Bi-LSTM
# Forward and backward 正向和反向
self.lstm_cell_forward = nn.LSTMCell(self.hidden_dim, self.hidden_dim)
self.lstm_cell_backward = nn.LSTMCell(self.hidden_dim, self.hidden_dim)
# LSTM layer
self.lstm_cell = nn.LSTMCell(self.hidden_dim * 2, self.hidden_dim * 2)
# Linear layer
self.linear = nn.Linear(self.hidden_dim * 2, self.num_classes)
def forward(self, x): # 代码片段5-权重初始化
# Bi-LSTM
# hs = [batch_size x hidden_size]
# cs = [batch_size x hidden_size]
hs_forward = torch.zeros(x.size(0), self.hidden_dim)
cs_forward = torch.zeros(x.size(0), self.hidden_dim)
hs_backward = torch.zeros(x.size(0), self.hidden_dim)
cs_backward = torch.zeros(x.size(0), self.hidden_dim)
# LSTM
# hs = [batch_size x (hidden_size * 2)]
# cs = [batch_size x (hidden_size * 2)]
hs_lstm = torch.zeros(x.size(0), self.hidden_dim * 2)
cs_lstm = torch.zeros(x.size(0), self.hidden_dim * 2)
# Weights initialization
torch.nn.init.kaiming_normal_(hs_forward)
torch.nn.init.kaiming_normal_(cs_forward)
torch.nn.init.kaiming_normal_(hs_backward)
torch.nn.init.kaiming_normal_(cs_backward)
torch.nn.init.kaiming_normal_(hs_lstm)
torch.nn.init.kaiming_normal_(cs_lstm)
# From idx to embedding
out = self.embedding(x)
# Prepare the shape for LSTM Cells
out = out.view(self.sequence_len, x.size(0), -1)
forward = []
backward = []
# Unfolding Bi-LSTM
# Forward
for i in range(self.sequence_len):
hs_forward, cs_forward = self.lstm_cell_forward(out[i], (hs_forward, cs_forward))
forward.append(hs_forward)
# Backward
for i in reversed(range(self.sequence_len)):
hs_backward, cs_backward = self.lstm_cell_backward(out[i], (hs_backward, cs_backward))
backward.append(hs_backward)
# LSTM
for fwd, bwd in zip(forward, backward):
input_tensor = torch.cat((fwd, bwd), 1)
hs_lstm, cs_lstm = self.lstm_cell(input_tensor, (hs_lstm, cs_lstm))
# Last hidden state is passed through a linear layer
out = self.linear(hs_lstm)
return out
抱歉:后面的看不懂了,,我需要弄懂LSTM模型,代码流程看不懂了、