23 ACL TemplateGEC: Improving grammatical error correction with detection template
1. 模型 (TemplateGEC = Seq2Edit + and Seq2Seq)
- 融合Seq2Edit和Seq2Seq,提出一种检测模版将第一阶段的检测标签输入第二阶段辅助预测。针对Seq2Edit产生的不准确标签,进一步通过基于Gold labels的一致性学习增强模型的鲁棒性。
- Loss = 交叉熵损失+一致性损失
- 一致性损失:KL divergence is a measure of the difference between two probability distributions.
- In the inference stage, only predicted detection labels are used to generate the template .
- (1)Seq2Edit model based on ELECTRA (性能不如GECToR),最后选择RoBERTa model of GECToR 用于英语数据集。(2)Gold labels are extracted by ERRANT. (3)Seq2Seq基于Transformer、BART、T5.
1.1 错误检测标签的定义
1.2 预测标签与真实标签
- Gold label: we use ERRANT (Bryant et al., 2017b) to extract the edits, from which we can obtain the gold label of the source sentence.
- ERRANT is used to extract the gold detection labels from all the training data.
1.3 Detection Template Construction
- Detection Prefix
- Detection Template
1.4 Gold Label-Assisted Consistency learning Motivation
- Training Objective
- Consistency Learning
- KL divergence loss
2. 背景知识
Grammatical error correction (GEC) can be divided into sequence-to-edit (Seq2Edit) and sequence-to-sequence (Seq2Seq) frameworks.
-
Seq2Edit: Seq2Edit GEC typically involves converting a source sentence into a sequence of token-level editing operations, such as insertion, deletion, etc.
-
Seq2Seq: Seq2Seq GEC approaches GEC as a monolingual translation problem has the advantage of better generation ability of the corrected sentence. However, Seq2Seq GEC still encounters the challenge of over-correction (过度纠正) (Park et al., 2020).