This paper shows how a multi-class grammatical error detection (GED) system can be used to improve grammatical error correction (GEC) for English.
(M缺失:标注在正确token quite的后面)
1. 模型 (Multi-class classification, single-model NMT-based)
- We treat GED as a multi-class problem and investigate ways to use multi-class GED predictions to inform GEC。
- Specifically, we develop a binary detection system based on pre-trained ELECTRA, and then extend it to multi-class detection using different error type tagsets derived from the ERRANT framework.
- Two methods of using GED information: i) as auxiliary input, and ii) for re-ranking.
- Two-step training: Source MHA 和 GED MHA 通过冻结参数分别训练。由于GED使用的数据量较少,这种方式可以避免GED部分的参数过拟合。
2. 背景知识
- 定义:Following Rei and Yannakoudakis (2016), we treat GED as a sequence labelling (or token classification) task and assign a label to each token in the input sentence.
- 早期GED方法关注特定错误类型,尤其是冠词和介词错误。
- Treat GED as a sequence labeling task and GEC as a sequence-to-sequence task.
- 主流还是基于Transformer和Seq2Seq建模、迭代标注
- Multi-task (GED & GEC) or streamline (GED-> GEC)
3. 数据集
- Lang8
- 4-classes consist of operation type only (i.e. missing (M), replacement (R), unnecessary (U) and correct (C)), 25-classes consist of main type only (e.g. noun, noun number, verb tense, etc.) and 55-classes consist of the full tags combined (See Appendix A in Bryant et al. (2017) for all combinations.).
4. 实验分析
- 先比较二分类性能,再扩展到多类别分类
- Binary GED: 传统二分类效果比已有baselines好
- Multi-class GED:4分类、25分类、55分类的结果对比表明,尽管类别数不同,所有的方法都能检测到大致相同的错误数量,只对特定类别标签分类有困难。
5. 小结
- 微调预训练模型有助于提升二分类GED,如微调ELECTRA
- Next, we employ a multi-encoder GEC model and presented two methods of integrating GED predictions into GEC systems: firstly during GEC fine-tuning and secondly as a post-processing reranking step.
- fine-tuning和post-processing两种方法都能独立有助于最终性能,但结合起来互补更强。