21 ACL A Tale of Two Systems: Multi-Class Classification

49 阅读2分钟

This paper shows how a multi-class grammatical error detection (GED) system can be used to improve grammatical error correction (GEC) for English.

image.png (M缺失:标注在正确token quite的后面)

1. 模型 (Multi-class classification, single-model NMT-based)

  • We treat GED as a multi-class problem and investigate ways to use multi-class GED predictions to inform GEC。
  • Specifically, we develop a binary detection system based on pre-trained ELECTRA, and then extend it to multi-class detection using different error type tagsets derived from the ERRANT framework.
  • Two methods of using GED information: i) as auxiliary input, and ii) for re-ranking.

image.png

  • Two-step training: Source MHA 和 GED MHA 通过冻结参数分别训练。由于GED使用的数据量较少,这种方式可以避免GED部分的参数过拟合

2. 背景知识

  • 定义:Following Rei and Yannakoudakis (2016), we treat GED as a sequence labelling (or token classification) task and assign a label to each token in the input sentence.
  • 早期GED方法关注特定错误类型,尤其是冠词和介词错误。
  • Treat GED as a sequence labeling task and GEC as a sequence-to-sequence task.
  • 主流还是基于Transformer和Seq2Seq建模、迭代标注
  • Multi-task (GED & GEC) or streamline (GED-> GEC)

3. 数据集

  • Lang8

image.png

  • 4-classes consist of operation type only (i.e. missing (M), replacement (R), unnecessary (U) and correct (C)), 25-classes consist of main type only (e.g. noun, noun number, verb tense, etc.) and 55-classes consist of the full tags combined (See Appendix A in Bryant et al. (2017) for all combinations.).

4. 实验分析

  • 先比较二分类性能,再扩展到多类别分类
  • Binary GED: 传统二分类效果比已有baselines好
  • Multi-class GED:4分类、25分类、55分类的结果对比表明,尽管类别数不同,所有的方法都能检测到大致相同的错误数量,只对特定类别标签分类有困难。

image.png image.png

5. 小结

  1. 微调预训练模型有助于提升二分类GED,如微调ELECTRA
  2. Next, we employ a multi-encoder GEC model and presented two methods of integrating GED predictions into GEC systems: firstly during GEC fine-tuning and secondly as a post-processing reranking step.
  3. fine-tuning和post-processing两种方法都能独立有助于最终性能,但结合起来互补更强。