21 ACL A Tale of Two Systems: Multi-Class Classification

2025-07-14 68 阅读2分钟

This paper shows how a multi-class grammatical error detection (GED) system can be used to improve grammatical error correction (GEC) for English.

（M缺失：标注在正确token quite的后面）

1. 模型 (Multi-class classification, single-model NMT-based)

We treat GED as a multi-class problem and investigate ways to use multi-class GED predictions to inform GEC。
Specifically, we develop a binary detection system based on pre-trained ELECTRA, and then extend it to multi-class detection using different error type tagsets derived from the ERRANT framework.
Two methods of using GED information: i) as auxiliary input, and ii) for re-ranking.

Two-step training: Source MHA 和 GED MHA 通过冻结参数分别训练。由于GED使用的数据量较少，这种方式可以避免GED部分的参数过拟合。

2. 背景知识

定义：Following Rei and Yannakoudakis (2016), we treat GED as a sequence labelling (or token classification) task and assign a label to each token in the input sentence.
早期GED方法关注特定错误类型，尤其是冠词和介词错误。
Treat GED as a sequence labeling task and GEC as a sequence-to-sequence task.
主流还是基于Transformer和Seq2Seq建模、迭代标注
Multi-task (GED & GEC) or streamline (GED-> GEC)

3. 数据集

Lang8

4-classes consist of operation type only (i.e. missing (M), replacement (R), unnecessary (U) and correct (C)), 25-classes consist of main type only (e.g. noun, noun number, verb tense, etc.) and 55-classes consist of the full tags combined (See Appendix A in Bryant et al. (2017) for all combinations.).

4. 实验分析

先比较二分类性能，再扩展到多类别分类
Binary GED：传统二分类效果比已有baselines好
Multi-class GED：4分类、25分类、55分类的结果对比表明，尽管类别数不同，所有的方法都能检测到大致相同的错误数量，只对特定类别标签分类有困难。

5. 小结

微调预训练模型有助于提升二分类GED，如微调ELECTRA
Next, we employ a multi-encoder GEC model and presented two methods of integrating GED predictions into GEC systems: firstly during GEC fine-tuning and secondly as a post-processing reranking step.
fine-tuning和post-processing两种方法都能独立有助于最终性能，但结合起来互补更强。