25 COLING End-to-End Refined Alignment-Based Approach

30 阅读2分钟

Refined Evaluation for End-to-End Grammatical Error Correction Using an Alignment-Based Approach (open-writing-evaluation.github.io/) 使用基于对齐的方法对端到端语法错误纠正进行精细评估

  • 旨在复现并改进现有评估工具(例如 Errant)的结果,解决了文本预处理中句子边界检测偏差带来的挑战。
  • 我们还提出了一个潜在的多语言 Errant,用于展示中文和韩语的 GEC 结果。

image.png

1. 模型 (Alignment-based errant)

our work adapted errant by incorporating an alignment-based preprocessing approach.

  • Our primary contribution is the refined alignment process, which addresses discrepancies in sentence boundaries between gold-standard and system-generated results.

与MT的递归编辑(juejin.cn/post/749156… )不同的是,该工作采用jointly preprocessed 算法, 通过对齐来预处理源和目标之间的句子边界和标记。 (sentence boundary and tokenization between source and target)

jp-errant: 1) using stanza for improved Chinese word segmentation and POS tagging, and 2) adhering to the original errant conventions (Missing, Unecessary, and Replacement) for consistent multilingual grammatical error annotation.

image.png

2. 背景知识

  1. 分词变化可能会引入标准句子中不存在的语法词素,反之亦然。这里的语法词素,也就是功能词functional words如介词、连词等;与之对应的是内容词,即lexical word。

    • can’t 可以被分词为can not or ca n’t
  2. 编辑距离Jaro-Winkler distance:原始的Jaro相似度用于计算两个序列前向匹配的字符数,Jaro-Winkler distance对给定前缀长度l引入前缀比例因子p。类似地,本文的工作旨在引入后缀比例因子,平衡权重。

  3. Gale和Church在1993年提出了一个基于长度进行句对齐的算法

  4. the Gale and Church algorithm

  5. Write & Improve (W&I) corpus (Bryant et al., 2019).
    6. the distribution of grammatical errors (Zeng et al., 2024).

  6. extended Gale-Church algorithm (EGC) and the jp algorithm (JP)

3. 评估指标

To measure the performance and reliability of GEC systems, there are several evaluation metrics:

  • M 2 (Dahlmeier and Ng, 2012)
  • GLEU (Napoles et al., 2015)
  • errant (Bryant et al., 2017; Bryant, 2019)
  • PT M2 (Gong et al., 2022)

errant (ERRor ANnotation Toolkit) 是当前评估GEC的事实上标准(the de facto standard for GEC)。

  • 既有整体性能分析、也有特定错误类别分析
    • 已经用于多语言,German, Chinese, and Korean, among others (Boyd, 2018; Hinson et al., 2020; Zhang et al., 2022; Sonawane et al., 2020; Belkebir and Habash, 2021; Náplava et al., 2022; Katinskaia et al., 2022; Yoon et al., 2023).

4. 工具

5. 数据集

参考