24 ESWA_Enhancing rumor detection with data augmentation and generative pre-trai

57 阅读1分钟
  • data augmentation
  • generative pre-trained transformer

GAP

However, the existing methods could not learn the deep concepts of the rumor text to detect the rumor. In addition, imbalanced datasets in the umor domain reduce the effectiveness of these algorithms.

Idea

leveraging the Generative Pre-trained Transformer 2 (GPT-2) model to generate rumor-like texts, thus creating a balanced dataset. (利用GPT2+增强数据)

  • GPT-2 captures rich semantic information and can produce diverse, high-quality synthetic text samples.

Datasets

PHEME, Twitter15, and Twitter16 datasets. image.png

image.png

Experimental Results

image.png