论文题目:Dial2vec: Self-Guided Contrastive Learning of Unsupervised Dialogue Embeddings
论文来源:EMNLP2022
代码地址:Dial2vec
一、Motivation
Task:learning unsupervised dialogue embedding;
Trivial approaches: combining pre-trained word or sentence embeddings and encoding through pre-trained language models (PLMs);
Default: ignore the conversational interactions between interlocutors, resulting in poor performance; 对于学习无监督对话嵌入这一任务来说,先前的方法大都是使用大规模预训练模型,但是这忽略了对话之间的交互关系,导致效果不佳。
Previous studies have extensively demonstrated the importance of encoding token-level interacions for learning semantic textual embeddings.
先前的研究证明了学习token级别的语义信息嵌入的重要性
However, for dialogue embedding, encoding interlocutor-level interactions is also essential but is overlooked in trivial approaches. 但是对于对话嵌入来说,编码对话者级别的交互也是至关重要的,但是其恰恰被忽略了。
二、Model
其使用一种巧妙的方法来构造对比学习所需要的正负样本,如上图所示,其对学习到的embedding表示进行多种操作来构造正负样本。
三、Performance
其实验结果表明该方法的优越性。
对话里面有噪音;一般的模型使用两个对话进行训练不一定有单个的好;我们的模型两个结合使得效果更好。
We conclude that under the guidance of the conversational interactions, dial2vec eliminates the interlocutor-level interaction-free information and highlights the interaction-aware information, thus achieving better performances.
其考虑了对话者间的交互关系,所以模型效果才会好。
四、References
[1] Liu C, Wang R, Jiang J, et al. Dial2vec: Self-Guided Contrastive Learning of Unsupervised Dialogue Embeddings[J]. arXiv preprint arXiv:2210.15332, 2022.