论文阅读A COMPREHENSIVE REVIEW OF VISUAL-TEXTUAL SENTIMENT ANALY

A COMPREHENSIVE REVIEW OF VISUAL-TEXTUAL SENTIMENT ANALYSIS FROM SOCIAL MEDIA NETWORKS 社交媒体网络视觉文本情绪分析综述

文章来自arxiv：原文链接

文章整体架构图一：文章目录

1、introduction

多模态情感分析的研究现状分为三个主要类别： • Social media image and tagged content analysis . • Audio and visual data analysis . • Human–human and human–machine interaction analysis .

文章贡献： • It reviews current works to offer researchers with a complete understanding of the methodologies and resources available for visual and textual SA. • It provides a comprehensive overview of visual and textual SA, including data pre-processing, feature extraction techniques and sentiment benchmark datasets. • This study categorises and summarises the most common SA methodologies, namely, ML, lexicon-based, hybrid and DL methods. • It provides a brief introduction of the most widely used data fusion strategies and summarises the existing research on visual–textual SA by referencing previously published works. • It summarises the applications and challenges associated with SA.

2 、视觉-文本SA框架

在这里插入图片描述

2.1 文本情感分析

2.1.1 文本预处理

pre-processing seeks to improve the analysis and minimise the dimensionality of the input data. Several typical tasks are included during the entire procedure as follows.

小写。将所有文本的字母更改为相同的大小写。
代替否定词。推文包括各种否定的概念。否定过程转换 will not 和 can't 分别是 will not 和 cannot。
消除不必要的信息 [包括标点符号、主题标签 (#)、特殊字符、附加空格、停用词、URL 引用、@username、数字和非 ASCII 英文字符，以保留英语独有的信息范围].此类信息不期望用户情绪的表达。
翻译表情符号。如今，用户利用表情符号来表达他们的想法、感受和情绪。因此，将所有表情符号翻译成对应的单词会产生更好的效果。
将具有重复字符的单词更改为其英语来源。个人经常使用带有重复的字母（例如'coooool'）来表达他们的感受。
使用首字母缩略词词典将首字母缩略词扩展到其原始单词，缩写和俚语是推文中经常使用的构造不良的单词。它们必须恢复到原来的样子。
分词
词性标记
词形还原。将特定单词简化为最简单形式，与词干相同，但它保留了与词相关的信息，如 PoS 标签。