!UPDATED -- 2024-01-07
分类/检测/识别/分割
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-07 | Big Data and Deep Learning in Smart Cities: A Comprehensive Dataset for AI-Driven Traffic Accident Detection and Computer Vision Systems | 智慧城市中的大数据和深度学习:人工智能驱动的交通事故检测和计算机视觉系统的综合数据集 | Victor Adewopo, Nelly Elsayed, Zag Elsayed, Murat Ozer, Constantinos Zekios, Ahmed Abdelgawad, Magdy Bayoumi | arxiv.org/pdf/2401.03… | null |
| 2024-01-07 | Invisible Reflections: Leveraging Infrared Laser Reflections to Target Traffic Sign Perception | 看不见的反射:利用红外激光反射来实现交通标志感知 | Takami Sato, Sri Hrushikesh Varma Bhupathiraju, Michael Clifford, Takeshi Sugawara, Qi Alfred Chen, Sara Rampazzi | arxiv.org/pdf/2401.03… | null |
| 2024-01-07 | SeTformer is What You Need for Vision and Language | SeTformer 是您视觉和语言所需的工具 | Pourya Shamsolmoali, Masoumeh Zareapoor, Eric Granger, Michael Felsberg | arxiv.org/pdf/2401.03… | null |
| 2024-01-07 | Text-Driven Traffic Anomaly Detection with Temporal High-Frequency Modeling in Driving Videos | 在驾驶视频中使用时间高频建模进行文本驱动的交通异常检测 | Rongqin Liang, Yuanman Li, Jiantao Zhou, Xia Li | arxiv.org/pdf/2401.03… | null |
| 2024-01-07 | Re:Draw -- Context Aware Translation as a Controllable Method for Artistic Production | Re:Draw——语境感知翻译作为艺术生产的可控方法 | Joao Liborio Cardoso, Francesco Banterle, Paolo Cignoni, Michael Wimmer | arxiv.org/pdf/2401.03… | null |
| 2024-01-07 | Segment Anything Model for Medical Image Segmentation: Current Applications and Future Directions | 医学图像分割的 Segment Anything 模型:当前应用和未来方向 | Yichi Zhang, Zhenrong Shen, Rushi Jiao | arxiv.org/pdf/2401.03… | link |
| 2024-01-07 | A Classification of Critical Configurations for any Number of Projective Views | 任意数量的投影视图的关键配置的分类 | Martin Bråtelund | arxiv.org/pdf/2401.03… | null |
| 2024-01-07 | Bilateral Reference for High-Resolution Dichotomous Image Segmentation | 高分辨率二分图像分割的双边参考 | Peng Zheng, Dehong Gao, Deng-Ping Fan, Li Liu, Jorma Laaksonen, Wanli Ouyang, Nicu Sebe | arxiv.org/pdf/2401.03… | null |
| 2024-01-07 | conv_einsum: A Framework for Representation and Fast Evaluation of Multilinear Operations in Convolutional Tensorial Neural Networks | conv_einsum:卷积张量神经网络中多线性运算的表示和快速评估框架 | Tahseen Rabbani, Jiahao Su, Xiaoyu Liu, David Chan, Geoffrey Sangston, Furong Huang | arxiv.org/pdf/2401.03… | null |
模型压缩/优化
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-07 | BCLNet: Bilateral Consensus Learning for Two-View Correspondence Pruning | BCLNet:双视图对应剪枝的双边共识学习 | Xiangyang Miao, Guobao Xiao, Shiping Wang, Jun Yu | arxiv.org/pdf/2401.03… | link |
生成模型
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-07 | SpecRef: A Fast Training-free Baseline of Specific Reference-Condition Real Image Editing | SpecRef:特定参考条件真实图像编辑的快速免训练基线 | Songyan Chen, Jiancheng Huang | arxiv.org/pdf/2401.03… | link |
| 2024-01-07 | Deep Learning-based Image and Video Inpainting: A Survey | 基于深度学习的图像和视频修复:一项调查 | Weize Quan, Jiaxi Chen, Yanli Liu, Dong-Ming Yan, Peter Wonka | arxiv.org/pdf/2401.03… | null |
多模态
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-07 | GRAM: Global Reasoning for Multi-Page VQA | GRAM:多页面 VQA 的全局推理 | Tsachi Blau, Sharon Fogel, Roi Ronen, Alona Golts, Roy Ganz, Elad Ben Avraham, Aviad Aberdam, Shahar Tsiper, Ron Litman | arxiv.org/pdf/2401.03… | null |
Transformer
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-07 | Involution Fused ConvNet for Classifying Eye-Tracking Patterns of Children with Autism Spectrum Disorder | 用于对自闭症谱系障碍儿童的眼动追踪模式进行分类的对合融合卷积网络 | Md. Farhadul Islam, Meem Arafat Manab, Joyanta Jyoti Mondal, Sarah Zabeen, Fardin Bin Rahman, Md. Zahidul Hasan, Farig Sadeque, Jannatun Noor | arxiv.org/pdf/2401.03… | null |
| 2024-01-07 | FurniScene: A Large-scale 3D Room Dataset with Intricate Furnishing Scenes | FurniScene:具有复杂家具场景的大型 3D 房间数据集 | Genghao Zhang, Yuxi Wang, Chuanchen Luo, Shibiao Xu, Junran Peng, Zhaoxiang Zhang, Man Zhang | arxiv.org/pdf/2401.03… | null |
| 2024-01-07 | See360: Novel Panoramic View Interpolation | See360:新颖的全景插值 | Zhi-Song Liu, Marie-Paule Cani, Wan-Chi Siu | arxiv.org/pdf/2401.03… | link |
| 2024-01-07 | Towards Effective Multiple-in-One Image Restoration: A Sequential and Prompt Learning Strategy | 实现有效的多合一图像恢复:一种顺序且快速的学习策略 | Xiangtao Kong, Chao Dong, Lei Zhang | arxiv.org/pdf/2401.03… | null |
图像理解
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-07 | Amirkabir campus dataset: Real-world challenges and scenarios of Visual Inertial Odometry (VIO) for visually impaired people | Amirkabir 校园数据集:视障人士视觉惯性里程计 (VIO) 的现实挑战和场景 | Ali Samadzadeh, Mohammad Hassan Mojab, Heydar Soudani, Seyed Hesamoddin Mireshghollah, Ahmad Nickabadi | arxiv.org/pdf/2401.03… | null |