[UPDATED!] 2024-02-17 (Publish Time)
生成模型
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-17 | TC-DiffRecon: Texture coordination MRI reconstruction method based on diffusion model and modified MF-UNet method | TC-DiffRecon:基于扩散模型和改进的MF-UNet方法的纹理协调MRI重建方法 | Chenyan Zhang, Yifei Chen, Zhenxiong Fan, Yiyu Huang, Wenchao Weng, Ruiquan Ge, Dong Zeng, Changmiao Wang | arxiv.org/pdf/2402.11… | null |
| 2024-02-17 | DiffPoint: Single and Multi-view Point Cloud Reconstruction with ViT Based Diffusion Model | DiffPoint:使用基于 ViT 的扩散模型进行单视点和多视点云重建 | Yu Feng, Xing Shi, Mengli Cheng, Yun Xiong | arxiv.org/pdf/2402.11… | null |
多模态
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-17 | Asclepius: A Spectrum Evaluation Benchmark for Medical Multi-Modal Large Language Models | Asclepius:医学多模态大语言模型的频谱评估基准 | Wenxuan Wang, Yihang Su, Jingyuan Huan, Jie Liu, Wenting Chen, Yudi Zhang, Cheng-Yi Li, Kao-Jung Chang, Xiaohan Xin, Linlin Shen, et.al. | arxiv.org/pdf/2402.11… | null |
| 2024-02-17 | Hand Biometrics in Digital Forensics | 数字取证中的手部生物识别技术 | Asish Bera, Debotosh Bhattacharjee, Mita Nasipuri | arxiv.org/pdf/2402.11… | null |
| 2024-02-17 | Supporting Experts with a Multimodal Machine-Learning-Based Tool for Human Behavior Analysis of Conversational Videos | 使用基于多模态机器学习的工具支持专家对对话视频进行人类行为分析 | Riku Arakawa, Kiyosu Maeda, Hiromu Yakura | arxiv.org/pdf/2402.11… | null |
Nerf
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-17 | Semantically-aware Neural Radiance Fields for Visual Scene Understanding: A Comprehensive Review | 用于视觉场景理解的语义感知神经辐射场:综合综述 | Thang-Anh-Quan Nguyen, Amine Bourki, Mátyás Macudzinski, Anthony Brunel, Mohammed Bennamoun | arxiv.org/pdf/2402.11… | null |
模型压缩/优化
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-17 | GraphKD: Exploring Knowledge Distillation Towards Document Object Detection with Structured Graph Creation | GraphKD:通过结构化图创建探索知识蒸馏以实现文档对象检测 | Ayan Banerjee, Sanket Biswas, Josep Lladós, Umapada Pal | arxiv.org/pdf/2402.11… | null |
| 2024-02-17 | On Good Practices for Task-Specific Distillation of Large Pretrained Models | 关于大型预训练模型的特定任务蒸馏的良好实践 | Juliette Marrie, Michael Arbel, Julien Mairal, Diane Larlus | arxiv.org/pdf/2402.11… | null |
| 2024-02-17 | Hierarchical Prior-based Super Resolution for Point Cloud Geometry Compression | 用于点云几何压缩的基于分层先验的超分辨率 | Dingquan Li, Kede Ma, Jing Wang, Ge Li | arxiv.org/pdf/2402.11… | null |
| 2024-02-17 | Knowledge Distillation Based on Transformed Teacher Matching | 基于变革型教师匹配的知识蒸馏 | Kaixiang Zheng, En-Hui Yang | arxiv.org/pdf/2402.11… | link |
分类/检测/识别/分割/...
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-17 | Exploiting T-norms for Deep Learning in Autonomous Driving | 利用 T 范数进行自动驾驶深度学习 | Mihaela Cătălina Stoian, Eleonora Giunchiglia, Thomas Lukasiewicz | arxiv.org/pdf/2402.11… | null |
| 2024-02-17 | ChatEarthNet: A Global-Scale, High-Quality Image-Text Dataset for Remote Sensing | ChatEarthNet:全球范围的高质量遥感图像文本数据集 | Zhenghang Yuan, Zhitong Xiong, Lichao Mou, Xiao Xiang Zhu | arxiv.org/pdf/2402.11… | null |
| 2024-02-17 | ICHPro: Intracerebral Hemorrhage Prognosis Classification Via Joint-attention Fusion-based 3d Cross-modal Network | ICHPro:通过基于联合注意力融合的 3d 跨模态网络对脑出血预后进行分类 | Xinlei Yu, Xinyang Li, Ruiquan Ge, Shibin Wu, Ahmed Elazab, Jichao Zhu, Lingyan Zhang, Gangyong Jia, Taosheng Xu, Xiang Wan, et.al. | arxiv.org/pdf/2402.11… | null |
| 2024-02-17 | ReViT: Enhancing Vision Transformers with Attention Residual Connections for Visual Recognition | ReViT:通过视觉识别的注意力残留连接增强视觉变压器 | Anxhelo Diko, Danilo Avola, Marco Cascio, Luigi Cinque | arxiv.org/pdf/2402.11… | null |
| 2024-02-17 | Semi-supervised Medical Image Segmentation Method Based on Cross-pseudo Labeling Leveraging Strong and Weak Data Augmentation Strategies | 基于交叉伪标记利用强弱数据增强策略的半监督医学图像分割方法 | Yifei Chen, Chenyan Zhang, Yifan Ke, Yiyu Huang, Xuezhou Dai, Feiwei Qin, Yongquan Zhang, Xiaodong Zhang, Changmiao Wang | arxiv.org/pdf/2402.11… | null |
| 2024-02-17 | Training-free image style alignment for self-adapting domain shift on handheld ultrasound devices | 无需训练的图像风格对齐,可在手持式超声设备上实现自适应域移动 | Hongye Zeng, Ke Zou, Zhihao Chen, Yuchong Gao, Hongbo Chen, Haibin Zhang, Kang Zhou, Meng Wang, Rick Siow Mong Goh, Yong Liu, et.al. | arxiv.org/pdf/2402.11… | null |
| 2024-02-17 | A Decoding Scheme with Successive Aggregation of Multi-Level Features for Light-Weight Semantic Segmentation | 一种用于轻量级语义分割的多级特征连续聚合的解码方案 | Jiwon Yoo, Jangwon Lee, Gyeonghwan Kim | arxiv.org/pdf/2402.11… | null |
图像理解
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-17 | Dense Matchers for Dense Tracking | 用于密集跟踪的密集匹配器 | Tomáš Jelínek, Jonáš Šerých, Jiří Matas | arxiv.org/pdf/2402.11… | null |
LLM
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-17 | CoLLaVO: Crayon Large Language and Vision mOdel | CoLLaVO:Crayon 大语言和视觉模型 | Byung-Kwan Lee, Beomchan Park, Chae Won Kim, Yong Man Ro | arxiv.org/pdf/2402.11… | null |
Transformer
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-17 | FViT: A Focal Vision Transformer with Gabor Filter | FViT:带有 Gabor 滤波器的焦点视觉转换器 | Yulong Shi, Mingwei Sun, Yongshuai Wang, Rui Wang, Hui Sun, Zengqiang Chen | arxiv.org/pdf/2402.11… | null |
其他
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-17 | Probabilistic Routing for Graph-Based Approximate Nearest Neighbor Search | 基于图的近似最近邻搜索的概率路由 | Kejing Lu, Chuan Xiao, Yoshiharu Ishikawa | arxiv.org/pdf/2402.11… | null |
| 2024-02-17 | Learning by Reconstruction Produces Uninformative Features For Perception | 通过重构学习会产生无信息的感知特征 | Randall Balestriero, Yann LeCun | arxiv.org/pdf/2402.11… | null |
| 2024-02-17 | Enhancing Surgical Performance in Cardiothoracic Surgery with Innovations from Computer Vision and Artificial Intelligence: A Narrative Review | 利用计算机视觉和人工智能的创新提高心胸外科的手术表现:叙述性回顾 | Merryn D. Constable, Hubert P. H. Shum, Stephen Clark | arxiv.org/pdf/2402.11… | null |
| 2024-02-17 | Beyond Literal Descriptions: Understanding and Locating Open-World Objects Aligned with Human Intentions | 超越文字描述:理解和定位符合人类意图的开放世界物体 | Wenxuan Wang, Yisi Zhang, Xingjian He, Yichen Yan, Zijia Zhao, Xinlong Wang, Jing Liu | arxiv.org/pdf/2402.11… | null |
| 2024-02-17 | Be Persistent: Towards a Unified Solution for Mitigating Shortcuts in Deep Learning | 坚持不懈:寻求统一解决方案以减少深度学习中的捷径 | Hadi M. Dolatabadi, Sarah M. Erfani, Christopher Leckie | arxiv.org/pdf/2402.11… | null |
| 2024-02-17 | Understanding News Thumbnail Representativeness by Counterfactual Text-Guided Contrastive Language-Image Pretraining | 通过反事实文本引导对比语言-图像预训练了解新闻缩略图的代表性 | Yejun Yoon, Seunghyun Yoon, Kunwoo Park | arxiv.org/pdf/2402.11… | null |