[UPDATED!] 2024-03-10 (Publish Time)
生成模型
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-10 | An End-to-End Deep Learning Generative Framework for Refinable Shape Matching and Generation | 用于可细化形状匹配和生成的端到端深度学习生成框架 | Soodeh Kalaie, Andy Bulpitt, Alejandro F. Frangi, Ali Gooya | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | FastVideoEdit: Leveraging Consistency Models for Efficient Text-to-Video Editing | FastVideoEdit:利用一致性模型进行高效的文本到视频编辑 | Youyuan Zhang, Xuan Ju, James J. Clark | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | On depth prediction for autonomous driving using self-supervised learning | 基于自监督学习的自动驾驶深度预测 | Houssem Boulahbal | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | DiffuMatting: Synthesizing Arbitrary Objects with Matting-level Annotation | DiffuMatting:使用抠图级注释合成任意对象 | Xiaobin Hu, Xu Peng, Donghao Luo, Xiaozhong Ji, Jinlong Peng, Zhengkai Jiang, Jiangning Zhang, Taisong Jin, Chengjie Wang, Rongrong Ji | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | Platypose: Calibrated Zero-Shot Multi-Hypothesis 3D Human Motion Estimation | Platypose:校准的零样本多假设 3D 人体运动估计 | Paweł A. Pierzchlewicz, Caio da Silva, R. James Cotton, Fabian H. Sinz | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | MACE: Mass Concept Erasure in Diffusion Models | MACE:扩散模型中的质量概念擦除 | Shilin Lu, Zilan Wang, Leyang Li, Yanzhu Liu, Adams Wai-Kin Kong | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | Coherent Temporal Synthesis for Incremental Action Segmentation | 用于增量动作分割的相干时间合成 | Guodong Ding, Hans Golong, Angela Yao | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models | VidProM:用于文本到视频扩散模型的百万级真实提示图库数据集 | Wenhao Wang, Yi Yang | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | Diffusion Models Trained with Large Data Are Transferable Visual Models | 用大数据训练的扩散模型是可迁移的视觉模型 | Guangkai Xu, Yongtao Ge, Mingyu Liu, Chengxiang Fan, Kangyang Xie, Zhiyue Zhao, Hao Chen, Chunhua Shen | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | Implicit Image-to-Image Schrodinger Bridge for CT Super-Resolution and Denoising | 用于 CT 超分辨率和降噪的隐式图像到图像薛定谔电桥 | Yuang Wang, Siyeop Yoon, Pengfei Jin, Matthew Tivnan, Zhennong Chen, Rui Hu, Li Zhang, Zhiqiang Chen, Quanzheng Li, Dufan Wu | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | Decoupled Data Consistency with Diffusion Purification for Image Restoration | 通过扩散净化解耦数据一致性以进行图像恢复 | Xiang Li, Soo Min Kwon, Ismail R. Alkhouri, Saiprasad Ravishanka, Qing Qu | arxiv.org/pdf/2403.06… | null |
多模态
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-10 | FOAA: Flattened Outer Arithmetic Attention For Multimodal Tumor Classification | FOAA:多模态肿瘤分类的扁平化外部算术注意力 | Omnia Alwazzan, Ioannis Patras, Gregory Slabaugh | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | A streamlined Approach to Multimodal Few-Shot Class Incremental Learning for Fine-Grained Datasets | 细粒度数据集的多模态少样本类增量学习的简化方法 | Thang Doan, Sima Behpour, Xin Li, Wenbin He, Liang Gou, Liu Ren | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | A Comprehensive Overhaul of Multimodal Assistant with Small Language Models | 小语言模型多模态助手的全面改造 | Minjie Zhu, Yichen Zhu, Xin Liu, Ning Liu, Zhiyuan Xu, Chaomin Shen, Yaxin Peng, Zhicai Ou, Feifei Feng, Jian Tang | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | DrFuse: Learning Disentangled Representation for Clinical Multi-Modal Fusion with Missing Modality and Modal Inconsistency | DrFuse:学习具有缺失模态和模态不一致的临床多模态融合的解缠结表示 | Wenfang Ya, Kejing Yin, William K. Cheung, Jia Liu, Jing Qin | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | RESTORE: Towards Feature Shift for Vision-Language Prompt Learning | RESTORE:迈向视觉语言即时学习的功能转变 | Yuncheng Yang, Chuyan Zhang, Zuopeng Yang, Yuting Gao, Yulei Qin, Ke Li, Xing Sun, Jie Yang, Yun Gu | arxiv.org/pdf/2403.06… | null |
Nerf
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-10 | S-DyRF: Reference-Based Stylized Radiance Fields for Dynamic Scenes | S-DyRF:动态场景的基于参考的程式化辐射场 | Xingyi Li, Zhiguo Cao, Yizheng Wu, Kewei Wang, Ke Xian, Zhe Wang, Guosheng Lin | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | Is Vanilla MLP in Neural Radiance Field Enough for Few-shot View Synthesis? | 神经辐射场中的普通 MLP 是否足以进行少量视图合成? | Hanxin Zhu, Tianyu He, Xin Li, Bingchen Li, Zhibo Chen | arxiv.org/pdf/2403.06… | null |
模型压缩/优化
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-10 | Roy Miles, Ismail Elezi, Jiankang Deng | arxiv.org/pdf/2403.06… | null | ||
| 2024-03-10 | Decoupled Contrastive Learning for Long-Tailed Recognition | 用于长尾识别的解耦对比学习 | Shiyu Xuan, Shiliang Zhang | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | Knowledge Distillation of Convolutional Neural Networks through Feature Map Transformation using Decision Trees | 使用决策树通过特征图变换进行卷积神经网络的知识蒸馏 | Maddimsetti Srinivas, Debdoot Sheet | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | Bit-mask Robust Contrastive Knowledge Distillation for Unsupervised Semantic Hashing | 用于无监督语义哈希的位掩码鲁棒对比知识蒸馏 | Liyang He, Zhenya Huang, Jiayu Liu, Enhong Chen, Fei Wang, Jing Sha, Shijin Wang | arxiv.org/pdf/2403.06… | null |
分类/检测/识别/分割/...
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-10 | Transformer based Multitask Learning for Image Captioning and Object Detection | 基于 Transformer 的图像描述和对象检测多任务学习 | Debolena Basak, P. K. Srijith, Maunendra Sankar Desarkar | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | Probing Image Compression For Class-Incremental Learning | 探索用于类增量学习的图像压缩 | Justin Yang, Zhihao Duan, Andrew Peng, Yuning Huang, Jiangpeng He, Fengqing Zhu | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | Physics-Guided Abnormal Trajectory Gap Detection | 物理引导的异常轨迹间隙检测 | Arun Sharma, Shashi Shekhar | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | Poly Kernel Inception Network for Remote Sensing Detection | 用于遥感检测的多核初始网络 | Xinhao Cai, Qiuxia Lai, Yuwei Wang, Wenguan Wang, Zeren Sun, Yazhou Yao | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | Text-Guided Variational Image Generation for Industrial Anomaly Detection and Segmentation | 用于工业异常检测和分割的文本引导变分图像生成 | Mingyu Lee, Jongwon Choi | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | COVID-19 Computer-aided Diagnosis through AI-assisted CT Imaging Analysis: Deploying a Medical AI System | 通过人工智能辅助 CT 成像分析进行 COVID-19 计算机辅助诊断:部署医疗人工智能系统 | Demetris Gerogiannis, Anastasios Arsenos, Dimitrios Kollias, Dimitris Nikitopoulos, Stefanos Kollias | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | Finding Visual Saliency in Continuous Spike Stream | 在连续尖峰流中寻找视觉显着性 | Lin Zhu, Xianzhang Chen, Xiao Wang, Hua Huang | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | PEPSI: Pathology-Enhanced Pulse-Sequence-Invariant Representations for Brain MRI | PEPSI:脑 MRI 的病理学增强脉冲序列不变表示 | Peirong Liu, Oula Puonti, Annabel Sorby-Adams, William T. Kimberly, Juan E. Iglesias | arxiv.org/pdf/2403.06… | link |
| 2024-03-10 | SuPRA: Surgical Phase Recognition and Anticipation for Intra-Operative Planning | SuPRA:手术阶段识别和术中规划预测 | Maxence Boels, Yang Liu, Prokar Dasgupta, Alejandro Granados, Sebastien Ourselin | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | Cross-Cluster Shifting for Efficient and Effective 3D Object Detection in Autonomous Driving | 跨集群转移可实现自动驾驶中高效且有效的 3D 物体检测 | Zhili Chen, Kien T. Pham, Maosheng Ye, Zhiqiang Shen, Qifeng Chen | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | Cracking the neural code for word recognition in convolutional neural networks | 破解卷积神经网络中单词识别的神经代码 | Aakash Agrawal, Stanislas Dehaene | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | GlanceVAD: Exploring Glance Supervision for Label-efficient Video Anomaly Detection | GlanceVAD:探索 Glance 监督以实现标签高效的视频异常检测 | Huaxin Zhang, Xiang Wang, Xiaohao Xu, Xiaonan Huang, Chuchu Han, Yuehuan Wang, Changxin Gao, Shanjun Zhang, Nong Sang | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | Bayesian Random Semantic Data Augmentation for Medical Image Classification | 用于医学图像分类的贝叶斯随机语义数据增强 | Yaoyao Zhu, Xiuding Cai, Xueyao Wang, Yu Yao | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | ClickVOS: Click Video Object Segmentation | ClickVOS:点击视频对象分割 | Pinxue Guo, Lingyi Hong, Xinyu Zhou, Shuyong Gao, Wanyun Li, Jinglun Li, Zhaoyu Chen, Xiaoqiang Li, Wei Zhang, Wenqiang Zhang | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | In-context Prompt Learning for Test-time Vision Recognition with Frozen Vision-language Model | 使用冻结视觉语言模型进行测试时视觉识别的上下文提示学习 | Junhui Yin, Xinyu Zhang, Lin Wu, Xianghua Xie, Xiaojie Wang | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | Style Blind Domain Generalized Semantic Segmentation via Covariance Alignment and Semantic Consistence Contrastive Learning | 通过协方差对齐和语义一致性对比学习的风格盲域广义语义分割 | Woo-Jin Ahn, Geun-Yeong Yang, Hyun-Duck Choi, Myo-Taeg Lim | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | CLEAR: Cross-Transformers with Pre-trained Language Model is All you need for Person Attribute Recognition and Retrieval | 明确:具有预训练语言模型的跨变压器是人物属性识别和检索所需的全部 | Doanh C. Bui, Thinh V. Le, Hung Ba Ngo, Tae Jong Choi | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | Textureless Object Recognition: An Edge-based Approach | 无纹理对象识别:基于边缘的方法 | Frincy Clement, Kirtan Shah, Dhara Pancholi, Gabriel Lugo Bustillo, Dr. Irene Cheng | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | Universal Debiased Editing for Fair Medical Image Classification | 用于公平医学图像分类的通用去偏编辑 | Ruinan Jin, Wenlong Deng, Minghui Chen, Xiaoxiao Li | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | Enhancing 3D Object Detection with 2D Detection-Guided Query Anchors | 使用 2D 检测引导的查询锚点增强 3D 对象检测 | Haoxuanye Ji, Pengpeng Liang, Erkang Cheng | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | Towards In-Vehicle Multi-Task Facial Attribute Recognition: Investigating Synthetic Data and Vision Foundation Models | 迈向车载多任务面部属性识别:研究合成数据和视觉基础模型 | Esmaeil Seraj, Walter Talamonti | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | Reframe Anything: LLM Agent for Open World Video Reframing | 重构一切:用于开放世界视频重构的 LLM 代理 | Jiawang Cao, Yongliang Wu, Weiheng Chi, Wenbo Zhu, Ziyue Su, Jay Wu | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | CausalCellSegmenter: Causal Inference inspired Diversified Aggregation Convolution for Pathology Image Segmentation | CausalCellSegmenter:因果推理启发病理图像分割的多样化聚合卷积 | Dawei Fan, Yifan Gao, Jiaming Yu, Yanping Chen, Wencheng Li, Chuancong Lin, Kaibin Li, Changcai Yang, Riqing Chen, Lifang Wei | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | Texture image retrieval using a classification and contourlet-based features | 使用分类和基于轮廓波的特征进行纹理图像检索 | Asal Rouhafzay, Nadia Baaziz, Mohand Said Allili | arxiv.org/pdf/2403.06… | null |
图像理解
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-10 | BlazeBVD: Make Scale-Time Equalization Great Again for Blind Video Deflickering | BlazeBVD:让缩放时间均衡再次发挥作用,实现盲视频去闪烁 | Xinmin Qiu, Congying Han, Zicheng Zhang, Bonan Li, Tiande Guo, Pingyu Wang, Xuecheng Nie | arxiv.org/pdf/2403.06… | null |
LLM
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-10 | Low-dose CT Denoising with Language-engaged Dual-space Alignment | 通过语言参与的双空间对齐进行低剂量 CT 去噪 | Zhihao Chen, Tao Chen, Chenhui Wang, Chuang Niu, Ge Wang, Hongming Shan | arxiv.org/pdf/2403.06… | null |
Transformer
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-10 | MoST: Motion Style Transformer between Diverse Action Contents | MoST:不同动作内容之间的动作风格转换器 | Boeun Kim, Jungho Kim, Hyung Jin Chang, Jin Young Choi | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | Harmonious Group Choreography with Trajectory-Controllable Diffusion | 具有轨迹可控扩散的和谐团体编排 | Yuqin Dai, Wanlu Zhu, Ronghui Li, Zeping Ren, Xiangzheng Zhou, Xiu Li, Jun Li, Jian Yang | arxiv.org/pdf/2403.06… | null |
3D/CG
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-10 | PSS-BA: LiDAR Bundle Adjustment with Progressive Spatial Smoothing | PSS-BA:具有渐进式空间平滑功能的 LiDAR 束调整 | Jianping Li, Thien-Minh Nguyen, Shenghai Yuan, Lihua Xie | arxiv.org/pdf/2403.06… | null |
各类学习方式
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-10 | Understanding and Mitigating Human-Labelling Errors in Supervised Contrastive Learning | 理解和减少监督对比学习中的人类标记错误 | Zijun Long, Lipeng Zhuang, George Killick, Richard McCreadie, Gerardo Aragon Camarasa, Paul Henderson | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | Test-time Distribution Learning Adapter for Cross-modal Visual Reasoning | 用于跨模态视觉推理的测试时间分布学习适配器 | Yi Zhang, Ce Zhang | arxiv.org/pdf/2403.06… | null |
其他
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-10 | Leveraging Computer Vision in the Intensive Care Unit (ICU) for Examining Visitation and Mobility | 在重症监护病房 (ICU) 中利用计算机视觉检查探视和活动情况 | Scott Siegel, Jiaqing Zhang, Sabyasachi Bandyopadhyay, Subhash Nerella, Brandon Silva, Tezcan Baslanti, Azra Bihorac, Parisa Rashidi | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | UNICORN: Ultrasound Nakagami Imaging via Score Matching and Adaptation | UNICORN:通过分数匹配和适应进行超声 Nakagami 成像 | Kwanyoung Kim, Jaa-Yeon Lee, Jong Chul Ye | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | Online Multi-spectral Neuron Tracing | 在线多光谱神经元追踪 | Bin Duan, Yuzhang Shang, Dawen Cai, Yan Yan | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | All-in-one platform for AI R&D in medical imaging, encompassing data collection, selection, annotation, and pre-processing | 集数据采集、选择、标注、预处理为一体的医学影像AI研发一体化平台 | Changhee Han, Kyohei Shibano, Wataru Ozaki, Keishiro Osaki, Takafumi Haraguchi, Daisuke Hirahara, Shumon Kimura, Yasuyuki Kobayashi, Gento Mogi | arxiv.org/pdf/2403.06… | null |
| 2024-03-10 | Multisize Dataset Condensation | 多尺寸数据集压缩 | Yang He, Lingao Xiao, Joey Tianyi Zhou, Ivor Tsang | arxiv.org/pdf/2403.06… | null |