[分享][每日更新][2024.02.23][CV_arxiv_papers]

246 阅读9分钟

[UPDATED!] 2024-02-23 (Publish Time)

生成模型

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-23A Study of Shape Modeling Against Noise抗噪声形状建模研究Cheng Long, Adrian Barbuarxiv.org/pdf/2402.15…null
2024-02-23Gen4Gen: Generative Data Pipeline for Generative Multi-Concept CompositionGen4Gen:用于生成多概念组合的生成数据管道Chun-Hsiao Yeh, Ta-Ying Cheng, He-Yen Hsieh, Chuan-En Lin, Yi Ma, Andrew Markham, Niki Trigoni, H. T. Kung, Yubei Chenarxiv.org/pdf/2402.15…link
2024-02-23ProTIP: Probabilistic Robustness Verification on Text-to-Image Diffusion Models against Stochastic PerturbationProTIP:文本到图像扩散模型对抗随机扰动的概率鲁棒性验证Yi Zhang, Yun Tang, Wenjie Ruan, Xiaowei Huang, Siddartha Khastgir, Paul Jennings, Xingyu Zhaoarxiv.org/pdf/2402.15…link
2024-02-23On normalization-equivariance properties of supervised and unsupervised denoising methods: a survey关于监督和无监督去噪方法的归一化等方差性质:一项调查Sébastien Herbreteau, Charles Kervrannarxiv.org/pdf/2402.15…null
2024-02-23Label-efficient Multi-organ Segmentation Method with Diffusion Model具有扩散模型的标签高效多器官分割方法Yongzhi Huang, Jinxin Zhu, Haseeb Hassan, Liyilei Su, Jingyu Li, Binding Huangarxiv.org/pdf/2402.15…null
2024-02-23Modified CycleGAN for the synthesization of samples for wheat head segmentation修改后的 CycleGAN 用于合成小麦头部分割的样本Jaden Myers, Keyhan Najafian, Farhad Maleki, Katie Ovensarxiv.org/pdf/2402.15…null

多模态

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-23RoboEXP: Action-Conditioned Scene Graph via Interactive Exploration for Robotic ManipulationRoboEXP:通过机器人操作的交互式探索的动作条件场景图Hanxiao Jiang, Binghao Huang, Ruihai Wu, Zhuoran Li, Shubham Garg, Hooshang Nayyeri, Shenlong Wang, Yunzhu Liarxiv.org/pdf/2402.15…null
2024-02-23Text2Pic Swift: Enhancing Long-Text to Image Retrieval for Large-Scale LibrariesText2Pic Swift:增强大型图书馆的长文本到图像检索Zijun Long, Xuri Ge, Richard Mccreadie, Joemon Josearxiv.org/pdf/2402.15…null
2024-02-23Large Multimodal Agents: A Survey大型多式联运代理:调查Junlin Xie, Zhihong Chen, Ruifei Zhang, Xiang Wan, Guanbin Liarxiv.org/pdf/2402.15…null
2024-02-23Multimodal Transformer With a Low-Computational-Cost Guarantee具有低计算成本保证的多模态变压器Sungjin Park, Edward Choiarxiv.org/pdf/2402.15…null

模型压缩/优化

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-23Distilling Adversarial Robustness Using Heterogeneous Teachers使用异质教师提炼对抗鲁棒性Jieren Deng, Aaron Palmer, Rigel Mahmood, Ethan Rathbun, Jinbo Bi, Kaleel Mahmood, Derek Aguiararxiv.org/pdf/2402.15…null
2024-02-23Hierarchical Invariance for Robust and Interpretable Vision Tasks at Larger Scales更大规模的鲁棒且可解释的视觉任务的层次不变性Shuren Qi, Yushu Zhang, Chao Wang, Zhihua Xia, Jian Weng, Xiaochun Caoarxiv.org/pdf/2402.15…link
2024-02-23Optimized Deployment of Deep Neural Networks for Visual Pose Estimation on Nano-drones用于纳米无人机视觉姿态估计的深度神经网络的优化部署Matteo Risso, Francesco Daghero, Beatrice Alessandra Motetti, Daniele Jahier Pagliari, Enrico Macii, Massimo Poncino, Alessio Burrelloarxiv.org/pdf/2402.15…null

分类/检测/识别/分割/...

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-23Self-Supervised Pre-Training for Table Structure Recognition Transformer表结构识别变压器的自监督预训练ShengYun Peng, Seongmin Lee, Xiaojing Wang, Rajarajeswari Balasubramaniyan, Duen Horng Chauarxiv.org/pdf/2402.15…null
2024-02-23Closing the AI generalization gap by adjusting for dermatology condition distribution differences across clinical settings通过调整临床环境中皮肤病状况分布差异来缩小人工智能泛化差距Rajeev V. Rikhye, Aaron Loh, Grace Eunhae Hong, Preeti Singh, Margaret Ann Smith, Vijaytha Muralidharan, Doris Wong, Rory Sayres, Michelle Phung, Nicolas Betancourt, et.al.arxiv.org/pdf/2402.15…null
2024-02-23Deep Networks Always Grok and Here is Why深度网络总是让人摸不着头脑,原因如下Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniukarxiv.org/pdf/2402.15…null
2024-02-23Co-Supervised Learning: Improving Weak-to-Strong Generalization with Hierarchical Mixture of Experts共同监督学习:通过专家的分层混合提高弱到强的泛化能力Yuejiang Liu, Alexandre Alahiarxiv.org/pdf/2402.15…null
2024-02-23Retinotopic Mapping Enhances the Robustness of Convolutional Neural Networks视网膜专题图增强卷积神经网络的鲁棒性Jean-Nicolas Jérémie, Emmanuel Daucé, Laurent U Perrinetarxiv.org/pdf/2402.15…null
2024-02-23Benchmarking the Robustness of Panoptic Segmentation for Automated Driving自动驾驶全景分割鲁棒性基准测试Yiting Wang, Haonan Zhao, Daniel Gummadi, Mehrdad Dianati, Kurt Debattista, Valentina Donzellaarxiv.org/pdf/2402.15…null
2024-02-23Outlier detection by ensembling uncertainty with negative objectness通过将不确定性与负客观性结合起来进行异常值检测Anja Delić, Matej Grcić, Siniša Šegvićarxiv.org/pdf/2402.15…null
2024-02-23AutoMMLab: Automatically Generating Deployable Models from Language Instructions for Computer Vision TasksAutoMMLab:根据计算机视觉任务的语言指令自动生成可部署模型Zekang Yang, Wang Zeng, Sheng Jin, Chen Qian, Ping Luo, Wentao Liuarxiv.org/pdf/2402.15…null
2024-02-23Low-Rank Representations Meets Deep Unfolding: A Generalized and Interpretable Network for Hyperspectral Anomaly Detection低秩表示满足深度展开:用于高光谱异常检测的通用且可解释的网络Chenyu Li, Bing Zhang, Danfeng Hong, Jing Yao, Jocelyn Chanussotarxiv.org/pdf/2402.15…null
2024-02-23OpenSUN3D: 1st Workshop Challenge on Open-Vocabulary 3D Scene UnderstandingOpenSUN3D:第一届开放词汇 3D 场景理解研讨会挑战Francis Engelmann, Ayca Takmaz, Jonas Schult, Elisabetta Fedele, Johanna Wald, Songyou Peng, Xi Wang, Or Litany, Siyu Tang, Federico Tombari, et.al.arxiv.org/pdf/2402.15…null
2024-02-23Representing Online Handwriting for Recognition in Large Vision-Language Models在大型视觉语言模型中表示在线手写识别Anastasiia Fadeeva, Philippe Schlattner, Andrii Maksai, Mark Collier, Efi Kokiopoulou, Jesse Berent, Claudiu Musatarxiv.org/pdf/2402.15…null
2024-02-23EMIFF: Enhanced Multi-scale Image Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object DetectionEMIFF:用于车辆-基础设施协作 3D 物体检测的增强型多尺度图像特征融合Zhe Wang, Siqi Fan, Xiaoliang Huo, Tongda Xu, Yan Wang, Jingjing Liu, Yilun Chen, Ya-Qin Zhangarxiv.org/pdf/2402.15…link
2024-02-23GS-EMA: Integrating Gradient Surgery Exponential Moving Average with Boundary-Aware Contrastive Learning for Enhanced Domain Generalization in Aneurysm SegmentationGS-EMA:将梯度手术指数移动平均与边界感知对比学习相结合,以增强动脉瘤分割中的域泛化Fengming Lin, Yan Xia, Michael MacRaild, Yash Deo, Haoran Dou, Qiongyao Liu, Nina Cheng, Nishant Ravikumar, Alejandro F. Frangiarxiv.org/pdf/2402.15…link
2024-02-23Unsupervised Domain Adaptation for Brain Vessel Segmentation through Transwarp Contrastive Learning通过 Transwarp 对比学习进行脑血管分割的无监督域适应Fengming Lin, Yan Xia, Michael MacRaild, Yash Deo, Haoran Dou, Qiongyao Liu, Kun Wu, Nishant Ravikumar, Alejandro F. Frangiarxiv.org/pdf/2402.15…link
2024-02-23Attention-Guided Masked Autoencoders For Learning Image Representations用于学习图像表示的注意力引导掩模自动编码器Leon Sick, Dominik Engel, Pedro Hermosilla, Timo Ropinskiarxiv.org/pdf/2402.15…null
2024-02-23Where Visual Speech Meets Language: VSP-LLM Framework for Efficient and Context-Aware Visual Speech Processing视觉语音与语言的结合:用于高效且上下文感知的视觉语音处理的 VSP-LLM 框架Jeong Hun Yeo, Seunghee Han, Minsu Kim, Yong Man Roarxiv.org/pdf/2402.15…link
2024-02-23PUAD: Frustratingly Simple Method for Robust Anomaly DetectionPUAD:用于稳健异常检测的极其简单的方法Shota Sugawara, Ryuji Imamuraarxiv.org/pdf/2402.15…null
2024-02-23Fiducial Focus Augmentation for Facial Landmark Detection用于面部标志检测的基准焦点增强Purbayan Kar, Vishal Chudasama, Naoyuki Onoe, Pankaj Wasnik, Vineeth Balasubramanianarxiv.org/pdf/2402.15…null

OCR

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-23DeepSet SimCLR: Self-supervised deep sets for improved pathology representation learningDeepSet SimCLR:用于改进病理表示学习的自监督深度集David Torpey, Richard Kleinarxiv.org/pdf/2402.15…null

图像理解

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-23State Space Models for Event Cameras事件相机的状态空间模型Nikola Zubić, Mathias Gehrig, Davide Scaramuzzaarxiv.org/pdf/2402.15…null

Transformer

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-23MambaIR: A Simple Baseline for Image Restoration with State-Space ModelMambaIR:状态空间模型图像恢复的简单基线Hang Guo, Jinmin Li, Tao Dai, Zhihao Ouyang, Xudong Ren, Shu-Tao Xiaarxiv.org/pdf/2402.15…null
2024-02-23Seamless Human Motion Composition with Blended Positional Encodings具有混合位置编码的无缝人体运动合成German Barquero, Sergio Escalera, Cristina Palmeroarxiv.org/pdf/2402.15…link
2024-02-23Semi-supervised Counting via Pixel-by-pixel Density Distribution Modelling通过逐像素密度分布建模进行半监督计数Hui Lin, Zhiheng Ma, Rongrong Ji, Yaowei Wang, Zhou Su, Xiaopeng Hong, Deyu Mengarxiv.org/pdf/2402.15…null
2024-02-23Descripción automática de secciones delgadas de rocas: una aplicación Web道路自动部分说明:网页应用程序Stalyn Paucar, Christian Mejía-Escobar y Víctor Collaguazoarxiv.org/pdf/2402.15…null

3D/CG

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-23Cohere3D: Exploiting Temporal Coherence for Unsupervised Representation Learning of Vision-based Autonomous DrivingCohere3D:利用时间一致性进行基于视觉的自动驾驶的无监督表示学习Yichen Xie, Hongge Chen, Gregory P. Meyer, Yong Jae Lee, Eric M. Wolff, Masayoshi Tomizuka, Wei Zhan, Yuning Chai, Xin Huangarxiv.org/pdf/2402.15…null

各类学习方式

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-23CI w/o TN: Context Injection without Task Name for Procedure PlanningCI w/o TN:没有任务名称的上下文注入用于程序规划Xinjie Liarxiv.org/pdf/2402.15…null
2024-02-23Does Combining Parameter-efficient Modules Improve Few-shot Transfer Accuracy?组合参数高效模块是否可以提高少样本传输精度?Nader Asadi, Mahdi Beitollahi, Yasser Khalil, Yinchuan Li, Guojun Zhang, Xi Chenarxiv.org/pdf/2402.15…null
2024-02-23Genie: Generative Interactive EnvironmentsGenie:生成交互环境Jake Bruce, Michael Dennis, Ashley Edwards, Jack Parker-Holder, Yuge Shi, Edward Hughes, Matthew Lai, Aditi Mavalankar, Richie Steigerwald, Chris Apps, et.al.arxiv.org/pdf/2402.15…null
2024-02-23Source-Guided Similarity Preservation for Online Person Re-Identification用于在线人员重新识别的源引导相似性保留Hamza Rami, Jhony H. Giraldo, Nicolas Winckler, Stéphane Lathuilièrearxiv.org/pdf/2402.15…link

其他

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-23Low-Frequency Black-Box Backdoor Attack via Evolutionary Algorithm通过进化算法进行低频黑盒后门攻击Yanqi Qiao, Dazhuang Liu, Rui Wang, Kaitai Liangarxiv.org/pdf/2402.15…null
2024-02-23Bagged Deep Image Prior for Recovering Images in the Presence of Speckle Noise用于在存在散斑噪声的情况下恢复图像的袋装深度图像先验Xi Chen, Zhewen Hou, Christopher A. Metzler, Arian Maleki, Shirin Jalaliarxiv.org/pdf/2402.15…null
2024-02-23Improving Explainable Object-induced Model through Uncertainty for Automated Vehicles通过自动驾驶车辆的不确定性改进可解释的对象诱发模型Shihong Ling, Yue Wan, Xiaowei Jia, Na Duarxiv.org/pdf/2402.15…null
2024-02-23CLIPPER+: A Fast Maximal Clique Algorithm for Robust Global RegistrationCLIPPER+:用于鲁棒全局注册的快速最大派系算法Kaveh Fathian, Tyler Summersarxiv.org/pdf/2402.15…null
2024-02-23Computer Vision for Multimedia Geolocation in Human Trafficking Investigation: A Systematic Literature Review人口贩运调查中多媒体地理定位的计算机视觉:系统文献综述Opeyemi Bamigbade, John Sheppard, Mark Scanlonarxiv.org/pdf/2402.15…null
2024-02-23Optimal Transport on the Lie Group of Roto-translations旋转平移李群上的最优传输Daan Bon, Gautam Pai, Gijs Bellaard, Olga Mula, Remco Duitsarxiv.org/pdf/2402.15…null
2024-02-23Seeing is Believing: Mitigating Hallucination in Large Vision-Language Models via CLIP-Guided Decoding眼见为实:通过 CLIP 引导解码减轻大视觉语言模型中的幻觉Ailin Deng, Zhirui Chen, Bryan Hooiarxiv.org/pdf/2402.15…null
2024-02-23Font Impression Estimation in the Wild野外字体印象估计Kazuki Kitajima, Daichi Haraguchi, Seiichi Uchidaarxiv.org/pdf/2402.15…null
2024-02-23Which Model to Transfer? A Survey on Transferability Estimation要转移哪个模型?可迁移性估计调查Yuhe Ding, Bo Jiang, Aijing Yu, Aihua Zheng, Jian Liangarxiv.org/pdf/2402.15…null
2024-02-23BSPA: Exploring Black-box Stealthy Prompt Attacks against Image GeneratorsBSPA:探索针对图像生成器的黑盒隐形即时攻击Yu Tian, Xiao Yang, Yinpeng Dong, Heming Yang, Hang Su, Jun Zhuarxiv.org/pdf/2402.15…null
2024-02-23Convergence Analysis of Blurring Mean Shift模糊均值漂移的收敛性分析Ryoya Yamasaki, Toshiyuki Tanakaarxiv.org/pdf/2402.15…null
2024-02-23Fine-tuning CLIP Text Encoders with Two-step Paraphrasing通过两步释义微调 CLIP 文本编码器Hyunjae Kim, Seunghyun Yoon, Trung Bui, Handong Zhao, Quan Tran, Franck Dernoncourt, Jaewoo Kangarxiv.org/pdf/2402.15…null