[分享][每日更新][2024.02.06][CV_arxiv_papers]

299 阅读10分钟

[UPDATED!] 2024-02-06 (Publish Time)

分类/检测/识别/分割/...

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-06EVA-CLIP-18B: Scaling CLIP to 18 Billion ParametersEVA-CLIP-18B:将 CLIP 扩展到 180 亿个参数Quan Sun, Jinsheng Wang, Qiying Yu, Yufeng Cui, Fan Zhang, Xiaosong Zhang, Xinlong Wangarxiv.org/pdf/2402.04…null
2024-02-06SHIELD : An Evaluation Benchmark for Face Spoofing and Forgery Detection with Multimodal Large Language ModelsSHIELD:使用多模态大语言模型进行人脸欺骗和伪造检测的评估基准Yichen Shi, Yuhao Gao, Yingxin Lai, Hongyang Wang, Jun Feng, Lei He, Jun Wan, Changsheng Chen, Zitong Yu, Xiaochun Caoarxiv.org/pdf/2402.04…null
2024-02-06A Hard-to-Beat Baseline for Training-free CLIP-based Adaptation无需培训、基于 CLIP 的适应的难以超越的基线Zhengbo Wang, Jian Liang, Lijun Sheng, Ran He, Zilei Wang, Tieniu Tanarxiv.org/pdf/2402.04…null
2024-02-06Multi-class Road Defect Detection and Segmentation using Spatial and Channel-wise Attention for Autonomous Road Repairing使用空间和通道注意进行多类道路缺陷检测和分割以进行自主道路修复Jongmin Yu, Chen Bene Chi, Sebastiano Fichera, Paolo Paoletti, Devansh Mehta, Shan Luoarxiv.org/pdf/2402.04…null
2024-02-06Connecting the Dots: Collaborative Fine-tuning for Black-Box Vision-Language Models连接点:黑盒视觉语言模型的协作微调Zhengbo Wang, Jian Liang, Ran He, Zilei Wang, Tieniu Tanarxiv.org/pdf/2402.04…null
2024-02-06Polyp-DDPM: Diffusion-Based Semantic Polyp Synthesis for Enhanced SegmentationPolyp-DDPM:基于扩散的语义息肉合成以增强分割Zolnamar Dorjsembe, Hsing-Kuo Pao, Furen Xiaoarxiv.org/pdf/2402.04…null
2024-02-06YOLOPoint Joint Keypoint and Object DetectionYOLOPoint 联合关键点和物体检测Anton Backhaus, Thorsten Luettel, Hans-Joachim Wuenschearxiv.org/pdf/2402.03…null
2024-02-06Humans Beat Deep Networks at Recognizing Objects in Unusual Poses, Given Enough Time只要有足够的时间,人类就能在识别异常姿势的物体方面击败深度网络Netta Ollikka, Amro Abbas, Andrea Perin, Markku Kilpeläinen, Stéphane Denyarxiv.org/pdf/2402.03…null
2024-02-06Boosting Adversarial Transferability across Model Genus by Deformation-Constrained Warping通过变形约束翘曲提高跨模型属的对抗性可迁移性Qinliang Lin, Cheng Luo, Zenghao Niu, Xilin He, Weicheng Xie, Yuanbo Hou, Linlin Shen, Siyang Songarxiv.org/pdf/2402.03…null
2024-02-06A new method for optical steel rope non-destructive damage detection一种光学钢丝绳无损损伤检测新方法Yunqing Bao, Bin Huarxiv.org/pdf/2402.03…null
2024-02-06An SVD-free Approach to Nonlinear Dictionary Learning based on RVFL基于RVFL的无SVD非线性字典学习方法G. Madhuri, Atul Negiarxiv.org/pdf/2402.03…null
2024-02-06Face Detection: Present State and Research Directions人脸检测:现状和研究方向Purnendu Prabhat, Himanshu Gupta, Ajeet Kumar Vishwakarmaarxiv.org/pdf/2402.03…null
2024-02-06Energy-based Domain-Adaptive Segmentation with Depth Guidance具有深度引导的基于能量的域自适应分割Jinjing Zhu, Zhedong Hu, Tae-Kyun Kim, Lin Wangarxiv.org/pdf/2402.03…null
2024-02-06Exploring Low-Resource Medical Image Classification with Weakly Supervised Prompt Learning通过弱监督即时学习探索低资源医学图像分类Fudan Zheng, Jindong Cao, Weijiang Yu, Zhiguang Chen, Nong Xiao, Yutong Luarxiv.org/pdf/2402.03…null
2024-02-06AttackNet: Enhancing Biometric Security via Tailored Convolutional Neural Network Architectures for Liveness DetectionAttackNet:通过定制的活体检测卷积神经网络架构增强生物识别安全性Oleksandr Kuznetsov, Dmytro Zakharov, Emanuele Frontoni, Andrea Maranesiarxiv.org/pdf/2402.03…null
2024-02-06MoD-SLAM: Monocular Dense Mapping for Unbounded 3D Scene ReconstructionMoD-SLAM:用于无界 3D 场景重建的单目密集建图Heng Zhou, Zhetao Guo, Shuhong Liu, Lechen Zhang, Qihao Wang, Yuxiang Ren, Mingrui Liarxiv.org/pdf/2402.03…null
2024-02-06Virtual Classification: Modulating Domain-Specific Knowledge for Multidomain Crowd Counting虚拟分类:调整多域人群计数的特定领域知识Mingyue Guo, Binghui Chen, Zhaoyi Yan, Yaowei Wang, Qixiang Yearxiv.org/pdf/2402.03…null
2024-02-06SISP: A Benchmark Dataset for Fine-grained Ship Instance Segmentation in Panchromatic Satellite ImagesSISP:全色卫星图像中细粒度船舶实例分割的基准数据集Pengming Feng, Mingjie Xie, Hongning Liu, Xuanjia Zhao, Guangjun He, Xueliang Zhang, Jian Guanarxiv.org/pdf/2402.03…null
2024-02-06MMAUD: A Comprehensive Multi-Modal Anti-UAV Dataset for Modern Miniature Drone ThreatsMMAUD:针对现代微型无人​​机威胁的综合多模式反无人机数据集Shenghai Yuan, Yizhuo Yang, Thien Hoang Nguyen, Thien-Minh Nguyen, Jianfei Yang, Fen Liu, Jianping Li, Han Wang, Lihua Xiearxiv.org/pdf/2402.03…null
2024-02-06SHMC-Net: A Mask-guided Feature Fusion Network for Sperm Head Morphology ClassificationSHMC-Net:用于精子头部形态分类的掩模引导特征融合网络Nishchal Sapkota, Yejia Zhang, Sirui Li, Peixian Liang, Zhuo Zhao, Danny Z Chenarxiv.org/pdf/2402.03…null
2024-02-06ConUNETR: A Conditional Transformer Network for 3D Micro-CT Embryonic Cartilage SegmentationConUNETR:用于 3D Micro-CT 胚胎软骨分割的条件变压器网络Nishchal Sapkota, Yejia Zhang, Susan M. Motch Perrine, Yuhan Hsi, Sirui Li, Meng Wu, Greg Holmes, Abdul R. Abdulai, Ethylin W. Jabs, Joan T. Richtsmeier, et.al.arxiv.org/pdf/2402.03…null
2024-02-06BEAM: Beta Distribution Ray Denoising for Multi-view 3D Object DetectionBEAM:用于多视图 3D 物体检测的 Beta 分布射线去噪Feng Liu, Tengteng Huang, Qianjing Zhang, Haotian Yao, Chi Zhang, Fang Wan, Qixiang Ye, Yanzhao Zhouarxiv.org/pdf/2402.03…null
2024-02-06CAT-SAM: Conditional Tuning Network for Few-Shot Adaptation of Segmentation Anything ModelCAT-SAM:用于分段任意模型的少样本自适应的条件调整网络Aoran Xiao, Weihao Xuan, Heli Qi, Yun Xing, Ruijie Ren, Xiaoqin Zhang, Shijian Luarxiv.org/pdf/2402.03…null
2024-02-06Improving Contextual Congruence Across Modalities for Effective Multimodal Marketing using Knowledge-infused Learning利用注入知识的学习提高跨模式的上下文一致性,以实现有效的多模式营销Trilok Padhi, Ugur Kursuncu, Yaman Kumar, Valerie L. Shalin, Lane Peterson Fronczekarxiv.org/pdf/2402.03…null

图像理解

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-06VRMM: A Volumetric Relightable Morphable Head ModelVRMM:体积可重复照明可变形头部模型Haotian Yang, Mingwu Zheng, Chongyang Ma, Yu-Kun Lai, Pengfei Wan, Haibin Huangarxiv.org/pdf/2402.04…null

LLM

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-06HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust RefusalHarmBench:自动化红队和强力拒绝的标准化评估框架Mantas Mazeika, Long Phan, Xuwang Yin, Andy Zou, Zifan Wang, Norman Mu, Elham Sakhaee, Nathaniel Li, Steven Basart, Bo Li, et.al.arxiv.org/pdf/2402.04…null
2024-02-06The Instinctive Bias: Spurious Images lead to Hallucination in MLLMs本能偏见:虚假图像导致 MLLM 产生幻觉Tianyang Han, Qing Lian, Rui Pan, Renjie Pi, Jipeng Zhang, Shizhe Diao, Yong Lin, Tong Zhangarxiv.org/pdf/2402.03…null
2024-02-06Tuning Large Multimodal Models for Videos using Reinforcement Learning from AI Feedback使用 AI 反馈的强化学习来调整视频的大型多模态模型Daechul Ahn, Yura Choi, Youngjae Yu, Dongyeop Kang, Jonghyun Choiarxiv.org/pdf/2402.03…null
2024-02-06Automatic Robotic Development through Collaborative Framework by Large Language Models通过大型语言模型的协作框架进行自动机器人开发Zhirong Luan, Yujun Laiarxiv.org/pdf/2402.03…null

Transformer

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-06U-shaped Vision Mamba for Single Image Dehazing用于单图像去雾的 U 形 Vision MambaZhuoran Zheng, Chen Wuarxiv.org/pdf/2402.04…null
2024-02-06Low-rank Attention Side-Tuning for Parameter-Efficient Fine-Tuning用于参数高效微调的低阶注意力侧调Ningyuan Tang, Minghao Fu, Ke Zhu, Jianxin Wuarxiv.org/pdf/2402.04…null
2024-02-06Controllable Diverse Sampling for Diffusion Based Motion Behavior Forecasting基于扩散的运动行为预测的可控多样化采样Yiming Xu, Hao Cheng, Monika Sesterarxiv.org/pdf/2402.03…null
2024-02-06Intensive Vision-guided Network for Radiology Report Generation用于生成放射学报告的强化视觉引导网络Fudan Zheng, Mengfei Li, Ying Wang, Weijiang Yu, Ruixuan Wang, Zhiguang Chen, Nong Xiao, Yutong Luarxiv.org/pdf/2402.03…null
2024-02-06Pre-training of Lightweight Vision Transformers on Small Datasets with Minimally Scaled Images在具有最小缩放图像的小数据集上预训练轻量级视觉变压器Jen Hong Tanarxiv.org/pdf/2402.03…null
2024-02-06Attention-based Shape and Gait Representations Learning for Video-based Cloth-Changing Person Re-Identification基于注意力的形状和步态表示学习,用于基于视频的换衣人员重新识别Vuong D. Nguyen, Samiha Mirza, Pranav Mantini, Shishir K. Shaharxiv.org/pdf/2402.03…null

3D/CG

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-06Instance by Instance: An Iterative Framework for Multi-instance 3D Registration逐个实例:多实例 3D 配准的迭代框架Xinyue Cao, Xiyu Zhang, Yuxin Cheng, Zhaoshuai Qi, Yanning Zhang, Jiaqi Yangarxiv.org/pdf/2402.04…null
2024-02-063D Volumetric Super-Resolution in Radiology Using 3D RRDB-GAN使用 3D RRDB-GAN 实现放射学中的 3D 体积超分辨率Juhyung Ha, Nian Wang, Surendra Maharjan, Xuhong Zhangarxiv.org/pdf/2402.04…null
2024-02-06EscherNet: A Generative Model for Scalable View SynthesisEscherNet:可扩展视图合成的生成模型Xin Kong, Shikun Liu, Xiaoyang Lyu, Marwan Taher, Xiaojuan Qi, Andrew J. Davisonarxiv.org/pdf/2402.03…null
2024-02-06Belief Scene Graphs: Expanding Partial Scenes with Objects through Computation of Expectation信念场景图:通过期望计算​​用对象扩展部分场景Mario A. V. Saucedo, Akash Patel, Akshit Saradagi, Christoforos Kanellakis, George Nikolakopoulosarxiv.org/pdf/2402.03…null
2024-02-06OASim: an Open and Adaptive Simulator based on Neural Rendering for Autonomous DrivingOASim:基于神经渲染的自动驾驶开放自适应模拟器Guohang Yan, Jiahao Pi, Jianfei Guo, Zhaotong Luo, Min Dou, Nianchen Deng, Qiusheng Huang, Daocheng Fu, Licheng Wen, Pinlong Cai, et.al.arxiv.org/pdf/2402.03…null
2024-02-06Rig3DGS: Creating Controllable Portraits from Casual Monocular VideosRig3DGS:从休闲单目视频创建可控肖像Alfredo Rivero, ShahRukh Athar, Zhixin Shu, Dimitris Samarasarxiv.org/pdf/2402.03…null
2024-02-063Doodle: Compact Abstraction of Objects with 3D Strokes3Doodle:使用 3D 笔画对对象进行紧凑抽象Changwoon Choi, Jaeah Lee, Jaesik Park, Young Min Kimarxiv.org/pdf/2402.03…null

各类学习方式

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-06OVOR: OnePrompt with Virtual Outlier Regularization for Rehearsal-Free Class-Incremental LearningOVOR:OnePrompt 具有虚拟离群值正则化功能,可实现免排练的课堂增量学习Wei-Cheng Huang, Chun-Fu Chen, Hsiang Hsuarxiv.org/pdf/2402.04…null
2024-02-06Elastic Feature Consolidation for Cold Start Exemplar-free Incremental Learning用于冷启动无范例增量学习的弹性特征整合Simone Magistri, Tomaso Trinci, Albin Soutif-Cormerais, Joost van de Weijer, Andrew D. Bagdanovarxiv.org/pdf/2402.03…null
2024-02-06Deep MSFOP: Multiple Spectral filter Operators Preservation in Deep Functional Maps for Unsupervised Shape MatchingDeep MSFOP:在深度函数图中保留多个光谱滤波器算子以实现无监督形状匹配Feifan Luo, Qingsong Li, Ling Hu, Xinru Liu, Haojun Xu, Haibo Wang, Ting Li, Shengjun Liuarxiv.org/pdf/2402.03…null
2024-02-06Convincing Rationales for Visual Question Answering Reasoning视觉问答推理的令人信服的理由Kun Li, George Vosselman, Michael Ying Yangarxiv.org/pdf/2402.03…null
2024-02-06Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models视觉超对齐:视觉基础模型的弱到强泛化Jianyuan Guo, Hanting Chen, Chengcheng Wang, Kai Han, Chang Xu, Yunhe Wangarxiv.org/pdf/2402.03…null

其他

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-06CogCoM: Train Large Vision-Language Models Diving into Details through Chain of ManipulationsCogCoM:训练大型视觉语言模型,通过操作链深入细节Ji Qi, Ming Ding, Weihan Wang, Yushi Bai, Qingsong Lv, Wenyi Hong, Bin Xu, Lei Hou, Juanzi Li, Yuxiao Dong, et.al.arxiv.org/pdf/2402.04…null
2024-02-06Informed Reinforcement Learning for Situation-Aware Traffic Rule Exceptions针对情境感知交通规则异常的知情强化学习Daniel Bogdoll, Jing Qin, Moritz Nekolla, Ahmed Abouelazm, Tim Joseph, J. Marius Zöllnerarxiv.org/pdf/2402.04…null
2024-02-06Analysis of Deep Image Prior and Exploiting Self-Guidance for Image Reconstruction深度图像先验分析和利用自引导进行图像重建Shijun Liang, Evan Bell, Qing Qu, Rongrong Wang, Saiprasad Ravishankararxiv.org/pdf/2402.04…null
2024-02-06Privacy Leakage on DNNs: A Survey of Model Inversion Attacks and DefensesDNN 上的隐私泄露:模型反转攻击和防御的调查Hao Fang, Yixiang Qiu, Hongyao Yu, Wenbo Yu, Jiawei Kong, Baoli Chong, Bin Chen, Xuan Wang, Shu-Tao Xiaarxiv.org/pdf/2402.04…null
2024-02-06MobileVLM V2: Faster and Stronger Baseline for Vision Language ModelMobileVLM V2:更快更强的视觉语言模型基线Xiangxiang Chu, Limeng Qiao, Xinyu Zhang, Shuang Xu, Fei Wei, Yang Yang, Xiaofei Sun, Yiming Hu, Xinyang Lin, Bo Zhang, et.al.arxiv.org/pdf/2402.03…null
2024-02-06AoSRNet: All-in-One Scene Recovery Networks via Multi-knowledge IntegrationAoSRNet:通过多知识集成的多合一场景恢复网络Yuxu Lu, Dong Yang, Yuan Gao, Ryan Wen Liu, Jun Liu, Yu Guoarxiv.org/pdf/2402.03…null
2024-02-06FoolSDEdit: Deceptively Steering Your Edits Towards Targeted Attribute-aware DistributionFoolSDEdit:欺骗性地将您的编辑引向有针对性的属性感知分发Qi Zhou, Dongxia Wang, Tianlin Li, Zhihong Xu, Yang Liu, Kui Ren, Wenhai Wang, Qing Guoarxiv.org/pdf/2402.03…null
2024-02-06QuEST: Low-bit Diffusion Model Quantization via Efficient Selective FinetuningQuEST:通过高效选择性微调进行低位扩散模型量化Haoxuan Wang, Yuzhang Shang, Zhihang Yuan, Junyi Wu, Yan Yanarxiv.org/pdf/2402.03…null
2024-02-06Reviewing FID and SID Metrics on Generative Adversarial Networks审查生成对抗网络的 FID 和 SID 指标Ricardo de Deijn, Aishwarya Batra, Brandon Koch, Naseef Mansoor, Hema Makkenaarxiv.org/pdf/2402.03…null
2024-02-06GRASP: GRAph-Structured Pyramidal Whole Slide Image RepresentationGRASP:图结构金字塔整体幻灯片图像表示Ali Khajegili Mirabadi, Graham Archibald, Amirali Darbandsari, Alberto Contreras-Sanz, Ramin Ebrahim Nakhli, Maryam Asadi, Allen Zhang, C. Blake Gilks, Peter Black, Gang Wang, et.al.arxiv.org/pdf/2402.03…null