[分享][每日更新][2024.03.15][CV_arxiv_papers]

330 阅读23分钟

[UPDATED!] 2024-03-15 (Publish Time)

生成模型

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-15Lodge: A Coarse to Fine Diffusion Network for Long Dance Generation Guided by the Characteristic Dance PrimitivesLodge:由特征舞蹈原语引导的长舞蹈生成的从粗到细的扩散网络Ronghui Li, YuXiang Zhang, Yachao Zhang, Hongwen Zhang, Jie Guo, Yan Zhang, Yebin Liu, Xiu Liarxiv.org/pdf/2403.10…link
2024-03-15Isotropic3D: Image-to-3D Generation Based on a Single CLIP EmbeddingIsotropic3D:基于单个 CLIP 嵌入的图像到 3D 生成Pengkun Liu, Yikai Wang, Fuchun Sun, Jiafang Li, Hang Xiao, Hongxiang Xue, Xinzhou Wangarxiv.org/pdf/2403.10…link
2024-03-15Denoising Task Difficulty-based Curriculum for Training Diffusion Models基于去噪任务难度的扩散模型训练课程Jin-Young Kim, Hyojun Go, Soonwoo Kwon, Hyun-Gyoon Kimarxiv.org/pdf/2403.10…null
2024-03-15Towards Generalizable Deepfake Video Detection with Thumbnail Layout and Graph Reasoning通过缩略图布局和图形推理实现可推广的 Deepfake 视频检测Yuting Xu, Jian Liang, Lijun Sheng, Xiao-Yu Zhangarxiv.org/pdf/2403.10…link
2024-03-15Arbitrary-Scale Image Generation and Upsampling using Latent Diffusion Model and Implicit Neural Decoder使用潜在扩散模型和隐式神经解码器的任意尺度图像生成和上采样Jinseok Kim, Tae-Kyun Kimarxiv.org/pdf/2403.10…null
2024-03-15FDGaussian: Fast Gaussian Splatting from Single Image via Geometric-aware Diffusion ModelFDGaussian:通过几何感知扩散模型从单幅图像进行快速高斯泼溅Qijun Feng, Zhen Xing, Zuxuan Wu, Yu-Gang Jiangarxiv.org/pdf/2403.10…null
2024-03-15BlindDiff: Empowering Degradation Modelling in Diffusion Models for Blind Image Super-ResolutionBlindDiff:增强盲图像超分辨率扩散模型中的退化建模Feng Li, Yixuan Wu, Zichao Liang, Runmin Cong, Huihui Bai, Yao Zhao, Meng Wangarxiv.org/pdf/2403.10…link
2024-03-15Animate Your Motion: Turning Still Images into Dynamic Videos动画化你的动作:将静态图像变成动态视频Mingxiao Li, Bo Wan, Marie-Francine Moens, Tinne Tuytelaarsarxiv.org/pdf/2403.10…null
2024-03-15SemanticHuman-HD: High-Resolution Semantic Disentangled 3D Human GenerationSemanticHuman-HD:高分辨率语义解缠结的 3D 人类生成Peng Zheng, Tao Liu, Zili Yi, Rui Maarxiv.org/pdf/2403.10…null
2024-03-15DiffMAC: Diffusion Manifold Hallucination Correction for High Generalization Blind Face RestorationDiffMAC:用于高泛化盲脸恢复的扩散流形幻觉校正Nan Gao, Jia Li, Huaibo Huang, Zhi Zeng, Ke Shang, Shuwu Zhang, Ran Hearxiv.org/pdf/2403.10…null
2024-03-15RangeLDM: Fast Realistic LiDAR Point Cloud GenerationRangeLDM:快速逼真的 LiDAR 点云生成Qianjiang Hu, Zhimin Zhang, Wei Huarxiv.org/pdf/2403.10…null
2024-03-15A survey of synthetic data augmentation methods in computer vision计算机视觉中合成数据增强方法的综述Alhassan Mumuni, Fuseini Mumuni, Nana Kobina Gerrararxiv.org/pdf/2403.10…null
2024-03-15RID-TWIN: An end-to-end pipeline for automatic face de-identification in videosRID-TWIN:用于视频中自动人脸去识别的端到端管道Anirban Mukherjee, Monjoy Narayan Choudhury, Dinesh Babu Jayagopiarxiv.org/pdf/2403.10…null
2024-03-15SphereDiffusion: Spherical Geometry-Aware Distortion Resilient Diffusion ModelSphereDiffusion:球形几何感知失真弹性扩散模型Tao Wu, Xuewei Li, Zhongang Qi, Di Hu, Xintao Wang, Ying Shan, Xi Liarxiv.org/pdf/2403.10…null
2024-03-15Real-World Computational Aberration Correction via Quantized Domain-Mixing Representation通过量化域混合表示进行现实世界计算像差校正Qi Jiang, Zhonghua Yi, Shaohua Gao, Yao Gao, Xiaolong Qian, Hao Shi, Lei Sun, Zhijie Xu, Kailun Yang, Kaiwei Wangarxiv.org/pdf/2403.10…null
2024-03-15ST-LDM: A Universal Framework for Text-Grounded Object Generation in Real ImagesST-LDM:真实图像中基于文本的对象生成的通用框架Xiangtian Xue, Jiasong Wu, Youyong Kong, Lotfi Senhadji, Huazhong Shuarxiv.org/pdf/2403.10…null
2024-03-15Controllable Text-to-3D Generation via Surface-Aligned Gaussian Splatting通过表面对齐高斯溅射实现可控文本到 3D 生成Zhiqi Li, Yiming Chen, Lingzhe Zhao, Peidong Liuarxiv.org/pdf/2403.09…null

多模态

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-15VideoAgent: Long-form Video Understanding with Large Language Model as AgentVideoAgent:以大语言模型为代理的长格式视频理解Xiaohan Wang, Yuhui Zhang, Orr Zohar, Serena Yeung-Levyarxiv.org/pdf/2403.10…null
2024-03-15Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study多模态基础模型的零样本鲁棒性基准测试:试点研究Chenguang Wang, Ruoxi Jia, Xin Liu, Dawn Songarxiv.org/pdf/2403.10…link
2024-03-15Mitigating Dialogue Hallucination for Large Multi-modal Models via Adversarial Instruction Tuning通过对抗性指令调整减轻大型多模态模型的对话幻觉Dongmin Park, Zhaofang Qian, Guangxing Han, Ser-Nam Limarxiv.org/pdf/2403.10…null
2024-03-15Joint Multimodal Transformer for Dimensional Emotional Recognition in the Wild用于野外维度情感识别的联合多模态变压器Paul Waligora, Osama Zeeshan, Haseeb Aslam, Soufiane Belharbi, Alessandro Lameiras Koerich, Marco Pedersoli, Simon Bacon, Eric Grangerarxiv.org/pdf/2403.10…null
2024-03-15EXAMS-V: A Multi-Discipline Multilingual Multimodal Exam Benchmark for Evaluating Vision Language ModelsEXAMS-V:用于评估视觉语言模型的多学科多语言多模式考试基准Rocktim Jyoti Das, Simeon Emilov Hristov, Haonan Li, Dimitar Iliyanov Dimitrov, Ivan Koychev, Preslav Nakovarxiv.org/pdf/2403.10…null
2024-03-15ANIM: Accurate Neural Implicit Model for Human Reconstruction from a single RGB-D imageANIM:用于从单个 RGB-D 图像重建人体的精确神经隐式模型Marco Pesavento, Yuanlu Xu, Nikolaos Sarafianos, Robert Maier, Ziyan Wang, Chun-Han Yao, Marco Volino, Edmond Boyer, Adrian Hilton, Tony Tungarxiv.org/pdf/2403.10…null
2024-03-15Uni-SMART: Universal Science Multimodal Analysis and Research TransformerUni-SMART:通用科学多模态分析和研究转换器Hengxing Cai, Xiaochen Cai, Shuwen Yang, Jiankun Wang, Lin Yao, Zhifeng Gao, Junhan Chang, Sihang Li, Mingjun Xu, Changxin Wang, et.al.arxiv.org/pdf/2403.10…null
2024-03-15Few-Shot Image Classification and Segmentation as Visual Question Answering Using Vision-Language Models使用视觉语言模型进行少镜头图像分类和分割作为视觉问答Tian Meng, Yang Tao, Ruilin Lyu, Wuliang Yinarxiv.org/pdf/2403.10…null
2024-03-15Magic Tokens: Select Diverse Tokens for Multi-modal Object Re-IdentificationMagic Tokens:选择多样化的Token进行多模态物体重识别Pingping Zhang, Yuhao Wang, Yang Liu, Zhengzheng Tu, Huchuan Luarxiv.org/pdf/2403.10…link
2024-03-15HawkEye: Training Video-Text LLMs for Grounding Text in VideosHawkEye:培训视频文本法学硕士,为视频中的文本奠定基础Yueqian Wang, Xiaojun Meng, Jianxin Liang, Yuxuan Wang, Qun Liu, Dongyan Zhaoarxiv.org/pdf/2403.10…null
2024-03-15Improving Medical Multi-modal Contrastive Learning with Expert Annotations通过专家注释改进医学多模态对比学习Yogesh Kumar, Pekka Marttinenarxiv.org/pdf/2403.10…null
2024-03-15CSDNet: Detect Salient Object in Depth-Thermal via A Lightweight Cross Shallow and Deep Perception NetworkCSDNet:通过轻量级交叉浅层和深层感知网络检测深度热中的显着物体Xiaotong Yu, Ruihan Xie, Zhihe Zhao, Chang-Wen Chenarxiv.org/pdf/2403.10…null
2024-03-15Histo-Genomic Knowledge Distillation For Cancer Prognosis From Histopathology Whole Slide Images从组织病理学全幻灯片图像中提炼癌症预后的组织基因组知识Zhikang Wang, Yumeng Zhang, Yingxue Xu, Seiya Imoto, Hao Chen, Jiangning Songarxiv.org/pdf/2403.10…link
2024-03-15Knowledge Condensation and Reasoning for Knowledge-based VQA基于知识的 VQA 的知识凝结和推理Dongze Hao, Jian Jia, Longteng Guo, Qunbo Wang, Te Yang, Yan Li, Yanhua Cheng, Bo Wang, Quan Chen, Han Li, et.al.arxiv.org/pdf/2403.10…null
2024-03-15SparseFusion: Efficient Sparse Multi-Modal Fusion Framework for Long-Range 3D PerceptionSparseFusion:用于远距离 3D 感知的高效稀疏多模态融合框架Yiheng Li, Hongyang Li, Zehao Huang, Hong Chang, Naiyan Wangarxiv.org/pdf/2403.10…null
2024-03-15Visual Foundation Models Boost Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation视觉基础模型促进 3D 语义分割的跨模式无监督域适应Jingyi Xu, Weidong Yang, Lingdong Kong, Youquan Liu, Rui Zhang, Qingyuan Zhou, Ben Feiarxiv.org/pdf/2403.10…link
2024-03-15GET: Unlocking the Multi-modal Potential of CLIP for Generalized Category DiscoveryGET:释放 CLIP 的多模式潜力以进行广义类别发现Enguang Wang, Zhimao Peng, Zhengyuan Xie, Xialei Liu, Ming-Ming Chengarxiv.org/pdf/2403.09…link

Nerf

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-15FeatUp: A Model-Agnostic Framework for Features at Any ResolutionFeatUp:适用于任何分辨率特征的模型无关框架Stephanie Fu, Mark Hamilton, Laura Brandt, Axel Feldman, Zhoutong Zhang, William T. Freemanarxiv.org/pdf/2403.10…null
2024-03-15Thermal-NeRF: Neural Radiance Fields from an Infrared CameraThermal-NeRF:红外相机的神经辐射场Tianxiang Ye, Qi Wu, Junyuan Deng, Guoqing Liu, Liu Liu, Songpengcheng Xia, Liang Pang, Wenxian Yu, Ling Peiarxiv.org/pdf/2403.10…null
2024-03-15Leveraging Neural Radiance Field in Descriptor Synthesis for Keypoints Scene Coordinate Regression利用神经辐射场进行关键点场景坐标回归的描述符合成Huy-Hoang Bui, Bach-Thuan Bui, Dinh-Tuan Tran, Joo-Ho Leearxiv.org/pdf/2403.10…null
2024-03-15GGRt: Towards Generalizable 3D Gaussians without Pose Priors in Real-TimeGGRt:在没有实时姿势先验的情况下实现可推广的 3D 高斯Hao Li, Yuanyuan Gao, Dingwen Zhang, Chenming Wu, Yalun Dai, Chen Zhao, Haocheng Feng, Errui Ding, Jingdong Wang, Junwei Hanarxiv.org/pdf/2403.10…null
2024-03-15URS-NeRF: Unordered Rolling Shutter Bundle Adjustment for Neural Radiance FieldsURS-NeRF:神经辐射场的无序滚动快门束调整Bo Xu, Ziao Liu, Mengqi Guo, Jiancheng Li, Gim Hee Liarxiv.org/pdf/2403.10…null
2024-03-15DyBluRF: Dynamic Neural Radiance Fields from Blurry Monocular VideoDyBluRF:模糊单目视频的动态神经辐射场Huiqiang Sun, Xingyi Li, Liao Shen, Xinyi Ye, Ke Xian, Zhiguo Caoarxiv.org/pdf/2403.10…null
2024-03-15Den-SOFT: Dense Space-Oriented Light Field DataseT for 6-DOF Immersive ExperienceDen-SOFT:用于六自由度沉浸式体验的密集空间导向光场数据集Xiaohang Yu, Zhengxian Yang, Shi Pan, Yuqi Han, Haoxiang Wang, Jun Zhang, Shi Yan, Borong Lin, Lei Yang, Tao Yu, et.al.arxiv.org/pdf/2403.09…null

3DGS

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-15SWAG: Splatting in the Wild images with Appearance-conditioned GaussiansSWAG:使用外观条件高斯函数在野外图像中泼溅Hiba Dahmani, Moussab Bennehar, Nathan Piasco, Luis Roldao, Dzmitry Tsishkouarxiv.org/pdf/2403.10…null
2024-03-15Texture-GS: Disentangling the Geometry and Texture for 3D Gaussian Splatting Editing纹理-GS:解开几何和纹理以进行 3D 高斯泼溅编辑Tian-Xing Xu, Wenbo Hu, Yu-Kun Lai, Ying Shan, Song-Hai Zhangarxiv.org/pdf/2403.10…null

模型压缩/优化

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-15Group-Mix SAM: Lightweight Solution for Industrial Assembly Line ApplicationsGroup-Mix SAM:工业装配线应用的轻量级解决方案Wu Liang, X. -G. Maarxiv.org/pdf/2403.10…null
2024-03-15Towards Adversarially Robust Dataset Distillation by Curvature Regularization通过曲率正则化实现对抗性鲁棒数据集蒸馏Eric Xue, Yijiang Li, Haoyang Liu, Yifan Shen, Haohan Wangarxiv.org/pdf/2403.10…null
2024-03-15Multi-criteria Token Fusion with One-step-ahead Attention for Efficient Vision Transformers具有一步领先注意力的多标准令牌融合,用于高效视觉变压器Sanghyeok Lee, Joonmyung Choi, Hyunwoo J. Kimarxiv.org/pdf/2403.10…null
2024-03-15Quantization Effects on Neural Networks Perception: How would quantization change the perceptual field of vision models?量化对神经网络感知的影响:量化将如何改变感知视野模型?Mohamed Amine Kerkouri, Marouane Tliba, Aladine Chetouani, Alessandro Brunoarxiv.org/pdf/2403.09…null

分类/检测/识别/分割/...

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-15Frozen Feature Augmentation for Few-Shot Image Classification用于少样本图像分类的冻结特征增强Andreas Bär, Neil Houlsby, Mostafa Dehghani, Manoj Kumararxiv.org/pdf/2403.10…null
2024-03-15NeuFlow: Real-time, High-accuracy Optical Flow Estimation on Robots Using Edge DevicesNeuFlow:使用边缘设备对机器人进行实时、高精度光流估计Zhiyong Zhang, Huaizu Jiang, Hanumant Singharxiv.org/pdf/2403.10…link
2024-03-15Real-Time Image Segmentation via Hybrid Convolutional-Transformer Architecture Search通过混合卷积变压器架构进行实时图像分割Hongyuan Yu, Cheng Wan, Mengchen Liu, Dongdong Chen, Bin Xiao, Xiyang Daiarxiv.org/pdf/2403.10…link
2024-03-15A comparative study on machine learning approaches for rock mass classification using drilling data利用钻孔数据进行岩体分类的机器学习方法的比较研究Tom F. Hansen, Georg H. Erharter, Zhongqiang Liu, Jim Torresenarxiv.org/pdf/2403.10…null
2024-03-15Energy Correction Model in the Feature Space for Out-of-Distribution Detection分布外检测的特征空间能量校正模型Marc Lafon, Clément Rambour, Nicolas Thomearxiv.org/pdf/2403.10…null
2024-03-15Open Stamped Parts Dataset打开冲压件数据集Sara Antiles, Sachin S. Talathiarxiv.org/pdf/2403.10…null
2024-03-15SimPB: A Single Model for 2D and 3D Object Detection from Multiple CamerasSimPB:用于从多个摄像机进行 2D 和 3D 物体检测的单一模型Yingqi Tang, Zhaotie Meng, Guoliang Chen, Erkang Chengarxiv.org/pdf/2403.10…null
2024-03-15Deep Learning for Multi-Level Detection and Localization of Myocardial Scars Based on Regional Strain Validated on Virtual Patients基于区域应变的心肌疤痕多级检测和定位的深度学习在虚拟患者上得到验证Müjde Akdeniz, Claudia Alessandra Manetti, Tijmen Koopsen, Hani Nozari Mirar, Sten Roar Snare, Svein Arne Aase, Joost Lumens, Jurica Šprem, Kristin Sarah McLeodarxiv.org/pdf/2403.10…null
2024-03-15Local positional graphs and attentive local features for a data and runtime-efficient hierarchical place recognition pipeline局部位置图和细心的局部特征,用于数据和运行时高效的分层位置识别管道Fangming Yuan, Stefan Schubert, Peter Protzel, Peer Neubertarxiv.org/pdf/2403.10…null
2024-03-15Region-aware Distribution Contrast: A Novel Approach to Multi-Task Partially Supervised Learning区域感知分布对比:多任务部分监督学习的新方法Meixuan Li, Tianyu Li, Guoqing Wang, Peng Wang, Yang Yang, Heng Tao Shenarxiv.org/pdf/2403.10…null
2024-03-15CoLeCLIP: Open-Domain Continual Learning via Joint Task Prompt and Vocabulary LearningCoLeCLIP:通过联合任务提示和词汇学习进行开放域持续学习Yukun Li, Guansong Pang, Wei Suo, Chenchen Jing, Yuling Xi, Lingqiao Liu, Hao Chen, Guoqiang Liang, Peng Wangarxiv.org/pdf/2403.10…null
2024-03-15Exploring Optical Flow Inclusion into nnU-Net Framework for Surgical Instrument Segmentation探索将光流纳入 nnU-Net 框架中以实现手术器械分割Marcos Fernández-Rodríguez, Bruno Silva, Sandro Queirós, Helena R. Torres, Bruno Oliveira, Pedro Morais, Lukas R. Buschle, Jorge Correia-Pinto, Estevão Lima, João L. Vilaçaarxiv.org/pdf/2403.10…null
2024-03-15A Data-Driven Approach for Mitigating Dark Current Noise and Bad Pixels in Complementary Metal Oxide Semiconductor Cameras for Space-based Telescopes一种用于减轻天基望远镜互补金属氧化物半导体相机中暗电流噪声和坏像素的数据驱动方法Peng Jia, Chao Lv, Yushan Li, Yongyang Sun, Shu Niu, Zhuoxiao Wangarxiv.org/pdf/2403.10…null
2024-03-15Learning on JPEG-LDPC Compressed Images: Classifying with SyndromesJPEG-LDPC 压缩图像的学习:用综合症分类Ahcen Aliouat, Elsa Duprazarxiv.org/pdf/2403.10…null
2024-03-15Generative Region-Language Pretraining for Open-Ended Object Detection用于开放式目标检测的生成区域语言预训练Chuang Lin, Yi Jiang, Lizhen Qu, Zehuan Yuan, Jianfei Caiarxiv.org/pdf/2403.10…link
2024-03-15A Hybrid SNN-ANN Network for Event-based Object Detection with Spatial and Temporal Attention用于具有空间和时间注意力的基于事件的目标检测的混合 SNN-ANN 网络Soikat Hasan Ahmed, Jan Finkbeiner, Emre Neftciarxiv.org/pdf/2403.10…null
2024-03-15Computer User Interface Understanding. A New Dataset and a Learning Framework计算机用户界面理解。新的数据集和学习框架Andrés Muñoz, Daniel Borrajoarxiv.org/pdf/2403.10…null
2024-03-15Cardiac valve event timing in echocardiography using deep learning and triplane recordings使用深度学习和三平面记录进行超声心动图心脏瓣膜事件计时Benjamin Strandli Fermann, John Nyberg, Espen W. Remme, Jahn Frederik Grue, Helén Grue, Roger Håland, Lasse Lovstakken, Håvard Dalen, Bjørnar Grenne, Svein Arne Aase, et.al.arxiv.org/pdf/2403.10…null
2024-03-15RCooper: A Real-world Large-scale Dataset for Roadside Cooperative PerceptionRCooper:用于路边协作感知的真实世界大规模数据集Ruiyang Hao, Siqi Fan, Yingru Dai, Zhenlin Zhang, Chenxi Li, Yuntian Wang, Haibao Yu, Wenxian Yang, Jirui Yuan, Zaiqing Niearxiv.org/pdf/2403.10…null
2024-03-15TransLandSeg: A Transfer Learning Approach for Landslide Semantic Segmentation Based on Vision Foundation ModelTransLandSeg:一种基于视觉基础模型的滑坡语义分割迁移学习方法Changhong Hou, Junchuan Yu, Daqing Ge, Liu Yang, Laidian Xi, Yunxuan Pang, Yi Wenarxiv.org/pdf/2403.10…null
2024-03-15Enhancing Human-Centered Dynamic Scene Understanding via Multiple LLMs Collaborated Reasoning通过多个法学硕士协作推理增强以人为中心的动态场景理解Hang Zhang, Wenxiao Zhang, Haoxuan Qu, Jun Liuarxiv.org/pdf/2403.10…null
2024-03-15Adaptive Random Feature Regularization on Fine-tuning Deep Neural Networks微调深度神经网络的自适应随机特征正则化Shin'ya Yamaguchi, Sekitoshi Kanai, Kazuki Adachi, Daiki Chijiwaarxiv.org/pdf/2403.10…null
2024-03-15Monkeypox disease recognition model based on improved SE-InceptionV3基于改进SE-InceptionV3的猴痘疾病识别模型Junzhuo Chen, Zonghan Lu, Shitong Kangarxiv.org/pdf/2403.10…link
2024-03-15CrossGLG: LLM Guides One-shot Skeleton-based 3D Action Recognition in a Cross-level MannerCrossGLG:LLM以跨级别方式指导基于骨架的一次性3D动作识别Tingbing Yan, Wenzheng Zeng, Yang Xiao, Xingyu Tong, Bo Tan, Zhiwen Fang, Zhiguo Cao, Joey Tianyi Zhouarxiv.org/pdf/2403.10…null
2024-03-15Control and Automation for Industrial Production Storage Zone: Generation of Optimal Route Using Image Processing工业生产存储区的控制和自动化:利用图像处理生成最佳路线Bejamin A. Huerfano, Fernando Jimenezarxiv.org/pdf/2403.10…null
2024-03-15TextBlockV2: Towards Precise-Detection-Free Scene Text Spotting with Pre-trained Language ModelTextBlockV2:使用预训练语言模型实现精确的无检测场景文本识别Jiahao Lyu, Jin Wei, Gangyan Zeng, Zeng Li, Enze Xie, Wei Wang, Yu Zhouarxiv.org/pdf/2403.10…null
2024-03-15Rethinking Low-quality Optical Flow in Unsupervised Surgical Instrument Segmentation重新思考无监督手术器械分割中的低质量光流Peiran Wu, Yang Liu, Jiayu Huo, Gongyu Zhang, Christos Bergeles, Rachel Sparks, Prokar Dasgupta, Alejandro Granados, Sebastien Ourselinarxiv.org/pdf/2403.10…link
2024-03-15Lifelong Person Re-Identification with Backward-Compatibility具有向后兼容性的终身人员重新识别Minyoung Oh, Jae-Young Simarxiv.org/pdf/2403.10…null
2024-03-15Linear optimal transport subspaces for point set classification用于点集分类的线性最优传输子空间Mohammad Shifat E Rabbi, Naqib Sad Pathan, Shiying Li, Yan Zhuang, Abu Hasnat Mohammad Rubaiyat, Gustavo K Rohdearxiv.org/pdf/2403.10…null
2024-03-15Cardiac Magnetic Resonance 2D+T Short- and Long-axis Segmentation via Spatio-temporal SAM Adaptation通过时空 SAM 适应进行心脏磁共振 2D+T 短轴和长轴分割Zhennong Chen, Sekeun Kim, Hui Ren, Quanzheng Li, Xiang Liarxiv.org/pdf/2403.10…null
2024-03-15FBPT: A Fully Binary Point TransformerFBPT:完全二进制点变压器Zhixing Hou, Yuzhang Shang, Yan Yanarxiv.org/pdf/2403.09…null
2024-03-15Skeleton-Based Human Action Recognition with Noisy Labels带有噪声标签的基于骨骼的人体动作识别Yi Xu, Kunyu Peng, Di Wen, Ruiping Liu, Junwei Zheng, Yufan Chen, Jiaming Zhang, Alina Roitberg, Kailun Yang, Rainer Stiefelhagenarxiv.org/pdf/2403.09…null
2024-03-15ViTCN: Vision Transformer Contrastive Network For ReasoningViTCN:用于推理的 Vision Transformer 对比网络Bo Song, Yuanhao Xu, Yichao Wuarxiv.org/pdf/2403.09…null
2024-03-15Shifting Focus: From Global Semantics to Local Prominent Features in Swin-Transformer for Knee Osteoarthritis Severity Assessment焦点转移:从 Swin-Transformer 膝骨关节炎严重程度评估的全局语义到局部显着特征Aymen Sekhri, Marouane Tliba, Mohamed Amine Kerkouri, Yassine Nasser, Aladine Chetouani, Alessandro Bruno, Rachid Jennanearxiv.org/pdf/2403.09…null
2024-03-15Attention-Enhanced Hybrid Feature Aggregation Network for 3D Brain Tumor Segmentation用于 3D 脑肿瘤分割的注意力增强混合特征聚合网络Ziya Ata Yazıcı, İlkay Öksüz, Hazım Kemal Ekenelarxiv.org/pdf/2403.09…link

图像理解

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-15Robust Shape Fitting for 3D Scene Abstraction用于 3D 场景抽象的稳健形状拟合Florian Kluger, Eric Brachmann, Michael Ying Yang, Bodo Rosenhahnarxiv.org/pdf/2403.10…null
2024-03-15NECA: Neural Customizable Human AvatarNECA:神经可定制人体头像Junjin Xiao, Qing Zhang, Zhan Xu, Wei-Shi Zhengarxiv.org/pdf/2403.10…link
2024-03-15AUTONODE: A Neuro-Graphic Self-Learnable Engine for Cognitive GUI AutomationAUTONODE:用于认知 GUI 自动化的神经图形自学习引擎Arkajit Datta, Tushar Verma, Rajat Chawlaarxiv.org/pdf/2403.10…null

LLM

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-15Using an LLM to Turn Sign Spottings into Spoken Language Sentences使用法学硕士将手势识别转化为口语句子Ozge Mercanoglu Sincan, Necati Cihan Camgoz, Richard Bowdenarxiv.org/pdf/2403.10…null

Transformer

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-15P-MapNet: Far-seeing Map Generator Enhanced by both SDMap and HDMap PriorsP-MapNet:由 SDMap 和 HDMap 先验增强的远视地图生成器Zhou Jiang, Zhenxin Zhu, Pengfei Li, Huan-ang Gao, Tianyuan Yuan, Yongliang Shi, Hang Zhao, Hao Zhaoarxiv.org/pdf/2403.10…null
2024-03-15A Novel Framework for Multi-Person Temporal Gaze Following and Social Gaze Prediction多人时间注视跟踪和社交注视预测的新框架Anshul Gupta, Samy Tafasca, Arya Farkhondeh, Pierre Vuillecard, Jean-Marc Odobezarxiv.org/pdf/2403.10…null
2024-03-15Approximate Nullspace Augmented Finetuning for Robust Vision Transformers鲁棒视觉变压器的近似零空间增强微调Haoyang Liu, Aditya Singh, Yijiang Li, Haohan Wangarxiv.org/pdf/2403.10…null
2024-03-15PASTA: Towards Flexible and Efficient HDR Imaging Via Progressively Aggregated Spatio-Temporal AligmentPASTA:通过逐步聚合的时空对齐实现灵活高效的 HDR 成像Xiaoning Liu, Ao Li, Zongwei Wu, Yapeng Du, Le Zhang, Yulun Zhang, Radu Timofte, Ce Zhuarxiv.org/pdf/2403.10…null
2024-03-15How Powerful Potential of Attention on Image Restoration?关注力对图像修复的潜力有多大?Cong Wang, Jinshan Pan, Yeying Jin, Liyan Wang, Wei Wang, Gang Fu, Wenqi Ren, Xiaochun Caoarxiv.org/pdf/2403.10…null
2024-03-15Context-Semantic Quality Awareness Network for Fine-Grained Visual Categorization用于细粒度视觉分类的上下文语义质量感知网络Qin Xu, Sitong Li, Jiahui Wang, Bo Jiang, Jinhui Tangarxiv.org/pdf/2403.10…null
2024-03-15Depth-induced Saliency Comparison Network for Diagnosis of Alzheimer's Disease via Jointly Analysis of Visual Stimuli and Eye Movements通过联合分析视觉刺激和眼动来诊断阿尔茨海默病的深度诱导显着性比较网络Yu Liu, Wenlin Zhang, Shaochu Wang, Fangyu Zuo, Peiguang Jing, Yong Jiarxiv.org/pdf/2403.10…null
2024-03-15Hybrid Convolutional and Attention Network for Hyperspectral Image Denoising用于高光谱图像去噪的混合卷积和注意力网络Shuai Hu, Feng Gao, Xiaowei Zhou, Junyu Dong, Qian Duarxiv.org/pdf/2403.10…link
2024-03-15MEDPNet: Achieving High-Precision Adaptive Registration for Complex Die CastingsMEDPNet:实现复杂压铸件的高精度自适应配准Yu Du, Yu Song, Ce Guo, Xiaojing Tian, Dong Liu, Ming Congarxiv.org/pdf/2403.09…null
2024-03-15EfficientVMamba: Atrous Selective Scan for Light Weight Visual MambaEfficientVMamba:用于轻量级 Visual Mamba 的 Atrous 选择性扫描Xiaohuan Pei, Tao Huang, Chang Xuarxiv.org/pdf/2403.09…null

3D/CG

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-15ParaPoint: Learning Global Free-Boundary Surface Parameterization of 3D Point CloudsParaPoint:学习 3D 点云的全局自由边界表面参数化Qijian Zhang, Junhui Hou, Ying Hearxiv.org/pdf/2403.10…null
2024-03-15SCILLA: SurfaCe Implicit Learning for Large Urban Area, a volumetric hybrid solutionSCILLA:大型城市地区的表面隐式学习,体积混合解决方案Hala Djeghim, Nathan Piasco, Moussab Bennehar, Luis Roldão, Dzmitry Tsishkou, Désiré Sidibéarxiv.org/pdf/2403.10…null
2024-03-15KP-RED: Exploiting Semantic Keypoints for Joint 3D Shape Retrieval and DeformationKP-RED:利用语义关键点进行联合 3D 形状检索和变形Ruida Zhang, Chenyangguang Zhang, Yan Di, Fabian Manhardt, Xingyu Liu, Federico Tombari, Xiangyang Jiarxiv.org/pdf/2403.10…link
2024-03-15VRHCF: Cross-Source Point Cloud Registration via Voxel Representation and Hierarchical Correspondence FilteringVRHCF:通过体素表示和分层对应过滤进行跨源​​点云配准Guiyu Zhao, Zewen Du, Zhentao Guo, Hongbin Maarxiv.org/pdf/2403.10…link
2024-03-15Codebook Transfer with Part-of-Speech for Vector-Quantized Image Modeling用于矢量量化图像建模的带词性的码本传输Baoquan Zhang, Huaibin Wang, Luo Chuyao, Xutao Li, Liang Guotao, Yunming Ye, Xiaochen Qi, Yao Hearxiv.org/pdf/2403.10…null
2024-03-15T4P: Test-Time Training of Trajectory Prediction via Masked Autoencoder and Actor-specific Token MemoryT4P:通过屏蔽自动编码器和参与者特定的令牌内存进行轨迹预测的测试时训练Daehee Park, Jaeseok Jeong, Sung-Hoon Yoon, Jaewoo Jeong, Kuk-Jin Yoonarxiv.org/pdf/2403.10…null
2024-03-15TRG-Net: An Interpretable and Controllable Rain GeneratorTRG-Net:可解释且可控的降雨发生器Zhiqiang Pang, Hong Wang, Qi Xie, Deyu Meng, Zongben Xuarxiv.org/pdf/2403.09…null
2024-03-15Boundary Constraint-free Biomechanical Model-Based Surface Matching for Intraoperative Liver Deformation Correction基于无边界约束生物力学模型的表面匹配术中肝脏变形矫正Zixin Yang, Richard Simon, Kelly Merrell, Cristian. A. Lintearxiv.org/pdf/2403.09…null
2024-03-15RadCLIP: Enhancing Radiologic Image Analysis through Contrastive Language-Image Pre-trainingRadCLIP:通过对比语言图像预训练增强放射图像分析Zhixiu Lu, Hailong Li, Lili Hearxiv.org/pdf/2403.09…null

各类学习方式

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-15CDMAD: Class-Distribution-Mismatch-Aware Debiasing for Class-Imbalanced Semi-Supervised LearningCDMAD:类不平衡半监督学习的类分布不匹配感知去偏Hyuck Lee, Heeyoung Kimarxiv.org/pdf/2403.10…link
2024-03-15E4C: Enhance Editability for Text-Based Image Editing by Harnessing Efficient CLIP GuidanceE4C:通过利用高效的 CLIP 指导增强基于文本的图像编辑的可编辑性Tianrui Huang, Pu Cao, Lu Yang, Chun Liu, Mengjie Hu, Zhiwei Liu, Qing Songarxiv.org/pdf/2403.10…null
2024-03-15Learning Physical Dynamics for Object-centric Visual Prediction学习物理动力学以进行以对象为中心的视觉预测Huilin Xu, Tao Chen, Feng Xuarxiv.org/pdf/2403.10…null
2024-03-15What Makes Good Collaborative Views? Contrastive Mutual Information Maximization for Multi-Agent Perception是什么造就了良好的协作视图?多智能体感知的对比互信息最大化Wanfang Su, Lixing Chen, Yang Bai, Xi Lin, Gaolei Li, Zhe Qu, Pan Zhouarxiv.org/pdf/2403.10…link

其他

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-15Strong and Controllable Blind Image Decomposition强而可控​​的图像盲分解Zeyu Zhang, Junlin Han, Chenhui Gou, Hongdong Li, Liang Zhengarxiv.org/pdf/2403.10…link
2024-03-15Understanding the Double Descent Phenomenon in Deep Learning理解深度学习中的双重下降现象Marc Lafon, Alexandre Thomasarxiv.org/pdf/2403.10…null
2024-03-15Evaluating Perceptual Distances by Fitting Binomial Distributions to Two-Alternative Forced Choice Data通过将二项式分布拟合到两个替代的强制选择数据来评估感知距离Alexander Hepburn, Raul Santos-Rodriguez, Javier Portillaarxiv.org/pdf/2403.10…null
2024-03-15Overcoming Distribution Shifts in Plug-and-Play Methods with Test-Time Training通过测试时训练克服即插即用方法中的分布变化Edward P. Chandler, Shirin Shoushtari, Jiaming Liu, M. Salman Asif, Ulugbek S. Kamilovarxiv.org/pdf/2403.10…null
2024-03-15Testing MediaPipe Holistic for Linguistic Analysis of Nonmanual Markers in Sign Languages测试 MediaPipe Holistic 对手语中非手动标记的语言分析Anna Kuznetsova, Vadim Kimmelmanarxiv.org/pdf/2403.10…null
2024-03-15CPGA: Coding Priors-Guided Aggregation Network for Compressed Video Quality EnhancementCPGA:用于增强压缩视频质量的编码先验引导聚合网络Qiang Zhu, Jinhua Hao, Yukang Ding, Yu Liu, Qiao Mo, Ming Sun, Chao Zhou, Shuyuan Zhuarxiv.org/pdf/2403.10…null
2024-03-15End-to-end Adaptive Dynamic Subsampling and Reconstruction for Cardiac MRI心脏 MRI 的端到端自适应动态子采样和重建George Yiasemis, Jan-Jakob Sonke, Jonas Teuwenarxiv.org/pdf/2403.10…null
2024-03-15A Fixed-Point Approach to Unified Prompt-Based Counting统一基于提示的计数的定点方法Wei Lin, Antoni B. Chanarxiv.org/pdf/2403.10…null
2024-03-15Perceptual Quality-based Model Training under Annotator Label Uncertainty注释器标签不确定性下基于感知质量的模型训练Chen Zhou, Mohit Prabhushankar, Ghassan AlRegibarxiv.org/pdf/2403.10…null
2024-03-15CoReEcho: Continuous Representation Learning for 2D+time Echocardiography AnalysisCoReEcho:用于 2D+时间超声心动图分析的连续表示学习Fadillah Adamsyah Maani, Numan Saeed, Aleksandr Matsun, Mohammad Yaqubarxiv.org/pdf/2403.10…null
2024-03-15PQDynamicISP: Dynamically Controlled Image Signal Processor for Any Image Sensors Pursuing Perceptual QualityPQDynamicISP:适用于任何追求感知质量的图像传感器的动态控制图像信号处理器Masakazu Yoshimura, Junji Otsuka, Takeshi Ohashiarxiv.org/pdf/2403.10…null
2024-03-15Approximation and bounding techniques for the Fisher-Rao distancesFisher-Rao 距离的近似和边界技术Frank Nielsenarxiv.org/pdf/2403.10…null
2024-03-15Benchmarking Adversarial Robustness of Image Shadow Removal with Shadow-adaptive Attacks使用阴影自适应攻击对图像阴影去除的对抗鲁棒性进行基准测试Chong Wang, Yi Yu, Lanqing Guo, Bihan Wenarxiv.org/pdf/2403.10…null
2024-03-15Revisiting Adversarial Training under Long-Tailed Distributions重新审视长尾分布下的对抗训练Xinli Yue, Ningping Mou, Qian Wang, Lingchen Zhaoarxiv.org/pdf/2403.10…link
2024-03-15Boundary Matters: A Bi-Level Active Finetuning Framework边界问题:双层主动微调框架Han Lu, Yichen Xie, Xiaokang Yang, Junchi Yanarxiv.org/pdf/2403.10…null
2024-03-15Contrastive Pre-Training with Multi-View Fusion for No-Reference Point Cloud Quality Assessment用于无参考点云质量评估的多视图融合对比预训练Ziyu Shan, Yujie Zhang, Qi Yang, Haichen Yang, Yiling Xu, Jenq-Neng Hwang, Xiaozhong Xu, Shan Liuarxiv.org/pdf/2403.10…null
2024-03-15Progressive Divide-and-Conquer via Subsampling Decomposition for Accelerated MRI通过加速 MRI 的子采样分解进行渐进式分治Chong Wang, Lanqing Guo, Yufei Wang, Hao Cheng, Yi Yu, Bihan Wenarxiv.org/pdf/2403.10…link
2024-03-15PAME: Self-Supervised Masked Autoencoder for No-Reference Point Cloud Quality AssessmentPAME:用于无参考点云质量评估的自监督屏蔽自动编码器Ziyu Shan, Yujie Zhang, Qi Yang, Haichen Yang, Yiling Xu, Shan Liuarxiv.org/pdf/2403.10…null
2024-03-15AD3: Implicit Action is the Key for World Models to Distinguish the Diverse Visual DistractorsAD3:内隐行动是世界模型区分各种视觉干扰因素的关键Yucen Wang, Shenghua Wan, Le Gan, Shuai Feng, De-Chuan Zhanarxiv.org/pdf/2403.09…null