[分享][每日更新][2024.03.11][CV_arxiv_papers]

411 阅读22分钟

[UPDATED!] 2024-03-11 (Publish Time)

生成模型

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-11BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch DiffusionBrushNet:一种具有分解双分支扩散的即插即用图像修复模型Xuan Ju, Xian Liu, Xintao Wang, Yuxuan Bian, Ying Shan, Qiang Xuarxiv.org/pdf/2403.06…null
2024-03-11Bayesian Diffusion Models for 3D Shape Reconstruction用于 3D 形状重建的贝叶斯扩散模型Haiyang Xu, Yu Lei, Zeyuan Chen, Xiang Zhang, Yue Zhao, Yilin Wang, Zhuowen Tuarxiv.org/pdf/2403.06…null
2024-03-11SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated DataSELMA:学习特定技能的文本到图像专家并将其与自动生成的数据合并Jialu Li, Jaemin Cho, Yi-Lin Sung, Jaehong Yoon, Mohit Bansalarxiv.org/pdf/2403.06…null
2024-03-11DEADiff: An Efficient Stylization Diffusion Model with Disentangled RepresentationsDEADiff:一种具有解缠结表示的高效风格化扩散模型Tianhao Qi, Shancheng Fang, Yanze Wu, Hongtao Xie, Jiawei Liu, Lang Chen, Qian He, Yongdong Zhangarxiv.org/pdf/2403.06…null
2024-03-11A Geospatial Approach to Predicting Desert Locust Breeding Grounds in Africa预测非洲沙漠蝗虫繁殖地的地理空间方法Ibrahim Salihu Yusuf, Mukhtar Opeyemi Yusuf, Kobby Panford-Quainoo, Arnu Pretoriusarxiv.org/pdf/2403.06…null
2024-03-11Medical Image Synthesis via Fine-Grained Image-Text Alignment and Anatomy-Pathology Prompting通过细粒度图像文本对齐和解剖病理学提示进行医学图像合成Wenting Chen, Pengyu Wang, Hui Ren, Lichao Sun, Quanzheng Li, Yixuan Yuan, Xiang Liarxiv.org/pdf/2403.06…null
2024-03-11Multistep Consistency Models多步一致性模型Jonathan Heek, Emiel Hoogeboom, Tim Salimansarxiv.org/pdf/2403.06…null
2024-03-11Data-Independent Operator: A Training-Free Artifact Representation Extractor for Generalizable Deepfake Detection数据无关的算子:用于通用 Deepfake 检测的免训练伪像表示提取器Chuangchuang Tan, Ping Liu, RenShuai Tao, Huan Liu, Yao Zhao, Baoyuan Wu, Yunchao Weiarxiv.org/pdf/2403.06…null
2024-03-11Distribution-Aware Data Expansion with Diffusion Models使用扩散模型进行分布感知数据扩展Haowei Zhu, Ling Yang, Jun-Hai Yong, Wentao Zhang, Bin Wangarxiv.org/pdf/2403.06…null
2024-03-11V3D: Video Diffusion Models are Effective 3D GeneratorsV3D:视频扩散模型是有效的 3D 生成器Zilong Chen, Yikai Wang, Feng Wang, Zhengyi Wang, Huaping Liuarxiv.org/pdf/2403.06…null
2024-03-11Enhancing Image Caption Generation Using Reinforcement Learning with Human Feedback使用强化学习和人类反馈来增强图像标题生成Adarsh N L, Arun P V, Aravindh N Larxiv.org/pdf/2403.06…null
2024-03-11Distributionally Generative Augmentation for Fair Facial Attribute Classification用于公平面部属性分类的分布式生成增强Fengda Zhang, Qianpei He, Kun Kuang, Jiashuo Liu, Long Chen, Chao Wu, Jun Xiao, Hanwang Zhangarxiv.org/pdf/2403.06…null
2024-03-11Transformer-based Fusion of 2D-pose and Spatio-temporal Embeddings for Distracted Driver Action Recognition基于 Transformer 的 2D 姿态和时空嵌入融合,用于分心驾驶员动作识别Erkut Akdag, Zeqi Zhu, Egor Bondarev, Peter H. N. De Witharxiv.org/pdf/2403.06…null
2024-03-11ReStainGAN: Leveraging IHC to IF Stain Domain Translation for in-silico Data GenerationReStainGAN:利用 IHC 到 IF 染色域转换进行计算机数据生成Dominik Winter, Nicolas Triltsch, Philipp Plewa, Marco Rosati, Thomas Padel, Ross Hill, Markus Schick, Nicolas Brieuarxiv.org/pdf/2403.06…null
2024-03-11Active Generation for Image Classification图像分类的主动生成Tao Huang, Jiaqi Liu, Shan You, Chang Xuarxiv.org/pdf/2403.06…null
2024-03-11Advancing Text-Driven Chest X-Ray Generation with Policy-Based Reinforcement Learning通过基于策略的强化学习推进文本驱动的胸部 X 射线生成Woojung Han, Chanyoung Kim, Dayun Ju, Yumin Shim, Seong Jae Hwangarxiv.org/pdf/2403.06…null
2024-03-11Incorporating Improved Sinusoidal Threshold-based Semi-supervised Method and Diffusion Models for Osteoporosis Diagnosis结合改进的基于正弦阈值的半监督方法和扩散模型进行骨质疏松症诊断Wenchi Kearxiv.org/pdf/2403.06…null
2024-03-113D-aware Image Generation and Editing with Multi-modal Conditions多模态条件下的 3D 感知图像生成和编辑Bo Li, Yi-ke Li, Zhi-fen He, Bin Liu, Yun-Kun Laiarxiv.org/pdf/2403.06…null
2024-03-11From Pixel to Cancer: Cellular Automata in Computed Tomography从像素到癌症:计算机断层扫描中的细胞自动机Yuxiang Lai, Xiaoxi Chen, Angtian Wang, Alan Yuille, Zongwei Zhouarxiv.org/pdf/2403.06…null
2024-03-11Text2QR: Harmonizing Aesthetic Customization and Scanning Robustness for Text-Guided QR Code GenerationText2QR:协调美学定制和扫描鲁棒性以生成文本引导的 QR 码Guangyang Wu, Xiaohong Liu, Jun Jia, Xuehao Cui, Guangtao Zhaiarxiv.org/pdf/2403.06…null
2024-03-11DivCon: Divide and Conquer for Progressive Text-to-Image GenerationDivCon:分而治之,逐步生成文本到图像Yuhao Jia, Wenhan Tanarxiv.org/pdf/2403.06…null
2024-03-11A Segmentation Foundation Model for Diverse-type Tumors多种肿瘤的分割基础模型Jianhao Xie, Ziang Zhang, Guibo Luo, Yuesheng Zhuarxiv.org/pdf/2403.06…null
2024-03-11FSViewFusion: Few-Shots View Generation of Novel ObjectsFSViewFusion:新对象的少量视图生成Rukhshanda Hussain, Hui Xian Grace Lim, Borchun Chen, Mubarak Shah, Ser Nam Limarxiv.org/pdf/2403.06…null
2024-03-11Enhancing Semantic Fidelity in Text-to-Image Synthesis: Attention Regulation in Diffusion Models增强文本到图像合成中的语义保真度:扩散模型中的注意力调节Yang Zhang, Teoh Tze Tzun, Lim Wei Hern, Tiviatis Sim, Kenji Kawaguchiarxiv.org/pdf/2403.06…null
2024-03-11Style2Talker: High-Resolution Talking Head Generation with Emotion Style and Art StyleStyle2Talker:具有情感风格和艺术风格的高分辨率说话头生成Shuai Tan, Bin Ji, Ye Panarxiv.org/pdf/2403.06…null
2024-03-11Say Anything with Any Style以任何风格说任何话Shuai Tan, Bin Ji, Yu Ding, Ye Panarxiv.org/pdf/2403.06…null
2024-03-11MOAB: Multi-Modal Outer Arithmetic Block For Fusion Of Histopathological Images And Genetic Data For Brain Tumor GradingMOAB:多模态外部算术块,用于融合组织病理学图像和遗传数据以进行脑肿瘤分级Omnia Alwazzan, Abbas Khan, Ioannis Patras, Gregory Slabaugharxiv.org/pdf/2403.06…null

多模态

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-11VideoMamba: State Space Model for Efficient Video UnderstandingVideoMamba:用于高效视频理解的状态空间模型Kunchang Li, Xinhao Li, Yi Wang, Yinan He, Yali Wang, Limin Wang, Yu Qiaoarxiv.org/pdf/2403.06…null
2024-03-11FocusCLIP: Multimodal Subject-Level Guidance for Zero-Shot Transfer in Human-Centric TasksFocusCLIP:以人为中心的任务中零镜头转移的多模式主题级指导Muhammad Saif Ullah Khan, Muhammad Ferjad Naeem, Federico Tombari, Luc Van Gool, Didier Stricker, Muhammad Zeshan Afzalarxiv.org/pdf/2403.06…null
2024-03-11DiaLoc: An Iterative Approach to Embodied Dialog LocalizationDiaLoc:一种实现对话本地化的迭代方法Chao Zhang, Mohan Li, Ignas Budvytis, Stephan Liwickiarxiv.org/pdf/2403.06…null
2024-03-11CT2Rep: Automated Radiology Report Generation for 3D Medical ImagingCT2Rep:自动生成 3D 医学成像放射学报告Ibrahim Ethem Hamamci, Sezgin Er, Bjoern Menzearxiv.org/pdf/2403.06…null
2024-03-11Real-Time Multimodal Cognitive Assistant for Emergency Medical Services紧急医疗服务实时多模态认知助手Keshara Weerasinghe, Saahith Janapati, Xueren Ge, Sion Kim, Sneha Iyer, John A. Stankovic, Homa Alemzadeharxiv.org/pdf/2403.06…null
2024-03-11Large Model driven Radiology Report Generation with Clinical Quality Reinforcement Learning通过临床质量强化学习生成大型模型驱动的放射学报告Zijian Zhou, Miaojing Shi, Meng Wei, Oluwatosin Alabi, Zijie Yue, Tom Vercauterenarxiv.org/pdf/2403.06…null
2024-03-11Restoring Ancient Ideograph: A Multimodal Multitask Neural Network Approach恢复古代表意文字:多模态多任务神经网络方法Siyu Duan, Jun Wang, Qi Suarxiv.org/pdf/2403.06…null
2024-03-11Answering Diverse Questions via Text Attached with Key Audio-Visual Clues通过附有关键视听线索的文字回答各种问题Qilang Ye, Zitong Yu, Xin Liuarxiv.org/pdf/2403.06…null
2024-03-113DRef: 3D Dataset and Benchmark for Reflection Detection in RGB and Lidar Data3DRef:RGB 和激光雷达数据中反射检测的 3D 数据集和基准Xiting Zhao, Sören Schwertfegerarxiv.org/pdf/2403.06…null
2024-03-113D Semantic Segmentation-Driven Representations for 3D Object Detection用于 3D 对象检测的 3D 语义分割驱动表示Hayeon O, Kunsoo Huharxiv.org/pdf/2403.06…null
2024-03-11Reliable Spatial-Temporal Voxels For Multi-Modal Test-Time Adaptation用于多模态测试时间适应的可靠时空体素Haozhi Cao, Yuecong Xu, Jianfei Yang, Pengyu Yin, Xingyu Ji, Shenghai Yuan, Lihua Xiearxiv.org/pdf/2403.06…null
2024-03-11Can LLMs' Tuning Methods Work in Medical Multimodal Domain?法学硕士的调整方法可以在医学多模式领域发挥作用吗?Jiawei Chen, Yue Jiang, Dingkang Yang, Mingcheng Li, Jinjie Wei, Ziyun Qian, Lihua Zhangarxiv.org/pdf/2403.06…null
2024-03-11See Through Their Minds: Learning Transferable Neural Representation from Cross-Subject fMRI看透他们的想法:从跨主题功能磁共振成像中学习可迁移的神经表征Yulong Liu, Yongqiang Ma, Guibo Zhu, Haodong Jing, Nanning Zhengarxiv.org/pdf/2403.06…null
2024-03-11Multi-modal Semantic Understanding with Contrastive Cross-modal Feature Alignment具有对比跨模态特征对齐的多模态语义理解Ming Zhang, Ke Chang, Yunfang Wuarxiv.org/pdf/2403.06…null

Nerf

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-11FreGS: 3D Gaussian Splatting with Progressive Frequency RegularizationFreGS:具有渐进频率正则化的 3D 高斯分布Jiahui Zhang, Fangneng Zhan, Muyu Xu, Shijian Lu, Eric Xingarxiv.org/pdf/2403.06…null
2024-03-11SiLVR: Scalable Lidar-Visual Reconstruction with Neural Radiance Fields for Robotic InspectionSiLVR:用于机器人检查的具有神经辐射场的可扩展激光雷达视觉重建Yifu Tao, Yash Bhalgat, Lanke Frank Tarimo Fu, Matias Mattamala, Nived Chebrolu, Maurice Fallonarxiv.org/pdf/2403.06…null
2024-03-11Vosh: Voxel-Mesh Hybrid Representation for Real-Time View SynthesisVosh:用于实时视图合成的体素网格混合表示Chenhao Zhang, Yongyang Zhou, Lei Zhangarxiv.org/pdf/2403.06…null

3DGS

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-11DNGaussian: Optimizing Sparse-View 3D Gaussian Radiance Fields with Global-Local Depth NormalizationDNGaussian:通过全局局部深度归一化优化稀疏视图 3D 高斯辐射场Jiahe Li, Jiawei Zhang, Xiao Bai, Jin Zheng, Xin Ning, Jun Zhou, Lin Guarxiv.org/pdf/2403.06…null

模型压缩/优化

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-11GRITv2: Efficient and Light-weight Social Relation RecognitionGRITv2:高效、轻量级的社交关系识别N K Sagar Reddy, Neeraj Kasera, Avinash Thakurarxiv.org/pdf/2403.06…null
2024-03-11An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models第 2 层之后,一张图像就值 1/2 代币:大型视觉语言模型的即插即用推理加速Liang Chen, Haozhe Zhao, Tianyu Liu, Shuai Bai, Junyang Lin, Chang Zhou, Baobao Changarxiv.org/pdf/2403.06…null
2024-03-11PeerAiD: Improving Adversarial Distillation from a Specialized Peer TutorPeerAiD:由专业同伴导师改进对抗性蒸馏Jaewon Jung, Hongsun Jang, Jaeyong Song, Jinho Leearxiv.org/pdf/2403.06…null
2024-03-11QuantTune: Optimizing Model Quantization with Adaptive Outlier-Driven Fine TuningQuantTune:通过自适应离群值驱动微调来优化模型量化Jiun-Man Chen, Yu-Hsuan Chao, Yu-Jie Wang, Ming-Der Shieh, Chih-Chung Hsu, Wei-Fen Linarxiv.org/pdf/2403.06…null
2024-03-11Enhanced Sparsification via Stimulative Training通过刺激训练增强稀疏化Shengji Tang, Weihao Lin, Hancheng Ye, Peng Ye, Chong Yu, Baopu Li, Tao Chenarxiv.org/pdf/2403.06…null
2024-03-11FlowVQTalker: High-Quality Emotional Talking Face Generation through Normalizing Flow and QuantizationFlowVQTalker:通过规范化流和量化生成高质量的情感面部表情Shuai Tan, Bin Ji, Ye Panarxiv.org/pdf/2403.06…null

分类/检测/识别/分割/...

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-11Attention Prompt Tuning: Parameter-efficient Adaptation of Pre-trained Models for Spatiotemporal Modeling注意力提示调优:时空建模预训练模型的参数高效适应Wele Gedara Chaminda Bandara, Vishal M. Patelarxiv.org/pdf/2403.06…null
2024-03-11Explainable Transformer Prototypes for Medical Diagnoses用于医学诊断的可解释变压器原型Ugur Demir, Debesh Jha, Zheyuan Zhang, Elif Keles, Bradley Allen, Aggelos K. Katsaggelos, Ulas Bagciarxiv.org/pdf/2403.06…null
2024-03-11Optimizing Latent Graph Representations of Surgical Scenes for Zero-Shot Domain Transfer优化手术场景的潜在图表示以实现零样本域转移Siddhant Satyanaik, Aditya Murali, Deepak Alapatt, Xin Wang, Pietro Mascagni, Nicolas Padoyarxiv.org/pdf/2403.06…null
2024-03-11Advancing Generalizable Remote Physiological Measurement through the Integration of Explicit and Implicit Prior Knowledge通过整合显性和隐性先验知识推进可推广的远程生理测量Yuting Zhang, Hao Lu, Xin Liu, Yingcong Chen, Kaishun Wuarxiv.org/pdf/2403.06…null
2024-03-11Real-time Transformer-based Open-Vocabulary Detection with Efficient Fusion Head具有高效融合头的基于变压器的实时开放词汇检测Tiancheng Zhao, Peng Liu, Xuan He, Lu Zhang, Kyusong Leearxiv.org/pdf/2403.06…null
2024-03-11COOD: Combined out-of-distribution detection using multiple measures for anomaly & novel class detection in large-scale hierarchical classificationCOOD:使用多种措施组合分布外检测,用于大规模分层分类中的异常和新类检测L. E. Hogeweg, R. Gangireddy, D. Brunink, V. J. Kalkman, L. Cornelissen, J. W. Kammingaarxiv.org/pdf/2403.06…null
2024-03-11DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video GenerationDriveDreamer-2:用于生成多样化驾驶视频的 LLM 增强世界模型Guosheng Zhao, Xiaofeng Wang, Zheng Zhu, Xinze Chen, Guan Huang, Xiaoyi Bao, Xingang Wangarxiv.org/pdf/2403.06…null
2024-03-11LeOCLR: Leveraging Original Images for Contrastive Learning of Visual RepresentationsLeOCLR:利用原始图像进行视觉表示的对比学习Mohammad Alkhalefi, Georgios Leontidis, Mingjun Zhongarxiv.org/pdf/2403.06…null
2024-03-11Deep Learning Approaches for Human Action Recognition in Video Data视频数据中人类动作识别的深度学习方法Yufei Xiearxiv.org/pdf/2403.06…null
2024-03-11Dynamic Perturbation-Adaptive Adversarial Training on Medical Image Classification医学图像分类的动态扰动自适应对抗训练Shuai Li, Xiaoguang Ma, Shancheng Jiang, Lu Mengarxiv.org/pdf/2403.06…null
2024-03-11Leveraging Internal Representations of Model for Magnetic Image Classification利用模型的内部表示进行磁图像分类Adarsh N L, Arun P V, Alok Porwal, Malcolm Aranhaarxiv.org/pdf/2403.06…null
2024-03-11Genetic Learning for Designing Sim-to-Real Data Augmentations用于设计模拟到真实数据增强的遗传学习Bram Vanherle, Nick Michiels, Frank Van Reetharxiv.org/pdf/2403.06…null
2024-03-11Average Calibration Error: A Differentiable Loss for Improved Reliability in Image Segmentation平均校准误差:提高图像分割可靠性的可微分损失Theodore Barfoot, Luis Garcia-Peraza-Herrera, Ben Glocker, Tom Vercauterenarxiv.org/pdf/2403.06…null
2024-03-11Shortcut Learning in Medical Image Segmentation医学图像分割中的捷径学习Manxi Lin, Nina Weng, Kamil Mikolaj, Zahra Bashir, Morten Bo Søndergaard Svendsen, Martin Tolsgaard, Anders Nymark Christensen, Aasa Feragenarxiv.org/pdf/2403.06…null
2024-03-11Probabilistic Contrastive Learning for Long-Tailed Visual Recognition长尾视觉识别的概率对比学习Chaoqun Du, Yulin Wang, Shiji Song, Gao Huangarxiv.org/pdf/2403.06…null
2024-03-11Advancing Graph Neural Networks with HL-HGAT: A Hodge-Laplacian and Attention Mechanism Approach for Heterogeneous Graph-Structured Data使用 HL-HGAT 推进图神经网络:异构图结构数据的 Hodge-Laplacian 和注意力机制方法Jinghan Huang, Qiufeng Chen, Yijun Bian, Pengli Zhu, Nanguang Chen, Moo K. Chung, Anqi Qiuarxiv.org/pdf/2403.06…null
2024-03-11Trustworthy Partial Label Learning with Out-of-distribution Detection具有分布外检测的可信部分标签学习Jintao Huang, Yiu-Ming Cheungarxiv.org/pdf/2403.06…null
2024-03-11CAM Back Again: Large Kernel CNNs from a Weakly Supervised Object Localization PerspectiveCAM 再次回归:从弱监督对象定位角度看大型内核 CNNShunsuke Yasuki, Masato Takiarxiv.org/pdf/2403.06…null
2024-03-11Car Damage Detection and Patch-to-Patch Self-supervised Image Alignment汽车损坏检测和逐块自监督图像对齐Hanxiao Chenarxiv.org/pdf/2403.06…null
2024-03-11epsilon-Mesh Attack: A Surface-based Adversarial Point Cloud Attack for Facial Expression Recognitionepsilon-Mesh 攻击:基于表面的对抗性点云攻击,用于面部表情识别Batuhan Cengiz, Mert Gulsen, Yusuf H. Sahin, Gozde Unalarxiv.org/pdf/2403.06…null
2024-03-11Towards Zero-Shot Interpretable Human Recognition: A 2D-3D Registration Framework迈向零样本可解释人类识别:2D-3D 配准框架Henrique Jesus, Hugo Proençaarxiv.org/pdf/2403.06…null
2024-03-11Ricci flow-based brain surface covariance descriptors for Alzheimer disease基于 Ricci 流的阿尔茨海默病脑表面协方差描述符Fatemeh Ahmadi, Mohamad Ebrahim Shiri, Behroz Bidabad, Maral Sedaghat, Pooran Memariarxiv.org/pdf/2403.06…null
2024-03-11Evaluating the Energy Efficiency of Few-Shot Learning for Object Detection in Industrial Settings评估工业环境中目标检测的少样本学习的能源效率Georgios Tsoumplekas, Vladislav Li, Ilias Siniosoglou, Vasileios Argyriou, Sotirios K. Goudos, Ioannis D. Moscholios, Panagiotis Radoglou-Grammatikis, Panagiotis Sarigiannidisarxiv.org/pdf/2403.06…null
2024-03-11Forest Inspection Dataset for Aerial Semantic Segmentation and Depth Estimation用于航空语义分割和深度估计的森林检查数据集Bianca-Cerasela-Zelia Blaga, Sergiu Nedevschiarxiv.org/pdf/2403.06…null
2024-03-11Density-Guided Label Smoothing for Temporal Localization of Driving Actions用于驾驶行为时间定位的密度引导标签平滑Tunc Alkanat, Erkut Akdag, Egor Bondarev, Peter H. N. De Witharxiv.org/pdf/2403.06…null
2024-03-11Cross-domain and Cross-dimension Learning for Image-to-Graph Transformers图像到图转换器的跨域和跨维度学习Alexander H. Berger, Laurin Lux, Suprosanna Shit, Ivan Ezhov, Georgios Kaissis, Martin J. Menten, Daniel Rueckert, Johannes C. Paetzoldarxiv.org/pdf/2403.06…null
2024-03-11BEV2PR: BEV-Enhanced Visual Place Recognition with Structural CuesBEV2PR:带有结构提示的 BEV 增强视觉位置识别Fudong Ge, Yiwei Zhang, Shuhan Shen, Yue Wang, Weiming Hu, Jin Gaoarxiv.org/pdf/2403.06…null
2024-03-11Exploiting Style Latent Flows for Generalizing Deepfake Detection Video Detection利用风格潜在流来推广 Deepfake 检测视频检测Jongwook Choi, Taehoon Kim, Yonghyun Jeong, Seungryul Baek, Jongwon Choiarxiv.org/pdf/2403.06…null
2024-03-11Detection of Object Throwing Behavior in Surveillance Videos监控视频中物体投掷行为的检测Ivo P. C. Kersten, Erkut Akdag, Egor Bondarev, Peter H. N. De Witharxiv.org/pdf/2403.06…null
2024-03-11OMH: Structured Sparsity via Optimally Matched Hierarchy for Unsupervised Semantic SegmentationOMH:通过最佳匹配层次结构实现无监督语义分割的结构化稀疏性Baran Ozaydin, Tong Zhang, Deblina Bhattacharjee, Sabine Süsstrunk, Mathieu Salzmannarxiv.org/pdf/2403.06…null
2024-03-11SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object DetectionSARDet-100K:迈向大规模 SAR 物体检测的开源基准和工具包Yuxuan Li, Xiang Li, Weijie Li, Qibin Hou, Li Liu, Ming-Ming Cheng, Jian Yangarxiv.org/pdf/2403.06…null
2024-03-11Confidence-Aware RGB-D Face Recognition via Virtual Depth Synthesis通过虚拟深度合成进行置信度感知 RGB-D 人脸识别Zijian Chen, Mei Wang, Weihong Deng, Hongzhi Shi, Dongchao Wen, Yingjie Zhang, Xingchen Cui, Jian Zhaoarxiv.org/pdf/2403.06…null
2024-03-11Skeleton Supervised Airway Segmentation骨骼监督气道分割Mingyue Zhao, Han Li, Li Fan, Shiyuan Liu, Xiaolan Qiu, S. Kevin Zhouarxiv.org/pdf/2403.06…null
2024-03-11Toward Generalist Anomaly Detection via In-context Residual Learning with Few-shot Sample Prompts通过带有少量样本提示的上下文残差学习实现通用异常检测Jiawen Zhu, Guansong Pangarxiv.org/pdf/2403.06…null
2024-03-11Query-guided Prototype Evolution Network for Few-Shot Segmentation用于少镜头分割的查询引导原型进化网络Runmin Cong, Hang Xiong, Jinpeng Chen, Wei Zhang, Qingming Huang, Yao Zhaoarxiv.org/pdf/2403.06…null
2024-03-11Toward Robust Canine Cardiac Diagnosis: Deep Prototype Alignment Network-Based Few-Shot Segmentation in Veterinary Medicine实现稳健的犬心脏诊断:兽医医学中基于深度原型对齐网络的少样本分割Jun-Young Oh, In-Gyu Lee, Tae-Eui Kam, Ji-Hoon Jeongarxiv.org/pdf/2403.06…null
2024-03-11Point Mamba: A Novel Point Cloud Backbone Based on State Space Model with Octree-Based Ordering StrategyPoint Mamba:基于状态空间模型和基于八叉树的排序策略的新型点云主干Jiuming Liu, Ruiji Yu, Yian Wang, Yu Zheng, Tianchen Deng, Weicai Ye, Hesheng Wangarxiv.org/pdf/2403.06…null
2024-03-11Towards the Uncharted: Density-Descending Feature Perturbation for Semi-supervised Semantic Segmentation走向未知:半监督语义分割的密度下降特征扰动Xiaoyang Wang, Huihui Bai, Limin Yu, Yao Zhao, Jimin Xiaoarxiv.org/pdf/2403.06…null
2024-03-11Ensemble Quadratic Assignment Network for Graph Matching用于图匹配的集成二次分配网络Haoru Tan, Chuang Wang, Sitong Wu, Xu-Yao Zhang, Fei Yin, Cheng-Lin Liuarxiv.org/pdf/2403.06…null
2024-03-11Fine-Grained Pillar Feature Encoding Via Spatio-Temporal Virtual Grid for 3D Object Detection通过时空虚拟网格进行细粒度支柱特征编码,用于 3D 对象检测Konyul Park, Yecheol Kim, Junho Koh, Byungwoo Park, Jun Won Choiarxiv.org/pdf/2403.06…null
2024-03-11AS-FIBA: Adaptive Selective Frequency-Injection for Backdoor Attack on Deep Face RestorationAS-FIBA:自适应选择性频率注入用于深度面部恢复的后门攻击Zhenbo Song, Wenhao Gao, Kaihao Zhang, Wenhan Luo, Zhaoxin Fan, Jianfeng Luarxiv.org/pdf/2403.06…null
2024-03-11PointSeg: A Training-Free Paradigm for 3D Scene Segmentation via Foundation ModelsPointSeg:通过基础模型进行 3D 场景分割的免训练范式Qingdong He, Jinlong Peng, Zhengkai Jiang, Xiaobin Hu, Jiangning Zhang, Qiang Nie, Yabiao Wang, Chengjie Wangarxiv.org/pdf/2403.06…null
2024-03-11Refining Segmentation On-the-Fly: An Interactive Framework for Point Cloud Semantic Segmentation实时细化分割:点云语义分割的交互式框架Peng Zhang, Ting Wu, Jinsheng Sun, Weiqing Li, Zhiyong Suarxiv.org/pdf/2403.06…null

GNN

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-11Structure Your Data: Towards Semantic Graph Counterfactuals构建数据:走向语义图反事实Angeliki Dimitriou, Maria Lymperaiou, Giorgos Filandrianos, Konstantinos Thomas, Giorgos Stamouarxiv.org/pdf/2403.06…null

图像理解

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-11Applicability of oculomics for individual risk prediction: Repeatability and robustness of retinal Fractal Dimension using DART and AutoMorph眼组学在个体风险预测中的适用性:使用 DART 和 AutoMorph 的视网膜分形维度的重复性和鲁棒性Justin Engelmann, Diana Moukaddem, Lucas Gago, Niall Strang, Miguel O. Bernabeuarxiv.org/pdf/2403.06…null

Transformer

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-11HDRTransDC: High Dynamic Range Image Reconstruction with Transformer Deformation ConvolutionHDRTransDC:使用变压器变形卷积进行高动态范围图像重建Shuaikang Shang, Xuejing Kang, Anlong Mingarxiv.org/pdf/2403.06…null
2024-03-11Boosting Image Restoration via Priors from Pre-trained Models通过预训练模型的先验促进图像恢复Xiaogang Xu, Shu Kong, Tao Hu, Zhe Liu, Hujun Baoarxiv.org/pdf/2403.06…null
2024-03-11CEAT: Continual Expansion and Absorption Transformer for Non-Exemplar Class-Incremental LearninCEAT:非典范类增量学习的持续扩展和吸收变压器Xinyuan Gao, Songlin Dong, Yuhang He, Xing Wei, Yihong Gongarxiv.org/pdf/2403.06…null
2024-03-11Multi-Scale Implicit Transformer with Re-parameterize for Arbitrary-Scale Super-Resolution具有任意尺度超分辨率重新参数化功能的多尺度隐式变压器Jinchen Zhu, Mingjian Zhang, Ling Zheng, Shizhuang Wengarxiv.org/pdf/2403.06…null
2024-03-11A Comparative Study of Perceptual Quality Metrics for Audio-driven Talking Head Videos音频驱动的头像视频感知质量指标的比较研究Weixia Zhang, Chengguang Zhu, Jingnan Gao, Yichao Yan, Guangtao Zhai, Xiaokang Yangarxiv.org/pdf/2403.06…null

3D/CG

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-11Memory-based Adapters for Online 3D Scene Perception用于在线 3D 场景感知的基于内存的适配器Xiuwei Xu, Chong Xia, Ziwei Wang, Linqing Zhao, Yueqi Duan, Jie Zhou, Jiwen Luarxiv.org/pdf/2403.06…null
2024-03-11MambaMIL: Enhancing Long Sequence Modeling with Sequence Reordering in Computational PathologyMambaMIL:通过计算病理学中的序列重排序增强长序列建模Shu Yang, Yihui Wang, Hao Chenarxiv.org/pdf/2403.06…null
2024-03-11FaceChain-SuDe: Building Derived Class to Inherit Category Attributes for One-shot Subject-Driven GenerationFaceChain-SuDe:构建派生类以继承类别属性以实现一次性主题驱动生成Pengchong Qiao, Lei Shang, Chang Liu, Baigui Sun, Xiangyang Ji, Jie Chenarxiv.org/pdf/2403.06…null
2024-03-11Fast Text-to-3D-Aware Face Generation and Manipulation via Direct Cross-modal Mapping and Geometric Regularization通过直接跨模式映射和几何正则化快速生成文本到 3D 感知的人脸并进行操作Jinlu Zhang, Yiyi Zhou, Qiancheng Zheng, Xiaoxiong Du, Gen Luo, Jun Peng, Xiaoshuai Sun, Rongrong Jiarxiv.org/pdf/2403.06…null
2024-03-11PCLD: Point Cloud Layerwise Diffusion for Adversarial PurificationPCLD:用于对抗性净化的点云分层扩散Mert Gulsen, Batuhan Cengiz, Yusuf H. Sahin, Gozde Unalarxiv.org/pdf/2403.06…null
2024-03-11Ada-Tracker: Soft Tissue Tracking via Inter-Frame and Adaptive-Template MatchingAda-Tracker:通过帧间和自适应模板匹配进行软组织跟踪Jiaxin Guo, Jiangliu Wang, Zhaoshuo Li, Tongyu Jia, Qi Dou, Yun-Hui Liuarxiv.org/pdf/2403.06…null

各类学习方式

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-11Split to Merge: Unifying Separated Modalities for Unsupervised Domain Adaptation拆分到合并:统一无监督域适应的分离模式Xinyao Li, Yuke Li, Zhekai Du, Fengling Li, Ke Lu, Jingjing Liarxiv.org/pdf/2403.06…null
2024-03-11Shape Non-rigid Kinematics (SNK): A Zero-Shot Method for Non-Rigid Shape Matching via Unsupervised Functional Map Regularized Reconstruction形状非刚性运动学 (SNK):通过无监督功能图正则化重建实现非刚性形状匹配的零样本方法Souhaib Attaiki, Maks Ovsjanikovarxiv.org/pdf/2403.06…null
2024-03-11Eliminating Warping Shakes for Unsupervised Online Video Stitching消除无监督在线视频拼接的扭曲抖动Lang Nie, Chunyu Lin, Kang Liao, Yun Zhang, Shuaicheng Liu, Yao Zhaoarxiv.org/pdf/2403.06…null

其他

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-11Deep adaptative spectral zoom for improved remote heart rate estimation深度自适应光谱变焦可改善远程心率估计Joaquim Comas, Adria Ruiz, Federico Suknoarxiv.org/pdf/2403.06…null
2024-03-11A Holistic Framework Towards Vision-based Traffic Signal Control with Microscopic Simulation基于视觉的微观仿真交通信号控制的整体框架Pan He, Quanyi Li, Xiaoyong Yuan, Bolei Zhouarxiv.org/pdf/2403.06…null
2024-03-11Learning with Noisy Foundation Models使用嘈杂的基础模型进行学习Hao Chen, Jindong Wang, Zihan Wang, Ran Tao, Hongxin Wei, Xing Xie, Masashi Sugiyama, Bhiksha Rajarxiv.org/pdf/2403.06…null
2024-03-11QUASAR: QUality and Aesthetics Scoring with Advanced RepresentationsQUASAR:使用高级表示进行质量和美观评分Sergey Kastryulin, Denis Prokopenko, Artem Babenko, Dmitry V. Dylovarxiv.org/pdf/2403.06…null
2024-03-11Real-Time Simulated Avatar from Head-Mounted Sensors来自头戴式传感器的实时模拟头像Zhengyi Luo, Jinkun Cao, Rawal Khirodkar, Alexander Winkler, Kris Kitani, Weipeng Xuarxiv.org/pdf/2403.06…null
2024-03-11Stochastic Cortical Self-Reconstruction随机皮质自我重建Christian Wachinger, Dennis Hedderich, Fabian Bongratzarxiv.org/pdf/2403.06…null
2024-03-11EarthLoc: Astronaut Photography Localization by Indexing Earth from SpaceEarthLoc:通过从太空索引地球来进行宇航员摄影定位Gabriele Berton, Alex Stoken, Barbara Caputo, Carlo Masonearxiv.org/pdf/2403.06…null
2024-03-11Transferring Relative Monocular Depth to Surgical Vision with Temporal Consistency将相对单眼深度转换为具有时间一致性的手术视觉Charlie Budd, Tom Vercauterenarxiv.org/pdf/2403.06…null
2024-03-11Leveraging Foundation Models for Content-Based Medical Image Retrieval in Radiology利用基础模型进行放射学中基于内容的医学图像检索Stefan Denner, David Zimmerer, Dimitrios Bounias, Markus Bujotzek, Shuhan Xiao, Lisa Kausch, Philipp Schader, Tobias Penzkofer, Paul F. Jäger, Klaus Maier-Heinarxiv.org/pdf/2403.06…null
2024-03-11Reconstructing Visual Stimulus Images from EEG Signals Based on Deep Visual Representation Model基于深度视觉表示模型的脑电信号重建视觉刺激图像Hongguang Pan, Zhuoyi Li, Yunpeng Fu, Xuebin Qin, Jianchen Huarxiv.org/pdf/2403.06…null
2024-03-11FontCLIP: A Semantic Typography Visual-Language Model for Multilingual Font ApplicationsFontCLIP:用于多语言字体应用的语义排版视觉语言模型Yuki Tatsukawa, I-Chao Shen, Anran Qi, Yuki Koyama, Takeo Igarashi, Ariel Shamirarxiv.org/pdf/2403.06…null
2024-03-11Latent Semantic Consensus For Deterministic Geometric Model Fitting确定性几何模型拟合的潜在语义共识Guobao Xiao, Jun Yu, Jiayi Ma, Deng-Ping Fan, Ling Shaoarxiv.org/pdf/2403.06…null
2024-03-11Temporal-Mapping Photography for Event Cameras用于事件相机的时间映射摄影Yuhan Bao, Lei Sun, Yuqin Ma, Kaiwei Wangarxiv.org/pdf/2403.06…null
2024-03-11Bridging Domains with Approximately Shared Features桥接具有近似共享功能的域Ziliang Samuel Zhong, Xiang Pan, Qi Leiarxiv.org/pdf/2403.06…null
2024-03-11Comparison of No-Reference Image Quality Models via MAP Estimation in Diffusion Latents通过扩散潜伏中的 MAP 估计比较无参考图像质量模型Weixia Zhang, Dingquan Li, Guangtao Zhai, Xiaokang Yang, Kede Maarxiv.org/pdf/2403.06…null
2024-03-11Pre-Trained Model Recommendation for Downstream Fine-tuning用于下游微调的预训练模型推荐Jiameng Bai, Sai Wu, Jie Song, Junbo Zhao, Gang Chenarxiv.org/pdf/2403.06…null
2024-03-11Video Generation with Consistency Tuning具有一致性调整的视频生成Chaoyi Wang, Yaozhe Song, Yafeng Zhang, Jun Pei, Lijie Xia, Jianpo Liuarxiv.org/pdf/2403.06…null
2024-03-11Exploring Hardware Friendly Bottleneck Architecture in CNN for Embedded Computing Systems探索嵌入式计算系统 CNN 中的硬件友好瓶颈架构Xing Lei, Longjun Liu, Zhiheng Zhou, Hongbin Sun, Nanning Zhengarxiv.org/pdf/2403.06…null
2024-03-11Put Myself in Your Shoes: Lifting the Egocentric Perspective from Exocentric Videos设身处地为你着想:从外中心视频中提升自我中心视角Mi Luo, Zihui Xue, Alex Dimakis, Kristen Graumanarxiv.org/pdf/2403.06…null