[分享][每日更新][2024.03.12][CV_arxiv_papers]

326 阅读21分钟

[UPDATED!] 2024-03-12 (Publish Time)

生成模型

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-12Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation连接不同语言模型和生成视觉模型以生成文本到图像Shihao Zhao, Shaozhe Hao, Bojia Zi, Huaizhe Xu, Kwan-Yee K. Wongarxiv.org/pdf/2403.07…link
2024-03-12SemCity: Semantic Scene Generation with Triplane DiffusionSemCity:利用三平面扩散生成语义场景Jumin Lee, Sebin Lee, Changho Jo, Woobin Im, Juhyeong Seon, Sung-Eui Yoonarxiv.org/pdf/2403.07…link
2024-03-12Stable-Makeup: When Real-World Makeup Transfer Meets Diffusion Model稳定妆容:当现实世界的妆容转移遇到扩散模型时Yuxuan Zhang, Lifu Wei, Qing Zhang, Yiren Song, Jiaming Liu, Huaxia Li, Xu Tang, Yao Hu, Haibo Zhaoarxiv.org/pdf/2403.07…null
2024-03-12SSM Meets Video Diffusion Models: Efficient Video Generation with Structured State SpacesSSM 与视频扩散模型的结合:利用结构化状态空间高效生成视频Yuta Oshima, Shohei Taniguchi, Masahiro Suzuki, Yutaka Matsuoarxiv.org/pdf/2403.07…link
2024-03-12Annotations on a Budget: Leveraging Geo-Data Similarity to Balance Model Performance and Annotation Cost预算注释:利用地理数据相似性来平衡模型性能和注释成本Oana Ignat, Longju Bai, Joan Nwatu, Rada Mihalceaarxiv.org/pdf/2403.07…link
2024-03-12Multiple Latent Space Mapping for Compressed Dark Image Enhancement用于压缩暗图像增强的多重潜在空间映射Yi Zeng, Zhengning Wang, Yuxuan Liu, Tianjiao Zeng, Xuhang Liu, Xinglong Luo, Shuaicheng Liu, Shuyuan Zhu, Bing Zengarxiv.org/pdf/2403.07…null
2024-03-12Accurate Spatial Gene Expression Prediction by integrating Multi-resolution features通过集成多分辨率特征进行准确的空间基因表达预测Youngmin Chung, Ji Hun Ha, Kyeong Chan Im, Joo Sang Leearxiv.org/pdf/2403.07…link
2024-03-12The future of document indexing: GPT and Donut revolutionize table of content processing文档索引的未来:GPT 和 Donut 彻底改变了内容处理表Degaga Wolde Feyisa, Haylemicheal Berihun, Amanuel Zewdu, Mahsa Najimoghadam, Marzieh Zarearxiv.org/pdf/2403.07…null
2024-03-12D4D: An RGBD diffusion model to boost monocular depth estimationD4D:用于增强单目深度估计的 RGBD 扩散模型L. Papa, P. Russo, I. Ameriniarxiv.org/pdf/2403.07…link
2024-03-12Block-wise LoRA: Revisiting Fine-grained LoRA for Effective Personalization and Stylization in Text-to-Image Generation分块 LoRA:重新审视细粒度 LoRA,以实现文本到图像生成中的有效个性化和风格化Likun Li, Haoqi Zeng, Changpeng Yang, Haozhe Jia, Di Xuarxiv.org/pdf/2403.07…null
2024-03-12DragAnything: Motion Control for Anything using Entity RepresentationDragAnything:使用实体表示对任何物体进行运动控制Wejia Wu, Zhuang Li, Yuchao Gu, Rui Zhao, Yefei He, David Junhao Zhang, Mike Zheng Shou, Yan Li, Tingting Gao, Di Zhangarxiv.org/pdf/2403.07…link
2024-03-12Auxiliary CycleGAN-guidance for Task-Aware Domain Translation from Duplex to Monoplex IHC Images用于任务感知域从双工到单工 IHC 图像转换的辅助 CycleGAN 指南Nicolas Brieu, Nicolas Triltsch, Philipp Wortmann, Dominik Winter, Shashank Saran, Marlon Rebelatto, Günter Schmidtarxiv.org/pdf/2403.07…null
2024-03-12Time-Efficient and Identity-Consistent Virtual Try-On Using A Variant of Altered Diffusion Models使用改变的扩散模型变体进行省时且身份一致的虚拟试穿Phuong Dam, Jihoon Jeong, Anh Tran, Daeyoung Kimarxiv.org/pdf/2403.07…null
2024-03-12Challenging Forgets: Unveiling the Worst-Case Forget Sets in Machine Unlearning挑战遗忘:揭示机器遗忘中最坏情况的遗忘集Chongyu Fan, Jiancheng Liu, Alfred Hero, Sijia Liuarxiv.org/pdf/2403.07…null
2024-03-12Premonition: Using Generative Models to Preempt Future Data Changes in Continual Learning预感:使用生成模型来预防持续学习中的未来数据变化Mark D. McDonnell, Dong Gong, Ehsan Abbasnejad, Anton van den Hengelarxiv.org/pdf/2403.07…null
2024-03-12Vector Quantization for Deep-Learning-Based CSI Feedback in Massive MIMO Systems大规模 MIMO 系统中基于深度学习的 CSI 反馈的矢量量化Junyong Shin, Yujin Kang, Yo-Seb Jeonarxiv.org/pdf/2403.07…null
2024-03-12Large Window-based Mamba UNet for Medical Image Segmentation: Beyond Convolution and Self-attention用于医学图像分割的基于大窗口的 Mamba UNet:超越卷积和自注意力Jinhong Wang, Jintai Chen, Danny Chen, Jian Wuarxiv.org/pdf/2403.07…link
2024-03-12Efficient Diffusion Model for Image Restoration by Residual Shifting通过残差移位进行图像恢复的高效扩散模型Zongsheng Yue, Jianyi Wang, Chen Change Loyarxiv.org/pdf/2403.07…link
2024-03-12Dynamic U-Net: Adaptively Calibrate Features for Abdominal Multi-organ Segmentation动态 U-Net:腹部多器官分割的自适应校准特征Jin Yang, Daniel S. Marcus, Aristeidis Sotirasarxiv.org/pdf/2403.07…null
2024-03-12A Bayesian Approach to OOD Robustness in Image Classification图像分类中 OOD 鲁棒性的贝叶斯方法Prakhar Kaushik, Adam Kortylewski, Alan Yuillearxiv.org/pdf/2403.07…null
2024-03-12GuideGen: A Text-guided Framework for Joint CT Volume and Anatomical structure GenerationGuideGen:用于联合 CT 体积和解剖结构生成的文本引导框架Linrui Dai, Rongzhao Zhang, Zhongzhen Huang, Xiaofan Zhangarxiv.org/pdf/2403.07…null
2024-03-12Frequency-Aware Deepfake Detection: Improving Generalizability through Frequency Space Learning频率感知 Deepfake 检测:通过频率空间学习提高通用性Chuangchuang Tan, Yao Zhao, Shikui Wei, Guanghua Gu, Ping Liu, Yunchao Weiarxiv.org/pdf/2403.07…null
2024-03-12It's All About Your Sketch: Democratising Sketch Control in Diffusion Models一切都与您的草图有关:扩散模型中的草图控制民主化Subhadeep Koley, Ayan Kumar Bhunia, Deeptanshu Sekhri, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Songarxiv.org/pdf/2403.07…link
2024-03-12Learn and Search: An Elegant Technique for Object Lookup using Contrastive Learning学习和搜索:使用对比学习进行对象查找的优雅技术Chandan Kumar, Jansel Herrera-Gerena, John Just, Matthew Darr, Ali Jannesariarxiv.org/pdf/2403.07…null
2024-03-12Text-to-Image Diffusion Models are Great Sketch-Photo Matchmakers文本到图像的扩散模型是很棒的素描-照片匹配器Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Songarxiv.org/pdf/2403.07…null

多模态

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-12Beyond Text: Frozen Large Language Models in Visual Signal Comprehension超越文本:视觉信号理解中冻结的大型语言模型Lei Zhu, Fangyun Wei, Yanye Luarxiv.org/pdf/2403.07…link
2024-03-12Multi-modal Auto-regressive Modeling via Visual Words通过视觉词进行多模态自回归建模Tianshuo Peng, Zuchao Li, Lefei Zhang, Hai Zhao, Ping Wang, Bo Duarxiv.org/pdf/2403.07…null
2024-03-12Unleashing Network Potentials for Semantic Scene Completion释放网络潜力以完成语义场景Fengyun Wang, Qianru Sun, Dong Zhang, Jinhui Tangarxiv.org/pdf/2403.07…link
2024-03-12DALSA: Domain Adaptation for Supervised Learning From Sparsely Annotated MR ImagesDALSA:从稀疏注释的 MR 图像中进行监督学习的领域适应Michael Götz, Christian Weber, Franciszek Binczyk, Joanna Polanska, Rafal Tarnawski, Barbara Bobek-Billewicz, Ullrich Köthe, Jens Kleesiek, Bram Stieltjes, Klaus H. Maier-Heinarxiv.org/pdf/2403.07…null
2024-03-12Bring Event into RGB and LiDAR: Hierarchical Visual-Motion Fusion for Scene Flow将事件引入 RGB 和 LiDAR:场景流的分层视觉运动融合Hanyu Zhou, Yi Chang, Zhiwei Shi, Luxin Yanarxiv.org/pdf/2403.07…null
2024-03-12In-context learning enables multimodal large language models to classify cancer pathology images上下文学习使多模态大语言模型能够对癌症病理图像进行分类Dyke Ferber, Georg Wölflein, Isabella C. Wiest, Marta Ligero, Srividhya Sainath, Narmin Ghaffari Laleh, Omar S. M. El Nahhas, Gustav Müller-Franzes, Dirk Jäger, Daniel Truhn, et.al.arxiv.org/pdf/2403.07…null
2024-03-12Eliminating Cross-modal Conflicts in BEV Space for LiDAR-Camera 3D Object Detection消除 BEV 空间中 LiDAR 相机 3D 物体检测的跨模态冲突Jiahui Fu, Chen Gao, Zitian Wang, Lirong Yang, Xiaofei Wang, Beipeng Mu, Si Liuarxiv.org/pdf/2403.07…null
2024-03-12Textual Knowledge Matters: Cross-Modality Co-Teaching for Generalized Visual Class Discovery文本知识很重要:跨模态协同教学促进广义视觉类发现Haiyang Zheng, Nan Pu, Wenjing Li, Nicu Sebe, Zhun Zhongarxiv.org/pdf/2403.07…null
2024-03-12KEBench: A Benchmark on Knowledge Editing for Large Vision-Language ModelsKEBench:大型视觉语言模型知识编辑的基准Han Huang, Haitian Zhong, Qiang Liu, Shu Wu, Liang Wang, Tieniu Tanarxiv.org/pdf/2403.07…null
2024-03-12Lumen: Unleashing Versatile Vision-Centric Capabilities of Large Multimodal ModelsLumen:释放大型多模态模型以视觉为中心的多功能功能Yang Jiao, Shaoxiang Chen, Zequn Jie, Jingjing Chen, Lin Ma, Yu-Gang Jiangarxiv.org/pdf/2403.07…link
2024-03-12Let Storytelling Tell Vivid Stories: An Expressive and Fluent Multimodal Storyteller让讲故事讲生动的故事:富有表现力、流利的多模式讲故事者Chuanqi Zang, Jiji Tang, Rongsheng Zhang, Zeng Zhao, Tangjie Lv, Mingtao Pei, Wei Liangarxiv.org/pdf/2403.07…null
2024-03-12SparseLIF: High-Performance Sparse LiDAR-Camera Fusion for 3D Object DetectionSparseLIF:用于 3D 物体检测的高性能稀疏 LiDAR 相机融合Hongcheng Zhang, Liu Liang, Pengxin Zeng, Xiao Song, Zhe Wangarxiv.org/pdf/2403.07…null
2024-03-12Calibrating Multi-modal Representations: A Pursuit of Group Robustness without Annotations校准多模态表示:追求无注释的群体鲁棒性Chenyu You, Yifei Min, Weicheng Dai, Jasjeet S. Sekhon, Lawrence Staib, James S. Duncanarxiv.org/pdf/2403.07…null

Nerf

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-12SMURF: Continuous Dynamics for Motion-Deblurring Radiance FieldsSMURF:运动去模糊辐射场的连续动力学Jungho Lee, Dogyoon Lee, Minhyeok Lee, Donghyung Kim, Sangyoun Leearxiv.org/pdf/2403.07…link

3DGS

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-12StyleGaussian: Instant 3D Style Transfer with Gaussian SplattingStyleGaussian:使用高斯泼溅进行即时 3D 风格转移Kunhao Liu, Fangneng Zhan, Muyu Xu, Christian Theobalt, Ling Shao, Shijian Luarxiv.org/pdf/2403.07…null
2024-03-12SemGauss-SLAM: Dense Semantic Gaussian Splatting SLAMSemGauss-SLAM:密集语义高斯泼溅 SLAMSiting Zhu, Renjie Qin, Guangming Wang, Jiuming Liu, Hesheng Wangarxiv.org/pdf/2403.07…null

模型压缩/优化

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-12Distilling the Knowledge in Data Pruning提炼数据修剪的知识Emanuel Ben-Baruch, Adam Botach, Igor Kviatkovsky, Manoj Aggarwal, Gérard Medioniarxiv.org/pdf/2403.07…null
2024-03-12MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error MetricMoPE-CLIP:使用模块式剪枝误差度量对高效视觉语言模型进行结构化剪枝Haokun Lin, Haoli Bai, Zhili Liu, Lu Hou, Muyi Sun, Linqi Song, Ying Wei, Zhenan Sunarxiv.org/pdf/2403.07…null
2024-03-12Learning Generalizable Feature Fields for Mobile Manipulation学习移动操作的可推广特征字段Ri-Zhao Qiu, Yafei Hu, Ge Yang, Yuchen Song, Yang Fu, Jianglong Ye, Jiteng Mu, Ruihan Yang, Nikolay Atanasov, Sebastian Scherer, et.al.arxiv.org/pdf/2403.07…null
2024-03-12Continual All-in-One Adverse Weather Removal with Knowledge Replay on a Unified Network Structure在统一的网络结构上通过知识重放持续进行多合一的恶劣天气消除De Cheng, Yanling Ji, Dong Gong, Yan Li, Nannan Wang, Junwei Han, Dingwen Zhangarxiv.org/pdf/2403.07…link

分类/检测/识别/分割/...

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-12Label Dropout: Improved Deep Learning Echocardiography Segmentation Using Multiple Datasets With Domain Shift and Partial Labelling标签丢失:使用具有域转移和部分标签的多个数据集改进深度学习超声心动图分割Iman Islam, Esther Puyol-Antón, Bram Ruijsink, Andrew J. Reader, Andrew P. Kingarxiv.org/pdf/2403.07…null
2024-03-12BraSyn 2023 challenge: Missing MRI synthesis and the effect of different learning objectivesBraSyn 2023 挑战:缺少 MRI 综合以及不同学习目标的影响Ivo M. Baltruschat, Parvaneh Janbakhshi, Matthias Lengaarxiv.org/pdf/2403.07…null
2024-03-12Vision-based Vehicle Re-identification in Bridge Scenario using Flock Similarity使用群体相似度进行桥梁场景中基于视觉的车辆重识别Chunfeng Zhang, Ping Wangarxiv.org/pdf/2403.07…null
2024-03-12Unleashing HyDRa: Hybrid Fusion, Depth Consistency and Radar for Unified 3D Perception释放 HyDRa:混合融合、深度一致性和雷达,实现统一 3D 感知Philipp Wolters, Johannes Gilg, Torben Teepe, Fabian Herzog, Anouar Laouichi, Martin Hofmann, Gerhard Rigollarxiv.org/pdf/2403.07…null
2024-03-12Equipping Computational Pathology Systems with Artifact Processing Pipelines: A Showcase for Computation and Performance Trade-offs为计算病理学系统配备伪影处理管道:计算和性能权衡的展示Neel Kanwal, Farbod Khoraminia, Umay Kiraz, Andres Mosquera-Zamudio, Carlos Monteagudo, Emiel A. M. Janssen, Tahlita C. M. Zuiverloon, Chunmig Rong, Kjersti Enganarxiv.org/pdf/2403.07…link
2024-03-12DSEG-LIME -- Improving Image Explanation by Hierarchical Data-Driven SegmentationDSEG-LIME——通过分层数据驱动的分割改进图像解释Patrick Knab, Sascha Marton, Christian Barteltarxiv.org/pdf/2403.07…null
2024-03-12Dynamic Graph Representation with Knowledge-aware Attention for Histopathology Whole Slide Image Analysis具有知识意识关注的动态图表示用于组织病理学全幻灯片图像分析Jiawen Li, Yuxuan Chen, Hongbo Chu, Qiehe Sun, Tian Guan, Anjia Han, Yonghong Hearxiv.org/pdf/2403.07…link
2024-03-12Intra-video Positive Pairs in Self-Supervised Learning for Ultrasound超声自我监督学习中的视频内正对Blake VanBerlo, Alexander Wong, Jesse Hoey, Robert Arntfieldarxiv.org/pdf/2403.07…null
2024-03-12Fast and Simple Explainability for Point Cloud Networks点云网络快速且简单的可解释性Meir Yossef Levi, Guy Gilboaarxiv.org/pdf/2403.07…null
2024-03-12CuVLER: Enhanced Unsupervised Object Discoveries through Exhaustive Self-Supervised TransformersCuVLER:通过详尽的自监督 Transformer 增强无监督对象发现Shahaf Arica, Or Rubin, Sapir Gershov, Shlomi Lauferarxiv.org/pdf/2403.07…link
2024-03-12Decomposing Disease Descriptions for Enhanced Pathology Detection: A Multi-Aspect Vision-Language Matching Framework分解疾病描述以增强病理学检测:多方面视觉语言匹配框架Minh Hieu Phan, Yutong Xie, Yuankai Qi, Lingqiao Liu, Liyang Liu, Bowen Zhang, Zhibin Liao, Qi Wu, Minh-Son To, Johan W. Verjansarxiv.org/pdf/2403.07…link
2024-03-12Hunting Attributes: Context Prototype-Aware Learning for Weakly Supervised Semantic Segmentation狩猎属性:弱监督语义分割的上下文原型感知学习Feilong Tang, Zhongxing Xu, Zhaojun Qu, Wei Feng, Xingjian Jiang, Zongyuan Gearxiv.org/pdf/2403.07…link
2024-03-12Mondrian: On-Device High-Performance Video Analytics with Compressive Packed InferenceMondrian:具有压缩打包推理的设备上高性能视频分析Changmin Jeon, Seonjun Kim, Juheon Yi, Youngki Leearxiv.org/pdf/2403.07…null
2024-03-12MinkUNeXt: Point Cloud-based Large-scale Place Recognition using 3D Sparse ConvolutionsMinkUNeXt:使用 3D 稀疏卷积的基于点云的大规模地点识别J. J. Cabrera, A. Santo, A. Gil, C. Viegas, L. Payáarxiv.org/pdf/2403.07…null
2024-03-12PeLK: Parameter-efficient Large Kernel ConvNets with Peripheral ConvolutionPeLK:具有外围卷积的参数高效的大型内核卷积网络Honghao Chen, Xiangxiang Chu, Yongjian Ren, Xin Zhao, Kaiqi Huangarxiv.org/pdf/2403.07…null
2024-03-12FPT: Fine-grained Prompt Tuning for Parameter and Memory Efficient Fine Tuning in High-resolution Medical Image ClassificationFPT:用于高分辨率医学图像分类中参数和内存高效微调的细粒度提示调整Yijin Huang, Pujin Cheng, Roger Tam, Xiaoying Tangarxiv.org/pdf/2403.07…null
2024-03-12An Active Contour Model Driven By the Hybrid Signed Pressure Function混合符号压力函数驱动的主动轮廓模型Jing Zhaoarxiv.org/pdf/2403.07…null
2024-03-12Exploring Challenges in Deep Learning of Single-Station Ground Motion Records探索单站地面运动记录深度学习的挑战Ümit Mert Çağlar, Baris Yilmaz, Melek Türkmen, Erdem Akagündüz, Salih Tileyliogluarxiv.org/pdf/2403.07…null
2024-03-12RSBuilding: Towards General Remote Sensing Image Building Extraction and Change Detection with Foundation ModelRSBuilding:利用基础模型实现通用遥感图像建筑物提取和变化检测Mingze Wang, Keyan Chen, Lili Su, Cilin Yan, Sheng Xu, Haotian Zhang, Pengcheng Yuan, Xiaolong Jiang, Baochang Zhangarxiv.org/pdf/2403.07…null
2024-03-12A Survey of Vision Transformers in Autonomous Driving: Current Trends and Future Directions自动驾驶中视觉转换器的调查:当前趋势和未来方向Quoc-Vinh Lai-Dangarxiv.org/pdf/2403.07…null
2024-03-12Open-World Semantic Segmentation Including Class Similarity包括类相似性的开放世界语义分割Matteo Sodano, Federico Magistri, Lucas Nunes, Jens Behley, Cyrill Stachnissarxiv.org/pdf/2403.07…null
2024-03-12Open-Vocabulary Scene Text Recognition via Pseudo-Image Labeling and Margin Loss通过伪图像标签和边缘损失进行开放词汇场景文本识别Xuhua Ren, Hengcan Shi, Jin Liarxiv.org/pdf/2403.07…null
2024-03-12Spatiotemporal Representation Learning for Short and Long Medical Image Time Series短和长医学图像时间序列的时空表示学习Chengzhi Shen, Martin J. Menten, Hrvoje Bogunović, Ursula Schmidt-Erfurth, Hendrik Scholl, Sobha Sivaprasad, Andrew Lotery, Daniel Rueckert, Paul Hager, Robbie Hollandarxiv.org/pdf/2403.07…null
2024-03-12MoAI: Mixture of All Intelligence for Large Language and Vision ModelsMoAI:大型语言和视觉模型的所有智能的混合Byung-Kwan Lee, Beomchan Park, Chae Won Kim, Yong Man Roarxiv.org/pdf/2403.07…null
2024-03-12A Comprehensive Survey of 3D Dense Captioning: Localizing and Describing Objects in 3D Scenes3D 密集字幕的综合综述:定位和描述 3D 场景中的对象Ting Yu, Xiaojun Lin, Shuhui Wang, Weiguo Sheng, Qingming Huang, Jun Yuarxiv.org/pdf/2403.07…null
2024-03-12Backdoor Attack with Mode Mixture Latent Modification模式混合潜在修改的后门攻击Hongwei Zhang, Xiaoyin Xu, Dongsheng An, Xianfeng Gu, Min Zhangarxiv.org/pdf/2403.07…null
2024-03-12JSTR: Joint Spatio-Temporal Reasoning for Event-based Moving Object DetectionJSTR:基于事件的移动物体检测的联合时空推理Hanyu Zhou, Zhiwei Shi, Hao Dong, Shihan Peng, Yi Chang, Luxin Yanarxiv.org/pdf/2403.07…null
2024-03-12Input Data Adaptive Learning (IDAL) for Sub-acute Ischemic Stroke Lesion Segmentation用于亚急性缺血性中风病变分割的输入数据自适应学习 (IDAL)Michael Götz, Christian Weber, Christoph Kolb, Klaus Maier-Heinarxiv.org/pdf/2403.07…null
2024-03-12From Canteen Food to Daily Meals: Generalizing Food Recognition to More Practical Scenarios从食堂饭菜到日常膳食:将食物识别推广到更实际的场景Guoshan Liu, Yang Jiao, Jingjing Chen, Bin Zhu, Yu-Gang Jiangarxiv.org/pdf/2403.07…null
2024-03-12BID: Boundary-Interior Decoding for Unsupervised Temporal Action Localization Pre-TraininBID:无监督时间动作定位预训练的边界内部解码Qihang Fang, Chengcheng Tang, Shugao Ma, Yanchao Yangarxiv.org/pdf/2403.07…null
2024-03-12Customizable Avatars with Dynamic Facial Action Coded Expressions (CADyFACE) for Improved User Engagement具有动态面部动作编码表达式 (CADyFACE) 的可定制化身,可提高用户参与度Megan A. Witherow, Crystal Butler, Winston J. Shields, Furkan Ilgin, Norou Diawara, Janice Keener, John W. Harrington, Khan M. Iftekharuddinarxiv.org/pdf/2403.07…null
2024-03-12Advancements in Continuous Glucose Monitoring: Integrating Deep Learning and ECG Signal连续血糖监测的进展:深度学习和心电图信号的集成MohammadReza Hosseinzadehketilateh, Banafsheh Adami, Nima Karimianarxiv.org/pdf/2403.07…null
2024-03-12Rediscovering BCE Loss for Uniform Classification重新发现统一分类的 BCE 损失Qiufu Li, Xi Jia, Jiancan Zhou, Linlin Shen, Jinming Duanarxiv.org/pdf/2403.07…null
2024-03-12MENTOR: Multilingual tExt detectioN TOward leaRning by analogy导师:通过类比学习进行多语言文本检测Hsin-Ju Lin, Tsu-Chun Chung, Ching-Chun Hsiao, Pin-Yu Chen, Wei-Chen Chiu, Ching-Chun Huangarxiv.org/pdf/2403.07…null
2024-03-12Adaptive Bounding Box Uncertainties via Two-Step Conformal Prediction通过两步共形预测的自适应边界框不确定性Alexander Timans, Christoph-Nikolas Straehle, Kaspar Sakmann, Eric Nalisnickarxiv.org/pdf/2403.07…null
2024-03-12Towards Zero-shot Human-Object Interaction Detection via Vision-Language Integration通过视觉语言集成实现零样本人机交互检测Weiying Xue, Qi Liu, Qiwei Xiong, Yuxiao Wang, Zhenao Wei, Xiaofen Xing, Xiangmin Xuarxiv.org/pdf/2403.07…null
2024-03-12Monocular Microscope to CT Registration using Pose Estimation of the Incus for Augmented Reality Cochlear Implant Surgery单目显微镜到 CT 配准,使用砧骨姿势估计进行增强现实人工耳蜗植入手术Yike Zhang, Eduardo Davalos, Dingjie Su, Ange Lou, Jack H. Noblearxiv.org/pdf/2403.07…null

图像理解

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-12Adaptive Fusion of Single-View and Multi-View Depth for Autonomous Driving自动驾驶的单视图和多视图深度自适应融合JunDa Cheng, Wei Yin, Kaixuan Wang, Xiaozhi Chen, Shijie Wang, Xin Yangarxiv.org/pdf/2403.07…null
2024-03-12SGE: Structured Light System Based on Gray Code with an Event CameraSGE:基于格雷码和事件相机的结构光系统Xingyu Lu, Lei Sun, Diyang Gu, Zhijie Xu, Kaiwei Wangarxiv.org/pdf/2403.07…null

LLM

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-12Synth![^2](): Boosting Visual-Language Models with Synthetic Captions and Image EmbeddingsSynth![^2]():通过合成字幕和图像嵌入增强视觉语言模型Sahand Sharifzadeh, Christos Kaplanis, Shreya Pathak, Dharshan Kumaran, Anastasija Ilic, Jovana Mitrovic, Charles Blundell, Andrea Baninoarxiv.org/pdf/2403.07…null
2024-03-12NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled ReasoningNavCoT:通过学习解缠推理促进基于 LLM 的视觉和语言导航Bingqian Lin, Yunshuang Nie, Ziming Wei, Jiaqi Chen, Shikui Ma, Jianhua Han, Hang Xu, Xiaojun Chang, Xiaodan Liangarxiv.org/pdf/2403.07…link

Transformer

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-12When Eye-Tracking Meets Machine Learning: A Systematic Review on Applications in Medical Image Analysis当眼动追踪遇到机器学习:医学图像分析应用的系统回顾Sahar Moradizeyveh, Mehnaz Tabassum, Sidong Liu, Robert Ahadizad Newport, Amin Beheshti, Antonio Di Ievaarxiv.org/pdf/2403.07…null
2024-03-12Masked AutoDecoder is Effective Multi-Task Vision GeneralistMasked AutoDecoder 是高效的多任务视觉通才Han Qiu, Jiaxing Huang, Peng Gao, Lewei Lu, Xiaoqin Zhang, Shijian Luarxiv.org/pdf/2403.07…link
2024-03-12Genuine Knowledge from Practice: Diffusion Test-Time Adaptation for Video Adverse Weather Removal实践中的真知:视频恶劣天气去除的扩散测试时间适应Yijun Yang, Hongtao Wu, Angelica I. Aviles-Rivero, Yulun Zhang, Jing Qin, Lei Zhuarxiv.org/pdf/2403.07…null
2024-03-12Smartphone region-wise image indoor localization using deep learning for indoor tourist attraction使用深度学习对室内旅游景点进行智能手机区域图像室内定位Gabriel Toshio Hirokawa Higa, Rodrigo Stuqui Monzani, Jorge Fernando da Silva Cecatto, Maria Fernanda Balestieri Mariano de Souza, Vanessa Aparecida de Moraes Weber, Hemerson Pistori, Edson Takashi Matsubaraarxiv.org/pdf/2403.07…null
2024-03-12LaB-GATr: geometric algebra transformers for large biomedical surface and volume meshesLaB-GATr:用于大型生物医学表面和体积网格的几何代数转换器Julian Suk, Baris Imre, Jelmer M. Wolterinkarxiv.org/pdf/2403.07…null
2024-03-12ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense PredictionsViT-CoMer:具有卷积多尺度特征交互的视觉变压器,用于密集预测Chunlong Xia, Xinliang Wang, Feng Lv, Xin Hao, Yifeng Shiarxiv.org/pdf/2403.07…link
2024-03-12Learning Correction Errors via Frequency-Self Attention for Blind Image Super-Resolution通过频率自注意力学习校正误差以实现盲图像超分辨率Haochen Sun, Yan Yuan, Lijuan Su, Haotian Shaoarxiv.org/pdf/2403.07…null
2024-03-12Gabor-guided transformer for single image deraining用于单图像去雨的 Gabor 引导变压器Sijin He, Guangfeng Linarxiv.org/pdf/2403.07…null
2024-03-12IM-Unpack: Training and Inference with Arbitrarily Low Precision IntegersIM-Unpack:使用任意低精度整数进行训练和推理Zhanpeng Zeng, Karthikeyan Sankaralingam, Vikas Singharxiv.org/pdf/2403.07…null
2024-03-12Learning Hierarchical Color Guidance for Depth Map Super-Resolution学习深度图超分辨率的分层颜色指导Runmin Cong, Ronghui Sheng, Hao Wu, Yulan Guo, Yunchao Wei, Wangmeng Zuo, Yao Zhao, Sam Kwongarxiv.org/pdf/2403.07…null

3D/CG

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-12DexCap: Scalable and Portable Mocap Data Collection System for Dexterous ManipulationDexCap:可扩展且便携式的 Mocap 数据收集系统,用于灵巧操作Chen Wang, Haochen Shi, Weizhuo Wang, Ruohan Zhang, Li Fei-Fei, C. Karen Liuarxiv.org/pdf/2403.07…null
2024-03-12Generative deep learning-enabled ultra-large field-of-view lens-free imaging支持生成式深度学习的超大视场无镜头成像Ronald B. Liu, Zhe Liu, Max G. A. Wolf, Krishna P. Purohit, Gregor Fritz, Yi Feng, Carsten G. Hansen, Pierre O. Bagnaninchi, Xavier Casadevall i Solvas, Yunjie Yangarxiv.org/pdf/2403.07…link
2024-03-12Motion Mamba: Efficient and Long Sequence Motion Generation with Hierarchical and Bidirectional Selective SSMMotion Mamba:利用分层和双向选择性 SSM 生成高效、长序列的运动Zeyu Zhang, Akide Liu, Ian Reid, Richard Hartley, Bohan Zhuang, Hao Tangarxiv.org/pdf/2403.07…null
2024-03-12FSC: Few-point Shape CompletionFSC:少点形状完成Xianzu Wu, Xianfeng Wu, Tianyu Luan, Yajing Bai, Zhongyuan Lai, Junsong Yuanarxiv.org/pdf/2403.07…null
2024-03-12Frequency Decoupling for Motion Magnification via Multi-Level Isomorphic Architecture通过多级同构架构进行频率解耦以实现运动放大Fei Wang, Dan Guo, Kun Li, Zhun Zhong, Meng Wangarxiv.org/pdf/2403.07…link
2024-03-12Complementing Event Streams and RGB Frames for Hand Mesh Reconstruction补充事件流和 RGB 帧以进行手部网格重建Jianping Jiang, Xinyu Zhou, Bingxuan Wang, Xiaoming Deng, Chao Xu, Boxin Shiarxiv.org/pdf/2403.07…null

各类学习方式

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-1212 mJ per Class On-Device Online Few-Shot Class-Incremental Learning每节课 12 mJ 设备上在线少样本课程 - 增量学习Yoga Esa Wibowo, Cristian Cioflan, Thorir Mar Ingolfsson, Michael Hersche, Leo Zhao, Abbas Rahimi, Luca Beniniarxiv.org/pdf/2403.07…link
2024-03-12A Fourier Transform Framework for Domain Adaptation用于域适应的傅立叶变换框架Le Luo, Bingrong Xu, Qingyong Zhang, Cheng Lian, Jie Luoarxiv.org/pdf/2403.07…null
2024-03-12Uncertainty-guided Contrastive Learning for Single Source Domain Generalisation用于单源域泛化的不确定性引导对比学习Anastasios Arsenos, Dimitrios Kollias, Evangelos Petrongonas, Christos Skliros, Stefanos Kolliasarxiv.org/pdf/2403.07…null
2024-03-12NightHaze: Nighttime Image Dehazing via Self-Prior LearningNightHaze:通过自先学习进行夜间图像去雾Beibei Lin, Yeying Jin, Wending Yan, Wei Ye, Yuan Yuan, Robby T. Tanarxiv.org/pdf/2403.07…null
2024-03-12FeTrIL++: Feature Translation for Exemplar-Free Class-Incremental Learning with Hill-ClimbingFeTrIL++:通过爬山实现无示例类增量学习的特征翻译Eduard Hogea, Adrian Popescu, Darian Onchis, Grégoire Petitarxiv.org/pdf/2403.07…null

其他

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-12Uncertainty Quantification with Deep Ensembles for 6D Object Pose Estimation用于 6D 物体姿态估计的深度集成的不确定性量化Kira Wursthorn, Markus Hillemann, Markus Ulricharxiv.org/pdf/2403.07…null
2024-03-12Robust Synthetic-to-Real Transfer for Stereo Matching用于立体匹配的强大的合成到真实传输Jiawei Zhang, Jiahe Li, Lei Huang, Xiaohan Yu, Lin Gu, Jin Zheng, Xiao Baiarxiv.org/pdf/2403.07…link
2024-03-12Optimizing Negative Prompts for Enhanced Aesthetics and Fidelity in Text-To-Image Generation优化负面提示以增强文本到图像生成的美观性和保真度Michael Ogezi, Ning Shiarxiv.org/pdf/2403.07…null
2024-03-12Unified Source-Free Domain Adaptation统一无源域适配Song Tang, Wenxin Su, Mao Ye, Jianwei Zhang, Xiatian Zhuarxiv.org/pdf/2403.07…link
2024-03-12AACP: Aesthetics assessment of children's paintings based on self-supervised learningAACP:基于自我监督学习的儿童绘画美学评估Shiqi Jiang, Ning Li, Chen Shi, Liping Guo, Changbo Wang, Chenhui Liarxiv.org/pdf/2403.07…null
2024-03-12Category-Agnostic Pose Estimation for Point Clouds点云的类别无关姿态估计Bowen Liu, Wei Liu, Siang Chen, Pengwei Xie, Guijin Wangarxiv.org/pdf/2403.07…null
2024-03-12Entropy is not Enough for Test-Time Adaptation: From the Perspective of Disentangled Factors熵不足以适应测试时间:从解开因素的角度来看Jonghyun Lee, Dahuin Jung, Saehyung Lee, Junsung Park, Juhyeon Shin, Uiwon Hwang, Sungroh Yoonarxiv.org/pdf/2403.07…null
2024-03-12Time-Efficient Light-Field Acquisition Using Coded Aperture and Events使用编码孔径和事件进行高效的光场采集Shuji Habuchi, Keita Takahashi, Chihiro Tsutake, Toshiaki Fujii, Hajime Nagaharaarxiv.org/pdf/2403.07…null
2024-03-12You'll Never Walk Alone: A Sketch and Text Duet for Fine-Grained Image Retrieval你永远不会独行:用于细粒度图像检索的草图和文本二重奏Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Songarxiv.org/pdf/2403.07…null