[分享][每日更新][2024.03.17][CV_arxiv_papers]

310 阅读13分钟

[UPDATED!] 2024-03-17 (Publish Time)

生成模型

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-17StainDiffuser: MultiTask Dual Diffusion Model for Virtual StainingStainDiffuser:用于虚拟染色的多任务双扩散模型Tushar Kataria, Beatrice Knudsen, Shireen Y. Elhabianarxiv.org/pdf/2403.11…null
2024-03-17Enhancing Bandwidth Efficiency for Video Motion Transfer Applications using Deep Learning Based Keypoint Prediction使用基于深度学习的关键点预测提高视频运动传输应用的带宽效率Xue Bai, Tasmiah Haque, Sumit Mohan, Yuliang Cai, Byungheon Jeong, Adam Halasz, Srinjoy Dasarxiv.org/pdf/2403.11…null
2024-03-17Advanced Knowledge Extraction of Physical Design Drawings, Translation and conversion to CAD formats using Deep Learning使用深度学习提取物理设计图纸的高级知识、翻译并转换为 CAD 格式Jesher Joshua M, Ragav V, Syed Ibrahim S Parxiv.org/pdf/2403.11…null
2024-03-17Fast Personalized Text-to-Image Syntheses With Attention Injection通过注意力注入进行快速个性化文本到图像合成Yuxuan Zhang, Yiren Song, Jinpeng Yu, Han Pan, Zhongliang Jingarxiv.org/pdf/2403.11…null
2024-03-17Stylized Face Sketch Extraction via Generative Prior with Limited Data通过有限数据的生成先验提取程式化脸部草图Kwan Yun, Kwanggyoon Seo, Chang Wook Seo, Soyeon Yoon, Seongcheol Kim, Soohyun Ji, Amirsaman Ashtari, Junyong Noharxiv.org/pdf/2403.11…null
2024-03-17THOR: Text to Human-Object Interaction Diffusion via Relation InterventionTHOR:通过关系干预将文本传播到人机交互Qianyang Wu, Ye Shi, Xiaoshui Huang, Jingyi Yu, Lan Xu, Jingya Wangarxiv.org/pdf/2403.11…null
2024-03-17MaskDiffusion: Exploiting Pre-trained Diffusion Models for Semantic SegmentationMaskDiffusion:利用预训练的扩散模型进行语义分割Yasufumi Kawano, Yoshimitsu Aokiarxiv.org/pdf/2403.11…null
2024-03-17Artifact Feature Purification for Cross-domain Detection of AI-generated Images用于 AI 生成图像跨域检测的伪影特征纯化Zheling Meng, Bo Peng, Jing Dong, Tieniu Tanarxiv.org/pdf/2403.11…null
2024-03-17CGI-DM: Digital Copyright Authentication for Diffusion Models via Contrasting Gradient InversionCGI-DM:通过对比梯度反转进行扩散模型的数字版权认证Xiaoyu Wu, Yang Hua, Chumeng Liang, Jiaru Zhang, Hao Wang, Tao Song, Haibing Guanarxiv.org/pdf/2403.11…null
2024-03-17Selective Hourglass Mapping for Universal Image Restoration Based on Diffusion Model基于扩散模型的选择性沙漏映射通用图像恢复Dian Zheng, Xiao-Ming Wu, Shuzhou Yang, Jian Zhang, Jian-Fang Hu, Wei-Shi Zhengarxiv.org/pdf/2403.11…null
2024-03-17Omni-Recon: Towards General-Purpose Neural Radiance Fields for Versatile 3D ApplicationsOmni-Recon:面向多功能 3D 应用的通用神经辐射场Yonggan Fu, Huaizhi Qu, Zhifan Ye, Chaojian Li, Kevin Zhao, Yingyan Linarxiv.org/pdf/2403.11…null
2024-03-173D Human Reconstruction in the Wild with Synthetic Data Using Generative Models使用生成模型利用合成数据在野外进行 3D 人体重建Yongtao Ge, Wenjia Wang, Yongfan Chen, Hao Chen, Chunhua Shenarxiv.org/pdf/2403.11…null
2024-03-17Source Prompt Disentangled Inversion for Boosting Image Editability with Diffusion Models源提示解缠结反演,通过扩散模型提高图像可编辑性Ruibin Li, Ruihuang Li, Song Guo, Lei Zhangarxiv.org/pdf/2403.11…null
2024-03-17Adaptive Semantic-Enhanced Denoising Diffusion Probabilistic Model for Remote Sensing Image Super-Resolution遥感图像超分辨率自适应语义增强去噪扩散概率模型Jialu Sui, Xianping Ma, Xiaokang Zhang, Man-On Punarxiv.org/pdf/2403.11…null
2024-03-17Zippo: Zipping Color and Transparency Distributions into a Single Diffusion ModelZippo:将颜色和透明度分布压缩到单个扩散模型中Kangyang Xie, Binbin Yang, Hao Chen, Meng Wang, Cheng Zou, Hui Xue, Ming Yang, Chunhua Shenarxiv.org/pdf/2403.11…null
2024-03-17Unveiling and Mitigating Memorization in Text-to-image Diffusion Models through Cross Attention通过交叉注意力揭示和减轻文本到图像扩散模型中的记忆Jie Ren, Yaxin Li, Shenglai Zen, Han Xu, Lingjuan Lyu, Yue Xing, Jiliang Tangarxiv.org/pdf/2403.11…null
2024-03-17Endora: Video Generation Models as Endoscopy SimulatorsEndora:作为内窥镜模拟器的视频生成模型Chenxin Li, Hengyu Liu, Yifan Liu, Brandon Y. Feng, Wuyang Li, Xinyu Liu, Zhen Chen, Jing Shao, Yixuan Yuanarxiv.org/pdf/2403.11…null

多模态

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-17Reconstruct before Query: Continual Missing Modality Learning with Decomposed Prompt Collaboration查询前重建:通过分解的即时协作进行持续缺失的模态学习Shu Zhao, Xiaohan Zou, Tan Yu, Huijuan Xuarxiv.org/pdf/2403.11…null
2024-03-17Few-Shot VQA with Frozen LLMs: A Tale of Two Approaches冻结法学硕士的少样本 VQA:两种方法的故事Igor Sterner, Weizhe Lin, Jinghong Chen, Bill Byrnearxiv.org/pdf/2403.11…null
2024-03-17Bilateral Propagation Network for Depth Completion用于深度补全的双边传播网络Jie Tang, Fei-Peng Tian, Boshi An, Jian Li, Ping Tanarxiv.org/pdf/2403.11…null
2024-03-17FORCE: Dataset and Method for Intuitive Physics Guided Human-object InteractionFORCE:直观物理引导人机交互的数据集和方法Xiaohan Zhang, Bharat Lal Bhatnagar, Sebastian Starke, Ilya Petrov, Vladimir Guzov, Helisa Dhamo, Eduardo Pérez-Pellitero, Gerard Pons-Mollarxiv.org/pdf/2403.11…null
2024-03-17PhD: A Prompted Visual Hallucination Evaluation Dataset博士:提示视幻觉评估数据集Jiazhen Liu, Yuhan Fu, Ruobing Xie, Runquan Xie, Xingwu Sun, Fengzong Lian, Zhanhui Kang, Xirong Liarxiv.org/pdf/2403.11…null
2024-03-17m&m's: A Benchmark to Evaluate Tool-Use for multi-step multi-modal Tasksm&m's:评估多步骤多模式任务工具使用的基准Zixian Ma, Weikai Huang, Jieyu Zhang, Tanmay Gupta, Ranjay Krishnaarxiv.org/pdf/2403.11…null
2024-03-17Customizing Visual-Language Foundation Models for Multi-modal Anomaly Detection and Reasoning定制用于多模态异常检测和推理的视觉语言基础模型Xiaohao Xu, Yunkang Cao, Yongqi Chen, Weiming Shen, Xiaonan Huangarxiv.org/pdf/2403.11…null
2024-03-17From Pixels to Predictions: Spectrogram and Vision Transformer for Better Time Series Forecasting从像素到预测:频谱图和视觉转换器可实现更好的时间序列预测Zhen Zeng, Rachneet Kaur, Suchetha Siddagangappa, Tucker Balch, Manuela Velosoarxiv.org/pdf/2403.11…null

Nerf

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-17Creating Seamless 3D Maps Using Radiance Fields使用辐射场创建无缝 3D 地图Sai Tarun Sathyan, Thomas B. Kinsmanarxiv.org/pdf/2403.11…null
2024-03-17SpikeNeRF: Learning Neural Radiance Fields from Continuous Spike StreamSpikeNeRF:从连续尖峰流中学习神经辐射场Lin Zhu, Kangmin Jia, Yifan Zhao, Yunshan Qi, Lizhi Wang, Hua Huangarxiv.org/pdf/2403.11…null
2024-03-17Recent Advances in 3D Gaussian Splatting3D 高斯分布的最新进展Tong Wu, Yu-Jie Yuan, Ling-Xiao Zhang, Jie Yang, Yan-Pei Cao, Ling-Qi Yan, Lin Gaoarxiv.org/pdf/2403.11…null

3DGS

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-173DGS-ReLoc: 3D Gaussian Splatting for Map Representation and Visual ReLocalization3DGS-ReLoc:用于地图表示和视觉重定位的 3D 高斯泼溅Peng Jiang, Gaurav Pandey, Srikanth Saripalliarxiv.org/pdf/2403.11…null
2024-03-17GeoGaussian: Geometry-aware Gaussian Splatting for Scene RenderingGeoGaussian:用于场景渲染的几何感知高斯泼溅Yanyan Li, Chenyu Lyu, Yan Di, Guangyao Zhai, Gim Hee Lee, Federico Tombariarxiv.org/pdf/2403.11…null
2024-03-17BrightDreamer: Generic 3D Gaussian Generative Framework for Fast Text-to-3D SynthesisBrightDreamer:用于快速文本到 3D 合成的通用 3D 高斯生成框架Lutao Jiang, Lin Wangarxiv.org/pdf/2403.11…null
2024-03-17Compact 3D Gaussian Splatting For Dense Visual SLAM用于密集视觉 SLAM 的紧凑 3D 高斯泼溅Tianchen Deng, Yaohui Chen, Leyan Zhang, Jianfei Yang, Shenghai Yuan, Danwei Wang, Weidong Chenarxiv.org/pdf/2403.11…null
2024-03-17Analytic-Splatting: Anti-Aliased 3D Gaussian Splatting via Analytic Integration解析扩散:通过解析积分进行抗锯齿 3D 高斯扩散Zhihao Liang, Qi Zhang, Wenbo Hu, Ying Feng, Lei Zhu, Kui Jiaarxiv.org/pdf/2403.11…null

模型压缩/优化

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-17Self-Supervised Quantization-Aware Knowledge Distillation自监督量化感知知识蒸馏Kaiqi Zhao, Ming Zhaoarxiv.org/pdf/2403.11…null
2024-03-17Graph Expansion in Pruned Recurrent Neural Network Layers Preserve Performance修剪后的循环神经网络层中的图扩展可保持性能Suryam Arnav Kalra, Arindam Biswas, Pabitra Mitra, Biswajit Basuarxiv.org/pdf/2403.11…null

分类/检测/识别/分割/...

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-17V2X-DGW: Domain Generalization for Multi-agent Perception under Adverse Weather ConditionsV2X-DGW:恶劣天气条件下多智能体感知的领域泛化Baolu Li, Jinlong Li, Xinyu Liu, Runsheng Xu, Zhengzhong Tu, Jiacheng Guo, Xiaopeng Li, Hongkai Yuarxiv.org/pdf/2403.11…null
2024-03-17Ensembling and Test Augmentation for Covid-19 Detection and Covid-19 Domain Adaptation from 3D CT-Scans通过 3D CT 扫描进行 Covid-19 检测和 Covid-19 域适应的集成和测试增强Fares Bougourzi, Feryal Windal Moula, Halim Benhabiles, Fadi Dornaika, Abdelmalik Taleb-Ahmedarxiv.org/pdf/2403.11…null
2024-03-17Domain-Guided Masked Autoencoders for Unique Player Identification用于唯一玩家识别的域引导屏蔽自动编码器Bavesh Balaji, Jerrin Bright, Sirisha Rambhatla, Yuhao Chen, Alexander Wong, John Zelek, David A Clausiarxiv.org/pdf/2403.11…null
2024-03-17NeoNeXt: Novel neural network operator and architecture based on the patch-wise matrix multiplicationsNeoNeXt:基于分片矩阵乘法的新型神经网络算子和架构Vladimir Korviakov, Denis Koposovarxiv.org/pdf/2403.11…null
2024-03-17YOLOv9 for Fracture Detection in Pediatric Wrist Trauma X-ray ImagesYOLOv9 用于儿童手腕创伤 X 射线图像中的骨折检测Chun-Tse Chien, Rui-Yang Ju, Kuang-Yi Chou, Jen-Shiun Chiangarxiv.org/pdf/2403.11…null
2024-03-17Simple 2D Convolutional Neural Network-based Approach for COVID-19 Detection基于简单 2D 卷积神经网络的 COVID-19 检测方法Chih-Chung Hsu, Chia-Ming Lee, Yang Fan Chiang, Yi-Shiuan Chou, Chih-Yu Jiang, Shen-Chieh Tai, Chi-Han Tsaiarxiv.org/pdf/2403.11…null
2024-03-17Concatenate, Fine-tuning, Re-training: A SAM-enabled Framework for Semi-supervised 3D Medical Image Segmentation连接、微调、再训练:支持 SAM 的半监督 3D 医学图像分割框架Shumeng Li, Lei Qi, Qian Yu, Jing Huo, Yinghuan Shi, Yang Gaoarxiv.org/pdf/2403.11…null
2024-03-17CPA-Enhancer: Chain-of-Thought Prompted Adaptive Enhancer for Object Detection under Unknown DegradationsCPA-Enhancer:用于未知退化下目标检测的思想链提示自适应增强器Yuwei Zhang, Yan Wu, Yanming Liu, Xinyue Pengarxiv.org/pdf/2403.11…link
2024-03-17RCdpia: A Renal Carcinoma Digital Pathology Image Annotation dataset based on pathologistsRCdpia:基于病理学家的肾癌数字病理学图像注释数据集Qingrong Sun, Weixiang Zhong, Jie Zhou, Chong Lai, Xiaodong Teng, Maode Laiarxiv.org/pdf/2403.11…null
2024-03-17TAG: Guidance-free Open-Vocabulary Semantic Segmentation标签:无指导开放词汇语义分割Yasufumi Kawano, Yoshimitsu Aokiarxiv.org/pdf/2403.11…null
2024-03-17NetTrack: Tracking Highly Dynamic Objects with a NetNetTrack:使用网络跟踪高度动态的物体Guangze Zheng, Shijie Lin, Haobo Zuo, Changhong Fu, Jia Panarxiv.org/pdf/2403.11…null
2024-03-17DuPL: Dual Student with Trustworthy Progressive Learning for Robust Weakly Supervised Semantic SegmentationDuPL:具有值得信赖的渐进式学习的双重学生,用于鲁棒的弱监督语义分割Yuanchen Wu, Xichen Ye, Kequan Yang, Jide Li, Xiaoqiang Liarxiv.org/pdf/2403.11…null
2024-03-17A lightweight deep learning pipeline with DRDA-Net and MobileNet for breast cancer classification使用 DRDA-Net 和 MobileNet 进行乳腺癌分类的轻量级深度学习管道Mahdie Ahmadi, Nader Karimi, Shadrokh Samaviarxiv.org/pdf/2403.11…null
2024-03-17GRA: Detecting Oriented Objects through Group-wise Rotating and AttentionGRA:通过分组旋转和注意力检测定向物体Jiangshan Wang, Yifan Pu, Yizeng Han, Jiayi Guo, Yiru Wang, Xiu Li, Gao Huangarxiv.org/pdf/2403.11…null
2024-03-17LERENet: Eliminating Intra-class Differences for Metal Surface Defect Few-shot Semantic SegmentationLERENet:消除金属表面缺陷的类内差异少样本语义分割Hanze Ding, Zhangkai Wu, Jiyan Zhang, Ming Ping, Yanfang Liuarxiv.org/pdf/2403.11…null
2024-03-17Local-consistent Transformation Learning for Rotation-invariant Point Cloud Analysis用于旋转不变点云分析的局部一致变换学习Yiyang Chen, Lunhao Duan, Shanshan Zhao, Changxing Ding, Dacheng Taoarxiv.org/pdf/2403.11…null
2024-03-17Self-supervised co-salient object detection via feature correspondence at multiple scales通过多尺度特征对应进行自监督共显着目标检测Souradeep Chakraborty, Dimitris Samarasarxiv.org/pdf/2403.11…null
2024-03-17Hierarchical Generative Network for Face Morphing Attacks用于面部变形攻击的分层生成网络Zuyuan He, Zongyong Deng, Qiaoyun He, Qijun Zhaoarxiv.org/pdf/2403.11…null
2024-03-17Multitask frame-level learning for few-shot sound event detection用于少样本声音事件检测的多任务帧级学习Liang Zou, Genwei Yan, Ruoyu Wang, Jun Du, Meng Lei, Tian Gao, Xin Fangarxiv.org/pdf/2403.11…null
2024-03-17Audio-Visual Segmentation via Unlabeled Frame Exploitation通过未标记帧利用进行视听分割Jinxiang Liu, Yikun Liu, Fei Zhang, Chen Ju, Ya Zhang, Yanfeng Wangarxiv.org/pdf/2403.11…null
2024-03-17Tokensome: Towards a Genetic Vision-Language GPT for Explainable and Cognitive KaryotypingTokensome:迈向可解释和认知核型分析的遗传视觉语言 GPTHaoxi Zhang, Xinxu Zhang, Yuanxin Lin, Maiqi Wang, Yi Lai, Yu Wang, Linfeng Yu, Yufeng Xu, Ran Cheng, Edward Szczerbickiarxiv.org/pdf/2403.11…null
2024-03-17Intelligent Railroad Grade Crossing: Leveraging Semantic Segmentation and Object Detection for Enhanced Safety智能铁路平交道口:利用语义分割和对象检测来增强安全性Al Amin, Deo Chimba, Kamrul Hasan, Emmanuel Samsonarxiv.org/pdf/2403.11…null

GNN

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-17DynamicGlue: Epipolar and Time-Informed Data Association in Dynamic Environments using Graph Neural NetworksDynamicGlue:使用图神经网络在动态环境中进行极线和时间通知数据关联Theresa Huber, Simon Schaefer, Stefan Leuteneggerarxiv.org/pdf/2403.11…null

LLM

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-17Large Language Models Powered Context-aware Motion Prediction大型语言模型支持上下文感知运动预测Xiaoji Zheng, Lixiu Wu, Zhijie Yan, Yuanrong Tang, Hao Zhao, Chen Zhong, Bokui Chen, Jiangtao Gongarxiv.org/pdf/2403.11…null

Transformer

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-17Unifying Feature and Cost Aggregation with Transformers for Semantic and Visual Correspondence使用 Transformers 统一功能和成本聚合,以实现语义和视觉对应Sunghwan Hong, Seokju Cho, Seungryong Kim, Stephen Linarxiv.org/pdf/2403.11…null

3D/CG

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-17A Dual-Augmentor Framework for Domain Generalization in 3D Human Pose Estimation3D 人体姿态估计中域泛化的双增强器框架Qucheng Peng, Ce Zheng, Chen Chenarxiv.org/pdf/2403.11…null
2024-03-17STAIR: Semantic-Targeted Active Implicit ReconstructionSTAIR:以语义为目标的主动隐式重建Liren Jin, Haofei Kuang, Yue Pan, Cyrill Stachniss, Marija Popovićarxiv.org/pdf/2403.11…null
2024-03-17Neural Markov Random Field for Stereo Matching用于立体匹配的神经马尔可夫随机场Tongfan Guan, Chen Wang, Yun-Hui Liuarxiv.org/pdf/2403.11…link
2024-03-17Boosting Semi-Supervised Temporal Action Localization by Learning from Non-Target Classes通过向非目标类学习来促进半监督时间动作本地化Kun Xia, Le Wang, Sanping Zhou, Gang Hua, Wei Tangarxiv.org/pdf/2403.11…null
2024-03-17Training A Small Emotional Vision Language Model for Visual Art Comprehension训练用于视觉艺术理解的小型情感视觉语言模型Jing Zhang, Liang Zheng, Dan Guo, Meng Wangarxiv.org/pdf/2403.11…null
2024-03-17A Versatile Framework for Multi-scene Person Re-identification多场景行人重识别的多功能框架Wei-Shi Zheng, Junkai Yan, Yi-Xing Pengarxiv.org/pdf/2403.11…null

各类学习方式

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-17Uncertainty-Aware Pseudo-Label Filtering for Source-Free Unsupervised Domain Adaptation用于无源无监督域适应的不确定性感知伪标签过滤Xi Chen, Haosen Yang, Huicong Zhang, Hongxun Yao, Xiatian Zhuarxiv.org/pdf/2403.11…null
2024-03-17Universal Semi-Supervised Domain Adaptation by Mitigating Common-Class Bias通过减轻共同类偏差进行通用半监督域适应Wenyu Zhang, Qingmu Liu, Felix Ong Wei Cong, Mohamed Ragab, Chuan-Sheng Fooarxiv.org/pdf/2403.11…null
2024-03-17Controllable Relation Disentanglement for Few-Shot Class-Incremental Learning小样本类增量学习的可控关系解开Yuan Zhou, Richang Hong, Yanrong Guo, Lin Liu, Shijie Hao, Hanwang Zhangarxiv.org/pdf/2403.11…null

其他

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-17SQ-LLaVA: Self-Questioning for Large Vision-Language AssistantSQ-LLaVA:大视野语言助手的自我提问Guohao Sun, Can Qin, Jiamian Wang, Zeyuan Chen, Ran Xu, Zhiqiang Taoarxiv.org/pdf/2403.11…null
2024-03-17Order-One Rolling Shutter Cameras一阶卷帘快门相机Marvin Anas Hahn, Kathlén Kohn, Orlando Marigliano, Tomas Pajdlaarxiv.org/pdf/2403.11…null
2024-03-17MindEye2: Shared-Subject Models Enable fMRI-To-Image With 1 Hour of DataMindEye2:共享受试者模型可通过 1 小时的数据实现 fMRI 转图像Paul S. Scotti, Mihir Tripathy, Cesar Kadir Torrico Villanueva, Reese Kneeland, Tong Chen, Ashutosh Narang, Charan Santhirasegaran, Jonathan Xu, Thomas Naselaris, Kenneth A. Norman, et.al.arxiv.org/pdf/2403.11…null
2024-03-17Self-Supervised Video Desmoking for Laparoscopic Surgery腹腔镜手术的自我监督视频除烟Renlong Wu, Zhilu Zhang, Shuohao Zhang, Longfei Gou, Haobin Chen, Lei Zhang, Hao Chen, Wangmeng Zuoarxiv.org/pdf/2403.11…null
2024-03-17Quality-Aware Image-Text Alignment for Real-World Image Quality Assessment用于真实世界图像质量评估的质量感知图像文本对齐Lorenzo Agnolucci, Leonardo Galteri, Marco Bertiniarxiv.org/pdf/2403.11…null
2024-03-17Lost in Translation? Translation Errors and Challenges for Fair Assessment of Text-to-Image Models on Multilingual Concepts迷失在翻译中?多语言概念的文本到图像模型公平评估的翻译错误和挑战Michael Saxon, Yiran Luo, Sharon Levy, Chitta Baral, Yezhou Yang, William Yang Wangarxiv.org/pdf/2403.11…null
2024-03-17OSTAF: A One-Shot Tuning Method for Improved Attribute-Focused T2I PersonalizationOSTAF:一种用于改进以属性为中心的 T2I 个性化的一次性调整方法Ye Wang, Zili Yi, Rui Maarxiv.org/pdf/2403.11…null