[分享][每日更新][2024.03.25][CV_arxiv_papers]

365 阅读12分钟

[UPDATED!] 2024-03-25 (Publish Time)

生成模型

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-25Multi-Scale Texture Loss for CT denoising with GANs使用 GAN 进行 CT 去噪的多尺度纹理损失Francesco Di Feola, Lorenzo Tronchin, Valerio Guarrasi, Paolo Sodaarxiv.org/pdf/2403.16…null
2024-03-25SDXS: Real-Time One-Step Latent Diffusion Models with Image ConditionsSDXS:具有图像条件的实时一步潜扩散模型Yuda Song, Zehao Sun, Xuanwu Yinarxiv.org/pdf/2403.16…null
2024-03-25SatSynth: Augmenting Image-Mask Pairs through Diffusion Models for Aerial Semantic SegmentationSatSynth:通过扩散模型增强图像掩模对以进行空中语义分割Aysim Toker, Marvin Eisenberger, Daniel Cremers, Laura Leal-Taixéarxiv.org/pdf/2403.16…null
2024-03-25An Intermediate Fusion ViT Enables Efficient Text-Image Alignment in Diffusion Models中间融合 ViT 可在扩散模型中实现高效的文本-图像对齐Zizhao Hu, Shaochong Jia, Mohammad Rostamiarxiv.org/pdf/2403.16…null
2024-03-25Let Real Images be as a Judger, Spotting Fake Images Synthesized with Generative Models让真实图像作为评判者,发现生成模型合成的假图像Ziyou Liang, Run Wang, Weifeng Liu, Yuyang Zhang, Wenyuan Yang, Lina Wang, Xingkai Wangarxiv.org/pdf/2403.16…null
2024-03-25Make-Your-Anchor: A Diffusion-based 2D Avatar Generation FrameworkMake-Your-Anchor:基于扩散的 2D 头像生成框架Ziyao Huang, Fan Tang, Yong Zhang, Xiaodong Cun, Juan Cao, Jintao Li, Tong-Yee Leearxiv.org/pdf/2403.16…null
2024-03-25Self-Supervised Learning for Medical Image Data with Anatomy-Oriented Imaging Planes具有面向解剖学成像平面的医学图像数据的自监督学习Tianwei Zhang, Dong Wei, Mengmeng Zhua, Shi Gu, Yefeng Zhengarxiv.org/pdf/2403.16…null
2024-03-25Refining Text-to-Image Generation: Towards Accurate Training-Free Glyph-Enhanced Image Generation细化文本到图像的生成:实现准确的免训练字形增强图像生成Sanyam Lakhanpal, Shivang Chopra, Vinija Jain, Aman Chadha, Man Luoarxiv.org/pdf/2403.16…null
2024-03-25Multi-attention Associate Prediction Network for Visual Tracking用于视觉跟踪的多注意关联预测网络Xinglong Sun, Haijiang Sun, Shan Jiang, Jiacheng Wang, Xilai Wei, Zhonghe Huarxiv.org/pdf/2403.16…null
2024-03-25Residual Dense Swin Transformer for Continuous Depth-Independent Ultrasound Imaging用于连续深度无关超声成像的残余密集 Swin 变压器Jintong Hu, Hui Che, Zishuo Li, Wenming Yangarxiv.org/pdf/2403.16…link
2024-03-25FlashEval: Towards Fast and Accurate Evaluation of Text-to-image Diffusion Generative ModelsFlashEval:快速准确地评估文本到图像的扩散生成模型Lin Zhao, Tianchen Zhao, Zinan Lin, Xuefei Ning, Guohao Dai, Huazhong Yang, Yu Wangarxiv.org/pdf/2403.16…null
2024-03-253D-EffiViTCaps: 3D Efficient Vision Transformer with Capsule for Medical Image Segmentation3D-EffiViTCaps:用于医学图像分割的带胶囊的 3D 高效视觉转换器Dongwei Gan, Ming Chang, Juan Chenarxiv.org/pdf/2403.16…null

多模态

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-25Elysium: Exploring Object-level Perception in Videos via MLLMElysium:通过 MLLM 探索视频中的对象级感知Han Wang, Yanjie Wang, Yongjie Ye, Yuxiang Nie, Can Huangarxiv.org/pdf/2403.16…null
2024-03-25CMViM: Contrastive Masked Vim Autoencoder for 3D Multi-modal Representation Learning for AD classificationCMViM:用于 AD 分类的 3D 多模态表示学习的对比屏蔽 Vim 自动编码器Guangqian Yang, Kangrui Du, Zhihan Yang, Ye Du, Yongping Zheng, Shujun Wangarxiv.org/pdf/2403.16…null
2024-03-25PathoTune: Adapting Visual Foundation Model to Pathological SpecialistsPathoTune:使视觉基础模型适应病理专家Jiaxuan Lu, Fang Yan, Xiaofan Zhang, Yue Gao, Shaoting Zhangarxiv.org/pdf/2403.16…null
2024-03-25RCBEVDet: Radar-camera Fusion in Bird's Eye View for 3D Object DetectionRCBEVDet:鸟瞰图中的雷达相机融合用于 3D 物体检测Zhiwei Lin, Zhe Liu, Zhongyu Xia, Xinhao Wang, Yongtao Wang, Shengxiang Qi, Yang Dong, Nan Dong, Le Zhang, Ce Zhuarxiv.org/pdf/2403.16…null
2024-03-25Text-IF: Leveraging Semantic Text Guidance for Degradation-Aware and Interactive Image FusionText-IF:利用语义文本指导进行退化感知和交互式图像融合Xunpeng Yi, Han Xu, Hao Zhang, Linfeng Tang, Jiayi Maarxiv.org/pdf/2403.16…null
2024-03-25Synthesize Step-by-Step: Tools, Templates and LLMs as Data Generators for Reasoning-Based Chart VQA逐步综合:工具、模板和 LLM 作为基于推理的图表 VQA 的数据生成器Li Zhuowan, Jasani Bhavan, Tang Peng, Ghadar Shabnamarxiv.org/pdf/2403.16…null

Nerf

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-25Spike-NeRF: Neural Radiance Field Based On Spike CameraSpike-NeRF:基于 Spike 相机的神经辐射场Yijia Guo, Yuanxi Bai, Liwen Hu, Mianzhi Liu, Ziyi Guo, Lei Ma, Tiejun Huangarxiv.org/pdf/2403.16…null

模型压缩/优化

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-25Distilling Semantic Priors from SAM to Efficient Image Restoration Models从 SAM 中提取语义先验,形成高效的图像恢复模型Quan Zhang, Xiaoyu Liu, Wei Li, Hanting Chen, Junchao Liu, Jie Hu, Zhiwei Xiong, Chun Yuan, Yunhe Wangarxiv.org/pdf/2403.16…null

分类/检测/识别/分割/...

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-25Assessing the Performance of Deep Learning for Automated Gleason Grading in Prostate Cancer评估深度学习在前列腺癌自动格里森分级中的表现Dominik Müller, Philip Meyer, Lukas Rentschler, Robin Manz, Daniel Hieber, Jonas Bäcker, Samantha Cramer, Christoph Wengenmayr, Bruno Märkl, Ralf Huss, et.al.arxiv.org/pdf/2403.16…null
2024-03-25DeepGleason: a System for Automated Gleason Grading of Prostate Cancer using Deep Neural NetworksDeepGleason:使用深度神经网络对前列腺癌进行自动格里森分级的系统Dominik Müller, Philip Meyer, Lukas Rentschler, Robin Manz, Jonas Bäcker, Samantha Cramer, Christoph Wengenmayr, Bruno Märkl, Ralf Huss, Iñaki Soto-Rey, et.al.arxiv.org/pdf/2403.16…null
2024-03-25Domain Adaptive Detection of MAVs: A Benchmark and Noise Suppression NetworkMAV 的域自适应检测:基准和噪声抑制网络Yin Zhang, Jinhong Deng, Peidong Liu, Wen Li, Shiyu Zhaoarxiv.org/pdf/2403.16…null
2024-03-25Clustering Propagation for Universal Medical Image Segmentation通用医学图像分割的聚类传播Yuhang Ding, Liulei Li, Wenguan Wang, Yi Yangarxiv.org/pdf/2403.16…null
2024-03-25AI-Generated Video Detection via Spatio-Temporal Anomaly Learning通过时空异常学习进行人工智能生成的视频检测Jianfa Bai, Man Lin, Gang Caoarxiv.org/pdf/2403.16…null
2024-03-25EDUE: Expert Disagreement-Guided One-Pass Uncertainty Estimation for Medical Image SegmentationEDUE:专家分歧引导的医学图像分割一次性不确定性估计Kudaibergen Abutalip, Numan Saeed, Ikboljon Sobirov, Vincent Andrearczyk, Adrien Depeursinge, Mohammad Yaqubarxiv.org/pdf/2403.16…null
2024-03-25In the Search for Optimal Multi-view Learning Models for Crop Classification with Global Remote Sensing Data利用全球遥感数据寻找作物分类的最佳多视图学习模型Francisco Mena, Diego Arenas, Andreas Dengelarxiv.org/pdf/2403.16…null
2024-03-25SegICL: A Universal In-context Learning Framework for Enhanced Segmentation in Medical ImagingSegICL:用于增强医学成像分割的通用上下文学习框架Lingdong Shen, Fangxin Shang, Yehui Yang, Xiaoshuang Huang, Shining Xiangarxiv.org/pdf/2403.16…null
2024-03-25Open-Set Recognition in the Age of Vision-Language Models视觉语言模型时代的开放集识别Dimity Miller, Niko Sünderhauf, Alex Kenna, Keita Masonarxiv.org/pdf/2403.16…null
2024-03-25Visually Guided Generative Text-Layout Pre-training for Document Intelligence文档智能的视觉引导生成文本布局预训练Zhiming Mao, Haoli Bai, Lu Hou, Jiansheng Wei, Xin Jiang, Qun Liu, Kam-Fai Wongarxiv.org/pdf/2403.16…null
2024-03-25CT-Bound: Fast Boundary Estimation From Noisy Images Via Hybrid Convolution and Transformer Neural NetworksCT-Bound:通过混合卷积和 Transformer 神经网络从噪声图像中进行快速边界估计Wei Xu, Junjie Luo, Qi Guoarxiv.org/pdf/2403.16…null
2024-03-25Real-time Neuron Segmentation for Voltage Imaging用于电压成像的实时神经元分割Yosuke Bando, Ramdas Pillai, Atsushi Kajita, Farhan Abdul Hakeem, Yves Quemener, Hua-an Tseng, Kiryl D. Piatkevich, Changyang Linghu, Xue Han, Edward S. Boydenarxiv.org/pdf/2403.16…null
2024-03-25DOCTR: Disentangled Object-Centric Transformer for Point Scene UnderstandingDOCTR:用于点场景理解的以对象为中心的解缠变压器Xiaoxuan Yu, Hao Wang, Weiming Li, Qiang Wang, Soonyong Cho, Younghun Sungarxiv.org/pdf/2403.16…null
2024-03-25Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects以自我为中心的手部与物体交互的姿势估计的基准和挑战Zicong Fan, Takehiko Ohkawa, Linlin Yang, Nie Lin, Zhishan Zhou, Shihao Zhou, Jiajun Liang, Zhong Gao, Xuanyang Zhang, Xue Zhang, et.al.arxiv.org/pdf/2403.16…null
2024-03-25Enhancing Visual Place Recognition via Fast and Slow Adaptive Biasing in Event Cameras通过事件摄像机中的快速和慢速自适应偏置增强视觉位置识别Gokul B. Nair, Michael Milford, Tobias Fischerarxiv.org/pdf/2403.16…null
2024-03-25A Survey on Long Video Generation: Challenges, Methods, and Prospects长视频生成综述:挑战、方法与前景Chengxuan Li, Di Huang, Zeyu Lu, Yang Xiao, Qingqi Pei, Lei Baiarxiv.org/pdf/2403.16…null
2024-03-25ASDF: Assembly State Detection Utilizing Late Fusion by Integrating 6D Pose EstimationASDF:通过集成 6D 姿态估计利用后期融合进行装配状态检测Hannah Schieber, Shiyu Li, Niklas Corell, Philipp Beckerle, Julian Kreimeier, Daniel Rotharxiv.org/pdf/2403.16…null
2024-03-25GoodSAM: Bridging Domain and Capacity Gaps via Segment Anything Model for Distortion-aware Panoramic Semantic SegmentationGoodSAM:通过 Segment Anything 模型弥合域和容量差距,实现失真感知全景语义分割Weiming Zhang, Yexin Liu, Xu Zheng, Lin Wangarxiv.org/pdf/2403.16…null
2024-03-25ChebMixer: Efficient Graph Representation Learning with MLP MixerChebMixer:使用 MLP 混合器进行高效图表示学习Xiaoyan Kui, Haonan Yan, Qinsong Li, Liming Chen, Beiji Zouarxiv.org/pdf/2403.16…null
2024-03-25Impact of Video Compression Artifacts on Fisheye Camera Visual Perception Tasks视频压缩伪影对鱼眼相机视觉感知任务的影响Madhumitha Sakthi, Louis Kerofsky, Varun Ravi Kumar, Senthil Yogamaniarxiv.org/pdf/2403.16…null

图像理解

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-25Enhancing Industrial Transfer Learning with Style Filter: Cost Reduction and Defect-Focus通过风格过滤器增强工业转移学习:降低成本和聚焦缺陷Chen Li, Ruijie Ma, Xiang Qian, Xiaohao Wang, Xinghui Liarxiv.org/pdf/2403.16…null
2024-03-25Elite360D: Towards Efficient 360 Depth Estimation via Semantic- and Distance-Aware Bi-Projection FusionElite360D:通过语义和距离感知双投影融合实现高效 360 度深度估计Hao Ai, Lin Wangarxiv.org/pdf/2403.16…null

LLM

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-25DOrA: 3D Visual Grounding with Order-Aware ReferringDOrA:具有订单感知参考功能的 3D 视觉基础Tung-Yu Wu, Sheng-Yu Huang, Yu-Chiang Frank Wangarxiv.org/pdf/2403.16…null
2024-03-25Dia-LLaMA: Towards Large Language Model-driven CT Report GenerationDia-LLaMA:迈向大型语言模型驱动的 CT 报告生成Zhixuan Chen, Luyang Luo, Yequan Bie, Hao Chenarxiv.org/pdf/2403.16…null

Transformer

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-25QKFormer: Hierarchical Spiking Transformer using Q-K AttentionQKFormer:使用 Q-K Attention 的分层尖峰变压器Chenlin Zhou, Han Zhang, Zhaokun Zhou, Liutao Yu, Liwei Huang, Xiaopeng Fan, Li Yuan, Zhengyu Ma, Huihui Zhou, Yonghong Tianarxiv.org/pdf/2403.16…null
2024-03-25VMRNN: Integrating Vision Mamba and LSTM for Efficient and Accurate Spatiotemporal ForecastingVMRNN:集成 Vision Mamba 和 LSTM,实现高效准确的时空预测Yujin Tang, Peijie Dong, Zhenheng Tang, Xiaowen Chu, Junwei Liangarxiv.org/pdf/2403.16…null
2024-03-25ModeTv2: GPU-accelerated Motion Decomposition Transformer for Pairwise Optimization in Medical Image RegistrationModeTv2:GPU 加速运动分解变压器,用于医学图像配准中的成对优化Haiqiao Wang, Zhuoyuan Wang, Dong Ni, Yi Wangarxiv.org/pdf/2403.16…null
2024-03-25Medical Image Registration and Its Application in Retinal Images: A Review医学图像配准及其在视网膜图像中的应用:综述Qiushi Nie, Xiaoqing Zhang, Yan Hu, Mingdao Gong, Jiang Liuarxiv.org/pdf/2403.16…null

3D/CG

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-25Creating a Digital Twin of Spinal Surgery: A Proof of Concept创建脊柱手术的数字孪生:概念证明Jonas Hein, Frederic Giraud, Lilian Calvet, Alexander Schwarz, Nicola Alessandro Cavalcanti, Sergey Prokudin, Mazda Farshad, Siyu Tang, Marc Pollefeys, Fabio Carrillo, et.al.arxiv.org/pdf/2403.16…null
2024-03-25V2X-PC: Vehicle-to-everything Collaborative Perception via Point ClusterV2X-PC:通过点集群实现车对万物的协同感知Si Liu, Zihan Ding, Jiahui Fu, Hongyu Li, Siheng Chen, Shifeng Zhang, Xu Zhouarxiv.org/pdf/2403.16…null
2024-03-25REFRAME: Reflective Surface Real-Time Rendering for Mobile DevicesREFRAME:移动设备的反射表面实时渲染Chaojie Ji, Yufeng Li, Yiyi Liaoarxiv.org/pdf/2403.16…null

各类学习方式

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-25Camera-aware Label Refinement for Unsupervised Person Re-identification用于无人监督人员重新识别的相机感知标签细化Pengna Li, Kangyi Wu, Wenli Huang, Sanping Zhou, Jinjun Wangarxiv.org/pdf/2403.16…null
2024-03-25Unsupervised Template-assisted Point Cloud Shape Correspondence Network无监督模板辅助点云形状对应网络Jiacheng Deng, Jiahao Lu, Tianzhu Zhangarxiv.org/pdf/2403.16…null

其他

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-25DPStyler: Dynamic PromptStyler for Source-Free Domain GeneralizationDPStyler:用于无源域泛化的动态 PromptStylerYunlong Tang, Yuxuan Wan, Lei Qi, Xin Gengarxiv.org/pdf/2403.16…null
2024-03-25Synapse: Learning Preferential Concepts from Visual DemonstrationsSynapse:从视觉演示中学习优先概念Sadanand Modak, Noah Patton, Isil Dillig, Joydeep Biswasarxiv.org/pdf/2403.16…null
2024-03-25FOOL: Addressing the Downlink Bottleneck in Satellite Computing with Neural Feature CompressionFOOL:利用神经特征压缩解决卫星计算中的下行链路瓶颈Alireza Furutanpey, Qiyang Zhang, Philipp Raith, Tobias Pfandzelter, Shangguang Wang, Schahram Dustdararxiv.org/pdf/2403.16…null
2024-03-25Self-Adaptive Reality-Guided Diffusion for Artifact-Free Super-Resolution自适应现实引导扩散,实现无伪影超分辨率Qingping Zheng, Ling Zheng, Yuanfan Guo, Ying Li, Songcen Xu, Jiankang Deng, Hang Xuarxiv.org/pdf/2403.16…null
2024-03-25Calibrating Bayesian UNet++ for Sub-Seasonal Forecasting校准贝叶斯 UNet++ 以进行次季节预测Busra Asan, Abdullah Akgul, Alper Unal, Melih Kandemir, Gozde Unalarxiv.org/pdf/2403.16…null
2024-03-25Revealing Vulnerabilities of Neural Networks in Parameter Learning and Defense Against Explanation-Aware Backdoors揭示神经网络在参数学习和防御解释感知后门方面的漏洞Md Abdul Kadir, GowthamKrishna Addluri, Daniel Sonntagarxiv.org/pdf/2403.16…null
2024-03-25If CLIP Could Talk: Understanding Vision-Language Model Representations Through Their Preferred Concept Descriptions如果 CLIP 会说话:通过首选概念描述理解视觉语言模型表示Reza Esfandiarpoor, Cristina Menghini, Stephen H. Bacharxiv.org/pdf/2403.16…null
2024-03-25Producing and Leveraging Online Map Uncertainty in Trajectory Prediction轨迹预测中在线地图不确定性的产生和利用Xunjiang Gu, Guanyu Song, Igor Gilitschenski, Marco Pavone, Boris Ivanovicarxiv.org/pdf/2403.16…null
2024-03-25Ensemble Adversarial Defense via Integration of Multiple Dispersed Low Curvature Models通过集成多个分散的低曲率模型进行整体对抗防御Kaikang Zhao, Xi Chen, Wei Huang, Liuxin Ding, Xianglong Kong, Fan Zhangarxiv.org/pdf/2403.16…null
2024-03-25Generating Potent Poisons and Backdoors from Scratch with Guided Diffusion通过引导扩散从头开始生成强效毒药和后门Hossein Souri, Arpit Bansal, Hamid Kazemi, Liam Fowl, Aniruddha Saha, Jonas Geiping, Andrew Gordon Wilson, Rama Chellappa, Tom Goldstein, Micah Goldblumarxiv.org/pdf/2403.16…null
2024-03-25RSTAR: Rotational Streak Artifact Reduction in 4D CBCT using Separable and Circular ConvolutionsRSTAR:使用可分离和循环卷积减少 4D CBCT 中的旋转条纹伪影Ziheng Deng, Hua Chen, Haibo Hu, Zhiyong Xu, Tianling Lyu, Yan Xi, Yang Chen, Jun Zhaoarxiv.org/pdf/2403.16…null
2024-03-25MEDDAP: Medical Dataset Enhancement via Diversified Augmentation PipelineMEDDAP:通过多样化的增强管道增强医疗数据集Yasamin Medghalchi, Niloufar Zakariaei, Arman Rahmim, Ilker Hacihalilogluarxiv.org/pdf/2403.16…null