[分享][每日更新][2024.02.14][CV_arxiv_papers]

143 阅读8分钟

[UPDATED!] 2024-02-14 (Publish Time)

生成模型

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-14Magic-Me: Identity-Specific Video Customized DiffusionMagic-Me:针对特定身份的视频定制扩散Ze Ma, Daquan Zhou, Chun-Hsiao Yeh, Xue-She Wang, Xiuyu Li, Huanrui Yang, Zhen Dong, Kurt Keutzer, Jiashi Fengarxiv.org/pdf/2402.09…null
2024-02-14Synthesizing Knowledge-enhanced Features for Real-world Zero-shot Food Detection综合知识增强特征以实现现实世界的零样本食品检测Pengfei Zhou, Weiqing Min, Jiajun Song, Yang Zhang, Shuqiang Jiangarxiv.org/pdf/2402.09…link
2024-02-14Semi-Supervised Diffusion Model for Brain Age Prediction用于脑年龄预测的半监督扩散模型Ayodeji Ijishakin, Sophie Martin, Florence Townend, Federica Agosta, Edoardo Gioele Spinelli, Silvia Basaia, Paride Schito, Yuri Falzone, Massimo Filippi, James Cole, et.al.arxiv.org/pdf/2402.09…null
2024-02-14DestripeCycleGAN: Stripe Simulation CycleGAN for Unsupervised Infrared Image DestripingDestripeCycleGAN:用于无监督红外图像去条纹的条纹模拟 CycleGANShiqi Yang, Hanlin Qin, Shuai Yuan, Xiang Yan, Hossein Rahmaniarxiv.org/pdf/2402.09…null
2024-02-14Towards Realistic Landmark-Guided Facial Video Inpainting Based on GANs基于 GAN 的逼真地标引导面部视频修复Fatemeh Ghorbani Lohesara, Karen Egiazarian, Sebastian Knorrarxiv.org/pdf/2402.09…null
2024-02-14Extreme Video Compression with Pre-trained Diffusion Models使用预先训练的扩散模型进行极限视频压缩Bohan Li, Yiming Liu, Xueyan Niu, Bo Bai, Lei Deng, Deniz Gündüzarxiv.org/pdf/2402.08…link

多模态

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-14MultiMedEval: A Benchmark and a Toolkit for Evaluating Medical Vision-Language ModelsMultiMedEval:评估医学视觉语言模型的基准和工具包Corentin Royer, Bjoern Menze, Anjany Sekuboyinaarxiv.org/pdf/2402.09…null
2024-02-14OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLMOmniMedVQA:医疗 LVLM 的新型大规模综合评估基准Yutao Hu, Tianbin Li, Quanfeng Lu, Wenqi Shao, Junjun He, Yu Qiao, Ping Luoarxiv.org/pdf/2402.09…null
2024-02-14Headset: Human emotion awareness under partial occlusions multimodal dataset耳机:部分遮挡多模态数据集下的人类情感意识Fatemeh Ghorbani Lohesara, Davi Rabbouni Freitas, Christine Guillemot, Karen Eguiazarian, Sebastian Knorrarxiv.org/pdf/2402.09…null
2024-02-14Comment-aided Video-Language Alignment via Contrastive Pre-training for Short-form Video Humor Detection通过对比预训练进行短视频幽默检测的评论辅助视频语言对齐Yang Liu, Tongfei Shen, Dong Zhang, Qingying Sun, Shoushan Li, Guodong Zhouarxiv.org/pdf/2402.09…null
2024-02-14Can Text-to-image Model Assist Multi-modal Learning for Visual Recognition with Visual Modality Missing?文本到图像模型能否辅助视觉模态缺失的视觉识别的多模态学习?Tiantian Feng, Daniel Yang, Digbalay Bose, Shrikanth Narayananarxiv.org/pdf/2402.09…null
2024-02-14Multi-modality transrectal ultrasound vudei classification for identification of clinically significant prostate cancer多模态经直肠超声 vudei 分类用于识别有临床意义的前列腺癌Hong Wu, Juan Fu, Hongsheng Ye, Yuming Zhong, Xuebin Zhou, Jianhua Zhou, Yi Wangarxiv.org/pdf/2402.08…link
2024-02-14Pretraining Vision-Language Model for Difference Visual Question Answering in Longitudinal Chest X-rays纵向胸部 X 射线差异视觉问答的预训练视觉语言模型Yeongjae Cho, Taehee Kim, Heejun Shin, Sungzoon Cho, Dongmyung Shinarxiv.org/pdf/2402.08…null
2024-02-14Interpretable Measures of Conceptual Similarity by Complexity-Constrained Descriptive Auto-Encoding通过复杂性约束的描述性自动编码来测量概念相似性的可解释性Alessandro Achille, Greg Ver Steeg, Tian Yu Liu, Matthew Trager, Carson Klingenberg, Stefano Soattoarxiv.org/pdf/2402.08…null

Nerf

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-14PC-NeRF: Parent-Child Neural Radiance Fields Using Sparse LiDAR Frames in Autonomous Driving EnvironmentsPC-NeRF:在自动驾驶环境中使用稀疏 LiDAR 帧的亲子神经辐射场Xiuzhong Hu, Guangming Xiong, Zheng Zang, Peng Jia, Yuxuan Han, Junyi Maarxiv.org/pdf/2402.09…link

模型压缩/优化

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-14Pruning Sparse Tensor Neural Networks Enables Deep Learning for 3D Ultrasound Localization Microscopy修剪稀疏张量神经网络支持 3D 超声定位显微镜的深度学习Brice Rauby, Paul Xing, Jonathan Porée, Maxime Gasse, Jean Provostarxiv.org/pdf/2402.09…null

分类/检测/识别/分割/...

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-14Deep Rib Fracture Instance Segmentation and Classification from CT on the RibFrac ChallengeRibFrac 挑战赛中 CT 的深部肋骨骨折实例分割和分类Jiancheng Yang, Rui Shi, Liang Jin, Xiaoyang Huang, Kaiming Kuang, Donglai Wei, Shixuan Gu, Jianying Liu, Pengfei Liu, Zhizhong Chai, et.al.arxiv.org/pdf/2402.09…null
2024-02-14YOLOv8-AM: YOLOv8 with Attention Mechanisms for Pediatric Wrist Fracture DetectionYOLOv8-AM:带有注意力机制的 YOLOv8,用于儿童手腕骨折检测Chun-Tse Chien, Rui-Yang Ju, Kuang-Yi Chou, Chien-Sheng Lin, Jen-Shiun Chiangarxiv.org/pdf/2402.09…null
2024-02-14Only My Model On My Data: A Privacy Preserving Approach Protecting one Model and Deceiving Unauthorized Black-Box Models只有我的模型在我的数据上:一种保护一个模型并欺骗未经授权的黑盒模型的隐私保护方法Weiheng Chai, Brian Testa, Huantao Ren, Asif Salekin, Senem Velipasalararxiv.org/pdf/2402.09…null
2024-02-14Few-Shot Object Detection with Sparse Context Transformers使用稀疏上下文转换器进行少样本目标检测Jie Mei, Mingyuan Jiu, Hichem Sahbi, Xiaoheng Jiang, Mingliang Xuarxiv.org/pdf/2402.09…null
2024-02-14Immediate generalisation in humans but a generalisation lag in deep neural networks\unicode{x2014}evidence for representational divergence?在人类中可以立即泛化,但在深度神经网络中泛化滞后\unicode{x2014}表征分歧的证据?Lukas S. Huber, Fred W. Mast, Felix A. Wichmannarxiv.org/pdf/2402.09…null
2024-02-14TDViT: Temporal Dilated Video Transformer for Dense Video TasksTDViT:用于密集视频任务的时间扩张视频转换器Guanxiong Sun, Yang Hua, Guosheng Hu, Neil Robertsonarxiv.org/pdf/2402.09…link
2024-02-14Efficient One-stage Video Object Detection by Exploiting Temporal Consistency利用时间一致性的高效单阶段视频目标检测Guanxiong Sun, Yang Hua, Guosheng Hu, Neil Robertsonarxiv.org/pdf/2402.09…link
2024-02-14Switch EMA: A Free Lunch for Better Flatness and SharpnessSwitch EMA:更好的平坦度和清晰度的免费午餐Siyuan Li, Zicheng Liu, Juanxi Tian, Ge Wang, Zedong Wang, Weiyang Jin, Di Wu, Cheng Tan, Tao Lin, Yang Liu, et.al.arxiv.org/pdf/2402.09…null
2024-02-14Is my Data in your AI Model? Membership Inference Test with Application to Face Images我的数据在你们的人工智能模型中吗?应用于人脸图像的隶属推理测试Daniel DeAlcala, Aythami Morales, Gonzalo Mancera, Julian Fierrez, Ruben Tolosana, Javier Ortega-Garciaarxiv.org/pdf/2402.09…null
2024-02-14Domain-adaptive and Subgroup-specific Cascaded Temperature Regression for Out-of-distribution Calibration用于分布外校准的域自适应和子组特定级联温度回归Jiexin Wang, Jiahao Chen, Bing Suarxiv.org/pdf/2402.09…null
2024-02-14Crop and Couple: cardiac image segmentation using interlinked specialist networksCrop and Couple:使用互连的专家网络进行心脏图像分割Abbas Khan, Muhammad Asad, Martin Benning, Caroline Roney, Gregory Slabaugharxiv.org/pdf/2402.09…null
2024-02-14Solid Waste Detection in Remote Sensing Images: A Survey遥感图像中的固体废物检测:调查Piero Fraternali, Luca Morandini, Sergio Luis Herrera Gonzálezarxiv.org/pdf/2402.09…null
2024-02-14Open-Vocabulary Segmentation with Unpaired Mask-Text Supervision具有不配对掩码文本监督的开放词汇分割Zhaoqing Wang, Xiaobo Xia, Ziye Chen, Xiao He, Yandong Guo, Mingming Gong, Tongliang Liuarxiv.org/pdf/2402.08…link
2024-02-14Learning-based Bone Quality Classification Method for Spinal Metastasis基于学习的脊柱转移骨质量分类方法Shiqi Peng, Bolin Lai, Guangyu Yao, Xiaoyun Zhang, Ya Zhang, Yan-Feng Wang, Hui Zhaoarxiv.org/pdf/2402.08…null
2024-02-14Weakly Supervised Segmentation of Vertebral Bodies with Iterative Slice-propagation迭代切片传播的椎体弱监督分割Shiqi Peng, Bolin Lai, Guangyu Yao, Xiaoyun Zhang, Ya Zhang, Yan-Feng Wang, Hui Zhaoarxiv.org/pdf/2402.08…null
2024-02-14Moving Object Proposals with Deep Learned Optical Flow for Video Object Segmentation用于视频对象分割的深度学习光流的移动对象建议Ge Shi, Zhili Yangarxiv.org/pdf/2402.08…null
2024-02-14TikTokActions: A TikTok-Derived Video Dataset for Human Action RecognitionTikTokActions:用于人类动作识别的 TikTok 衍生视频数据集Yang Qian, Yinan Sun, Ali Kargarandehkordi, Onur Cezmi Mutlu, Saimourya Surabhi, Pingyi Chen, Zain Jabbar, Dennis Paul Wall, Peter Washingtonarxiv.org/pdf/2402.08…null

图像理解

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-14Depth-aware Volume Attention for Texture-less Stereo Matching用于无纹理立体匹配的深度感知体积注意Tong Zhao, Mingyu Ding, Wei Zhan, Masayoshi Tomizuka, Yintao Weiarxiv.org/pdf/2402.08…link

Transformer

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-14Less is More: Fewer Interpretable Region via Submodular Subset Selection少即是多:通过子模子集选择减少可解释区域Ruoyu Chen, Hua Zhang, Siyuan Liang, Jingzhi Li, Xiaochun Caoarxiv.org/pdf/2402.09…null
2024-02-14Pyramid Attention Network for Medical Image Registration用于医学图像配准的金字塔注意力网络Zhuoyuan Wang, Haiqiao Wang, Yi Wangarxiv.org/pdf/2402.09…null
2024-02-14CLIP-MUSED: CLIP-Guided Multi-Subject Visual Neural Information Semantic DecodingCLIP-MUSED:CLIP引导的多主题视觉神经信息语义解码Qiongyi Zhou, Changde Du, Shengpei Wang, Huiguang Hearxiv.org/pdf/2402.08…null
2024-02-14Predictive Temporal Attention on Event-based Video Stream for Energy-efficient Situation Awareness基于事件的视频流的预测时间注意力以实现节能态势感知Yiming Bu, Jiayang Liu, Qinru Qiuarxiv.org/pdf/2402.08…null

3D/CG

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-14Registration of Longitudinal Spine CTs for Monitoring Lesion Growth用于监测病变生长的纵向脊柱 CT 配准Malika Sanhinova, Nazim Haouchine, Steve D. Pieper, William M. Wells III, Tracy A. Balboni, Alexander Spektor, Mai Anh Huynh, Jeffrey P. Guenette, Bryan Czajkowski, Sarah Caplan, et.al.arxiv.org/pdf/2402.09…null
2024-02-14DUDF: Differentiable Unsigned Distance Fields with Hyperbolic ScalingDUDF:具有双曲标度的可微分无符号距离场Miguel Fainstein, Viviana Siless, Emmanuel Iarussiarxiv.org/pdf/2402.08…null

其他

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-14Weatherproofing Retrieval for Localization with Generative AI and Geometric Consistency利用生成式人工智能和几何一致性进行本地化防风雨检索Yannis Kalantidis, Mert Bülent Sarıyıldız, Rafael S. Rezende, Philippe Weinzaepfel, Diane Larlus, Gabriela Csurkaarxiv.org/pdf/2402.09…null
2024-02-14Traj-LIO: A Resilient Multi-LiDAR Multi-IMU State Estimator Through Sparse Gaussian ProcessTraj-LIO:通过稀疏高斯过程的弹性多 LiDAR 多 IMU 状态估计器Xin Zheng, Jianke Zhuarxiv.org/pdf/2402.09…null
2024-02-14Generalized Portrait Quality Assessment广义肖像质量评估Nicolas Chahine, Sira Ferradans, Javier Vazquez-Corral, Jean Poncearxiv.org/pdf/2402.09…link