[分享][每日更新][2024.01.23][CV_arxiv_papers]

146 阅读10分钟

[UPDATED!] 2024-01-23 (Publish Time)

分类/检测/识别/分割

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-23GALA: Generating Animatable Layered Assets from a Single ScanGALA:通过单次扫描生成可动画化的分层资源Taeksoo Kim, Byungjun Kim, Shunsuke Saito, Hanbyul Jooarxiv.org/pdf/2401.12…null
2024-01-23SegmentAnyBone: A Universal Model that Segments Any Bone at Any Location on MRISegmentAnyBone:一种通用模型,可在 MRI 上的任何位置分割任何骨骼Hanxue Gu, Roy Colglazier, Haoyu Dong, Jikai Zhang, Yaqian Chen, Zafer Yildiz, Yuwen Chen, Lin Li, Jichen Yang, Jay Willhite, et.al.arxiv.org/pdf/2401.12…null
2024-01-23Neural deformation fields for template-based reconstruction of cortical surfaces from MRI用于基于 MRI 皮质表面模板重建的神经变形场Fabian Bongratz, Anne-Marie Rickmann, Christian Wachingerarxiv.org/pdf/2401.12…null
2024-01-23Segmentation of tibiofemoral joint tissues from knee MRI using MtRA-Unet and incorporating shape information: Data from the Osteoarthritis Initiative使用 MtRA-Unet 对膝 MRI 中的胫股关节组织进行分割并结合形状信息:来自骨关节炎倡议的数据Akshay Daydar, Alik Pramanick, Arijit Sur, Subramani Kanagarajarxiv.org/pdf/2401.12…null
2024-01-23Facing the Elephant in the Room: Visual Prompt Tuning or Full Finetuning?面对房间里的大象:视觉提示调整还是全面微调?Cheng Han, Qifan Wang, Yiming Cui, Wenguan Wang, Lifu Huang, Siyuan Qi, Dongfang Liuarxiv.org/pdf/2401.12…null
2024-01-23Unlocking the Potential: Multi-task Deep Learning for Spaceborne Quantitative Monitoring of Fugitive Methane Plumes释放潜力:用于星载逃逸甲烷羽流定量监测的多任务深度学习Guoxin Si, Shiliang Fu, Wei Yaoarxiv.org/pdf/2401.12…null
2024-01-23Classification of grapevine varieties using UAV hyperspectral imaging利用无人机高光谱成像对葡萄品种进行分类Alfonso López, Carlos Javier Ogayar, Francisco Ramón Feito, Joaquim João Sousaarxiv.org/pdf/2401.12…null
2024-01-23DatUS^2: Data-driven Unsupervised Semantic Segmentation with Pre-trained Self-supervised Vision TransformerDatUS^2:数据驱动的无监督语义分割与预训练的自监督视觉 TransformerSonal Kumar, Arijit Sur, Rashmi Dutta Baruaharxiv.org/pdf/2401.12…null
2024-01-23MUSES: The Multi-Sensor Semantic Perception Dataset for Driving under UncertaintyMUSES:用于不确定性驾驶的多传感器语义感知数据集Tim Brödermann, David Bruggemann, Christos Sakaridis, Kevin Ta, Odysseas Liagouris, Jason Corkill, Luc Van Goolarxiv.org/pdf/2401.12…null
2024-01-23Correlation-Embedded Transformer Tracking: A Single-Branch Framework相关嵌入式变压器跟踪:单分支框架Fei Xie, Wankou Yang, Chunyu Wang, Lei Chu, Yue Cao, Chao Ma, Wenjun Zengarxiv.org/pdf/2401.12…null
2024-01-23Enhancing Object Detection Performance for Small Objects through Synthetic Data Generation and Proportional Class-Balancing Technique: A Comparative Study in Industrial Scenarios通过合成数据生成和比例类平衡技术增强小物体的物体检测性能:工业场景的比较研究Jibinraj Antony, Vinit Hegiste, Ali Nazeri, Hooman Tavakoli, Snehal Walunj, Christiane Plociennik, Martin Ruskowskiarxiv.org/pdf/2401.12…null
2024-01-23Two-View Topogram-Based Anatomy-Guided CT Reconstruction for Prospective Risk Minimization基于双视图拓扑图的解剖引导 CT 重建,实现前瞻性风险最小化Chang Liu, Laura Klein, Yixing Huang, Edith Baader, Michael Lell, Marc Kachelrieß, Andreas Maierarxiv.org/pdf/2401.12…null
2024-01-23Pragmatic Communication in Multi-Agent Collaborative Perception多智能体协作感知中的语用沟通Yue Hu, Xianghe Pang, Xiaoqi Qin, Yonina C. Eldar, Siheng Chen, Ping Zhang, Wenjun Zhangarxiv.org/pdf/2401.12…null
2024-01-23Energy-based Automated Model Evaluation基于能量的自动化模型评估Ru Peng, Heming Zou, Haobo Wang, Yawen Zeng, Zenan Huang, Junbo Zhaoarxiv.org/pdf/2401.12…link
2024-01-23ClipSAM: CLIP and SAM Collaboration for Zero-Shot Anomaly SegmentationClipSAM:CLIP 和 SAM 协作进行零样本异常分割Shengze Li, Jianjian Cao, Peng Ye, Yuhan Ding, Chongjun Tu, Tao Chenarxiv.org/pdf/2401.12…null
2024-01-23Self-Supervised Vision Transformers Are Efficient Segmentation Learners for Imperfect Labels自监督视觉变压器是不完美标签的有效分割学习器Seungho Lee, Seoungyoon Kang, Hyunjung Shimarxiv.org/pdf/2401.12…null
2024-01-23Detecting and recognizing characters in Greek papyri with YOLOv8, DeiT and SimCLR使用 YOLOv8、DeiT 和 SimCLR 检测和识别希腊纸莎草中的字符Robert Turnbull, Evelyn Mannixarxiv.org/pdf/2401.12…null
2024-01-23Open-Set Facial Expression Recognition开放集面部表情识别Yuhang Zhang, Yue Yao, Xuannan Liu, Lixiong Qin, Wenjing Wang, Weihong Dengarxiv.org/pdf/2401.12…null
2024-01-23Small Language Model Meets with Reinforced Vision Vocabulary小语言模型与强化视觉词汇的结合Haoran Wei, Lingyu Kong, Jinyue Chen, Liang Zhao, Zheng Ge, En Yu, Jianjian Sun, Chunrui Han, Xiangyu Zhangarxiv.org/pdf/2401.12…null
2024-01-23An Automated Real-Time Approach for Image Processing and Segmentation of Fluoroscopic Images and Videos Using a Single Deep Learning Network使用单个深度学习网络对荧光图像和视频进行图像处理和分割的自动化实时方法Viet Dung Nguyen, Michael T. LaCour, Richard D. Komistekarxiv.org/pdf/2401.12…null
2024-01-23Explore Synergistic Interaction Across Frames for Interactive Video Object Segmentation探索交互式视频对象分割的跨帧协同交互Kexin Li, Tao Jiang, Zongxin Yang, Yi Yang, Yueting Zhuang, Jun Xiaoarxiv.org/pdf/2401.12…null
2024-01-23TD^2-Net: Toward Denoising and Debiasing for Dynamic Scene Graph GenerationTD^2-Net:动态场景图生成的去噪和去偏Xin Lin, Chong Shi, Yibing Zhan, Zuopeng Yang, Yaqi Wu, Dacheng Taoarxiv.org/pdf/2401.12…null
2024-01-23Zero Shot Open-ended Video Inference零镜头开放式视频推理Ee Yeo Keat, Zhang Hao, Alexander Matyasko, Basura Fernandoarxiv.org/pdf/2401.12…null
2024-01-23Self-supervised Learning of LiDAR 3D Point Clouds via 2D-3D Neural Calibration通过 2D-3D 神经校准进行 LiDAR 3D 点云的自监督学习Yifan Zhang, Siyu Ren, Junhui Hou, Jinjian Wu, Guangming Shiarxiv.org/pdf/2401.12…null
2024-01-23NIV-SSD: Neighbor IoU-Voting Single-Stage Object Detector From Point CloudNIV-SSD:来自点云的邻居 IoU 投票单级物体检测器Shuai Liu, Di Wang, Quan Wang, Kai Huangarxiv.org/pdf/2401.12…link
2024-01-23MAST: Video Polyp Segmentation with a Mixture-Attention Siamese TransformerMAST:使用混合注意力 Siamese Transformer 进行视频息肉分割Geng Chen, Junqing Yang, Xiaozhou Pu, Ge-Peng Ji, Huan Xiong, Yongsheng Pan, Hengfei Cui, Yong Xiaarxiv.org/pdf/2401.12…link
2024-01-23The Neglected Tails of Vision-Language Models视觉语言模型被忽视的尾巴Shubham Parashar, Zhiqiu Lin, Tian Liu, Xiangjue Dong, Yanan Li, Deva Ramanan, James Caverlee, Shu Kongarxiv.org/pdf/2401.12…null

模型压缩/优化

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-23A Novel Garment Transfer Method Supervised by Distilled Knowledge of Virtual Try-on Model虚拟试穿模型蒸馏知识监督下的新型服装传输方法Naiyu Fang, Lemiao Qiu, Shuyou Zhang, Zili Wang, Kerui Hu, Jianrong Tanarxiv.org/pdf/2401.12…null
2024-01-23Icy Moon Surface Simulation and Stereo Depth Estimation for Sampling Autonomy用于采样自主性的冰月表面模拟和立体深度估计Ramchander Bhaskara, Georgios Georgakis, Jeremy Nash, Marissa Cameron, Joseph Bowkett, Adnan Ansar, Manoranjan Majji, Paul Backesarxiv.org/pdf/2401.12…link

生成模型

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-23Zero-Shot Learning for the Primitives of 3D Affordance in General Objects一般对象中 3D 可供性基元的零样本学习Hyeonwoo Kim, Sookwan Han, Patrick Kwon, Hanbyul Jooarxiv.org/pdf/2401.12…null
2024-01-23Lumiere: A Space-Time Diffusion Model for Video GenerationLumiere:用于视频生成的时空扩散模型Omer Bar-Tal, Hila Chefer, Omer Tov, Charles Herrmann, Roni Paiss, Shiran Zada, Ariel Ephrat, Junhwa Hur, Yuanzhen Li, Tomer Michaeli, et.al.arxiv.org/pdf/2401.12…null
2024-01-23UniHDA: Towards Universal Hybrid Domain Adaptation of Image GeneratorsUniHDA:迈向图像生成器的通用混合域适应Hengjia Li, Yang Liu, Yuqi Lin, Zhanwei Zhang, Yibo Zhao, weihang Pan, Tu Zheng, Zheng Yang, Yuchun Jiang, Boxi Wu, et.al.arxiv.org/pdf/2401.12…null
2024-01-23Exploration and Improvement of Nerf-based 3D Scene Editing Techniques基于Nerf的3D场景编辑技术的探索与改进Shun Fang, Ming Cui, Xing Feng, Yanan Zhangarxiv.org/pdf/2401.12…null

多模态

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-23On the Efficacy of Text-Based Input Modalities for Action Anticipation基于文本的输入方式对动作预期的功效Apoorva Beedu, Karan Samel, Irfan Essaarxiv.org/pdf/2401.12…null
2024-01-23Red Teaming Visual Language Models红队视觉语言模型Mukai Li, Lei Li, Yuwei Yin, Masood Ahmed, Zhenguang Liu, Qi Liuarxiv.org/pdf/2401.12…null
2024-01-23FedRSU: Federated Learning for Scene Flow Estimation on Roadside UnitsFedRSU:路边场景流估计的联邦学习Shaoheng Fang, Rui Ye, Wenhao Wang, Zuhong Liu, Yuxiao Wang, Yafei Wang, Siheng Chen, Yanfeng Wangarxiv.org/pdf/2401.12…null
2024-01-23NeRF-AD: Neural Radiance Field with Attention-based Disentanglement for Talking Face SynthesisNeRF-AD:具有基于注意力的解开的神经辐射场,用于说话人脸合成Chongke Bi, Xiaoxing Liu, Zhilei Liuarxiv.org/pdf/2401.12…null
2024-01-23Multi-modal News Understanding with Professionally Labelled Videos (ReutersViLNews)通过专业标记的视频进行多模式新闻理解 (ReutersViLNews)Shih-Han Chou, Matthew Kowal, Yasmin Niknam, Diana Moyano, Shayaan Mehdi, Richard Pito, Cheng Zhang, Ian Knopke, Sedef Akinli Kocak, Leonid Sigal, et.al.arxiv.org/pdf/2401.12…null

LLM

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-23HAZARD Challenge: Embodied Decision Making in Dynamically Changing EnvironmentsHAZARD 挑战:动态变化环境中的具体决策Qinhong Zhou, Sunli Chen, Yisong Wang, Haozhe Xu, Weihua Du, Hongxin Zhang, Yilun Du, Joshua B. Tenenbaum, Chuang Ganarxiv.org/pdf/2401.12…link
2024-01-23AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic AgentsAutoRT:机器人代理大规模编排的具体基础模型Michael Ahn, Debidatta Dwibedi, Chelsea Finn, Montse Gonzalez Arenas, Keerthana Gopalakrishnan, Karol Hausman, Brian Ichter, Alex Irpan, Nikhil Joshi, Ryan Julian, et.al.arxiv.org/pdf/2401.12…null

Transformer

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-23SGTR+: End-to-end Scene Graph Generation with TransformerSGTR+:使用 Transformer 生成端到端场景图Rongjie Li, Songyang Zhang, Xuming Hearxiv.org/pdf/2401.12…link
2024-01-23Shift-ConvNets: Small Convolutional Kernel with Large Kernel EffectsShift-ConvNets:具有大核效应的小卷积核Dachong Li, Li Li, Zhuangzhuang Chen, Jianqiang Liarxiv.org/pdf/2401.12…link
2024-01-23Convolutional Initialization for Data-Efficient Vision Transformers数据高效视觉转换器的卷积初始化Jianqiao Zheng, Xueqian Li, Simon Luceyarxiv.org/pdf/2401.12…link
2024-01-23Methods and strategies for improving the novel view synthesis quality of neural radiation field提高神经辐射场新视合成质量的方法与策略Shun Fang, Ming Cui, Xing Feng, Yanna Lvarxiv.org/pdf/2401.12…null
2024-01-23InverseMatrixVT3D: An Efficient Projection Matrix-Based Approach for 3D Occupancy PredictionInverseMatrixVT3D:一种基于投影矩阵的高效 3D 占用预测方法Zhenxing Ming, Julie Stephany Berrio, Mao Shan, Stewart Worrallarxiv.org/pdf/2401.12…null

3DGS

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-23PSAvatar: A Point-based Morphable Shape Model for Real-Time Head Avatar Creation with 3D Gaussian SplattingPSAvatar:基于点的可变形形状模型,用于通过 3D 高斯泼溅创建实时头部头像Zhongyuan Zhao, Zhenyu Bao, Qing Li, Guoping Qiu, Kanglin Liuarxiv.org/pdf/2401.12…null
2024-01-23EndoGaussian: Gaussian Splatting for Deformable Surgical Scene ReconstructionEndoGaussian:用于可变形手术场景重建的高斯喷射Yifan Liu, Chenxin Li, Chen Yang, Yixuan Yuanarxiv.org/pdf/2401.12…null

3D/CG

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-23IRIS: Inverse Rendering of Indoor Scenes from Low Dynamic Range ImagesIRIS:低动态范围图像的室内场景逆渲染Zhi-Hao Lin, Jia-Bin Huang, Zhengqin Li, Zhao Dong, Christian Richardt, Tuotuo Li, Michael Zollhöfer, Johannes Kopf, Shenlong Wang, Changil Kimarxiv.org/pdf/2401.12…null
2024-01-23Coverage Axis++: Efficient Inner Point Selection for 3D Shape SkeletonizationCoverage Axis++:3D 形状骨架化的高效内点选择Zimeng Wang, Zhiyang Dou, Rui Xu, Cheng Lin, Yuan Liu, Xiaoxiao Long, Shiqing Xin, Lingjie Liu, Taku Komura, Xiaoming Yuan, et.al.arxiv.org/pdf/2401.12…null
2024-01-23PSDF: Prior-Driven Neural Implicit Surface Learning for Multi-view ReconstructionPSDF:用于多视图重建的先验驱动神经隐式表面学习Wanjuan Su, Chen Zhang, Qingshan Xu, Wenbing Taoarxiv.org/pdf/2401.12…null
2024-01-23RGBD Objects in the Wild: Scaling Real-World 3D Object Learning from RGB-D Videos野外 RGBD 对象:通过 RGB-D 视频缩放真实世界 3D 对象学习Hongchi Xia, Yang Fu, Sifei Liu, Xiaolong Wangarxiv.org/pdf/2401.12…null

各类学习方式

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-23Consistency Enhancement-Based Deep Multiview Clustering via Contrastive Learning通过对比学习进行基于一致性增强的深度多视图聚类Hao Yang, Hua Mao, Wai Lok Woo, Jie Chen, Xi Pengarxiv.org/pdf/2401.12…null
2024-01-23Fast Semi-supervised Unmixing using Non-convex Optimization使用非凸优化的快速半监督分解Behnood Rasti, Alexandre Zouaoui, Julien Mairal, Jocelyn Chanussotarxiv.org/pdf/2401.12…null
2024-01-23AdaEmbed: Semi-supervised Domain Adaptation in the Embedding SpaceAdaEmbed:嵌入空间中的半监督域适应Ali Mottaghi, Mohammad Abdullah Jamal, Serena Yeung, Omid Mohareriarxiv.org/pdf/2401.12…null

其他

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-23Data-Centric Evolution in Autonomous Driving: A Comprehensive Survey of Big Data System, Data Mining, and Closed-Loop Technologies自动驾驶中以数据为中心的演进:大数据系统、数据挖掘和闭环技术的全面综述Lincan Li, Wei Shao, Wei Dong, Yijun Tian, Kaixiang Yang, Wenjie Zhangarxiv.org/pdf/2401.12…null
2024-01-23Fast Implicit Neural Representation Image Codec in Resource-limited Devices资源有限设备中的快速隐式神经表示图像编解码器Xiang Liu, Jiahong Chen, Bin Chen, Zimo Liu, Baoyi An, Shu-Tao Xiaarxiv.org/pdf/2401.12…null
2024-01-23Secure Federated Learning Approaches to Diagnosing COVID-19用于诊断 COVID-19 的安全联合学习方法Rittika Adhikari, Christopher Settlesarxiv.org/pdf/2401.12…null