[分享][每日更新][2024.01.30][CV_arxiv_papers]

263 阅读11分钟

[UPDATED!] 2024-01-30 (Publish Time)

生成模型

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-30You Only Need One Step: Fast Super-Resolution with Stable Diffusion via Scale Distillation您只需一步:通过刻度蒸馏实现快速超分辨率和稳定扩散Mehdi Noroozi, Isma Hadji, Brais Martinez, Adrian Bulat, Georgios Tzimiropoulosarxiv.org/pdf/2401.17…null
2024-01-30ContactGen: Contact-Guided Interactive 3D Human Generation for PartnersContactGen:为合作伙伴提供接触引导的交互式 3D 人类生成Dongjun Gu, Jaehyeok Shim, Jaehoon Jang, Changwoo Kang, Kyungdon Jooarxiv.org/pdf/2401.17…null
2024-01-30Self-Supervised Representation Learning for Nerve Fiber Distribution Patterns in 3D-PLI3D-PLI 中神经纤维分布模式的自监督表示学习Alexander Oberstrass, Sascha E. A. Muenzing, Meiqi Niu, Nicola Palomero-Gallagher, Christian Schiffer, Markus Axer, Katrin Amunts, Timo Dickscheidarxiv.org/pdf/2401.17…null
2024-01-30An Open Software Suite for Event-Based Video用于基于事件的视频的开放软件套件Andrew C. Freemanarxiv.org/pdf/2401.17…null
2024-01-30Repositioning the Subject within Image重新定位图像中的主体Yikai Wang, Chenjie Cao, Qiaole Dong, Yifan Li, Yanwei Fuarxiv.org/pdf/2401.16…null
2024-01-30BoostDream: Efficient Refining for High-Quality Text-to-3D Generation from Multi-View DiffusionBoostDream:通过多视图扩散高效细化高质量文本到 3D 生成Yonghao Yu, Shunan Zhu, Huai Qin, Haorui Liarxiv.org/pdf/2401.16…null
2024-01-30Pick-and-Draw: Training-free Semantic Guidance for Text-to-Image PersonalizationPick-and-Draw:用于文本到图像个性化的免训练语义指导Henglei Lv, Jiayu Xiao, Liang Li, Qingming Huangarxiv.org/pdf/2401.16…null

多模态

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-30GazeGPT: Augmenting Human Capabilities using Gaze-contingent Contextual AI for Smart EyewearGazeGPT:使用智能眼镜的注视相关情境 AI 增强人类能力Robert Konrad, Nitish Padmanaban, J. Gabriel Buckmaster, Kevin C. Boyle, Gordon Wetzsteinarxiv.org/pdf/2401.17…null
2024-01-30Embracing Language Inclusivity and Diversity in CLIP through Continual Language Learning通过持续语言学习,拥抱 CLIP 中的语言包容性和多样性Bang Yang, Yong Dai, Xuxin Cheng, Yaowei Li, Asif Raza, Yuexian Zouarxiv.org/pdf/2401.17…null
2024-01-30M2CURL: Sample-Efficient Multimodal Reinforcement Learning via Self-Supervised Representation Learning for Robotic ManipulationM2CURL:通过用于机器人操作的自监督表示学习实现样本高效的多模态强化学习Fotios Lygerakis, Vedant Dave, Elmar Rueckertarxiv.org/pdf/2401.17…null
2024-01-30Multi-modal Representation Learning for Cross-modal Prediction of Continuous Weather Patterns from Discrete Low-Dimensional Data基于离散低维数据的连续天气模式跨模态预测的多模态表示学习Alif Bin Abdul Qayyum, Xihaier Luo, Nathan M. Urban, Xiaoning Qian, Byung-Jun Yoonarxiv.org/pdf/2401.16…null
2024-01-30Fourier Prompt Tuning for Modality-Incomplete Scene Segmentation模态不完整场景分割的傅立叶快速调整Ruiping Liu, Jiaming Zhang, Kunyu Peng, Yufan Chen, Ke Cao, Junwei Zheng, M. Saquib Sarfraz, Kailun Yang, Rainer Stiefelhagenarxiv.org/pdf/2401.16…null
2024-01-30EarthGPT: A Universal Multi-modal Large Language Model for Multi-sensor Image Comprehension in Remote Sensing DomainEarthGPT:遥感领域多传感器图像理解的通用多模态大语言模型Wei Zhang, Miaoxin Cai, Tong Zhang, Yin Zhuang, Xuerui Maoarxiv.org/pdf/2401.16…null

LLM

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-30MouSi: Poly-Visual-Expert Vision-Language ModelsMouSi:多视觉专家视觉语言模型Xiaoran Fan, Tao Ji, Changhao Jiang, Shuo Li, Senjie Jin, Sirui Song, Junke Wang, Boyang Hong, Lu Chen, Guodong Zheng, et.al.arxiv.org/pdf/2401.17…null
2024-01-30StrokeNUWA: Tokenizing Strokes for Vector Graphic SynthesisStrokeNUWA:用于矢量图形合成的笔画标记化Zecheng Tang, Chenfei Wu, Zekai Zhang, Mingheng Ni, Shengming Yin, Yu Liu, Zhengyuan Yang, Lijuan Wang, Zicheng Liu, Juntao Li, et.al.arxiv.org/pdf/2401.17…null

Transformer

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-30CPR++: Object Localization via Single Coarse Point SupervisionCPR++:通过单粗点监督进行对象定位Xuehui Yu, Pengfei Chen, Kuiran Wang, Xumeng Han, Guorong Li, Zhenjun Han, Qixiang Ye, Jianbin Jiaoarxiv.org/pdf/2401.17…null
2024-01-30OmniSCV: An Omnidirectional Synthetic Image Generator for Computer VisionOmniSCV:用于计算机视觉的全方位合成图像生成器Bruno Berenguel-Baeta, Jesus Bermudez-Cameo, Jose J. Guerreroarxiv.org/pdf/2401.17…null
2024-01-30ViTree: Single-path Neural Tree for Step-wise Interpretable Fine-grained Visual CategorizationViTree:用于逐步可解释的细粒度视觉分类的单路径神经树Danning Lao, Qi Liu, Jiazi Bu, Junchi Yan, Wei Shenarxiv.org/pdf/2401.17…null
2024-01-30Deep 3D World Models for Multi-Image Super-Resolution Beyond Optical Flow超越光流的多图像超分辨率深度 3D 世界模型Luca Savant Aira, Diego Valsesia, Andrea Bordone Molini, Giulia Fracastoro, Enrico Magli, Andrea Mirabilearxiv.org/pdf/2401.16…null
2024-01-30CAFCT: Contextual and Attentional Feature Fusions of Convolutional Neural Networks and Transformer for Liver Tumor SegmentationCAFCT:用于肝脏肿瘤分割的卷积神经网络和 Transformer 的上下文和注意力特征融合Ming Kang, Chee-Ming Ting, Fung Fung Ting, Raphaël Phanarxiv.org/pdf/2401.16…null
2024-01-30SmartFRZ: An Efficient Training Framework using Attention-Based Layer FreezingSmartFRZ:使用基于注意力的层冻结的高效训练框架Sheng Li, Geng Yuan, Yue Dai, Youtao Zhang, Yanzhi Wang, Xulong Tangarxiv.org/pdf/2401.16…null
2024-01-30Towards Precise 3D Human Pose Estimation with Multi-Perspective Spatial-Temporal Relational Transformers利用多视角时空关系变换器实现精确的 3D 人体姿势估计Jianbin Jiao, Xina Cheng, Weijie Chen, Xiaoting Yin, Hao Shi, Kailun Yangarxiv.org/pdf/2401.16…null

3DGS

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-30VR-GS: A Physical Dynamics-Aware Interactive Gaussian Splatting System in Virtual RealityVR-GS:虚拟现实中的物理动力学感知交互式高斯溅射系统Ying Jiang, Chang Yu, Tianyi Xie, Xuan Li, Yutao Feng, Huamin Wang, Minchen Li, Henry Lau, Feng Gao, Yin Yang, et.al.arxiv.org/pdf/2401.16…null

3D/CG

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-30YOLO-World: Real-Time Open-Vocabulary Object DetectionYOLO-World:实时开放词汇目标检测Tianheng Cheng, Lin Song, Yixiao Ge, Wenyu Liu, Xinggang Wang, Ying Shanarxiv.org/pdf/2401.17…null
2024-01-30Multi-Camera Asynchronous Ball Localization and Trajectory Prediction with Factor Graphs and Human Poses使用因子图和人体姿势进行多摄像机异步球定位和轨迹预测Qingyu Xiao, Zulfiqar Zaidi, Matthew Gombolayarxiv.org/pdf/2401.17…null
2024-01-30Non-central panorama indoor dataset非中心全景室内数据集Bruno Berenguel-Baeta, Jesus Bermudez-Cameo, Jose J. Guerreroarxiv.org/pdf/2401.17…null
2024-01-30Atlanta Scaled layouts from non-central panoramas亚特兰大 非中心全景的比例布局Bruno Berenguel-Baeta, Jesus Bermudez-Cameo, Jose J. Guerreroarxiv.org/pdf/2401.17…null
2024-01-30BlockFusion: Expandable 3D Scene Generation using Latent Tri-plane ExtrapolationBlockFusion:使用潜在三平面外推法生成可扩展的 3D 场景Zhennan Wu, Yang Li, Han Yan, Taizhang Shang, Weixuan Sun, Senbo Wang, Ruikai Cui, Weizhe Liu, Hiroyuki Sato, Hongdong Li, et.al.arxiv.org/pdf/2401.17…null
2024-01-30An Embeddable Implicit IUVD Representation for Part-based 3D Human Surface Reconstruction基于部件的 3D 人体表面重建的可嵌入隐式 IUVD 表示Baoxing Li, Yong Deng, Yehui Yang, Xu Zhaoarxiv.org/pdf/2401.16…null
2024-01-30All-optical complex field imaging using diffractive processors使用衍射处理器的全光学复杂场成像Jingxi Li, Yuhang Li, Tianyi Gan, Che-Yung Shen, Mona Jarrahi, Aydogan Ozcanarxiv.org/pdf/2401.16…null
2024-01-30Multi-granularity Correspondence Learning from Long-term Noisy Videos从长期噪声视频中进行多粒度对应学习Yijie Lin, Jie Zhang, Zhenyu Huang, Jia Liu, Zujie Wen, Xi Pengarxiv.org/pdf/2401.16…null
2024-01-30The Why, When, and How to Use Active Learning in Large-Data-Driven 3D Object Detection for Safe Autonomous Driving: An Empirical Exploration为什么、何时以及如何在大数据驱动的 3D 物体检测中使用主动学习来实现安全自动驾驶:实证探索Ross Greer, Bjørk Antoniussen, Mathias V. Andersen, Andreas Møgelmose, Mohan M. Trivediarxiv.org/pdf/2401.16…null

各类学习方式

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-30Zero-shot Classification using Hyperdimensional Computing使用超维计算的零样本分类Samuele Ruffino, Geethan Karunaratne, Michael Hersche, Luca Benini, Abu Sebastian, Abbas Rahimiarxiv.org/pdf/2401.16…null
2024-01-30Reviving Undersampling for Long-Tailed Learning恢复欠采样以实现长尾学习Hao Yu, Yingxiao Du, Jianxin Wuarxiv.org/pdf/2401.16…null
2024-01-30Detection and Recovery Against Deep Neural Network Fault Injection Attacks Based on Contrastive Learning基于对比学习的深度神经网络故障注入攻击检测与恢复Chenan Wang, Pu Zhao, Siyue Wang, Xue Linarxiv.org/pdf/2401.16…null
2024-01-30MuSc: Zero-Shot Industrial Anomaly Classification and Segmentation with Mutual Scoring of the Unlabeled ImagesMuSc:零样本工业异常分类和分割以及未标记图像的相互评分Xurui Li, Ziming Huang, Feng Xue, Yu Zhouarxiv.org/pdf/2401.16…link

其他

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-30A simple, strong baseline for building damage detection on the xBD dataset用于在 xBD 数据集上构建损伤检测的简单而强大的基线Sebastian Gerard, Paul Borne-Pons, Josephine Sullivanarxiv.org/pdf/2401.17…null
2024-01-30Robust Prompt Optimization for Defending Language Models Against Jailbreaking Attacks防御语言模型免受越狱攻击的稳健提示优化Andy Zhou, Bo Li, Haohan Wangarxiv.org/pdf/2401.17…null
2024-01-30SLIC: A Learned Image Codec Using Structure and ColorSLIC:使用结构和颜色的学习图像编解码器Srivatsa Prativadibhayankaram, Mahadev Prasad Panda, Thomas Richter, Heiko Sparenberg, Siegfried Fößel, André Kauparxiv.org/pdf/2401.17…null
2024-01-30ReAlnet: Achieving More Human Brain-Like Vision via Human Neural Representational AlignmentReAlnet:通过人类神经表征对齐实现更像人脑的视觉Zitong Lu, Yile Wang, Julie D. Golombarxiv.org/pdf/2401.17…null
2024-01-30NormEnsembleXAI: Unveiling the Strengths and Weaknesses of XAI Ensemble TechniquesNormEnsembleXAI:揭示 XAI 集成技术的优点和缺点Weronika Hryniewska-Guzik, Bartosz Sawicki, Przemysław Biecekarxiv.org/pdf/2401.17…null
2024-01-30Evaluation in Neural Style Transfer: A Review神经风格迁移评估:回顾Eleftherios Ioannou, Steve Maddockarxiv.org/pdf/2401.17…null
2024-01-30H-SynEx: Using synthetic images and ultra-high resolution ex vivo MRI for hypothalamus subregion segmentationH-SynEx:使用合成图像和超高分辨率离体 MRI 进行下丘脑分区分割Livia Rodrigues, Martina Bocchetta, Oula Puonti, Douglas Greve, Ana Carolina Londe, Marcondes França, Simone Appenzeller, Juan Eugenio Iglesias, Leticia Rittnerarxiv.org/pdf/2401.17…null
2024-01-30CharNet: Generalized Approach for High-Complexity Character ClassificationCharNet:高复杂性字符分类的通用方法Boris Kriukarxiv.org/pdf/2401.17…null
2024-01-30Active Generation Network of Human Skeleton for Action Recognition用于动作识别的人体骨骼主动生成网络Long Liu, Xin Wang, Fangming Li, Jiayu Chenarxiv.org/pdf/2401.17…null
2024-01-30Efficient Gesture Recognition on Spiking Convolutional Networks Through Sensor Fusion of Event-Based and Depth Data通过基于事件和深度数据的传感器融合在尖峰卷积网络上进行高效手势识别Lea Steffen, Thomas Trapp, Arne Roennau, Rüdiger Dillmannarxiv.org/pdf/2401.17…null
2024-01-30Floor extraction and door detection for visually impaired guidance楼层提取和门检测,为视障人士提供引导Bruno Berenguel-Baeta, Manuel Guerrero-Viu, Alejandro de Nova, Jesus Bermudez-Cameo, Alejandro Perez-Yus, Jose J. Guerreroarxiv.org/pdf/2401.17…null
2024-01-30Towards Assessing the Synthetic-to-Measured Adversarial Vulnerability of SAR ATR评估 SAR ATR 的综合测量对抗漏洞Bowen Peng, Bo Peng, Jingyuan Xia, Tianpeng Liu, Yongxiang Liu, Li Liuarxiv.org/pdf/2401.17…null
2024-01-30Multilayer Graph Approach to Deep Subspace Clustering深层子空间聚类的多层图方法Lovro Sindičić, Ivica Koprivaarxiv.org/pdf/2401.17…null
2024-01-30Static and Dynamic Synthesis of Bengali and Devanagari Signatures孟加拉语和梵文签名的静态和动态合成Miguel A. Ferrer, Sukalpa Chanda, Moises Diaz, Chayan Kr. Banerjee, Anirban Majumdar, Cristina Carmona-Duarte, Parikshit Acharya, Umapada Palarxiv.org/pdf/2401.17…null
2024-01-30MF-MOS: A Motion-Focused Model for Moving Object SegmentationMF-MOS:用于运动物体分割的运动聚焦模型Jintao Cheng, Kang Zeng, Zhuoxu Huang, Xiaoyu Tang, Jin Wu, Chengxi Zhang, Xieyuanli Chen, Rui Fanarxiv.org/pdf/2401.17…null
2024-01-30Evaluation of Out-of-Distribution Detection Performance on Autonomous Driving Datasets自动驾驶数据集上的分布外检测性能评估Jens Henriksson, Christian Berger, Stig Ursing, Markus Borgarxiv.org/pdf/2401.17…null
2024-01-30Category-wise Fine-Tuning: Resisting Incorrect Pseudo-Labels in Multi-Label Image Classification with Partial Labels按类别微调:在具有部分标签的多标签图像分类中抵抗不正确的伪标签Chak Fong Chong, Xinyi Fang, Jielong Guo, Yapeng Wang, Wei Ke, Chan-Tong Lam, Sio-Kei Imarxiv.org/pdf/2401.16…null
2024-01-30Segmentation and Characterization of Macerated Fibers and Vessels Using Deep Learning使用深度学习对浸渍纤维和血管进行分割和表征Saqib Qamar, Abu Imran Baba, Stéphane Verger, Magnus Anderssonarxiv.org/pdf/2401.16…null
2024-01-30Dynamic MRI reconstruction using low-rank plus sparse decomposition with smoothness regularization使用低秩加稀疏分解和平滑正则化进行动态 MRI 重建Chee-Ming Ting, Fuad Noman, Raphaël C. -W. Phan, Hernando Ombaoarxiv.org/pdf/2401.16…null
2024-01-30A Tournament of Transformation Models: B-Spline-based vs. Mesh-based Multi-Objective Deformable Image Registration变换模型锦标赛:基于 B 样条与基于网格的多目标可变形图像配准Georgios Andreadis, Joas I. Mulder, Anton Bouter, Peter A. N. Bosman, Tanja Alderliestenarxiv.org/pdf/2401.16…null
2024-01-30MESA: Matching Everything by Segmenting AnythingMESA:通过分割任何内容来匹配所有内容Yesheng Zhang, Xu Zhaoarxiv.org/pdf/2401.16…null
2024-01-30Optimal-Landmark-Guided Image Blending for Face Morphing Attacks用于面部变形攻击的最佳地标引导图像混合Qiaoyun He, Zongyong Deng, Zuyuan He, Qijun Zhaoarxiv.org/pdf/2401.16…null
2024-01-30LF Tracy: A Unified Single-Pipeline Approach for Salient Object Detection in Light Field CamerasLF Tracy:用于光场相机中显着物体检测的统一单管道方法Fei Teng, Jiaming Zhang, Jiawei Liu, Kunyu Peng, Xina Cheng, Zhiyong Li, Kailun Yangarxiv.org/pdf/2401.16…null
2024-01-30EdgeOL: Efficient in-situ Online Learning on Edge DevicesEdgeOL:边缘设备上的高效原位在线学习Sheng Li, Geng Yuan, Yawen Wu, Yue Dai, Chao Wu, Alex K. Jones, Jingtong Hu, Yanzhi Wang, Xulong Tangarxiv.org/pdf/2401.16…null
2024-01-30Characterization of Magnetic Labyrinthine Structures through Junctions and Terminals Detection using Template Matching and CNN使用模板匹配和 CNN 通过连接和终端检测来表征磁性迷宫结构Vinícius Yu Okubo, Kotaro Shimizu, B. S. Shivaram, Hae Yong Kimarxiv.org/pdf/2401.16…null