[分享][每日更新][2024.02.13][CV_arxiv_papers]

228 阅读9分钟

[UPDATED!] 2024-02-13 (Publish Time)

生成模型

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-13Learning Continuous 3D Words for Text-to-Image Generation学习连续 3D 单词以生成文本到图像Ta-Ying Cheng, Matheus Gadelha, Thibault Groueix, Matthew Fisher, Radomir Mech, Andrew Markham, Niki Trigoniarxiv.org/pdf/2402.08…null
2024-02-13Denoising Diffusion Restoration Tackles Forward and Inverse Problems for the Laplace Operator去噪扩散恢复解决拉普拉斯算子的正向和逆向问题Amartya Mukherjee, Melissa M. Stadt, Lena Podina, Mohammad Kohandel, Jun Liuarxiv.org/pdf/2402.08…null
2024-02-13Confronting Reward Overoptimization for Diffusion Models: A Perspective of Inductive and Primacy Biases面对扩散模型的奖励过度优化:归纳偏差和首要偏差的视角Ziyi Zhang, Sen Zhang, Yibing Zhan, Yong Luo, Yonggang Wen, Dacheng Taoarxiv.org/pdf/2402.08…null
2024-02-13Taking Training Seriously: Human Guidance and Management-Based Regulation of Artificial Intelligence认真对待培训:人工智能的人为指导和管理调控Cary Coglianese, Colton R. Crumarxiv.org/pdf/2402.08…null
2024-02-13One-to-many Reconstruction of 3D Geometry of cultural Artifacts using a synthetically trained Generative Model使用综合训练的生成模型一对多重建文化文物的 3D 几何形状Thomas Pöllabauer, Julius Kühn, Jiayi Li, Arjan Kuijperarxiv.org/pdf/2402.08…null
2024-02-13A Dense Reward View on Aligning Text-to-Image Diffusion with Preference关于将文本到图像扩散与偏好对齐的密集奖励视图Shentao Yang, Tianqi Chen, Mingyuan Zhouarxiv.org/pdf/2402.08…link
2024-02-13Fine-Tuning Text-To-Image Diffusion Models for Class-Wise Spurious Feature Generation微调文本到图像的扩散模型以生成按类别的杂散特征AprilPyone MaungMaung, Huy H. Nguyen, Hitoshi Kiya, Isao Echizenarxiv.org/pdf/2402.08…null
2024-02-13Poisson flow consistency models for low-dose CT image denoising低剂量CT图像去噪的泊松流一致性模型Dennis Hein, Adam Wang, Ge Wangarxiv.org/pdf/2402.08…null

多模态

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-13PIN: Positional Insert Unlocks Object Localisation Abilities in VLMsPIN:位置插入解锁 VLM 中的对象定位能力Michael Dorkenwald, Nimrod Barazani, Cees G. M. Snoek, Yuki M. Asanoarxiv.org/pdf/2402.08…null
2024-02-13Test-Time Backdoor Attacks on Multimodal Large Language Models对多模态大型语言模型的测试时后门攻击Dong Lu, Tianyu Pang, Chao Du, Qian Liu, Xianjun Yang, Min Linarxiv.org/pdf/2402.08…link
2024-02-13Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast史密斯特工:单张图像可以以指数速度越狱一百万多模式 LLM 特工Xiangming Gu, Xiaosen Zheng, Tianyu Pang, Chao Du, Qian Liu, Ye Wang, Jing Jiang, Min Linarxiv.org/pdf/2402.08…link
2024-02-13Visual Question Answering Instruction: Unlocking Multimodal Large Language Model To Domain-Specific Visual Multitasks视觉问答教学:将多模态大语言模型解锁到特定领域的视觉多任务Jusung Lee, Sungguk Cha, Younghyun Lee, Cheoljong Yangarxiv.org/pdf/2402.08…null

Nerf

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-13NeRF Analogies: Example-Based Visual Attribute Transfer for NeRFsNeRF 类比:NeRF 基于示例的视觉属性传输Michael Fischer, Zhengqin Li, Thu Nguyen-Phuoc, Aljaz Bozic, Zhao Dong, Carl Marshall, Tobias Ritschelarxiv.org/pdf/2402.08…null
2024-02-13H2O-SDF: Two-phase Learning for 3D Indoor Reconstruction using Object Surface FieldsH2O-SDF:使用物体表面场进行 3D 室内重建的两阶段学习Minyoung Park, Mirae Do, YeonJae Shin, Jaeseok Yoo, Jongkwang Hong, Joongrock Kim, Chul Leearxiv.org/pdf/2402.08…null

3DGS

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-13IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D GenerationIM-3D:用于高质量 3D 生成的迭代多视图扩散和重建Luke Melas-Kyriazi, Iro Laina, Christian Rupprecht, Natalia Neverova, Andrea Vedaldi, Oran Gafni, Filippos Kokkinosarxiv.org/pdf/2402.08…null

模型压缩/优化

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-13BdSLW60: A Word-Level Bangla Sign Language DatasetBdSLW60:单词级孟加拉手语数据集Husne Ara Rubaiyeat, Hasan Mahmud, Ahsan Habib, Md. Kamrul Hasanarxiv.org/pdf/2402.08…link

分类/检测/识别/分割/...

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-13Convolutional Neural Networks Towards Facial Skin Lesions Detection卷积神经网络用于面部皮肤病变检测Reza Sarshar, Mohammad Heydari, Elham Akhondzadeh Noughabiarxiv.org/pdf/2402.08…null
2024-02-13FESS Loss: Feature-Enhanced Spatial Segmentation Loss for Optimizing Medical Image AnalysisFESS 损失:用于优化医学图像分析的特征增强空间分割损失Charulkumar Chodvadiya, Navyansh Mahla, Kinshuk Gaurav Singh, Kshitij Sharad Jadhavarxiv.org/pdf/2402.08…null
2024-02-13Glass Segmentation with Multi Scales and Primary Prediction Guiding多尺度玻璃分割和初步预测引导Zhiyu Xu, Qingliang Chenarxiv.org/pdf/2402.08…null
2024-02-13Approximately Piecewise E(3) Equivariant Point Networks近似分段 E(3) 等变点网络Matan Atzmon, Jiahui Huang, Francis Williams, Or Litanyarxiv.org/pdf/2402.08…null
2024-02-13P-Mamba: Marrying Perona Malik Diffusion with Mamba for Efficient Pediatric Echocardiographic Left Ventricular SegmentationP-Mamba:将 Perona Malik Diffusion 与 Mamba 结合起来,实现高效的儿科超声心动图左心室分割Zi Ye, Tianxiang Chenarxiv.org/pdf/2402.08…null
2024-02-13Intriguing Differences Between Zero-Shot and Systematic Evaluations of Vision-Language Transformer Models视觉语言 Transformer 模型的零样本评估和系统评估之间的有趣差异Shaeke Salman, Md Montasir Bin Shams, Xiuwen Liu, Lingjiong Zhuarxiv.org/pdf/2402.08…null
2024-02-13Latent space configuration for improved generalization in supervised autoencoder neural networks用于改进监督自动编码器神经网络泛化的潜在空间配置Nikita Gabdullinarxiv.org/pdf/2402.08…null
2024-02-13Camera Calibration through Geometric Constraints from Rotation and Projection Matrices通过旋转和投影矩阵的几何约束进行相机校准Muhammad Waleed, Abdul Rauf, Murtaza Tajarxiv.org/pdf/2402.08…link
2024-02-13Leveraging Self-Supervised Instance Contrastive Learning for Radar Object Detection利用自监督实例对比学习进行雷达目标检测Colin Decourt, Rufin VanRullen, Didier Salle, Thomas Oberlinarxiv.org/pdf/2402.08…null
2024-02-13Transferring Ultrahigh-Field Representations for Intensity-Guided Brain Segmentation of Low-Field Magnetic Resonance Imaging传输超高场表示以进行低场磁共振成像强度引导脑分割Kwanseok Oh, Jieun Lee, Da-Woon Heo, Dinggang Shen, Heung-Il Sukarxiv.org/pdf/2402.08…null
2024-02-13Adaptive Hierarchical Certification for Segmentation using Randomized Smoothing使用随机平滑进行分段的自适应分层认证Alaa Anani, Tobias Lorenz, Bernt Schiele, Mario Fritzarxiv.org/pdf/2402.08…null
2024-02-13Visually Dehallucinative Instruction Generation视觉幻觉指令生成Sungguk Cha, Jusung Lee, Younghyun Lee, Cheoljong Yangarxiv.org/pdf/2402.08…link
2024-02-13Conditional Information Gain Trellis条件信息增益网格Ufuk Can Bicici, Tuna Han Salih Meral, Lale Akarunarxiv.org/pdf/2402.08…null
2024-02-13Scribble-based fast weak-supervision and interactive corrections for segmenting whole slide images基于涂鸦的快速弱监督和交互式校正,用于分割整个幻灯片图像Antoine Habis, Roy Rosman Nathanson, Vannary Meas-Yedid, Elsa D. Angelini, Jean-Christophe Olivo-Marinarxiv.org/pdf/2402.08…null
2024-02-13The Paradox of Motion: Evidence for Spurious Correlations in Skeleton-based Gait Recognition Models运动悖论:基于骨骼的步态识别模型中虚假相关性的证据Andy Cătrună, Adrian Cosma, Emilian Rădoiarxiv.org/pdf/2402.08…null
2024-02-13Rethinking U-net Skip Connections for Biomedical Image Segmentation重新思考用于生物医学图像分割的 U-net Skip ConnectionsFrauke Wilm, Jonas Ammeling, Mathias Öttl, Rutger H. J. Fick, Marc Aubreville, Katharina Breiningerarxiv.org/pdf/2402.08…null
2024-02-13Improving Image Coding for Machines through Optimizing Encoder via Auxiliary Loss通过辅助损失优化编码器来改进机器的图像编码Kei Iino, Shunsuke Akamatsu, Hiroshi Watanabe, Shohei Enomoto, Akira Sakamoto, Takeharu Edaarxiv.org/pdf/2402.08…null
2024-02-13Object Detection in Thermal Images Using Deep Learning for Unmanned Aerial Vehicles使用无人机深度学习进行热图像中的物体检测Minh Dang Tu, Kieu Trang Le, Manh Duong Phungarxiv.org/pdf/2402.08…null

GNN

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-13Pix2Code: Learning to Compose Neural Visual Concepts as ProgramsPix2Code:学习将神经视觉概念编写为程序Antonia Wüst, Wolfgang Stammer, Quentin Delfosse, Devendra Singh Dhami, Kristian Kerstingarxiv.org/pdf/2402.08…link

LLM

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-13Mitigating Object Hallucination in Large Vision-Language Models via Classifier-Free Guidance通过无分类器指导减轻大视觉语言模型中的物体幻觉Linxi Zhao, Yihe Deng, Weitong Zhang, Quanquan Guarxiv.org/pdf/2402.08…null

Transformer

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-13Are Semi-Dense Detector-Free Methods Good at Matching Local Features?半密集无检测器方法是否擅长匹配局部特征?Matthieu Vilain, Rémi Giraud, Hugo Germain, Guillaume Bourmaudarxiv.org/pdf/2402.08…null
2024-02-13Peeking Behind the Curtains of Residual Learning窥视残差学习的幕后Tunhou Zhang, Feng Yan, Hai Li, Yiran Chenarxiv.org/pdf/2402.08…null
2024-02-13CrossGaze: A Strong Method for 3D Gaze Estimation in the WildCrossGaze:野外 3D 视线估计的强大方法Andy Cătrună, Adrian Cosma, Emilian Rădoiarxiv.org/pdf/2402.08…null
2024-02-13MetaTra: Meta-Learning for Generalized Trajectory Prediction in Unseen DomainMetaTra:​​用于未知领域广义轨迹预测的元学习Xiaohe Li, Feilong Huang, Zide Fan, Fangli Mou, Yingyan Hou, Chen Qian, Lijie Wenarxiv.org/pdf/2402.08…null
2024-02-13Translating Images to Road Network:A Non-Autoregressive Sequence-to-Sequence Approach将图像转换为道路网络:一种非自回归序列到序列方法Jiachen Lu, Renyuan Peng, Xinyue Cai, Hang Xu, Hongyang Li, Feng Wen, Wei Zhang, Li Zhangarxiv.org/pdf/2402.08…link
2024-02-13Optimized Information Flow for Transformer Tracking变压器跟踪的优化信息流Janani Kugarajeevan, Thanikasalam Kokul, Amirthalingam Ramanan, Subha Fernandoarxiv.org/pdf/2402.08…link

3D/CG

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-13Learning to Produce Semi-dense Correspondences for Visual Localization学习为视觉定位生成半密集对应Khang Truong Giang, Soohwan Song, Sungho Joarxiv.org/pdf/2402.08…link
2024-02-13Color Image Denoising Using The Green Channel Prior使用绿色通道先验进行彩色图像去噪Zhaoming Kong, Xiaowei Yangarxiv.org/pdf/2402.08…null
2024-02-13Advancing Data-driven Weather Forecasting: Time-Sliding Data Augmentation of ERA5推进数据驱动的天气预报:ERA5 的时间滑动数据增强Minjong Cheon, Daehyun Kang, Yo-Hwan Choi, Seon-Yu Kangarxiv.org/pdf/2402.08…null

各类学习方式

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-13Pixel Sentence Representation Learning像素句子表示学习Chenghao Xiao, Zhuoxu Huang, Danlu Chen, G Thomas Hudson, Yizhi Li, Haoran Duan, Chenghua Lin, Jie Fu, Jungong Han, Noura Al Moubayedarxiv.org/pdf/2402.08…null

其他

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-13Learned Image Compression with Text Quality Enhancement学习图像压缩和文本质量增强Chih-Yu Lai, Dung Tran, Kazuhito Koishidaarxiv.org/pdf/2402.08…null
2024-02-13Latent Inversion with Timestep-aware Sampling for Training-free Non-rigid Editing具有时间步感知采样的潜在反转,用于免训练非刚性编辑Yunji Jung, Seokju Lee, Tair Djanibekov, Hyunjung Shim, Jongchul Yearxiv.org/pdf/2402.08…null
2024-02-13JeFaPaTo -- A joint toolbox for blinking analysis and facial features extractionJeFaPaTo——眨眼分析和面部特征提取的联合工具箱Tim Büchner, Oliver Mothes, Orlando Guntinas-Lichius, Joachim Denzlerarxiv.org/pdf/2402.08…null
2024-02-13A Neural-network Enhanced Video Coding Framework beyond ECM超越 ECM 的神经网络增强视频编码框架Yanchen Zhao, Wenxuan He, Chuanmin Jia, Qizhe Wang, Junru Li, Yue Li, Chaoyi Lin, Kai Zhang, Li Zhang, Siwei Maarxiv.org/pdf/2402.08…null
2024-02-13An Order-Complexity Aesthetic Assessment Model for Aesthetic-aware Music Recommendation用于审美感知音乐推荐的顺序复杂度审美评估模型Xin Jin, Wu Zhou, Jingyu Wang, Duo Xu, Yongsen Zhengarxiv.org/pdf/2402.08…null
2024-02-13Learning semantic image quality for fetal ultrasound from noisy ranking annotation从嘈杂的排名注释中学习胎儿超声的语义图像质量Manxi Lin, Jakob Ambsdorf, Emilie Pi Fogtmann Sejer, Zahra Bashir, Chun Kit Wong, Paraskevas Pegios, Alberto Raheli, Morten Bo Søndergaard Svendsen, Mads Nielsen, Martin Grønnebæk Tolsgaard, et.al.arxiv.org/pdf/2402.08…null
2024-02-13SepRep-Net: Multi-source Free Domain Adaptation via Model Separation And ReparameterizationSepRep-Net:通过模型分离和重新参数化进行多源自由域适应Ying Jin, Jiaqi Wang, Dahua Linarxiv.org/pdf/2402.08…null