[UPDATED!] 2024-02-09 (Publish Time)
生成模型
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-09 | Sequential Flow Matching for Generative Modeling | 用于生成建模的顺序流匹配 | Jongmin Yoon, Juho Lee | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | ControlUDA: Controllable Diffusion-assisted Unsupervised Domain Adaptation for Cross-Weather Semantic Segmentation | ControlUDA:用于跨天气语义分割的可控扩散辅助无监督域适应 | Fengyi Shen, Li Zhou, Kagan Kucukaytekin, Ziyuan Liu, He Wang, Alois Knoll | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | Improving 2D-3D Dense Correspondences with Diffusion Models for 6D Object Pose Estimation | 使用扩散模型改进 2D-3D 密集对应以进行 6D 物体姿态估计 | Peter Hönig, Stefan Thalhammer, Markus Vincze | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | ImplicitDeepfake: Plausible Face-Swapping through Implicit Deepfake Generation using NeRF and Gaussian Splatting | ImplicitDeepfake:使用 NeRF 和高斯泼溅通过隐式 Deepfake 生成进行合理的换脸 | Georgii Stanishevskii, Jakub Steczkiewicz, Tomasz Szczepanik, Sławomir Tadeja, Jacek Tabor, Przemysław Spurek | arxiv.org/pdf/2402.06… | link |
| 2024-02-09 | Multisource Semisupervised Adversarial Domain Generalization Network for Cross-Scene Sea\textendash Land Clutter Classification | 用于跨场景海\textendash陆地杂波分类的多源半监督对抗域泛化网络 | Xiaoxuan Zhang, Quan Pan, Salvador García | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | Masked LoGoNet: Fast and Accurate 3D Image Analysis for Medical Domain | Masked LoGoNet:医疗领域快速准确的 3D 图像分析 | Amin Karimi Monsefi, Payam Karisani, Mengxi Zhou, Stacey Choi, Nathan Doble, Heng Ji, Srinivasan Parthasarathy, Rajiv Ramnath | arxiv.org/pdf/2402.06… | null |
多模态
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-09 | On the Out-Of-Distribution Generalization of Multimodal Large Language Models | 多模态大语言模型的分布外泛化 | Xingxuan Zhang, Jiansheng Li, Wenjing Chu, Junjia Hai, Renzhe Xu, Yuqing Yang, Shikai Guan, Jiazheng Xu, Peng Cui | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | Quantifying and Enhancing Multi-modal Robustness with Modality Preference | 通过模态偏好量化和增强多模态鲁棒性 | Zequn Yang, Yake Wei, Ce Liang, Di Hu | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | Revealing Multimodal Contrastive Representation Learning through Latent Partial Causal Models | 通过潜在部分因果模型揭示多模态对比表示学习 | Yuhang Liu, Zhen Zhang, Dong Gong, Biwei Huang, Mingming Gong, Anton van den Hengel, Kun Zhang, Javen Qinfeng Shi | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | GS-CLIP: Gaussian Splatting for Contrastive Language-Image-3D Pretraining from Real-World Data | GS-CLIP:根据真实世界数据进行对比语言-图像-3D 预训练的高斯泼溅 | Haoyuan Li, Yanpeng Zhou, Yihan Zeng, Hang Xu, Xiaodan Liang | arxiv.org/pdf/2402.06… | null |
3DGS
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-09 | HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting | HeadStudio:使用 3D 高斯泼溅将文本转换为可动画头部头像 | Zhenglin Zhou, Fan Ma, Hehe Fan, Yi Yang | arxiv.org/pdf/2402.06… | null |
模型压缩/优化
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-09 | Multi-source-free Domain Adaptation via Uncertainty-aware Adaptive Distillation | 通过不确定性感知自适应蒸馏进行多源自由域适应 | Yaxuan Song, Jianan Fan, Dongnan Liu, Weidong Cai | arxiv.org/pdf/2402.06… | null |
分类/检测/识别/分割/...
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-09 | More than the Sum of Its Parts: Ensembling Backbone Networks for Few-Shot Segmentation | 不仅仅是各个部分的总和:集成主干网络以实现少样本分割 | Nico Catalano, Alessandro Maranelli, Agnese Chiatti, Matteo Matteucci | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | Video Annotator: A framework for efficiently building video classifiers using vision-language models and active learning | 视频注释器:使用视觉语言模型和主动学习有效构建视频分类器的框架 | Amir Ziai, Aneesh Vartakavi | arxiv.org/pdf/2402.06… | link |
| 2024-02-09 | Hybridnet for depth estimation and semantic segmentation | 用于深度估计和语义分割的混合网络 | Dalila Sánchez-Escobedo, Xiao Lin, Josep R. Casas, Montse Pardàs | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | Feature Density Estimation for Out-of-Distribution Detection via Normalizing Flows | 通过标准化流进行分布外检测的特征密度估计 | Evan D. Cook, Marc-Antoine Lavoie, Steven L. Waslander | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | Transferring facade labels between point clouds with semantic octrees while considering change detection | 在考虑变化检测的同时,使用语义八叉树在点云之间传输立面标签 | Sophia Schwarz, Tanja Pilz, Olaf Wysocki, Ludwig Hoegner, Uwe Stilla | arxiv.org/pdf/2402.06… | link |
| 2024-02-09 | Classifying point clouds at the facade-level using geometric features and deep learning networks | 使用几何特征和深度学习网络在立面级别对点云进行分类 | Yue Tan, Olaf Wysocki, Ludwig Hoegner, Uwe Stilla | arxiv.org/pdf/2402.06… | link |
| 2024-02-09 | Iris-SAM: Iris Segmentation Using a Foundational Model | Iris-SAM:使用基础模型进行虹膜分割 | Parisa Farmanifard, Arun Ross | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | Deep Learning-Based Auto-Segmentation of Planning Target Volume for Total Marrow and Lymph Node Irradiation | 基于深度学习的全骨髓和淋巴结照射规划目标体积的自动分割 | Ricardo Coimbra Brioso, Damiano Dei, Nicola Lambri, Daniele Loiacono, Pietro Mancosu, Marta Scorsetti | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | Cardiac ultrasound simulation for autonomous ultrasound navigation | 用于自主超声导航的心脏超声模拟 | Abdoul Aziz Amadou, Laura Peralta, Paul Dryburgh, Paul Klein, Kaloian Petkov, Richard James Housden, Vivek Singh, Rui Liao, Young-Ho Kim, Florin Christian Ghesu, et.al. | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | CurveFormer++: 3D Lane Detection by Curve Propagation with Temporal Curve Queries and Attention | CurveFormer++:利用时间曲线查询和注意力的曲线传播进行 3D 车道检测 | Yifeng Bai, Zhirong Chen, Pengpeng Liang, Erkang Cheng | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | Learning using privileged information for segmenting tumors on digital mammograms | 学习使用特权信息在数字乳房X光照片上分割肿瘤 | Ioannis N. Tzortzis, Konstantinos Makantasis, Ioannis Rallis, Nikolaos Bakalos, Anastasios Doulamis, Nikolaos Doulamis | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | Taking Class Imbalance Into Account in Open Set Recognition Evaluation | 在开集识别评估中考虑类别不平衡 | Joanna Komorniczak, Pawel Ksieniewicz | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | MLS2LoD3: Refining low LoDs building models with MLS point clouds to reconstruct semantic LoD3 building models | MLS2LoD3:使用 MLS 点云细化低 LoD 建筑模型以重建语义 LoD3 建筑模型 | Olaf Wysocki, Ludwig Hoegner, Uwe Stilla | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | Insomnia Identification via Electroencephalography | 通过脑电图识别失眠 | Olviya Udeshika, Dilshan Lakshitha, Nilantha Premakumara, Surangani Bandara | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | Anomaly Unveiled: Securing Image Classification against Adversarial Patch Attacks | 揭秘异常:保护图像分类免受对抗性补丁攻击 | Nandish Chattopadhyay, Amira Guesmi, Muhammad Shafique | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | Learning Contrastive Feature Representations for Facial Action Unit Detection | 学习面部动作单元检测的对比特征表示 | Ziqiao Shang, Bin Liu, Fei Teng, Tianrui Li | arxiv.org/pdf/2402.06… | link |
| 2024-02-09 | Target Recognition Algorithm for Monitoring Images in Electric Power Construction Process | 电力施工过程监控图像目标识别算法 | Hao Song, Wei Lin, Wei Song, Man Wang | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | TETRIS: Towards Exploring the Robustness of Interactive Segmentation | 《俄罗斯方块》:探索交互式分段的稳健性 | Andrey Moskalenko, Vlad Shakhuro, Anna Vorontsova, Anton Konushin, Anton Antonov, Alexander Krapukhin, Denis Shepelev, Konstantin Soshin | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | Multiple Instance Learning for Cheating Detection and Localization in Online Examinations | 用于在线考试作弊检测和定位的多实例学习 | Yemeng Liu, Jing Ren, Jianshuo Xu, Xiaomei Bai, Roopdeep Kaur, Feng Xia | arxiv.org/pdf/2402.06… | null |
图像理解
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-09 | SIR: Multi-view Inverse Rendering with Decomposable Shadow for Indoor Scenes | SIR:室内场景的可分解阴影多视图逆渲染 | Xiaokang Wei, Zhuoman Liu, Yan Luximon | arxiv.org/pdf/2402.06… | null |
Transformer
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-09 | Maia: A Real-time Non-Verbal Chat for Human-AI Interaction | Maia:用于人机交互的实时非语言聊天 | Dragos Costea, Alina Marcu, Cristina Lazar, Marius Leordeanu | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | A self-supervised framework for learning whole slide representations | 用于学习整个幻灯片表示的自监督框架 | Xinhai Hou, Cheng Jiang, Akhil Kondepudi, Yiwei Lyu, Asadur Zaman Chowdury, Honglak Lee, Todd C. Hollon | arxiv.org/pdf/2402.06… | null |
3D/CG
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-09 | Reconstructing facade details using MLS point clouds and Bag-of-Words approach | 使用 MLS 点云和词袋方法重建立面细节 | Thomas Froech, Olaf Wysocki, Ludwig Hoegner, Uwe Stilla | arxiv.org/pdf/2402.06… | link |
| 2024-02-09 | A Network for structural dense displacement based on 3D deformable mesh model and optical flow | 基于3D变形网格模型和光流的结构密集位移网络 | Peimian Du, Qicheng Guo, Yanru Li | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | Halo Reduction in Display Systems through Smoothed Local Histogram Equalization and Human Visual System Modeling | 通过平滑局部直方图均衡和人类视觉系统建模减少显示系统中的光晕 | Prasoon Ambalathankandy, Yafei Ou, Masayuki Ikebe | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | ViGoR: Improving Visual Grounding of Large Vision Language Models with Fine-Grained Reward Modeling | ViGoR:通过细粒度奖励模型改善大视觉语言模型的视觉基础 | Siming Yan, Min Bai, Weifeng Chen, Xiong Zhou, Qixing Huang, Li Erran Li | arxiv.org/pdf/2402.06… | null |
其他
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-09 | Image-based Deep Learning for the time-dependent prediction of fresh concrete properties | 基于图像的深度学习,用于新拌混凝土性能随时间的预测 | Max Meyer, Amadeus Langer, Max Mehltretter, Dries Beyer, Max Coenen, Tobias Schack, Michael Haist, Christian Heipke | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | BarlowTwins-CXR : Enhancing Chest X-Ray abnormality localization in heterogeneous data with cross-domain self-supervised learning | BarlowTwins-CXR:通过跨域自监督学习增强异构数据中的胸部 X 射线异常定位 | Haoyue Sheng, Linrui Ma, Jean-Francois Samson, Dianbo Liu | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | Large Language Models for Captioning and Retrieving Remote Sensing Images | 用于字幕和检索遥感图像的大型语言模型 | João Daniel Silva, João Magalhães, Devis Tuia, Bruno Martins | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | FD-Vision Mamba for Endoscopic Exposure Correction | 用于内窥镜曝光校正的 FD-Vision Mamba | Zhuoran Zheng, Jun Zhang | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | Towards actionability for open medical imaging datasets: lessons from community-contributed platforms for data management and stewardship | 实现开放医学成像数据集的可操作性:社区贡献的数据管理和管理平台的经验教训 | Amelia Jiménez-Sánchez, Natalia-Rozalia Avlona, Dovile Juodelyte, Théo Sourget, Caroline Vang-Larsen, Hubert Dariusz Zając, Veronika Cheplygina | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | Towards Chip-in-the-loop Spiking Neural Network Training via Metropolis-Hastings Sampling | 通过 Metropolis-Hastings 采样进行芯片在环尖峰神经网络训练 | Ali Safa, Vikrant Jaltare, Samira Sebt, Kameron Gano, Johannes Leugering, Georges Gielen, Gert Cauwenberghs | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | The Berkeley Single Cell Computational Microscopy (BSCCM) Dataset | 伯克利单细胞计算显微镜 (BSCCM) 数据集 | Henry Pinkard, Cherry Liu, Fanice Nyatigo, Daniel A. Fletcher, Laura Waller | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | Development and validation of an artificial intelligence model to accurately predict spinopelvic parameters | 开发和验证人工智能模型以准确预测脊柱骨盆参数 | Edward S. Harake, Joseph R. Linzey, Cheng Jiang, Rushikesh S. Joshi, Mark M. Zaki, Jaes C. Jones, Siri S. Khalsa, John H. Lee, Zachary Wilseck, Jacob R. Joseph, et.al. | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | Domain Generalization with Small Data | 小数据领域泛化 | Kecheng Chen, Elena Gal, Hong Yan, Haoliang Li | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | ContPhy: Continuum Physical Concept Learning and Reasoning from Videos | ContPhy:从视频中学习和推理连续物理概念 | Zhicheng Zheng, Xin Yan, Zhenfang Chen, Jingzhou Wang, Qin Zhi Eddie Lim, Joshua B. Tenenbaum, Chuang Gan | arxiv.org/pdf/2402.06… | null |
| 2024-02-09 | Spatially-Attentive Patch-Hierarchical Network with Adaptive Sampling for Motion Deblurring | 具有自适应采样运动去模糊功能的空间注意力补丁分层网络 | Maitreya Suin, Kuldeep Purohit, A. N. Rajagopalan | arxiv.org/pdf/2402.06… | null |