[UPDATED!] 2024-03-06 (Publish Time)
生成模型
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-06 | 3D Diffusion Policy | 3D 扩散策略 | Yanjie Ze, Gu Zhang, Kangning Zhang, Chenyuan Hu, Muhan Wang, Huazhe Xu | arxiv.org/pdf/2403.03… | link |
| 2024-03-06 | Latent Dataset Distillation with Diffusion Models | 使用扩散模型进行潜在数据集蒸馏 | Brian B. Moser, Federico Raue, Sebastian Palacio, Stanislav Frolov, Andreas Dengel | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Unifying Generation and Compression: Ultra-low bitrate Image Coding Via Multi-stage Transformer | 统一生成和压缩:通过多级变压器进行超低比特率图像编码 | Naifu Xue, Qi Mao, Zijian Wang, Yuan Zhang, Siwei Ma | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Generative Active Learning with Variational Autoencoder for Radiology Data Generation in Veterinary Medicine | 使用变分自动编码器生成主动学习,用于兽医放射学数据生成 | In-Gyu Lee, Jun-Young Oh, Hee-Jung Yu, Jae-Hwan Kim, Ki-Dong Eom, Ji-Hoon Jeong | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Dcl-Net: Dual Contrastive Learning Network for Semi-Supervised Multi-Organ Segmentation | Dcl-Net:用于半监督多器官分割的双对比学习网络 | Lu Wen, Zhenghao Feng, Yun Hou, Peng Wang, Xi Wu, Jiliu Zhou, Yan Wang | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | NoiseCollage: A Layout-Aware Text-to-Image Diffusion Model Based on Noise Cropping and Merging | NoiseCollage:基于噪声裁剪和合并的布局感知文本到图像扩散模型 | Takahiro Shirakawa, Seiichi Uchida | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | FLAME Diffuser: Grounded Wildfire Image Synthesis using Mask Guided Diffusion | FLAME Diffuser:使用掩模引导扩散的接地野火图像合成 | Hao Wang, Sayed Pedram Haeri Boroujeni, Xiwen Chen, Ashish Bastola, Huayu Li, Abolfazl Razi | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | DLP-GAN: Learning to Draw Modern Chinese Landscape Photos with Generative Adversarial Network | DLP-GAN:学习用生成对抗网络绘制现代中国风景照片 | Xiangquan Gui, Binxuan Zhang, Li Li, Yi Yang | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Towards Understanding Cross and Self-Attention in Stable Diffusion for Text-Guided Image Editing | 理解文本引导图像编辑稳定扩散中的交叉和自注意力 | Bingyan Liu, Chengyu Wang, Tingfeng Cao, Kui Jia, Jun Huang | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Scene Depth Estimation from Traditional Oriental Landscape Paintings | 东方传统山水画的场景深度估计 | Sungho Kang, YeongHyeon Park, Hyunkyu Park, Juneho Yi | arxiv.org/pdf/2403.03… | null |
多模态
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-06 | Are Language Models Puzzle Prodigies? Algorithmic Puzzles Unveil Serious Challenges in Multimodal Reasoning | 语言模型是谜题神童吗?算法难题给多模态推理带来了严峻的挑战 | Deepanway Ghosal, Vernon Toh Yan Han, Chia Yew Ken, Soujanya Poria | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Multimodal Transformer for Comics Text-Cloze | 用于漫画文本完形填空的多模态 Transformer | Emanuele Vivoli, Joan Lafuente Baeza, Ernest Valveny Llobet, Dimosthenis Karatzas | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Causality-based Cross-Modal Representation Learning for Vision-and-Language Navigation | 用于视觉和语言导航的基于因果关系的跨模态表示学习 | Liuyi Wang, Zongtao He, Ronghao Dang, Huiyi Chen, Chengju Liu, Qijun Chen | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Multi-modal Deep Learning | 多模态深度学习 | Chen Yuhua | arxiv.org/pdf/2403.03… | null |
Nerf
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-06 | GSNeRF: Generalizable Semantic Neural Radiance Fields with Enhanced 3D Scene Understanding | GSNeRF:具有增强 3D 场景理解的可泛化语义神经辐射场 | Zi-Ting Chou, Sheng-Yu Huang, I-Jieh Liu, Yu-Chiang Frank Wang | arxiv.org/pdf/2403.03… | null |
模型压缩/优化
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-06 | Continual Segmentation with Disentangled Objectness Learning and Class Recognition | 通过解开对象学习和类别识别进行持续分割 | Yizheng Gong, Siyue Yu, Xiaoyang Wang, Jimin Xiao | arxiv.org/pdf/2403.03… | null |
分类/检测/识别/分割/...
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-06 | DART: Implicit Doppler Tomography for Radar Novel View Synthesis | DART:用于雷达新颖视图合成的隐式多普勒断层扫描 | Tianshu Huang, John Miller, Akarsh Prabhakara, Tao Jin, Tarana Laroia, Zico Kolter, Anthony Rowe | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Self and Mixed Supervision to Improve Training Labels for Multi-Class Medical Image Segmentation | 自监督和混合监督改进多类医学图像分割的训练标签 | Jianfei Liu, Christopher Parnell, Ronald M. Summers | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Redefining cystoscopy with ai: bladder cancer diagnosis using an efficient hybrid cnn-transformer model | 用人工智能重新定义膀胱镜检查:使用高效的混合 cnn-transformer 模型诊断膀胱癌 | Meryem Amaouche, Ouassim Karrakchou, Mounir Ghogho, Anouar El Ghazzaly, Mohamed Alami, Ahmed Ameur | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | ECAP: Extensive Cut-and-Paste Augmentation for Unsupervised Domain Adaptive Semantic Segmentation | ECAP:无监督域自适应语义分割的广泛剪切和粘贴增强 | Erik Brorsson, Knut Åkesson, Lennart Svensson, Kristofer Bengtsson | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | MedMamba: Vision Mamba for Medical Image Classification | MedMamba:用于医学图像分类的 Vision Mamba | Yubiao Yue, Zhenzhang Li | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Temporal Enhanced Floating Car Observers | 时间增强型浮动汽车观察器 | Jeremias Gerner, Klaus Bogenberger, Stefanie Schmidtner | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Popeye: A Unified Visual-Language Model for Multi-Source Ship Detection from Remote Sensing Imagery | Popeye:用于遥感图像多源船舶检测的统一视觉语言模型 | Wei Zhang, Miaoxin Cai, Tong Zhang, Guoqiang Lei, Yin Zhuang, Xuerui Mao | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Learning 3D object-centric representation through prediction | 通过预测学习以 3D 对象为中心的表示 | John Day, Tushar Arora, Jirui Liu, Li Erran Li, Ming Bo Cai | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | CMDA: Cross-Modal and Domain Adversarial Adaptation for LiDAR-Based 3D Object Detection | CMDA:基于 LiDAR 的 3D 物体检测的跨模态和域对抗适应 | Gyusam Chang, Wonseok Roh, Sujin Jang, Dongwook Lee, Daehyun Ji, Gyeongrok Oh, Jinsun Park, Jinkyu Kim, Sangpil Kim | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Multi-Grained Cross-modal Alignment for Learning Open-vocabulary Semantic Segmentation from Text Supervision | 用于从文本监督学习开放词汇语义分割的多粒度跨模态对齐 | Yajie Liu, Pu Ge, Qingjie Liu, Di Huang | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Causal Prototype-inspired Contrast Adaptation for Unsupervised Domain Adaptive Semantic Segmentation of High-resolution Remote Sensing Imagery | 高分辨率遥感图像无监督域自适应语义分割的因果原型启发的对比度适应 | Jingru Zhu, Ya Guo, Geng Sun, Liang Hong, Jie Chen | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | MolNexTR: A Generalized Deep Learning Model for Molecular Image Recognition | MolNexTR:分子图像识别的通用深度学习模型 | Yufan Chen, Ching Ting Leung, Yong Huang, Jianwei Sun, Hao Chen, Hanyu Gao | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | 3D Object Visibility Prediction in Autonomous Driving | 自动驾驶中的 3D 物体可见性预测 | Chuanyu Luo, Nuo Cheng, Ren Zhong, Haipeng Jiang, Wenyu Chen, Aoli Wang, Pu Li | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Adversarial Infrared Geometry: Using Geometry to Perform Adversarial Attack against Infrared Pedestrian Detectors | 对抗性红外几何:利用几何对红外行人探测器进行对抗性攻击 | Kalibinuer Tiliwalidi | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Portraying the Need for Temporal Data in Flood Detection via Sentinel-1 | 通过 Sentinel-1 描绘洪水检测中对时间数据的需求 | Xavier Bou, Thibaud Ehret, Rafael Grompone von Gioi, Jeremy Anger | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | On Transfer in Classification: How Well do Subsets of Classes Generalize? | 关于分类中的迁移:类子集的泛化能力如何? | Raphael Baena, Lucas Drumetz, Vincent Gripon | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | VastTrack: Vast Category Visual Object Tracking | VastTrack:大类别视觉对象跟踪 | Liang Peng, Junyuan Gao, Xinran Liu, Weihong Li, Shaohua Dong, Zhipeng Zhang, Heng Fan, Libo Zhang | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Inverse-Free Fast Natural Gradient Descent Method for Deep Learning | 深度学习的无逆快速自然梯度下降法 | Xinwei Ou, Ce Zhu, Xiaolin Huang, Yipeng Liu | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Multi-task Learning for Real-time Autonomous Driving Leveraging Task-adaptive Attention Generator | 利用任务自适应注意力生成器进行实时自动驾驶的多任务学习 | Wonhyeok Choi, Mingyu Shin, Hyukzae Lee, Jaehoon Cho, Jaehyeon Park, Sunghoon Im | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Interactive Continual Learning Architecture for Long-Term Personalization of Home Service Robots | 用于家庭服务机器人长期个性化的交互式持续学习架构 | Ali Ayub, Chrystopher Nehaniv, Kerstin Dautenhahn | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Kernel Correlation-Dissimilarity for Multiple Kernel k-Means Clustering | 多核 k 均值聚类的核相关性相异性 | Rina Su, Yu Guo, Caiying Wu, Qiyu Jin, Tieyong Zeng | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Advancing Out-of-Distribution Detection through Data Purification and Dynamic Activation Function Design | 通过数据净化和动态激活函数设计推进分布外检测 | Yingrui Ji, Yao Zhu, Zhigang Li, Jiansheng Chen, Yunlong Kong, Jingbo Chen | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Contrastive Learning of Person-independent Representations for Facial Action Unit Detection | 用于面部动作单元检测的独立于人的表示的对比学习 | Yong Li, Shiguang Shan | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Performance Evaluation of Semi-supervised Learning Frameworks for Multi-Class Weed Detection | 多类杂草检测半监督学习框架的性能评估 | Jiajia Li, Dong Chen, Xunyuan Yin, Zhaojian Li | arxiv.org/pdf/2403.03… | null |
Transformer
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-06 | Joint multi-task learning improves weakly-supervised biomarker prediction in computational pathology | 联合多任务学习改善了计算病理学中的弱监督生物标志物预测 | Omar S. M. El Nahhas, Georg Wölflein, Marta Ligero, Tim Lenz, Marko van Treeck, Firas Khader, Daniel Truhn, Jakob Nikolas Kather | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | A Density-Guided Temporal Attention Transformer for Indiscernible Object Counting in Underwater Video | 用于水下视频中难以辨别的物体计数的密度引导时间注意力变换器 | Cheng-Yen Yang, Hsiang-Wei Huang, Zhongyu Jiang, Hao Wang, Farron Wallace, Jenq-Neng Hwang | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Slot Abstractors: Toward Scalable Abstract Visual Reasoning | Slot Abstractors:迈向可扩展的抽象视觉推理 | Shanka Subhra Mondal, Jonathan D. Cohen, Taylor W. Webb | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | HDRFlow: Real-Time HDR Video Reconstruction with Large Motions | HDRFlow:大运动的实时 HDR 视频重建 | Gangwei Xu, Yujin Wang, Jinwei Gu, Tianfan Xue, Xin Yang | arxiv.org/pdf/2403.03… | null |
3D/CG
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-06 | Self-supervised Photographic Image Layout Representation Learning | 自监督摄影图像布局表示学习 | Zhaoran Zhao, Peng Lu, Xujun Peng, Wenhao Guo | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Extend Your Own Correspondences: Unsupervised Distant Point Cloud Registration by Progressive Distance Extension | 扩展您自己的对应关系:通过渐进距离扩展进行无监督的远程点云配准 | Quan Liu, Hongzi Zhu, Zhenxi Wang, Yunsong Zhou, Shan Chang, Minyi Guo | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Fast, nonlocal and neural: a lightweight high quality solution to image denoising | 快速、非局部和神经:轻量级高质量图像去噪解决方案 | Yu Guo, Axel Davy, Gabriele Facciolo, Jean-Michel Morel, Qiyu Jin | arxiv.org/pdf/2403.03… | null |
各类学习方式
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-06 | MeaCap: Memory-Augmented Zero-shot Image Captioning | MeaCap:内存增强零样本图像字幕 | Zequn Zeng, Yan Xie, Hao Zhang, Chiyu Chen, Zhengjue Wang, Bo Chen | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Task Attribute Distance for Few-Shot Learning: Theoretical Analysis and Applications | 小样本学习的任务属性距离:理论分析与应用 | Minyang Hu, Hong Chang, Zong Guo, Bingpeng Ma, Shiguan Shan, Xilin Chen | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Boosting Meta-Training with Base Class Information for Few-Shot Learning | 利用基类信息促进元训练以实现少样本学习 | Weihao Jiang, Guodong Liu, Di He, Kun He | arxiv.org/pdf/2403.03… | null |
其他
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-06 | Hierarchical Diffusion Policy for Kinematics-Aware Multi-Task Robotic Manipulation | 运动学感知多任务机器人操作的分层扩散策略 | Xiao Ma, Sumit Patidar, Iain Haughton, Stephen James | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | A Precision Drone Landing System using Visual and IR Fiducial Markers and a Multi-Payload Camera | 使用视觉和红外基准标记以及多有效负载相机的精密无人机着陆系统 | Joshua Springer, Gylfi Þór Guðmundsson, Marcel Kyas | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | SUPClust: Active Learning at the Boundaries | SUPClust:边界上的主动学习 | Yuta Ono, Till Aczel, Benjamin Estermann, Roger Wattenhofer | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Bridging Diversity and Uncertainty in Active learning with Self-Supervised Pre-Training | 通过自我监督预训练弥合主动学习的多样性和不确定性 | Paul Doucet, Benjamin Estermann, Till Aczel, Roger Wattenhofer | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Harnessing Meta-Learning for Improving Full-Frame Video Stabilization | 利用元学习提高全帧视频稳定性 | Muhammad Kashif Ali, Eun Woo Im, Dongjin Kim, Tae Hyun Kim | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | HMD-Poser: On-Device Real-time Human Motion Tracking from Scalable Sparse Observations | HMD-Poser:通过可扩展稀疏观测进行设备上实时人体运动跟踪 | Peng Dai, Yang Zhang, Tao Liu, Zhen Fan, Tianyuan Du, Zhuo Su, Xiaozheng Zheng, Zeming Li | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Low-Dose CT Image Reconstruction by Fine-Tuning a UNet Pretrained for Gaussian Denoising for the Downstream Task of Image Enhancement | 通过微调预训练的 UNet 进行低剂量 CT 图像重建,用于图像增强的下游任务的高斯去噪 | Tim Selig, Thomas März, Martin Storath, Andreas Weinmann | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | Gadolinium dose reduction for brain MRI using conditional deep learning | 使用条件深度学习减少脑 MRI 的钆剂量 | Thomas Pinetz, Erich Kobler, Robert Haase, Julian A. Luetkens, Mathias Meetschen, Johannes Haubold, Cornelius Deuschl, Alexander Radbruch, Katerina Deike, Alexander Effland | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | D4C glove-train: solving the RPM and Bongard-logo problem by distributing and Circumscribing concepts | D4C 手套系:通过分布和划线概念解决 RPM 和 Bongard-logo 问题 | Ruizhuo Song, Beiming Yuan | arxiv.org/pdf/2403.03… | null |
| 2024-03-06 | LEAD: Learning Decomposition for Source-free Universal Domain Adaptation | LEAD:无源通用域适应的学习分解 | Sanqing Qu, Tianpei Zou, Lianghua He, Florian Röhrbein, Alois Knoll, Guang Chen, Changjun Jiang | arxiv.org/pdf/2403.03… | null |