[UPDATED!] 2024-03-03 (Publish Time)
生成模型
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-03 | Critical windows: non-asymptotic theory for feature emergence in diffusion models | 临界窗口:扩散模型中特征出现的非渐近理论 | Marvin Li, Sitan Chen | arxiv.org/pdf/2403.01… | null |
| 2024-03-03 | SCott: Accelerating Diffusion Models with Stochastic Consistency Distillation | SCott:通过随机一致性蒸馏加速扩散模型 | Hongjian Liu, Qingsong Xie, Zhijie Deng, Chen Chen, Shixiang Tang, Fueyang Fu, Zheng-jun Zha, Haonan Lu | arxiv.org/pdf/2403.01… | null |
| 2024-03-03 | Learning A Physical-aware Diffusion Model Based on Transformer for Underwater Image Enhancement | 学习基于 Transformer 的物理感知扩散模型,用于水下图像增强 | Chen Zhao, Chenyu Dong, Weiling Cai | arxiv.org/pdf/2403.01… | null |
| 2024-03-03 | Regeneration Based Training-free Attribution of Fake Images Generated by Text-to-Image Generative Models | 基于再生的文本到图像生成模型生成的假图像的免训练归因 | Meiling Li, Zhenxing Qian, Xinpeng Zhang | arxiv.org/pdf/2403.01… | null |
| 2024-03-03 | Approximations to the Fisher Information Metric of Deep Generative Models for Out-Of-Distribution Detection | 用于分布外检测的深度生成模型的 Fisher 信息度量的近似 | Sam Dauncey, Chris Holmes, Christopher Williams, Fabian Falck | arxiv.org/pdf/2403.01… | null |
多模态
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-03 | InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding | InfiMM-HD:高分辨率多模态理解的飞跃 | Haogeng Liu, Quanzeng You, Xiaotian Han, Yiqi Wang, Bohan Zhai, Yongfei Liu, Yunzhe Tao, Huaibo Huang, Ran He, Hongxia Yang | arxiv.org/pdf/2403.01… | null |
| 2024-03-03 | MovieLLM: Enhancing Long Video Understanding with AI-Generated Movies | MovieLLM:通过人工智能生成的电影增强长视频理解 | Zhende Song, Chenchen Wang, Jiamu Sheng, Chi Zhang, Gang Yu, Jiayuan Fan, Tao Chen | arxiv.org/pdf/2403.01… | null |
模型压缩/优化
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-03 | Image2Sentence based Asymmetrical Zero-shot Composed Image Retrieval | 基于Image2Sentence的非对称零样本合成图像检索 | Yongchao Du, Min Wang, Wengang Zhou, Shuping Hui, Houqiang Li | arxiv.org/pdf/2403.01… | null |
| 2024-03-03 | Logit Standardization in Knowledge Distillation | 知识蒸馏中的 Logit 标准化 | Shangquan Sun, Wenqi Ren, Jingzhi Li, Rui Wang, Xiaochun Cao | arxiv.org/pdf/2403.01… | link |
分类/检测/识别/分割/...
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-03 | AIO2: Online Correction of Object Labels for Deep Learning with Incomplete Annotation in Remote Sensing Image Segmentation | AIO2:遥感图像分割中标注不完整的深度学习对象标签在线校正 | Chenying Liu, Conrad M Albrecht, Yi Wang, Qingyu Li, Xiao Xiang Zhu | arxiv.org/pdf/2403.01… | link |
| 2024-03-03 | A Unified Model Selection Technique for Spectral Clustering Based Motion Segmentation | 基于谱聚类的运动分割统一模型选择技术 | Yuxiang Huang, John Zelek | arxiv.org/pdf/2403.01… | null |
| 2024-03-03 | Rethinking CLIP-based Video Learners in Cross-Domain Open-Vocabulary Action Recognition | 重新思考跨域开放词汇动作识别中基于 CLIP 的视频学习器 | Kun-Yu Lin, Henghui Ding, Jiaming Zhou, Yi-Xing Peng, Zhilin Zhao, Chen Change Loy, Wei-Shi Zheng | arxiv.org/pdf/2403.01… | link |
| 2024-03-03 | Self-Supervised Representation Learning with Meta Comprehensive Regularization | 具有元综合正则化的自监督表示学习 | Huijie Guo, Ying Ba, Jie Hu, Lingyu Si, Wenwen Qiang, Lei Shi | arxiv.org/pdf/2403.01… | null |
| 2024-03-03 | CDSE-UNet: Enhancing COVID-19 CT Image Segmentation with Canny Edge Detection and Dual-Path SENet Feature Fusion | CDSE-UNet:通过 Canny 边缘检测和双路径 SENet 特征融合增强 COVID-19 CT 图像分割 | Jiao Ding, Jie Chang, Renrui Han, Li Yang | arxiv.org/pdf/2403.01… | null |
| 2024-03-03 | End-to-End Human Instance Matting | 端到端人体实例抠图 | Qinglin Liu, Shengping Zhang, Quanling Meng, Bineng Zhong, Peiqiang Liu, Hongxun Yao | arxiv.org/pdf/2403.01… | link |
| 2024-03-03 | EAGLE: Eigen Aggregation Learning for Object-Centric Unsupervised Semantic Segmentation | EAGLE:以对象为中心的无监督语义分割的特征聚合学习 | Chanyoung Kim, Woojung Han, Dayun Ju, Seong Jae Hwang | arxiv.org/pdf/2403.01… | link |
| 2024-03-03 | CCC: Color Classified Colorization | CCC:颜色分类着色 | Mrityunjoy Gain, Avi Deb Raha, Rameswar Debnath | arxiv.org/pdf/2403.01… | null |
| 2024-03-03 | Is in-domain data beneficial in transfer learning for landmarks detection in x-ray images? | 域内数据对于 X 射线图像中地标检测的迁移学习是否有益? | Roberto Di Via, Matteo Santacesaria, Francesca Odone, Vito Paolo Pastore | arxiv.org/pdf/2403.01… | null |
| 2024-03-03 | Multiview Subspace Clustering of Hyperspectral Images based on Graph Convolutional Networks | 基于图卷积网络的高光谱图像多视图子空间聚类 | Xianju Li, Renxiang Guan, Zihao Li, Hao Liu, Jing Yang | arxiv.org/pdf/2403.01… | null |
| 2024-03-03 | GuardT2I: Defending Text-to-Image Models from Adversarial Prompts | GuardT2I:保护文本到图像模型免受对抗性提示的影响 | Yijun Yang, Ruiyuan Gao, Xiao Yang, Jianyuan Zhong, Qiang Xu | arxiv.org/pdf/2403.01… | null |
| 2024-03-03 | GPTSee: Enhancing Moment Retrieval and Highlight Detection via Description-Based Similarity Features | GPTSee:通过基于描述的相似性特征增强时刻检索和亮点检测 | Yunzhuo Sun, Yifang Xu, Zien Xie, Yukun Shu, Sidan Du | arxiv.org/pdf/2403.01… | null |
| 2024-03-03 | A Simple-but-effective Baseline for Training-free Class-Agnostic Counting | 简单但有效的免训练与类别无关计数的基线 | Yuhao Lin, Haiming Xu, Lingqiao Liu, Javen Qinfeng Shi | arxiv.org/pdf/2403.01… | null |
| 2024-03-03 | LUM-ViT: Learnable Under-sampling Mask Vision Transformer for Bandwidth Limited Optical Signal Acquisition | LUM-ViT:用于带宽有限光信号采集的可学习欠采样掩模视觉变压器 | Lingfeng Liu, Dong Ni, Hangjie Yuan | arxiv.org/pdf/2403.01… | link |
| 2024-03-03 | Region-Transformer: Self-Attention Region Based Class-Agnostic Point Cloud Segmentation | Region-Transformer:基于自注意力区域的与类无关的点云分割 | Dipesh Gyawali, Jian Zhang, BB Karki | arxiv.org/pdf/2403.01… | null |
| 2024-03-03 | Enhancing Retinal Vascular Structure Segmentation in Images With a Novel Design Two-Path Interactive Fusion Module Model | 通过新颖设计的两路交互式融合模块模型增强图像中的视网膜血管结构分割 | Rui Yang, Shunpu Zhang | arxiv.org/pdf/2403.01… | null |
图像理解
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-03 | OccFusion: A Straightforward and Effective Multi-Sensor Fusion Framework for 3D Occupancy Prediction | OccFusion:用于 3D 占用预测的简单有效的多传感器融合框架 | Zhenxing Ming, Julie Stephany Berrio, Mao Shan, Stewart Worrall | arxiv.org/pdf/2403.01… | null |
| 2024-03-03 | Kick Back & Relax++: Scaling Beyond Ground-Truth Depth with SlowTV & CribsTV | Kick Back & Relax++:利用 SlowTV 和 CribsTV 超越真实深度 | Jaime Spencer, Chris Russell, Simon Hadfield, Richard Bowden | arxiv.org/pdf/2403.01… | link |
| 2024-03-03 | Pyramid Feature Attention Network for Monocular Depth Prediction | 用于单目深度预测的金字塔特征注意网络 | Yifang Xu, Chenglei Peng, Ming Li, Yang Li, Sidan Du | arxiv.org/pdf/2403.01… | null |
| 2024-03-03 | Depth Estimation Algorithm Based on Transformer-Encoder and Feature Fusion | 基于Transformer-Encoder和特征融合的深度估计算法 | Linhan Xia, Junbang Liu, Tong Wu | arxiv.org/pdf/2403.01… | null |
LLM
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-03 | SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos | 模式:状态变化对于教学视频中的程序规划很重要 | Yulei Niu, Wenliang Guo, Long Chen, Xudong Lin, Shih-Fu Chang | arxiv.org/pdf/2403.01… | null |
Transformer
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-03 | You Need to Pay Better Attention | 你需要更加注意 | Mehran Hosseini, Peyman Hosseini | arxiv.org/pdf/2403.01… | null |
| 2024-03-03 | APISR: Anime Production Inspired Real-World Anime Super-Resolution | APISR:动漫制作启发现实世界动漫超分辨率 | Boyang Wang, Fengyu Yang, Xihang Yu, Chao Zhang, Hanbin Zhao | arxiv.org/pdf/2403.01… | link |
| 2024-03-03 | MatchU: Matching Unseen Objects for 6D Pose Estimation from RGB-D Images | MatchU:匹配看不见的对象以从 RGB-D 图像进行 6D 姿势估计 | Junwen Huang, Hao Yu, Kuan-Ting Yu, Nassir Navab, Slobodan Ilic, Benjamin Busam | arxiv.org/pdf/2403.01… | null |
3D/CG
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-03 | Spectrum AUC Difference (SAUCD): Human-aligned 3D Shape Evaluation | 频谱 AUC 差异 (SAUCD):人体对齐 3D 形状评估 | Tianyu Luan, Zhong Li, Lele Chen, Xuan Gong, Lichang Chen, Yi Xu, Junsong Yuan | arxiv.org/pdf/2403.01… | null |
| 2024-03-05 | 3DGStream: On-the-Fly Training of 3D Gaussians for Efficient Streaming of Photo-Realistic Free-Viewpoint Videos | 3DGStream:3D 高斯的动态训练,用于高效流式传输逼真的自由视点视频 | Jiakai Sun, Han Jiao, Guangyuan Li, Zhanjie Zhang, Lei Zhao, Wei Xing | arxiv.org/pdf/2403.01… | null |
| 2024-03-03 | Unsigned Orthogonal Distance Fields: An Accurate Neural Implicit Representation for Diverse 3D Shapes | 无符号正交距离场:各种 3D 形状的准确神经隐式表示 | Yujie Lu, Long Wan, Nayu Ding, Yulong Wang, Shuhan Shen, Shen Cai, Lin Gao | arxiv.org/pdf/2403.01… | null |
各类学习方式
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-05 | Efficient Action Counting with Dynamic Queries | 通过动态查询进行高效的操作计数 | Zishi Li, Xiaoxuan Ma, Qiuyan Shang, Wentao Zhu, Hai Ci, Yu Qiao, Yizhou Wang | arxiv.org/pdf/2403.01… | link |
其他
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-03 | DUFOMap: Efficient Dynamic Awareness Mapping | DUFOMap:高效的动态感知绘图 | Daniel Duberg, Qingwen Zhang, MingKai Jia, Patric Jensfelt | arxiv.org/pdf/2403.01… | link |
| 2024-03-03 | Dynamic Adapter Meets Prompt Tuning: Parameter-Efficient Transfer Learning for Point Cloud Analysis | 动态适配器满足快速调整:用于点云分析的参数高效迁移学习 | Xin Zhou, Dingkang Liang, Wei Xu, Xingkui Zhu, Yihan Xu, Zhikang Zou, Xiang Bai | arxiv.org/pdf/2403.01… | link |
| 2024-03-03 | SA-MixNet: Structure-aware Mixup and Invariance Learning for Scribble-supervised Road Extraction in Remote Sensing Images | SA-MixNet:遥感图像中涂鸦监督道路提取的结构感知混合和不变学习 | Jie Feng, Hao Huang, Junpeng Zhang, Weisheng Dong, Dingwen Zhang, Licheng Jiao | arxiv.org/pdf/2403.01… | null |