[分享][每日更新][2024.03.03][CV_arxiv_papers]

378 阅读7分钟

[UPDATED!] 2024-03-03 (Publish Time)

生成模型

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-03Critical windows: non-asymptotic theory for feature emergence in diffusion models临界窗口:扩散模型中特征出现的非渐近理论Marvin Li, Sitan Chenarxiv.org/pdf/2403.01…null
2024-03-03SCott: Accelerating Diffusion Models with Stochastic Consistency DistillationSCott:通过随机一致性蒸馏加速扩散模型Hongjian Liu, Qingsong Xie, Zhijie Deng, Chen Chen, Shixiang Tang, Fueyang Fu, Zheng-jun Zha, Haonan Luarxiv.org/pdf/2403.01…null
2024-03-03Learning A Physical-aware Diffusion Model Based on Transformer for Underwater Image Enhancement学习基于 Transformer 的物理感知扩散模型,用于水下图像增强Chen Zhao, Chenyu Dong, Weiling Caiarxiv.org/pdf/2403.01…null
2024-03-03Regeneration Based Training-free Attribution of Fake Images Generated by Text-to-Image Generative Models基于再生的文本到图像生成模型生成的假图像的免训练归因Meiling Li, Zhenxing Qian, Xinpeng Zhangarxiv.org/pdf/2403.01…null
2024-03-03Approximations to the Fisher Information Metric of Deep Generative Models for Out-Of-Distribution Detection用于分布外检测的深度生成模型的 Fisher 信息度量的近似Sam Dauncey, Chris Holmes, Christopher Williams, Fabian Falckarxiv.org/pdf/2403.01…null

多模态

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-03InfiMM-HD: A Leap Forward in High-Resolution Multimodal UnderstandingInfiMM-HD:高分辨率多模态理解的飞跃Haogeng Liu, Quanzeng You, Xiaotian Han, Yiqi Wang, Bohan Zhai, Yongfei Liu, Yunzhe Tao, Huaibo Huang, Ran He, Hongxia Yangarxiv.org/pdf/2403.01…null
2024-03-03MovieLLM: Enhancing Long Video Understanding with AI-Generated MoviesMovieLLM:通过人工智能生成的电影增强长视频理解Zhende Song, Chenchen Wang, Jiamu Sheng, Chi Zhang, Gang Yu, Jiayuan Fan, Tao Chenarxiv.org/pdf/2403.01…null

模型压缩/优化

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-03Image2Sentence based Asymmetrical Zero-shot Composed Image Retrieval基于Image2Sentence的非对称零样本合成图像检索Yongchao Du, Min Wang, Wengang Zhou, Shuping Hui, Houqiang Liarxiv.org/pdf/2403.01…null
2024-03-03Logit Standardization in Knowledge Distillation知识蒸馏中的 Logit 标准化Shangquan Sun, Wenqi Ren, Jingzhi Li, Rui Wang, Xiaochun Caoarxiv.org/pdf/2403.01…link

分类/检测/识别/分割/...

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-03AIO2: Online Correction of Object Labels for Deep Learning with Incomplete Annotation in Remote Sensing Image SegmentationAIO2:遥感图像分割中标注不完整的深度学习对象标签在线校正Chenying Liu, Conrad M Albrecht, Yi Wang, Qingyu Li, Xiao Xiang Zhuarxiv.org/pdf/2403.01…link
2024-03-03A Unified Model Selection Technique for Spectral Clustering Based Motion Segmentation基于谱聚类的运动分割统一模型选择技术Yuxiang Huang, John Zelekarxiv.org/pdf/2403.01…null
2024-03-03Rethinking CLIP-based Video Learners in Cross-Domain Open-Vocabulary Action Recognition重新思考跨域开放词汇动作识别中基于 CLIP 的视频学习器Kun-Yu Lin, Henghui Ding, Jiaming Zhou, Yi-Xing Peng, Zhilin Zhao, Chen Change Loy, Wei-Shi Zhengarxiv.org/pdf/2403.01…link
2024-03-03Self-Supervised Representation Learning with Meta Comprehensive Regularization具有元综合正则化的自监督表示学习Huijie Guo, Ying Ba, Jie Hu, Lingyu Si, Wenwen Qiang, Lei Shiarxiv.org/pdf/2403.01…null
2024-03-03CDSE-UNet: Enhancing COVID-19 CT Image Segmentation with Canny Edge Detection and Dual-Path SENet Feature FusionCDSE-UNet:通过 Canny 边缘检测和双路径 SENet 特征融合增强 COVID-19 CT 图像分割Jiao Ding, Jie Chang, Renrui Han, Li Yangarxiv.org/pdf/2403.01…null
2024-03-03End-to-End Human Instance Matting端到端人体实例抠图Qinglin Liu, Shengping Zhang, Quanling Meng, Bineng Zhong, Peiqiang Liu, Hongxun Yaoarxiv.org/pdf/2403.01…link
2024-03-03EAGLE: Eigen Aggregation Learning for Object-Centric Unsupervised Semantic SegmentationEAGLE:以对象为中心的无监督语义分割的特征聚合学习Chanyoung Kim, Woojung Han, Dayun Ju, Seong Jae Hwangarxiv.org/pdf/2403.01…link
2024-03-03CCC: Color Classified ColorizationCCC:颜色分类着色Mrityunjoy Gain, Avi Deb Raha, Rameswar Debnatharxiv.org/pdf/2403.01…null
2024-03-03Is in-domain data beneficial in transfer learning for landmarks detection in x-ray images?域内数据对于 X 射线图像中地标检测的迁移学习是否有益?Roberto Di Via, Matteo Santacesaria, Francesca Odone, Vito Paolo Pastorearxiv.org/pdf/2403.01…null
2024-03-03Multiview Subspace Clustering of Hyperspectral Images based on Graph Convolutional Networks基于图卷积网络的高光谱图像多视图子空间聚类Xianju Li, Renxiang Guan, Zihao Li, Hao Liu, Jing Yangarxiv.org/pdf/2403.01…null
2024-03-03GuardT2I: Defending Text-to-Image Models from Adversarial PromptsGuardT2I:保护文本到图像模型免受对抗性提示的影响Yijun Yang, Ruiyuan Gao, Xiao Yang, Jianyuan Zhong, Qiang Xuarxiv.org/pdf/2403.01…null
2024-03-03GPTSee: Enhancing Moment Retrieval and Highlight Detection via Description-Based Similarity FeaturesGPTSee:通过基于描述的相似性特征增强时刻检索和亮点检测Yunzhuo Sun, Yifang Xu, Zien Xie, Yukun Shu, Sidan Duarxiv.org/pdf/2403.01…null
2024-03-03A Simple-but-effective Baseline for Training-free Class-Agnostic Counting简单但有效的免训练与类别无关计数的基线Yuhao Lin, Haiming Xu, Lingqiao Liu, Javen Qinfeng Shiarxiv.org/pdf/2403.01…null
2024-03-03LUM-ViT: Learnable Under-sampling Mask Vision Transformer for Bandwidth Limited Optical Signal AcquisitionLUM-ViT:用于带宽有限光信号采集的可学习欠采样掩模视觉变压器Lingfeng Liu, Dong Ni, Hangjie Yuanarxiv.org/pdf/2403.01…link
2024-03-03Region-Transformer: Self-Attention Region Based Class-Agnostic Point Cloud SegmentationRegion-Transformer:基于自注意力区域的与类无关的点云分割Dipesh Gyawali, Jian Zhang, BB Karkiarxiv.org/pdf/2403.01…null
2024-03-03Enhancing Retinal Vascular Structure Segmentation in Images With a Novel Design Two-Path Interactive Fusion Module Model通过新颖设计的两路交互式融合模块模型增强图像中的视网膜血管结构分割Rui Yang, Shunpu Zhangarxiv.org/pdf/2403.01…null

图像理解

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-03OccFusion: A Straightforward and Effective Multi-Sensor Fusion Framework for 3D Occupancy PredictionOccFusion:用于 3D 占用预测的简单有效的多传感器融合框架Zhenxing Ming, Julie Stephany Berrio, Mao Shan, Stewart Worrallarxiv.org/pdf/2403.01…null
2024-03-03Kick Back & Relax++: Scaling Beyond Ground-Truth Depth with SlowTV & CribsTVKick Back & Relax++:利用 SlowTV 和 CribsTV 超越真实深度Jaime Spencer, Chris Russell, Simon Hadfield, Richard Bowdenarxiv.org/pdf/2403.01…link
2024-03-03Pyramid Feature Attention Network for Monocular Depth Prediction用于单目深度预测的金字塔特征注意网络Yifang Xu, Chenglei Peng, Ming Li, Yang Li, Sidan Duarxiv.org/pdf/2403.01…null
2024-03-03Depth Estimation Algorithm Based on Transformer-Encoder and Feature Fusion基于Transformer-Encoder和特征融合的深度估计算法Linhan Xia, Junbang Liu, Tong Wuarxiv.org/pdf/2403.01…null

LLM

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-03SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos模式:状态变化对于教学视频中的程序规划很重要Yulei Niu, Wenliang Guo, Long Chen, Xudong Lin, Shih-Fu Changarxiv.org/pdf/2403.01…null

Transformer

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-03You Need to Pay Better Attention你需要更加注意Mehran Hosseini, Peyman Hosseiniarxiv.org/pdf/2403.01…null
2024-03-03APISR: Anime Production Inspired Real-World Anime Super-ResolutionAPISR:动漫制作启发现实世界动漫超分辨率Boyang Wang, Fengyu Yang, Xihang Yu, Chao Zhang, Hanbin Zhaoarxiv.org/pdf/2403.01…link
2024-03-03MatchU: Matching Unseen Objects for 6D Pose Estimation from RGB-D ImagesMatchU:匹配看不见的对象以从 RGB-D 图像进行 6D 姿势估计Junwen Huang, Hao Yu, Kuan-Ting Yu, Nassir Navab, Slobodan Ilic, Benjamin Busamarxiv.org/pdf/2403.01…null

3D/CG

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-03Spectrum AUC Difference (SAUCD): Human-aligned 3D Shape Evaluation频谱 AUC 差异 (SAUCD):人体对齐 3D 形状评估Tianyu Luan, Zhong Li, Lele Chen, Xuan Gong, Lichang Chen, Yi Xu, Junsong Yuanarxiv.org/pdf/2403.01…null
2024-03-053DGStream: On-the-Fly Training of 3D Gaussians for Efficient Streaming of Photo-Realistic Free-Viewpoint Videos3DGStream:3D 高斯的动态训练,用于高效流式传输逼真的自由视点视频Jiakai Sun, Han Jiao, Guangyuan Li, Zhanjie Zhang, Lei Zhao, Wei Xingarxiv.org/pdf/2403.01…null
2024-03-03Unsigned Orthogonal Distance Fields: An Accurate Neural Implicit Representation for Diverse 3D Shapes无符号正交距离场:各种 3D 形状的准确神经隐式表示Yujie Lu, Long Wan, Nayu Ding, Yulong Wang, Shuhan Shen, Shen Cai, Lin Gaoarxiv.org/pdf/2403.01…null

各类学习方式

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-05Efficient Action Counting with Dynamic Queries通过动态查询进行高效的操作计数Zishi Li, Xiaoxuan Ma, Qiuyan Shang, Wentao Zhu, Hai Ci, Yu Qiao, Yizhou Wangarxiv.org/pdf/2403.01…link

其他

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-03DUFOMap: Efficient Dynamic Awareness MappingDUFOMap:高效的动态感知绘图Daniel Duberg, Qingwen Zhang, MingKai Jia, Patric Jensfeltarxiv.org/pdf/2403.01…link
2024-03-03Dynamic Adapter Meets Prompt Tuning: Parameter-Efficient Transfer Learning for Point Cloud Analysis动态适配器满足快速调整:用于点云分析的参数高效迁移学习Xin Zhou, Dingkang Liang, Wei Xu, Xingkui Zhu, Yihan Xu, Zhikang Zou, Xiang Baiarxiv.org/pdf/2403.01…link
2024-03-03SA-MixNet: Structure-aware Mixup and Invariance Learning for Scribble-supervised Road Extraction in Remote Sensing ImagesSA-MixNet:遥感图像中涂鸦监督道路提取的结构感知混合和不变学习Jie Feng, Hao Huang, Junpeng Zhang, Weisheng Dong, Dingwen Zhang, Licheng Jiaoarxiv.org/pdf/2403.01…null