[分享][每日更新][2024.03.18][CV_arxiv_papers]

266 阅读15分钟

[UPDATED!] 2024-03-18 (Publish Time)

生成模型

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-18LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution ImagesLLaVA-UHD:感知任何长宽比和高分辨率图像的 LMMRuyi Xu, Yuan Yao, Zonghao Guo, Junbo Cui, Zanlin Ni, Chunjiang Ge, Tat-Seng Chua, Zhiyuan Liu, Maosong Sun, Gao Huangarxiv.org/pdf/2403.11…null
2024-03-18Urban Scene Diffusion through Semantic Occupancy Map通过语义占用图进行城市场景扩散Junge Zhang, Qihang Zhang, Li Zhang, Ramana Rao Kompella, Gaowen Liu, Bolei Zhouarxiv.org/pdf/2403.11…null
2024-03-18Binary Noise for Binary Tasks: Masked Bernoulli Diffusion for Unsupervised Anomaly Detection用于二元任务的二元噪声:用于无监督异常检测的掩蔽伯努利扩散Julia Wolleb, Florentin Bieder, Paul Friedrich, Peter Zhang, Alicia Durrer, Philippe C. Cattinarxiv.org/pdf/2403.11…null
2024-03-18Diffusion-Based Environment-Aware Trajectory Prediction基于扩散的环境感知轨迹预测Theodor Westny, Björn Olofsson, Erik Friskarxiv.org/pdf/2403.11…null
2024-03-18Arc2Face: A Foundation Model of Human FacesArc2Face:人脸基础模型Foivos Paraperas Papantoniou, Alexandros Lattas, Stylianos Moschoglou, Jiankang Deng, Bernhard Kainz, Stefanos Zafeiriouarxiv.org/pdf/2403.11…null
2024-03-18LoRA-Composer: Leveraging Low-Rank Adaptation for Multi-Concept Customization in Training-Free Diffusion ModelsLoRA-Composer:利用低秩适应在免训练扩散模型中实现多概念定制Yang Yang, Wen Wang, Liang Peng, Chaotian Song, Yao Chen, Hengjia Li, Xiaolong Yang, Qinglin Lu, Deng Cai, Boxi Wu, et.al.arxiv.org/pdf/2403.11…null
2024-03-18CRS-Diff: Controllable Generative Remote Sensing Foundation ModelCRS-Diff:可控生成遥感基础模型Datao Tang, Xiangyong Cao, Xingsong Hou, Zhongyuan Jiang, Deyu Mengarxiv.org/pdf/2403.11…null
2024-03-18EffiVED:Efficient Video Editing via Text-instruction Diffusion ModelsEffiVED:通过文本指令扩散模型进行高效视频编辑Zhenghao Zhang, Zuozhuo Dai, Long Qin, Weizhi Wangarxiv.org/pdf/2403.11…null
2024-03-18EchoReel: Enhancing Action Generation of Existing Video Diffusion ModelsEchoReel:增强现有视频传播模型的动作生成Jianzhi liu, Junchen Zhu, Lianli Gao, Jingkuan Songarxiv.org/pdf/2403.11…null
2024-03-18Diffusion Models are Geometry Critics: Single Image 3D Editing Using Pre-Trained Diffusion Priors扩散模型是几何批评家:使用预先训练的扩散先验进行单图像 3D 编辑Ruicheng Wang, Jianfeng Xiang, Jiaolong Yang, Xin Tongarxiv.org/pdf/2403.11…null
2024-03-18CasSR: Activating Image Power for Real-World Image Super-ResolutionCasSR:激活图像能力以实现真实世界图像超分辨率Haolan Chen, Jinhua Hao, Kai Zhao, Kun Yuan, Ming Sun, Chao Zhou, Wei Huarxiv.org/pdf/2403.11…null
2024-03-18VmambaIR: Visual State Space Model for Image RestorationVmambaIR:用于图像恢复的视觉状态空间模型Yuan Shi, Bin Xia, Xiaoyu Jin, Xing Wang, Tianyu Zhao, Xin Xia, Xuefeng Xiao, Wenming Yangarxiv.org/pdf/2403.11…null
2024-03-18DreamSampler: Unifying Diffusion Sampling and Score Distillation for Image ManipulationDreamSampler:统一图像处理的扩散采样和分数蒸馏Jeongsol Kim, Geon Yeong Park, Jong Chul Yearxiv.org/pdf/2403.11…null

多模态

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-18MASSTAR: A Multi-Modal and Large-Scale Scene Dataset with a Versatile Toolchain for Surface Prediction and CompletionMASSTAR:多模态和大规模场景数据集,具有用于表面预测和完成的多功能工具链Guiyong Zheng, Jinqi Jiang, Chen Feng, Shaojie Shen, Boyu Zhouarxiv.org/pdf/2403.11…null
2024-03-18End-to-end multi-modal product matching in fashion e-commerce时尚电商端到端多模态产品匹配Sándor Tóth, Stephen Wilson, Alexia Tsoukara, Enric Moreu, Anton Masalovich, Lars Roemheldarxiv.org/pdf/2403.11…null
2024-03-183DGS-Calib: 3D Gaussian Splatting for Multimodal SpatioTemporal Calibration3DGS-Calib:用于多模态时空校准的 3D 高斯泼溅Quentin Herau, Moussab Bennehar, Arthur Moreau, Nathan Piasco, Luis Roldao, Dzmitry Tsishkou, Cyrille Migniot, Pascal Vasseur, Cédric Demonceauxarxiv.org/pdf/2403.11…null
2024-03-18OCR is All you need: Importing Multi-Modality into Image-based Defect Detection SystemOCR 就是您所需要的:将多模态导入基于图像的缺陷检测系统Chih-Chung Hsu, Chia-Ming Lee, Chun-Hung Sun, Kuang-Ming Wuarxiv.org/pdf/2403.11…null
2024-03-18Sim-to-Real Grasp Detection with Global-to-Local RGB-D Adaptation具有全局到局部 RGB-D 适应的模拟到真实抓取检测Haoxiang Ma, Ran Qin, Modi shi, Boyang Gao, Di Huangarxiv.org/pdf/2403.11…null
2024-03-18VideoAgent: A Memory-augmented Multimodal Agent for Video UnderstandingVideoAgent:用于视频理解的内存增强多模态代理Yue Fan, Xiaojian Ma, Rujie Wu, Yuntao Du, Jiaqi Li, Zhi Gao, Qing Liarxiv.org/pdf/2403.11…null
2024-03-18Generative Motion Stylization within Canonical Motion Space规范运动空间内的生成运动风格化Jiaxu Zhang, Xin Chen, Gang Yu, Zhigang Tuarxiv.org/pdf/2403.11…null
2024-03-18Path-GPTOmic: A Balanced Multi-modal Learning Framework for Survival Outcome PredictionPath-GPTOmic:用于生存结果预测的平衡多模态学习框架Hongxiao Wang, Yang Yang, Zhuo Zhao, Pengfei Gu, Nishchal Sapkota, Danny Z. Chenarxiv.org/pdf/2403.11…null

Nerf

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-18Exploring 3D-aware Latent Spaces for Efficiently Learning Numerous Scenes探索 3D 感知潜在空间以有效学习众多场景Antoine Schnepf, Karim Kassab, Jean-Yves Franceschi, Laurent Caraffa, Flavian Vasile, Jeremie Mary, Andrew Comport, Valérie Gouet-Brunetarxiv.org/pdf/2403.11…null
2024-03-18UV Gaussians: Joint Learning of Mesh Deformation and Gaussian Textures for Human Avatar ModelingUV 高斯:用于人体头像建模的网格变形和高斯纹理的联合学习Yujiao Jiang, Qingmin Liao, Xiaoyu Li, Li Ma, Qi Zhang, Chaopeng Zhang, Zongqing Lu, Ying Shanarxiv.org/pdf/2403.11…null
2024-03-18Just Add $100 More: Augmenting NeRF-based Pseudo-LiDAR Point Cloud for Resolving Class-imbalance Problem只需再加 100 美元:增强基于 NeRF 的伪 LiDAR 点云以解决类不平衡问题Mincheol Chang, Siyeong Lee, Jinkyu Kim, Namil Kimarxiv.org/pdf/2403.11…null

3DGS

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-18NEDS-SLAM: A Novel Neural Explicit Dense Semantic SLAM Framework using 3D Gaussian SplattingNEDS-SLAM:使用 3D 高斯分布的新型神经显式密集语义 SLAM 框架Yiming Ji, Yang Liu, Guanghu Xie, Boyu Ma, Zongwu Xiearxiv.org/pdf/2403.11…null
2024-03-18GaussNav: Gaussian Splatting for Visual NavigationGaussNav:用于视觉导航的高斯泼溅Xiaohan Lei, Min Wang, Wengang Zhou, Houqiang Liarxiv.org/pdf/2403.11…null
2024-03-18Fed3DGS: Scalable 3D Gaussian Splatting with Federated LearningFed3DGS:使用联邦学习的可扩展 3D 高斯分布Teppei Suzukiarxiv.org/pdf/2403.11…null
2024-03-18Bridging 3D Gaussian and Mesh for Freeview Video Rendering桥接 3D 高斯和网格以进行 Freeview 视频渲染Yuting Xiao, Xuan Wang, Jiafei Li, Hongrui Cai, Yanbo Fan, Nan Xue, Minghui Yang, Yujun Shen, Shenghua Gaoarxiv.org/pdf/2403.11…null
2024-03-18Motion-aware 3D Gaussian Splatting for Efficient Dynamic Scene Reconstruction用于高效动态场景重建的运动感知 3D 高斯泼溅Zhiyang Guo, Wengang Zhou, Li Li, Min Wang, Houqiang Liarxiv.org/pdf/2403.11…null
2024-03-18BAGS: Building Animatable Gaussian Splatting from a Monocular Video with Diffusion PriorsBAGS:利用扩散先验从单目视频构建可动画化的高斯泼溅Tingyang Zhang, Qingzhe Gao, Weiyu Li, Libin Liu, Baoquan Chenarxiv.org/pdf/2403.11…null

模型压缩/优化

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-18TrajectoryNAS: A Neural Architecture Search for Trajectory PredictionTrajectoryNAS:用于轨迹预测的神经架构搜索Ali Asghar Sharifi, Ali Zoljodi, Masoud Daneshtalabarxiv.org/pdf/2403.11…null
2024-03-18TTT-KD: Test-Time Training for 3D Semantic Segmentation through Knowledge Distillation from Foundation ModelsTTT-KD:通过基础模型的知识蒸馏进行 3D 语义分割的测试时训练Lisa Weijler, Muhammad Jehanzeb Mirza, Leon Sick, Can Ekkazan, Pedro Hermosillaarxiv.org/pdf/2403.11…null
2024-03-18Better (pseudo-)labels for semi-supervised instance segmentation用于半监督实例分割的更好的(伪)标签François Porcher, Camille Couprie, Marc Szafraniec, Jakob Verbeekarxiv.org/pdf/2403.11…null
2024-03-18Hierarchical Frequency-based Upsampling and Refining for Compressed Video Quality Enhancement用于增强压缩视频质量的基于频率的分层上采样和细化Qianyu Zhang, Bolun Zheng, Xinying Chen, Quan Chen, Zhunjie Zhu, Canjin Wang, Zongpeng Li, Chengang Yanarxiv.org/pdf/2403.11…null
2024-03-18Boosting Order-Preserving and Transferability for Neural Architecture Search: a Joint Architecture Refined Search and Fine-tuning Approach提高神经架构搜索的保序性和可转移性:联合架构细化搜索和微调方法Beichen Zhang, Xiaoxing Wang, Xiaohan Qin, Junchi Yanarxiv.org/pdf/2403.11…null

分类/检测/识别/分割/...

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-18A Spatial-Temporal Progressive Fusion Network for Breast Lesion Segmentation in Ultrasound Videos用于超声视频中乳腺病变分割的时空渐进融合网络Zhengzheng Tu, Zigang Zhu, Yayang Duan, Bo Jiang, Qishun Wang, Chaoxue Zhangarxiv.org/pdf/2403.11…null
2024-03-18Object Segmentation-Assisted Inter Prediction for Versatile Video Coding适用于多功能视频编码的对象分割辅助帧间预测Zhuoyuan Li, Zikun Yuan, Li Li, Dong Liu, Xiaohu Tang, Feng Wuarxiv.org/pdf/2403.11…null
2024-03-18MoreStyle: Relax Low-frequency Constraint of Fourier-based Image Reconstruction in Generalizable Medical Image SegmentationMoreStyle:放宽基于傅里叶的图像重建在广义医学图像分割中的低频约束Haoyu Zhao, Wenhui Dong, Rui Yu, Zhou Zhao, Du Bo, Yongchao Xuarxiv.org/pdf/2403.11…null
2024-03-18Normalized Validity Scores for DNNs in Regression based Eye Feature Extraction基于回归的眼睛特征提取中 DNN 的归一化有效性得分Wolfgang Fuhlarxiv.org/pdf/2403.11…null
2024-03-18LocalStyleFool: Regional Video Style Transfer Attack Using Segment Anything ModelLocalStyleFool:使用分段任意模型的区域视频风格转移攻击Yuxin Cao, Jinghao Li, Xi Xiao, Derui Wang, Minhui Xue, Hao Ge, Wei Liu, Guangwu Huarxiv.org/pdf/2403.11…null
2024-03-18Gridless 2D Recovery of Lines using the Sliding Frank-Wolfe Algorithm使用滑动 Frank-Wolfe 算法对线路进行无网格二维恢复Kévin Polisano, Basile Dubois-Bonnaire, Sylvain Meignenarxiv.org/pdf/2403.11…null
2024-03-18Compositional Kronecker Context Optimization for Vision-Language Models视觉语言模型的组合克罗内克上下文优化Kun Ding, Xiaohui Li, Qiang Yu, Ying Wang, Haojian Zhang, Shiming Xiangarxiv.org/pdf/2403.11…null
2024-03-18Multi-View Video-Based Learning: Leveraging Weak Labels for Frame-Level Perception基于多视图视频的学习:利用弱标签进行帧级感知Vijay John, Yasutomo Kawanishiarxiv.org/pdf/2403.11…null
2024-03-18OurDB: Ouroboric Domain Bridging for Multi-Target Domain Adaptive Semantic SegmentationOurDB:用于多目标域自适应语义分割的 Ouroboric 域桥接Seungbeom Woo, Geonwoo Baek, Taehoon Kim, Jaemin Na, Joong-won Hwang, Wonjun Hwangarxiv.org/pdf/2403.11…null
2024-03-18MISS: Memory-efficient Instance Segmentation Framework By Visual Inductive Priors Flow PropagationMISS:通过视觉归纳先验流传播实现内存高效实例分割框架Chih-Chung Hsu, Chia-Ming Leearxiv.org/pdf/2403.11…null
2024-03-18Augment Before Copy-Paste: Data and Memory Efficiency-Oriented Instance Segmentation Framework for Sport-scenes复制粘贴前的增强:面向数据和内存效率的体育场景实例分割框架Chih-Chung Hsu, Chia-Ming Lee, Ming-Shyen Wuarxiv.org/pdf/2403.11…null
2024-03-18Learning Unified Reference Representation for Unsupervised Multi-class Anomaly Detection学习无监督多类异常检测的统一参考表示Liren He, Zhengkai Jiang, Jinlong Peng, Liang Liu, Qiangang Du, Xiaobin Hu, Wenbing Zhu, Mingmin Chi, Yabiao Wang, Chengjie Wangarxiv.org/pdf/2403.11…null
2024-03-18Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters通过混合专家适配器促进视觉语言模型的持续学习Jiazuo Yu, Yunzhi Zhuge, Lu Zhang, Dong Wang, Huchuan Lu, You Hearxiv.org/pdf/2403.11…null
2024-03-18Out-of-Distribution Detection Should Use Conformal Prediction (and Vice-versa?)分布外检测应使用保形预测(反之亦然?)Paul Novello, Joseba Dalmau, Léo Andeolarxiv.org/pdf/2403.11…null
2024-03-18Continual Forgetting for Pre-trained Vision Models预训练视觉模型的持续遗忘Hongbo Zhao, Bolin Ni, Haochen Wang, Junsong Fan, Fei Zhu, Yuxi Wang, Yuntao Chen, Gaofeng Meng, Zhaoxiang Zhangarxiv.org/pdf/2403.11…link
2024-03-18Video Object Segmentation with Dynamic Query Modulation使用动态查询调制进行视频对象分割Hantao Zhou, Runze Hu, Xiu Liarxiv.org/pdf/2403.11…null
2024-03-18Circle Representation for Medical Instance Object Segmentation用于医疗实例对象分割的圆形表示Juming Xiong, Ethan H. Nguyen, Yilin Liu, Ruining Deng, Regina N Tyree, Hernan Correa, Girish Hiremath, Yaohong Wang, Haichun Yang, Agnes B. Fogo, et.al.arxiv.org/pdf/2403.11…null
2024-03-18Covid-19 detection from CT scans using EfficientNet and Attention mechanism使用 EfficientNet 和注意力机制从 CT 扫描中检测 Covid-19Ramy Farag, Parth Upadhyay, Guilhermen DeSouzaarxiv.org/pdf/2403.11…null
2024-03-18Domain Adaptation Using Pseudo Labels for COVID-19 Detection使用伪标签进行域适应进行 COVID-19 检测Runtian Yuan, Qingqiu Li, Junlin Hou, Jilan Xu, Yuejie Zhang, Rui Feng, Hao Chenarxiv.org/pdf/2403.11…null
2024-03-18CCC++: Optimized Color Classified Colorization with Segment Anything Model (SAM) Empowered Object Selective Color HarmonizationCCC++:使用分段任意模型 (SAM) 增强的对象选择性颜色协调来优化颜色分类着色Mrityunjoy Gain, Avi Deb Raha, Rameswar Debnatharxiv.org/pdf/2403.11…null
2024-03-18Towards understanding the nature of direct functional connectivity in visual brain network理解视觉大脑网络中直接功能连接的本质Debanjali Bhattacharya, Neelam Sinhaarxiv.org/pdf/2403.11…null
2024-03-18Collage Prompting: Budget-Friendly Visual Recognition with GPT-4V拼贴提示:使用 GPT-4V 进行经济实惠的视觉识别Siyu Xu, Yunke Wang, Daochang Liu, Chang Xuarxiv.org/pdf/2403.11…null
2024-03-18Zero-shot Compound Expression Recognition with Visual Language Model at the 6th ABAW Challenge第六届 ABAW 挑战赛上使用视觉语言模型的零样本复合表达识别Jiahe Wang, Jiale Huang, Bingzhao Cai, Yifan Cao, Xin Yun, Shangfei Wangarxiv.org/pdf/2403.11…null
2024-03-18Robust Overfitting Does Matter: Test-Time Adversarial Purification With FGSM鲁棒的过度拟合确实很重要:使用 FGSM 进行测试时对抗性纯化Linyu Tang, Lei Zhangarxiv.org/pdf/2403.11…null
2024-03-18Boosting Continuous Emotion Recognition with Self-Pretraining using Masked Autoencoders, Temporal Convolutional Networks, and Transformers使用 Masked Autoencoders、Temporal Convolutional Network 和 Transformers 进行自我预训练来增强连续情绪识别Weiwei Zhou, Jiada Lu, Chenkun Ling, Weifeng Wang, Shaowei Liuarxiv.org/pdf/2403.11…null
2024-03-18ShapeFormer: Shape Prior Visible-to-Amodal Transformer-based Amodal Instance SegmentationShapeFormer:基于形状先验可见到非模态转换器的非模态实例分割Minh Tran, Winston Bounsavy, Khoa Vo, Anh Nguyen, Tri Nguyen, Ngan Learxiv.org/pdf/2403.11…null

图像理解

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-18SSAP: A Shape-Sensitive Adversarial Patch for Comprehensive Disruption of Monocular Depth Estimation in Autonomous Navigation ApplicationsSSAP:一种形状敏感的对抗补丁,用于全面破坏自主导航应用中的单目深度估计Amira Guesmi, Muhammad Abdullah Hanif, Ihsen Alouani, Bassem Ouni, Muhammad Shafiquearxiv.org/pdf/2403.11…null
2024-03-18Investigating the Benefits of Projection Head for Representation Learning研究投影头对于表征学习的好处Yihao Xue, Eric Gan, Jiayi Ni, Siddharth Joshi, Baharan Mirzasoleimanarxiv.org/pdf/2403.11…null

LLM

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-18Scene-LLM: Extending Language Model for 3D Visual Understanding and ReasoningScene-LLM:扩展 3D 视觉理解和推理的语言模型Rao Fu, Jingyu Liu, Xilun Chen, Yixin Nie, Wenhan Xiongarxiv.org/pdf/2403.11…null

Transformer

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-18QEAN: Quaternion-Enhanced Attention Network for Visual Dance GenerationQEAN:用于视觉舞蹈生成的四元数增强注意力网络Zhizhen Zhou, Yejing Huo, Guoheng Huang, An Zeng, Xuhang Chen, Lian Huang, Zinuo Liarxiv.org/pdf/2403.11…null
2024-03-18Hierarchical Spatial Proximity Reasoning for Vision-and-Language Navigation用于视觉和语言导航的分层空间邻近推理Ming Xu, Zilong Xiearxiv.org/pdf/2403.11…null
2024-03-18Siamese Learning with Joint Alignment and Regression for Weakly-Supervised Video Paragraph Grounding用于弱监督视频段落接地的联合对齐和回归的连体学习Chaolei Tan, Jianhuang Lai, Wei-Shi Zheng, Jian-Fang Huarxiv.org/pdf/2403.11…null
2024-03-18Benchmarking the Robustness of UAV Tracking Against Common Corruptions针对常见腐败情况对无人机跟踪的鲁棒性进行基准测试Xiaoqiong Liu, Yunhe Feng, Shu Hu, Xiaohui Yuan, Heng Fanarxiv.org/pdf/2403.11…link

3D/CG

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-18Prioritized Semantic Learning for Zero-shot Instance Navigation零样本实例导航的优先语义学习Xander Sun, Louis Lau, Hoyard Zhi, Ronghe Qiu, Junwei Liangarxiv.org/pdf/2403.11…null
2024-03-18An Accurate and Real-time Relative Pose Estimation from Triple Point-line Images by Decoupling Rotation and Translation通过解耦旋转和平移从三点线图像进行准确实时的相对位姿估计Zewen Xu, Yijia He, Hao Wei, Bo Xu, BinJian Xie, Yihong Wuarxiv.org/pdf/2403.11…null
2024-03-18Personalized 3D Human Pose and Shape Refinement个性化 3D 人体姿势和形状细化Tom Wehrbein, Bodo Rosenhahn, Iain Matthews, Carsten Stollarxiv.org/pdf/2403.11…null
2024-03-18DynoSurf: Neural Deformation-based Temporally Consistent Dynamic Surface ReconstructionDynoSurf:基于神经变形的时间一致动态表面重建Yuxin Yao, Siyu Ren, Junhui Hou, Zhi Deng, Juyong Zhang, Wenping Wangarxiv.org/pdf/2403.11…null
2024-03-18TARN-VIST: Topic Aware Reinforcement Network for Visual StorytellingTARN-VIST:用于视觉叙事的主题感知强化网络Weiran Chen, Xin Li, Jiaqi Su, Guiqian Zhu, Ying Li, Yi Ji, Chunping Liuarxiv.org/pdf/2403.11…null
2024-03-18GenFlow: Generalizable Recurrent Flow for 6D Pose Refinement of Novel ObjectsGenFlow:用于新物体 6D 姿态细化的可推广循环流Sungphill Moon, Hyeontae Son, Dongcheol Hur, Sangwook Kimarxiv.org/pdf/2403.11…null

各类学习方式

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-18Towards Generalizing to Unseen Domains with Few Labels用很少的标签推广到看不见的领域Chamuditha Jayanga Galappaththige, Sanoojan Baliah, Malitha Gunawardhana, Muhammad Haris Khanarxiv.org/pdf/2403.11…null
2024-03-18Semantic Prompting with Image-Token for Continual Learning使用图像令牌进行语义提示以进行持续学习Jisu Han, Jaemin Na, Wonjun Hwangarxiv.org/pdf/2403.11…null

其他

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-18Implicit Discriminative Knowledge Learning for Visible-Infrared Person Re-Identification可见光-红外行人重识别的隐式判别知识学习Kaijie Ren, Lei Zhangarxiv.org/pdf/2403.11…null
2024-03-18WIA-LD2ND: Wavelet-based Image Alignment for Self-supervised Low-Dose CT DenoisingWIA-LD2ND:基于小波的图像对齐,用于自监督低剂量 CT 去噪Haoyu Zhao, Guyu Liang, Zhou Zhao, Bo Du, Yongchao Xu, Rui Yuarxiv.org/pdf/2403.11…null
2024-03-18MedMerge: Merging Models for Effective Transfer Learning to Medical Imaging TasksMedMerge:将有效迁移学习的模型合并到医学成像任务Ibrahim Almakky, Santosh Sanjeev, Anees Ur Rehman Hashmi, Mohammad Areeb Qazi, Mohammad Yaqubarxiv.org/pdf/2403.11…null
2024-03-18HSEmotion Team at the 6th ABAW Competition: Facial Expressions, Valence-Arousal and Emotion Intensity PredictionHSEmotion团队参加第六届ABAW竞赛:面部表情、效价唤醒和情绪强度预测Andrey V. Savchenkoarxiv.org/pdf/2403.11…null
2024-03-18LogicalDefender: Discovering, Extracting, and Utilizing Common-Sense KnowledgeLogicalDefender:发现、提取和利用常识知识Yuhe Liu, Mengxue Kang, Zengchang Qin, Xiangxiang Chuarxiv.org/pdf/2403.11…null
2024-03-18End-To-End Underwater Video Enhancement: Dataset and Model端到端水下视频增强:数据集和模型Dazhao Du, Enhan Li, Lingyu Si, Fanjiang Xu, Jianwei Niuarxiv.org/pdf/2403.11…null
2024-03-18MLVICX: Multi-Level Variance-Covariance Exploration for Chest X-ray Self-Supervised Representation LearningMLVICX:胸部 X 射线自监督表示学习的多级方差-协方差探索Azad Singh, Vandan Gorade, Deepak Mishraarxiv.org/pdf/2403.11…null
2024-03-18Do CLIPs Always Generalize Better than ImageNet Models?CLIP 是否总是比 ImageNet 模型具有更好的泛化能力?Qizhou Wang, Yong Lin, Yongqiang Chen, Ludwig Schmidt, Bo Han, Tong Zhangarxiv.org/pdf/2403.11…null
2024-03-18SmartRefine: An Scenario-Adaptive Refinement Framework for Efficient Motion PredictionSmartRefine:用于高效运动预测的场景自适应细化框架Yang Zhou, Hao Shao, Letian Wang, Steven L. Waslander, Hongsheng Li, Yu Liuarxiv.org/pdf/2403.11…null
2024-03-18Defense Against Adversarial Attacks on No-Reference Image Quality Models with Gradient Norm Regularization利用梯度范数正则化防御对无参考图像质量模型的对抗性攻击Yujia Liu, Chenxi Yang, Dingquan Li, Jianhao Ding, Tingting Jiangarxiv.org/pdf/2403.11…null