[UPDATED!] 2024-03-18 (Publish Time)
生成模型
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-18 | LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images | LLaVA-UHD:感知任何长宽比和高分辨率图像的 LMM | Ruyi Xu, Yuan Yao, Zonghao Guo, Junbo Cui, Zanlin Ni, Chunjiang Ge, Tat-Seng Chua, Zhiyuan Liu, Maosong Sun, Gao Huang | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Urban Scene Diffusion through Semantic Occupancy Map | 通过语义占用图进行城市场景扩散 | Junge Zhang, Qihang Zhang, Li Zhang, Ramana Rao Kompella, Gaowen Liu, Bolei Zhou | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Binary Noise for Binary Tasks: Masked Bernoulli Diffusion for Unsupervised Anomaly Detection | 用于二元任务的二元噪声:用于无监督异常检测的掩蔽伯努利扩散 | Julia Wolleb, Florentin Bieder, Paul Friedrich, Peter Zhang, Alicia Durrer, Philippe C. Cattin | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Diffusion-Based Environment-Aware Trajectory Prediction | 基于扩散的环境感知轨迹预测 | Theodor Westny, Björn Olofsson, Erik Frisk | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Arc2Face: A Foundation Model of Human Faces | Arc2Face:人脸基础模型 | Foivos Paraperas Papantoniou, Alexandros Lattas, Stylianos Moschoglou, Jiankang Deng, Bernhard Kainz, Stefanos Zafeiriou | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | LoRA-Composer: Leveraging Low-Rank Adaptation for Multi-Concept Customization in Training-Free Diffusion Models | LoRA-Composer:利用低秩适应在免训练扩散模型中实现多概念定制 | Yang Yang, Wen Wang, Liang Peng, Chaotian Song, Yao Chen, Hengjia Li, Xiaolong Yang, Qinglin Lu, Deng Cai, Boxi Wu, et.al. | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | CRS-Diff: Controllable Generative Remote Sensing Foundation Model | CRS-Diff:可控生成遥感基础模型 | Datao Tang, Xiangyong Cao, Xingsong Hou, Zhongyuan Jiang, Deyu Meng | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | EffiVED:Efficient Video Editing via Text-instruction Diffusion Models | EffiVED:通过文本指令扩散模型进行高效视频编辑 | Zhenghao Zhang, Zuozhuo Dai, Long Qin, Weizhi Wang | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | EchoReel: Enhancing Action Generation of Existing Video Diffusion Models | EchoReel:增强现有视频传播模型的动作生成 | Jianzhi liu, Junchen Zhu, Lianli Gao, Jingkuan Song | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Diffusion Models are Geometry Critics: Single Image 3D Editing Using Pre-Trained Diffusion Priors | 扩散模型是几何批评家:使用预先训练的扩散先验进行单图像 3D 编辑 | Ruicheng Wang, Jianfeng Xiang, Jiaolong Yang, Xin Tong | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | CasSR: Activating Image Power for Real-World Image Super-Resolution | CasSR:激活图像能力以实现真实世界图像超分辨率 | Haolan Chen, Jinhua Hao, Kai Zhao, Kun Yuan, Ming Sun, Chao Zhou, Wei Hu | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | VmambaIR: Visual State Space Model for Image Restoration | VmambaIR:用于图像恢复的视觉状态空间模型 | Yuan Shi, Bin Xia, Xiaoyu Jin, Xing Wang, Tianyu Zhao, Xin Xia, Xuefeng Xiao, Wenming Yang | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | DreamSampler: Unifying Diffusion Sampling and Score Distillation for Image Manipulation | DreamSampler:统一图像处理的扩散采样和分数蒸馏 | Jeongsol Kim, Geon Yeong Park, Jong Chul Ye | arxiv.org/pdf/2403.11… | null |
多模态
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-18 | MASSTAR: A Multi-Modal and Large-Scale Scene Dataset with a Versatile Toolchain for Surface Prediction and Completion | MASSTAR:多模态和大规模场景数据集,具有用于表面预测和完成的多功能工具链 | Guiyong Zheng, Jinqi Jiang, Chen Feng, Shaojie Shen, Boyu Zhou | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | End-to-end multi-modal product matching in fashion e-commerce | 时尚电商端到端多模态产品匹配 | Sándor Tóth, Stephen Wilson, Alexia Tsoukara, Enric Moreu, Anton Masalovich, Lars Roemheld | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | 3DGS-Calib: 3D Gaussian Splatting for Multimodal SpatioTemporal Calibration | 3DGS-Calib:用于多模态时空校准的 3D 高斯泼溅 | Quentin Herau, Moussab Bennehar, Arthur Moreau, Nathan Piasco, Luis Roldao, Dzmitry Tsishkou, Cyrille Migniot, Pascal Vasseur, Cédric Demonceaux | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | OCR is All you need: Importing Multi-Modality into Image-based Defect Detection System | OCR 就是您所需要的:将多模态导入基于图像的缺陷检测系统 | Chih-Chung Hsu, Chia-Ming Lee, Chun-Hung Sun, Kuang-Ming Wu | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Sim-to-Real Grasp Detection with Global-to-Local RGB-D Adaptation | 具有全局到局部 RGB-D 适应的模拟到真实抓取检测 | Haoxiang Ma, Ran Qin, Modi shi, Boyang Gao, Di Huang | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding | VideoAgent:用于视频理解的内存增强多模态代理 | Yue Fan, Xiaojian Ma, Rujie Wu, Yuntao Du, Jiaqi Li, Zhi Gao, Qing Li | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Generative Motion Stylization within Canonical Motion Space | 规范运动空间内的生成运动风格化 | Jiaxu Zhang, Xin Chen, Gang Yu, Zhigang Tu | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Path-GPTOmic: A Balanced Multi-modal Learning Framework for Survival Outcome Prediction | Path-GPTOmic:用于生存结果预测的平衡多模态学习框架 | Hongxiao Wang, Yang Yang, Zhuo Zhao, Pengfei Gu, Nishchal Sapkota, Danny Z. Chen | arxiv.org/pdf/2403.11… | null |
Nerf
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-18 | Exploring 3D-aware Latent Spaces for Efficiently Learning Numerous Scenes | 探索 3D 感知潜在空间以有效学习众多场景 | Antoine Schnepf, Karim Kassab, Jean-Yves Franceschi, Laurent Caraffa, Flavian Vasile, Jeremie Mary, Andrew Comport, Valérie Gouet-Brunet | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | UV Gaussians: Joint Learning of Mesh Deformation and Gaussian Textures for Human Avatar Modeling | UV 高斯:用于人体头像建模的网格变形和高斯纹理的联合学习 | Yujiao Jiang, Qingmin Liao, Xiaoyu Li, Li Ma, Qi Zhang, Chaopeng Zhang, Zongqing Lu, Ying Shan | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Just Add $100 More: Augmenting NeRF-based Pseudo-LiDAR Point Cloud for Resolving Class-imbalance Problem | 只需再加 100 美元:增强基于 NeRF 的伪 LiDAR 点云以解决类不平衡问题 | Mincheol Chang, Siyeong Lee, Jinkyu Kim, Namil Kim | arxiv.org/pdf/2403.11… | null |
3DGS
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-18 | NEDS-SLAM: A Novel Neural Explicit Dense Semantic SLAM Framework using 3D Gaussian Splatting | NEDS-SLAM:使用 3D 高斯分布的新型神经显式密集语义 SLAM 框架 | Yiming Ji, Yang Liu, Guanghu Xie, Boyu Ma, Zongwu Xie | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | GaussNav: Gaussian Splatting for Visual Navigation | GaussNav:用于视觉导航的高斯泼溅 | Xiaohan Lei, Min Wang, Wengang Zhou, Houqiang Li | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Fed3DGS: Scalable 3D Gaussian Splatting with Federated Learning | Fed3DGS:使用联邦学习的可扩展 3D 高斯分布 | Teppei Suzuki | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Bridging 3D Gaussian and Mesh for Freeview Video Rendering | 桥接 3D 高斯和网格以进行 Freeview 视频渲染 | Yuting Xiao, Xuan Wang, Jiafei Li, Hongrui Cai, Yanbo Fan, Nan Xue, Minghui Yang, Yujun Shen, Shenghua Gao | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Motion-aware 3D Gaussian Splatting for Efficient Dynamic Scene Reconstruction | 用于高效动态场景重建的运动感知 3D 高斯泼溅 | Zhiyang Guo, Wengang Zhou, Li Li, Min Wang, Houqiang Li | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | BAGS: Building Animatable Gaussian Splatting from a Monocular Video with Diffusion Priors | BAGS:利用扩散先验从单目视频构建可动画化的高斯泼溅 | Tingyang Zhang, Qingzhe Gao, Weiyu Li, Libin Liu, Baoquan Chen | arxiv.org/pdf/2403.11… | null |
模型压缩/优化
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-18 | TrajectoryNAS: A Neural Architecture Search for Trajectory Prediction | TrajectoryNAS:用于轨迹预测的神经架构搜索 | Ali Asghar Sharifi, Ali Zoljodi, Masoud Daneshtalab | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | TTT-KD: Test-Time Training for 3D Semantic Segmentation through Knowledge Distillation from Foundation Models | TTT-KD:通过基础模型的知识蒸馏进行 3D 语义分割的测试时训练 | Lisa Weijler, Muhammad Jehanzeb Mirza, Leon Sick, Can Ekkazan, Pedro Hermosilla | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Better (pseudo-)labels for semi-supervised instance segmentation | 用于半监督实例分割的更好的(伪)标签 | François Porcher, Camille Couprie, Marc Szafraniec, Jakob Verbeek | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Hierarchical Frequency-based Upsampling and Refining for Compressed Video Quality Enhancement | 用于增强压缩视频质量的基于频率的分层上采样和细化 | Qianyu Zhang, Bolun Zheng, Xinying Chen, Quan Chen, Zhunjie Zhu, Canjin Wang, Zongpeng Li, Chengang Yan | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Boosting Order-Preserving and Transferability for Neural Architecture Search: a Joint Architecture Refined Search and Fine-tuning Approach | 提高神经架构搜索的保序性和可转移性:联合架构细化搜索和微调方法 | Beichen Zhang, Xiaoxing Wang, Xiaohan Qin, Junchi Yan | arxiv.org/pdf/2403.11… | null |
分类/检测/识别/分割/...
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-18 | A Spatial-Temporal Progressive Fusion Network for Breast Lesion Segmentation in Ultrasound Videos | 用于超声视频中乳腺病变分割的时空渐进融合网络 | Zhengzheng Tu, Zigang Zhu, Yayang Duan, Bo Jiang, Qishun Wang, Chaoxue Zhang | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Object Segmentation-Assisted Inter Prediction for Versatile Video Coding | 适用于多功能视频编码的对象分割辅助帧间预测 | Zhuoyuan Li, Zikun Yuan, Li Li, Dong Liu, Xiaohu Tang, Feng Wu | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | MoreStyle: Relax Low-frequency Constraint of Fourier-based Image Reconstruction in Generalizable Medical Image Segmentation | MoreStyle:放宽基于傅里叶的图像重建在广义医学图像分割中的低频约束 | Haoyu Zhao, Wenhui Dong, Rui Yu, Zhou Zhao, Du Bo, Yongchao Xu | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Normalized Validity Scores for DNNs in Regression based Eye Feature Extraction | 基于回归的眼睛特征提取中 DNN 的归一化有效性得分 | Wolfgang Fuhl | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | LocalStyleFool: Regional Video Style Transfer Attack Using Segment Anything Model | LocalStyleFool:使用分段任意模型的区域视频风格转移攻击 | Yuxin Cao, Jinghao Li, Xi Xiao, Derui Wang, Minhui Xue, Hao Ge, Wei Liu, Guangwu Hu | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Gridless 2D Recovery of Lines using the Sliding Frank-Wolfe Algorithm | 使用滑动 Frank-Wolfe 算法对线路进行无网格二维恢复 | Kévin Polisano, Basile Dubois-Bonnaire, Sylvain Meignen | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Compositional Kronecker Context Optimization for Vision-Language Models | 视觉语言模型的组合克罗内克上下文优化 | Kun Ding, Xiaohui Li, Qiang Yu, Ying Wang, Haojian Zhang, Shiming Xiang | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Multi-View Video-Based Learning: Leveraging Weak Labels for Frame-Level Perception | 基于多视图视频的学习:利用弱标签进行帧级感知 | Vijay John, Yasutomo Kawanishi | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | OurDB: Ouroboric Domain Bridging for Multi-Target Domain Adaptive Semantic Segmentation | OurDB:用于多目标域自适应语义分割的 Ouroboric 域桥接 | Seungbeom Woo, Geonwoo Baek, Taehoon Kim, Jaemin Na, Joong-won Hwang, Wonjun Hwang | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | MISS: Memory-efficient Instance Segmentation Framework By Visual Inductive Priors Flow Propagation | MISS:通过视觉归纳先验流传播实现内存高效实例分割框架 | Chih-Chung Hsu, Chia-Ming Lee | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Augment Before Copy-Paste: Data and Memory Efficiency-Oriented Instance Segmentation Framework for Sport-scenes | 复制粘贴前的增强:面向数据和内存效率的体育场景实例分割框架 | Chih-Chung Hsu, Chia-Ming Lee, Ming-Shyen Wu | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Learning Unified Reference Representation for Unsupervised Multi-class Anomaly Detection | 学习无监督多类异常检测的统一参考表示 | Liren He, Zhengkai Jiang, Jinlong Peng, Liang Liu, Qiangang Du, Xiaobin Hu, Wenbing Zhu, Mingmin Chi, Yabiao Wang, Chengjie Wang | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters | 通过混合专家适配器促进视觉语言模型的持续学习 | Jiazuo Yu, Yunzhi Zhuge, Lu Zhang, Dong Wang, Huchuan Lu, You He | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Out-of-Distribution Detection Should Use Conformal Prediction (and Vice-versa?) | 分布外检测应使用保形预测(反之亦然?) | Paul Novello, Joseba Dalmau, Léo Andeol | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Continual Forgetting for Pre-trained Vision Models | 预训练视觉模型的持续遗忘 | Hongbo Zhao, Bolin Ni, Haochen Wang, Junsong Fan, Fei Zhu, Yuxi Wang, Yuntao Chen, Gaofeng Meng, Zhaoxiang Zhang | arxiv.org/pdf/2403.11… | link |
| 2024-03-18 | Video Object Segmentation with Dynamic Query Modulation | 使用动态查询调制进行视频对象分割 | Hantao Zhou, Runze Hu, Xiu Li | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Circle Representation for Medical Instance Object Segmentation | 用于医疗实例对象分割的圆形表示 | Juming Xiong, Ethan H. Nguyen, Yilin Liu, Ruining Deng, Regina N Tyree, Hernan Correa, Girish Hiremath, Yaohong Wang, Haichun Yang, Agnes B. Fogo, et.al. | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Covid-19 detection from CT scans using EfficientNet and Attention mechanism | 使用 EfficientNet 和注意力机制从 CT 扫描中检测 Covid-19 | Ramy Farag, Parth Upadhyay, Guilhermen DeSouza | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Domain Adaptation Using Pseudo Labels for COVID-19 Detection | 使用伪标签进行域适应进行 COVID-19 检测 | Runtian Yuan, Qingqiu Li, Junlin Hou, Jilan Xu, Yuejie Zhang, Rui Feng, Hao Chen | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | CCC++: Optimized Color Classified Colorization with Segment Anything Model (SAM) Empowered Object Selective Color Harmonization | CCC++:使用分段任意模型 (SAM) 增强的对象选择性颜色协调来优化颜色分类着色 | Mrityunjoy Gain, Avi Deb Raha, Rameswar Debnath | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Towards understanding the nature of direct functional connectivity in visual brain network | 理解视觉大脑网络中直接功能连接的本质 | Debanjali Bhattacharya, Neelam Sinha | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Collage Prompting: Budget-Friendly Visual Recognition with GPT-4V | 拼贴提示:使用 GPT-4V 进行经济实惠的视觉识别 | Siyu Xu, Yunke Wang, Daochang Liu, Chang Xu | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Zero-shot Compound Expression Recognition with Visual Language Model at the 6th ABAW Challenge | 第六届 ABAW 挑战赛上使用视觉语言模型的零样本复合表达识别 | Jiahe Wang, Jiale Huang, Bingzhao Cai, Yifan Cao, Xin Yun, Shangfei Wang | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Robust Overfitting Does Matter: Test-Time Adversarial Purification With FGSM | 鲁棒的过度拟合确实很重要:使用 FGSM 进行测试时对抗性纯化 | Linyu Tang, Lei Zhang | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Boosting Continuous Emotion Recognition with Self-Pretraining using Masked Autoencoders, Temporal Convolutional Networks, and Transformers | 使用 Masked Autoencoders、Temporal Convolutional Network 和 Transformers 进行自我预训练来增强连续情绪识别 | Weiwei Zhou, Jiada Lu, Chenkun Ling, Weifeng Wang, Shaowei Liu | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | ShapeFormer: Shape Prior Visible-to-Amodal Transformer-based Amodal Instance Segmentation | ShapeFormer:基于形状先验可见到非模态转换器的非模态实例分割 | Minh Tran, Winston Bounsavy, Khoa Vo, Anh Nguyen, Tri Nguyen, Ngan Le | arxiv.org/pdf/2403.11… | null |
图像理解
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-18 | SSAP: A Shape-Sensitive Adversarial Patch for Comprehensive Disruption of Monocular Depth Estimation in Autonomous Navigation Applications | SSAP:一种形状敏感的对抗补丁,用于全面破坏自主导航应用中的单目深度估计 | Amira Guesmi, Muhammad Abdullah Hanif, Ihsen Alouani, Bassem Ouni, Muhammad Shafique | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Investigating the Benefits of Projection Head for Representation Learning | 研究投影头对于表征学习的好处 | Yihao Xue, Eric Gan, Jiayi Ni, Siddharth Joshi, Baharan Mirzasoleiman | arxiv.org/pdf/2403.11… | null |
LLM
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-18 | Scene-LLM: Extending Language Model for 3D Visual Understanding and Reasoning | Scene-LLM:扩展 3D 视觉理解和推理的语言模型 | Rao Fu, Jingyu Liu, Xilun Chen, Yixin Nie, Wenhan Xiong | arxiv.org/pdf/2403.11… | null |
Transformer
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-18 | QEAN: Quaternion-Enhanced Attention Network for Visual Dance Generation | QEAN:用于视觉舞蹈生成的四元数增强注意力网络 | Zhizhen Zhou, Yejing Huo, Guoheng Huang, An Zeng, Xuhang Chen, Lian Huang, Zinuo Li | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Hierarchical Spatial Proximity Reasoning for Vision-and-Language Navigation | 用于视觉和语言导航的分层空间邻近推理 | Ming Xu, Zilong Xie | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Siamese Learning with Joint Alignment and Regression for Weakly-Supervised Video Paragraph Grounding | 用于弱监督视频段落接地的联合对齐和回归的连体学习 | Chaolei Tan, Jianhuang Lai, Wei-Shi Zheng, Jian-Fang Hu | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Benchmarking the Robustness of UAV Tracking Against Common Corruptions | 针对常见腐败情况对无人机跟踪的鲁棒性进行基准测试 | Xiaoqiong Liu, Yunhe Feng, Shu Hu, Xiaohui Yuan, Heng Fan | arxiv.org/pdf/2403.11… | link |
3D/CG
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-18 | Prioritized Semantic Learning for Zero-shot Instance Navigation | 零样本实例导航的优先语义学习 | Xander Sun, Louis Lau, Hoyard Zhi, Ronghe Qiu, Junwei Liang | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | An Accurate and Real-time Relative Pose Estimation from Triple Point-line Images by Decoupling Rotation and Translation | 通过解耦旋转和平移从三点线图像进行准确实时的相对位姿估计 | Zewen Xu, Yijia He, Hao Wei, Bo Xu, BinJian Xie, Yihong Wu | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Personalized 3D Human Pose and Shape Refinement | 个性化 3D 人体姿势和形状细化 | Tom Wehrbein, Bodo Rosenhahn, Iain Matthews, Carsten Stoll | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | DynoSurf: Neural Deformation-based Temporally Consistent Dynamic Surface Reconstruction | DynoSurf:基于神经变形的时间一致动态表面重建 | Yuxin Yao, Siyu Ren, Junhui Hou, Zhi Deng, Juyong Zhang, Wenping Wang | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | TARN-VIST: Topic Aware Reinforcement Network for Visual Storytelling | TARN-VIST:用于视觉叙事的主题感知强化网络 | Weiran Chen, Xin Li, Jiaqi Su, Guiqian Zhu, Ying Li, Yi Ji, Chunping Liu | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | GenFlow: Generalizable Recurrent Flow for 6D Pose Refinement of Novel Objects | GenFlow:用于新物体 6D 姿态细化的可推广循环流 | Sungphill Moon, Hyeontae Son, Dongcheol Hur, Sangwook Kim | arxiv.org/pdf/2403.11… | null |
各类学习方式
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-18 | Towards Generalizing to Unseen Domains with Few Labels | 用很少的标签推广到看不见的领域 | Chamuditha Jayanga Galappaththige, Sanoojan Baliah, Malitha Gunawardhana, Muhammad Haris Khan | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Semantic Prompting with Image-Token for Continual Learning | 使用图像令牌进行语义提示以进行持续学习 | Jisu Han, Jaemin Na, Wonjun Hwang | arxiv.org/pdf/2403.11… | null |
其他
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-18 | Implicit Discriminative Knowledge Learning for Visible-Infrared Person Re-Identification | 可见光-红外行人重识别的隐式判别知识学习 | Kaijie Ren, Lei Zhang | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | WIA-LD2ND: Wavelet-based Image Alignment for Self-supervised Low-Dose CT Denoising | WIA-LD2ND:基于小波的图像对齐,用于自监督低剂量 CT 去噪 | Haoyu Zhao, Guyu Liang, Zhou Zhao, Bo Du, Yongchao Xu, Rui Yu | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | MedMerge: Merging Models for Effective Transfer Learning to Medical Imaging Tasks | MedMerge:将有效迁移学习的模型合并到医学成像任务 | Ibrahim Almakky, Santosh Sanjeev, Anees Ur Rehman Hashmi, Mohammad Areeb Qazi, Mohammad Yaqub | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | HSEmotion Team at the 6th ABAW Competition: Facial Expressions, Valence-Arousal and Emotion Intensity Prediction | HSEmotion团队参加第六届ABAW竞赛:面部表情、效价唤醒和情绪强度预测 | Andrey V. Savchenko | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | LogicalDefender: Discovering, Extracting, and Utilizing Common-Sense Knowledge | LogicalDefender:发现、提取和利用常识知识 | Yuhe Liu, Mengxue Kang, Zengchang Qin, Xiangxiang Chu | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | End-To-End Underwater Video Enhancement: Dataset and Model | 端到端水下视频增强:数据集和模型 | Dazhao Du, Enhan Li, Lingyu Si, Fanjiang Xu, Jianwei Niu | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | MLVICX: Multi-Level Variance-Covariance Exploration for Chest X-ray Self-Supervised Representation Learning | MLVICX:胸部 X 射线自监督表示学习的多级方差-协方差探索 | Azad Singh, Vandan Gorade, Deepak Mishra | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Do CLIPs Always Generalize Better than ImageNet Models? | CLIP 是否总是比 ImageNet 模型具有更好的泛化能力? | Qizhou Wang, Yong Lin, Yongqiang Chen, Ludwig Schmidt, Bo Han, Tong Zhang | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | SmartRefine: An Scenario-Adaptive Refinement Framework for Efficient Motion Prediction | SmartRefine:用于高效运动预测的场景自适应细化框架 | Yang Zhou, Hao Shao, Letian Wang, Steven L. Waslander, Hongsheng Li, Yu Liu | arxiv.org/pdf/2403.11… | null |
| 2024-03-18 | Defense Against Adversarial Attacks on No-Reference Image Quality Models with Gradient Norm Regularization | 利用梯度范数正则化防御对无参考图像质量模型的对抗性攻击 | Yujia Liu, Chenxi Yang, Dingquan Li, Jianhao Ding, Tingting Jiang | arxiv.org/pdf/2403.11… | null |