[UPDATED!] 2024-03-23 (Publish Time)
生成模型
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-23 | Feature Manipulation for DDPM based Change Detection | 基于 DDPM 的变化检测的特征操作 | Zhenglin Li, Yangchen Huang, Mengran Zhu, Jingyu Zhang, JingHao Chang, Houze Liu | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | X-Portrait: Expressive Portrait Animation with Hierarchical Motion Attention | X-Portrait:具有分层运动注意力的富有表现力的肖像动画 | You Xie, Hongyi Xu, Guoxian Song, Chao Wang, Yichun Shi, Linjie Luo | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | In-Context Matting | 上下文抠图 | He Guo, Zixuan Ye, Zhiguo Cao, Hao Lu | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | Graph Image Prior for Unsupervised Dynamic MRI Reconstruction | 用于无监督动态 MRI 重建的图形图像先验 | Zhongsen Li, Wenxuan Chen, Shuai Wang, Chuyu Liu, Rui Li | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | FusionINN: Invertible Image Fusion for Brain Tumor Monitoring | FusionINN:用于脑肿瘤监测的可逆图像融合 | Nishant Kumar, Ziyan Tao, Jaikirat Singh, Yang Li, Peiwen Sun, Binghui Zhao, Stefan Gumhold | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | Contact-aware Human Motion Generation from Textual Descriptions | 根据文本描述生成接触感知人体动作 | Sihan Ma, Qiong Cao, Jing Zhang, Dacheng Tao | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | SceneX:Procedural Controllable Large-scale Scene Generation via Large-language Models | SceneX:通过大语言模型生成程序可控的大规模场景 | Mengqi Zhou, Jun Hou, Chuanchen Luo, Yuxi Wang, Zhaoxiang Zhang, Junran Peng | arxiv.org/pdf/2403.15… | null |
多模态
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-23 | IllusionVQA: A Challenging Optical Illusion Dataset for Vision Language Models | IllusionVQA:用于视觉语言模型的具有挑战性的视错觉数据集 | Haz Sameen Shahgir, Khondker Salman Sayeed, Abhik Bhattacharjee, Wasi Uddin Ahmad, Yue Dong, Rifat Shahriyar | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | PNAS-MOT: Multi-Modal Object Tracking with Pareto Neural Architecture Search | PNAS-MOT:使用 Pareto 神经架构搜索进行多模态对象跟踪 | Chensheng Peng, Zhaoyu Zeng, Jinling Gao, Jundong Zhou, Masayoshi Tomizuka, Xinbing Wang, Chenghu Zhou, Nanyang Ye | arxiv.org/pdf/2403.15… | null |
Nerf
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-23 | UPNeRF: A Unified Framework for Monocular 3D Object Reconstruction and Pose Estimation | UPNeRF:单目 3D 对象重建和姿态估计的统一框架 | Yuliang Guo, Abhinav Kumar, Cheng Zhao, Ruoyu Wang, Xinyu Huang, Liu Ren | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | Gaussian in the Wild: 3D Gaussian Splatting for Unconstrained Image Collections | 野外高斯:用于无约束图像集合的 3D 高斯泼溅 | Dongbin Zhang, Chuming Wang, Weitao Wang, Peihao Li, Minghan Qin, Haoqian Wang | arxiv.org/pdf/2403.15… | null |
模型压缩/优化
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-23 | Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression | 一次一举两得:视觉 Transformer 压缩的单阶段重要性和稀疏性搜索 | Hancheng Ye, Chong Yu, Peng Ye, Renqiu Xia, Yansong Tang, Jiwen Lu, Tao Chen, Bo Zhang | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | iDAT: inverse Distillation Adapter-Tuning | iDAT:逆蒸馏适配器调整 | Jiacheng Ruan, Jingsheng Gao, Mingye Xie, Daize Dong, Suncheng Xiang, Ting Liu, Yuzhuo Fu | arxiv.org/pdf/2403.15… | null |
分类/检测/识别/分割/...
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-23 | Finding needles in a haystack: A Black-Box Approach to Invisible Watermark Detection | 大海捞针:隐形水印检测的黑盒方法 | Minzhou Pan, Zhengting Wang, Xin Dong, Vikash Sehwag, Lingjuan Lyu, Xue Lin | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | Deep Domain Adaptation: A Sim2Real Neural Approach for Improving Eye-Tracking Systems | 深度域适应:用于改进眼动追踪系统的 Sim2Real 神经方法 | Viet Dung Nguyen, Reynold Bailey, Gabriel J. Diaz, Chengyi Ma, Alexander Fix, Alexander Ororbia | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | Adaptive Super Resolution For One-Shot Talking-Head Generation | 自适应超分辨率一次性生成人头说话 | Luchuan Song, Pinxin Liu, Guojun Yin, Chenliang Xu | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | An Embarrassingly Simple Defense Against Backdoor Attacks On SSL | 针对 SSL 后门攻击的极其简单的防御 | Aryan Satpathy, Nilaksh, Dhruva Rajwade | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | MatchSeg: Towards Better Segmentation via Reference Image Matching | MatchSeg:通过参考图像匹配实现更好的分割 | Ruiqiang Xiao, Jiayu Huo, Haotian Zheng, Yang Liu, Sebastien Ourselin, Rachel Sparks | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | An edge detection-based deep learning approach for tear meniscus height measurement | 基于边缘检测的深度学习方法用于泪液半月板高度测量 | Kesheng Wang, Kunhui Xu, Xiaoyu Chen, Chunlei He, Jianfeng Zhang, Dexing Kong, Qi Dai, Shoujun Huang | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | Inpainting-Driven Mask Optimization for Object Removal | 用于对象移除的修复驱动蒙版优化 | Kodai Shimosato, Norimichi Ukita | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | VLM-CPL: Consensus Pseudo Labels from Vision-Language Models for Human Annotation-Free Pathological Image Classification | VLM-CPL:来自视觉语言模型的共识伪标签,用于无人类注释的病理图像分类 | Lanfeng Zhong, Xin Liao, Shaoting Zhang, Xiaofan Zhang, Guotai Wang | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | Time-series Initialization and Conditioning for Video-agnostic Stabilization of Video Super-Resolution using Recurrent Networks | 使用循环网络实现与视频无关的视频超分辨率稳定性的时间序列初始化和调节 | Hiroshi Mori, Norimichi Ukita | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | Spatio-Temporal Bi-directional Cross-frame Memory for Distractor Filtering Point Cloud Single Object Tracking | 用于干扰过滤点云单目标跟踪的时空双向跨帧存储器 | Shaoyu Sun, Chunyang Wang, Xuelian Liu, Chunhao Shi, Yueyang Ding, Guan Xi | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | Innovative Quantitative Analysis for Disease Progression Assessment in Familial Cerebral Cavernous Malformations | 家族性脑海绵状血管瘤疾病进展评估的创新定量分析 | Ruige Zong, Tao Wang, Chunwang Li, Xinlin Zhang, Yuanbin Chen, Longxuan Zhao, Qixuan Li, Qinquan Gao, Dezhi Kang, Fuxin Lin, et.al. | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | Adversarial Defense Teacher for Cross-Domain Object Detection under Poor Visibility Conditions | 能见度较差条件下跨域目标检测的对抗性防御老师 | Kaiwen Wang, Yinzhe Shen, Martin Lauer | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | 3D-TransUNet for Brain Metastases Segmentation in the BraTS2023 Challenge | BraTS2023 挑战赛中用于脑转移瘤分割的 3D-TransUNet | Siwei Yang, Xianhang Li, Jieru Mei, Jieneng Chen, Cihang Xie, Yuyin Zhou | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | Technical Report: Masked Skeleton Sequence Modeling for Learning Larval Zebrafish Behavior Latent Embeddings | 技术报告:用于学习幼虫斑马鱼行为潜在嵌入的蒙面骨架序列建模 | Lanxin Xu, Shuo Wang | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | What Do You See in Vehicle? Comprehensive Vision Solution for In-Vehicle Gaze Estimation | 您在车辆中看到什么?用于车内视线估计的综合视觉解决方案 | Yihua Cheng, Yaning Zhu, Zongji Wang, Hongquan Hao, Yongwei Liu, Shiqing Cheng, Xi Wang, Hyung Jin Chang | arxiv.org/pdf/2403.15… | null |
图像理解
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-23 | Depth Estimation fusing Image and Radar Measurements with Uncertain Directions | 融合图像和雷达测量与不确定方向的深度估计 | Masaya Kotani, Takeru Oba, Norimichi Ukita | arxiv.org/pdf/2403.15… | null |
Transformer
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-23 | Temporal-Spatial Object Relations Modeling for Vision-and-Language Navigation | 用于视觉和语言导航的时空对象关系建模 | Bowen Huang, Yanwei Zheng, Chuanlin Lan, Xinpeng Zhao, Dongxiao yu, Yifei Zou | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | DS-NeRV: Implicit Neural Video Representation with Decomposed Static and Dynamic Codes | DS-NeRV:具有分解的静态和动态代码的隐式神经视频表示 | Hao Yan, Zhihui Ke, Xiaobo Zhou, Tie Qiu, Xidong Shi, Dadong Jiang | arxiv.org/pdf/2403.15… | null |
3D/CG
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-23 | Explore until Confident: Efficient Exploration for Embodied Question Answering | 探索直至自信:实体问答的高效探索 | Allen Z. Ren, Jaden Clark, Anushri Dixit, Masha Itkina, Anirudha Majumdar, Dorsa Sadigh | arxiv.org/pdf/2403.15… | null |
各类学习方式
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-23 | Towards Human-Like Machine Comprehension: Few-Shot Relational Learning in Visually-Rich Documents | 迈向类人机器理解:视觉丰富的文档中的少量关系学习 | Hao Wang, Tang Li, Chenhui Chu, Nengjun Zhu, Rui Wang, Pinpin Zhu | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | AOCIL: Exemplar-free Analytic Online Class Incremental Learning with Low Time and Resource Consumption | AOCIL:低时间、低资源消耗的无范例分析型网课增量学习 | Huiping Zhuang, Yuchen Liu, Run He, Kai Tong, Ziqian Zeng, Cen Chen, Yi Wang, Lap-Pui Chau | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | G-ACIL: Analytic Learning for Exemplar-Free Generalized Class Incremental Learning | G-ACIL:无范例广义类增量学习的分析学习 | Huiping Zhuang, Yizhu Chen, Di Fang, Run He, Kai Tong, Hongxin Wei, Ziqian Zeng, Cen Chen | arxiv.org/pdf/2403.15… | null |
其他
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-23 | MapTracker: Tracking with Strided Memory Fusion for Consistent Vector HD Mapping | MapTracker:使用跨步内存融合进行跟踪以实现一致的矢量高清地图 | Jiacheng Chen, Yuefan Wu, Jiaqi Tan, Hang Ma, Yasutaka Furukawa | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | Towards Low-Energy Adaptive Personalization for Resource-Constrained Devices | 面向资源受限设备的低能耗自适应个性化 | Yushan Huang, Josh Millar, Yuxuan Long, Yuchen Zhao, Hamed Hadaddi | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | Human Motion Prediction under Unexpected Perturbation | 意外扰动下的人体运动预测 | Jiangbei Yue, Baiyi Li, Julien Pettré, Armin Seyfried, He Wang | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | Diffusion-based Aesthetic QR Code Generation via Scanning-Robust Perceptual Guidance | 通过扫描稳健感知引导生成基于扩散的美观 QR 码 | Jia-Wei Liao, Winston Wang, Tzu-Sian Wang, Li-Xuan Peng, Cheng-Fu Chou, Jun-Cheng Chen | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | Cognitive resilience: Unraveling the proficiency of image-captioning models to interpret masked visual content | 认知弹性:揭示图像字幕模型解释屏蔽视觉内容的熟练程度 | Zhicheng Du, Zhaotian Xie, Huazhang Ying, Likun Zhang, Peiwu Qin | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | Centered Masking for Language-Image Pre-Training | 用于语言图像预训练的中心掩蔽 | Mingliang Liang, Martha Larson | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | Ev-Edge: Efficient Execution of Event-based Vision Algorithms on Commodity Edge Platforms | Ev-Edge:在商品边缘平台上高效执行基于事件的视觉算法 | Shrihari Sridharan, Surya Selvam, Kaushik Roy, Anand Raghunathan | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | The Limits of Perception: Analyzing Inconsistencies in Saliency Maps in XAI | 感知的局限性:分析 XAI 显着图中的不一致 | Anna Stubbin, Thompson Chyrikov, Jim Zhao, Christina Chajo | arxiv.org/pdf/2403.15… | null |
| 2024-03-23 | An active learning model to classify animal species in Hong Kong | 香港动物物种分类的主动学习模型 | Gareth Lamb, Ching Hei Lo, Jin Wu, Calvin K. F. Lee | arxiv.org/pdf/2403.15… | null |