!UPDATED -- 2024-01-10
各类学习方式
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-10 | URHand: Universal Relightable Hands | URHand:通用可重复照明手 | Zhaoxi Chen, Gyeongsik Moon, Kaiwen Guo, Chen Cao, Stanislav Pidhorskyi, Tomas Simon, Rohan Joshi, Yuan Dong, Yichen Xu, Bernardo Pires, et.al. | arxiv.org/pdf/2401.05… | null |
| 2024-01-10 | Score Distillation Sampling with Learned Manifold Corrective | 使用学习流形校正对蒸馏采样进行评分 | Thiemo Alldieck, Nikos Kolotouros, Cristian Sminchisescu | arxiv.org/pdf/2401.05… | null |
| 2024-01-10 | CLIP-guided Source-free Object Detection in Aerial Images | CLIP 引导的航空图像中的无源物体检测 | Nanqing Liu, Xun Xu, Yongyi Su, Chengxin Liu, Peiliang Gong, Heng-Chao Li | arxiv.org/pdf/2401.05… | null |
| 2024-01-10 | Derm-T2IM: Harnessing Synthetic Skin Lesion Data via Stable Diffusion Models for Enhanced Skin Disease Classification using ViT and CNN | Derm-T2IM:通过稳定扩散模型利用合成皮肤病变数据,使用 ViT 和 CNN 增强皮肤疾病分类 | Muhammad Ali Farooq, Wang Yao, Michael Schukat, Mark A Little, Peter Corcoran | arxiv.org/pdf/2401.05… | null |
| 2024-01-10 | CrossDiff: Exploring Self-Supervised Representation of Pansharpening via Cross-Predictive Diffusion Model | CrossDiff:通过交叉预测扩散模型探索全色锐化的自监督表示 | Yinghui Xing, Litao Qu, ShiZhou Zhang, Xiuwei Zhang, Yanning Zhang | arxiv.org/pdf/2401.05… | null |
| 2024-01-10 | SwiMDiff: Scene-wide Matching Contrastive Learning with Diffusion Constraint for Remote Sensing Image | SwiMDiff:遥感图像具有扩散约束的全场景匹配对比学习 | Jiayuan Tian, Jie Lei, Jiaqing Zhang, Weiying Xie, Yunsong Li | arxiv.org/pdf/2401.05… | null |
| 2024-01-10 | Less is More : A Closer Look at Multi-Modal Few-Shot Learning | 少即是多:仔细观察多模态少样本学习 | Chunpeng Zhou, Haishuai Wang, Xilu Yuan, Zhi Yu, Jiajun Bu | arxiv.org/pdf/2401.05… | null |
| 2024-01-10 | ECC-PolypDet: Enhanced CenterNet with Contrastive Learning for Automatic Polyp Detection | ECC-PolypDet:具有对比学习的增强型 CenterNet,用于自动息肉检测 | Yuncheng Jiang, Zixun Zhang, Yiwen Hu, Guanbin Li, Xiang Wan, Song Wu, Shuguang Cui, Silin Huang, Zhen Li | arxiv.org/pdf/2401.04… | null |
| 2024-01-10 | Latency-aware Road Anomaly Segmentation in Videos: A Photorealistic Dataset and New Metrics | 视频中的延迟感知道路异常分割:真实感数据集和新指标 | Beiwen Tian, Huan-ang Gao, Leiyao Cui, Yupeng Zheng, Lan Luo, Baofeng Wang, Rong Zhi, Guyue Zhou, Hao Zhao | arxiv.org/pdf/2401.04… | null |
| 2024-01-10 | Modality-Aware Representation Learning for Zero-shot Sketch-based Image Retrieval | 用于基于零样本草图的图像检索的模态感知表示学习 | Eunyi Lyou, Doyeon Lee, Jooeun Kim, Joonseok Lee | arxiv.org/pdf/2401.04… | null |
分类/检测/识别/分割
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-10 | Towards Online Sign Language Recognition and Translation | 走向在线手语识别和翻译 | Ronglai Zuo, Fangyun Wei, Brian Mak | arxiv.org/pdf/2401.05… | null |
| 2024-01-10 | ANIM-400K: A Large-Scale Dataset for Automated End-To-End Dubbing of Video | ANIM-400K:用于视频自动端到端配音的大规模数据集 | Kevin Cai, Chonghua Liu, David M. Chan | arxiv.org/pdf/2401.05… | null |
| 2024-01-10 | Strategic Client Selection to Address Non-IIDness in HAPS-enabled FL Networks | 战略客户选择以解决支持 HAPS 的 FL 网络中的非独立同分布问题 | Amin Farajzadeh, Animesh Yadav, Halim Yanikomeroglu | arxiv.org/pdf/2401.05… | null |
| 2024-01-10 | Enhanced Muscle and Fat Segmentation for CT-Based Body Composition Analysis: A Comparative Study | 基于 CT 的身体成分分析的增强肌肉和脂肪分割:比较研究 | Benjamin Hou, Tejas Sudharshan Mathai, Jianfei Liu, Christopher Parnell, Ronald M. Summers | arxiv.org/pdf/2401.05… | null |
| 2024-01-10 | Do Vision and Language Encoders Represent the World Similarly? | 视觉和语言编码器是否同样代表世界? | Mayug Maniparambil, Raiymbek Akshulakov, Yasser Abdelaziz Dahou Djilali, Sanath Narayan, Mohamed El Amine Seddik, Karttikeya Mangalam, Noel E. O'Connor | arxiv.org/pdf/2401.05… | null |
| 2024-01-10 | Video-based Automatic Lameness Detection of Dairy Cows using Pose Estimation and Multiple Locomotion Traits | 使用姿势估计和多种运动特征进行基于视频的奶牛自动跛行检测 | Helena Russello, Rik van der Tol, Menno Holzhauer, Eldert J. van Henten, Gert Kootstra | arxiv.org/pdf/2401.05… | null |
| 2024-01-10 | Watermark Text Pattern Spotting in Document Images | 文档图像中的水印文本图案识别 | Mateusz Krubinski, Stefan Matcovici, Diana Grigore, Daniel Voinea, Alin-Ionut Popa | arxiv.org/pdf/2401.05… | null |
| 2024-01-10 | REACT 2024: the Second Multiple Appropriate Facial Reaction Generation Challenge | REACT 2024:第二届多重适当面部反应生成挑战赛 | Siyang Song, Micol Spitale, Cheng Luo, Cristina Palmero, German Barquero, Hengde Zhu, Sergio Escalera, Michel Valstar, Tobias Baur, Fabien Ringeval, et.al. | arxiv.org/pdf/2401.05… | null |
| 2024-01-10 | MISS: A Generative Pretraining and Finetuning Approach for Med-VQA | MISS:Med-VQA 的生成预训练和微调方法 | Jiawei Chen, Dingkang Yang, Yue Jiang, Yuxuan Lei, Lihua Zhang | arxiv.org/pdf/2401.05… | null |
| 2024-01-10 | Toward distortion-aware change detection in realistic scenarios | 在现实场景中实现失真感知变化检测 | Yitao Zhao, Heng-Chao Li, Nanqing Liu, Rui Wang | arxiv.org/pdf/2401.05… | null |
| 2024-01-10 | DISCOVER: 2-D Multiview Summarization of Optical Coherence Tomography Angiography for Automatic Diabetic Retinopathy Diagnosis | 发现:用于自动糖尿病视网膜病变诊断的光学相干断层扫描血管造影的二维多视图总结 | Mostafa El Habib Daho, Yihao Li, Rachid Zeghlache, Hugo Le Boité, Pierre Deman, Laurent Borderie, Hugang Ren, Niranchana Mannivanan, Capucine Lepicard, Béatrice Cochener, et.al. | arxiv.org/pdf/2401.05… | null |
| 2024-01-10 | Efficient Fine-Tuning with Domain Adaptation for Privacy-Preserving Vision Transformer | 通过领域适应进行高效微调,以保护隐私的 Vision Transformer | Teru Nagamori, Sayaka Shiota, Hitoshi Kiya | arxiv.org/pdf/2401.05… | null |
| 2024-01-10 | Dual-Perspective Knowledge Enrichment for Semi-Supervised 3D Object Detection | 半监督 3D 物体检测的双视角知识丰富 | Yucheng Han, Na Zhao, Weiling Chen, Keng Teck Ma, Hanwang Zhang | arxiv.org/pdf/2401.05… | null |
| 2024-01-10 | Optimising Graph Representation for Hardware Implementation of Graph Convolutional Networks for Event-based Vision | 优化基于事件视觉的图卷积网络硬件实现的图表示 | Kamil Jeziorek, Piotr Wzorek, Krzysztof Blachut, Andrea Pinna, Tomasz Kryjak | arxiv.org/pdf/2401.04… | null |
| 2024-01-10 | HaltingVT: Adaptive Token Halting Transformer for Efficient Video Recognition | HaltingVT:用于高效视频识别的自适应令牌停止变压器 | Qian Wu, Ruoxuan Cui, Yuke Li, Haoqi Zhu | arxiv.org/pdf/2401.04… | null |
| 2024-01-10 | EmMixformer: Mix transformer for eye movement recognition | EmMixformer:用于眼动识别的混合变压器 | Huafeng Qin, Hongyu Zhu, Xin Jin, Qun Song, Mounim A. El-Yacoubi, Xinbo Gao | arxiv.org/pdf/2401.04… | null |
图像理解
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-10 | InseRF: Text-Driven Generative Object Insertion in Neural 3D Scenes | InseRF:神经 3D 场景中文本驱动的生成对象插入 | Mohamad Shahbazi, Liesbeth Claessens, Michael Niemeyer, Edo Collins, Alessio Tonioni, Luc Van Gool, Federico Tombari | arxiv.org/pdf/2401.05… | null |
生成模型
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-10 | PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models | PIXART-δ:具有潜在一致性模型的快速且可控的图像生成 | Junsong Chen, Yue Wu, Simian Luo, Enze Xie, Sayak Paul, Ping Luo, Hang Zhao, Zhenguo Li | arxiv.org/pdf/2401.05… | null |
| 2024-01-10 | Application of Deep Learning in Blind Motion Deblurring: Current Status and Future Prospects | 深度学习在盲运动去模糊中的应用:现状与未来展望 | Yawen Xiang, Heng Zhou, Chengyang Li, Fangwei Sun, Zhongbo Li, Yongqiang Xie | arxiv.org/pdf/2401.05… | null |
多模态
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-10 | SnapCap: Efficient Snapshot Compressive Video Captioning | SnapCap:高效的快照压缩视频字幕 | Jianqiao Sun, Yudi Su, Hao Zhang, Ziheng Cheng, Zequn Zeng, Zhengjue Wang, Bo Chen, Xin Yuan | arxiv.org/pdf/2401.04… | null |
Transformer
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-10 | AdvMT: Adversarial Motion Transformer for Long-term Human Motion Prediction | AdvMT:用于长期人体运动预测的对抗性运动变压器 | Sarmad Idrees, Jongeun Choi, Seokman Sohn | arxiv.org/pdf/2401.05… | null |
| 2024-01-10 | MGNet: Learning Correspondences via Multiple Graphs | MGNet:通过多个图学习对应关系 | Luanyuan Dai, Xiaoyu Du, Hanwang Zhang, Jinhui Tang | arxiv.org/pdf/2401.04… | null |
| 2024-01-10 | Diffusion-based Pose Refinement and Muti-hypothesis Generation for 3D Human Pose Estimaiton | 基于扩散的姿势细化和多假设生成,用于 3D 人体姿势估计 | Hongbo Kang, Yong Wang, Mengyuan Liu, Doudou Wu, Peng Liu, Xinlin Yuan, Wenming Yang | arxiv.org/pdf/2401.04… | null |
| 2024-01-10 | Knowledge-aware Graph Transformer for Pedestrian Trajectory Prediction | 用于行人轨迹预测的知识感知图转换器 | Yu Liu, Yuexin Zhang, Kunming Li, Yongliang Qiao, Stewart Worrall, You-Fu Li, He Kong | arxiv.org/pdf/2401.04… | null |
| 2024-01-10 | CTNeRF: Cross-Time Transformer for Dynamic Neural Radiance Field from Monocular Video | CTNeRF:单目视频动态神经辐射场的跨时间变换器 | Xingyu Miao, Yang Bai, Haoran Duan, Yawen Huang, Fan Wan, Yang Long, Yefeng Zheng | arxiv.org/pdf/2401.04… | null |
3D/CG
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-10 | Structure from Duplicates: Neural Inverse Graphics from a Pile of Objects | 重复的结构:一堆对象的神经逆向图形 | Tianhang Cheng, Wei-Chiu Ma, Kaiyu Guan, Antonio Torralba, Shenlong Wang | arxiv.org/pdf/2401.05… | null |
其他
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-10 | Measuring Natural Scenes SFR of Automotive Fisheye Cameras | 测量汽车鱼眼相机的自然场景 SFR | Daniel Jakab, Eoin Martino Grua, Brian Micheal Deegan, Anthony Scanlan, Pepijn Van De Ven, Ciarán Eising | arxiv.org/pdf/2401.05… | null |
| 2024-01-10 | Exploring Vulnerabilities of No-Reference Image Quality Assessment Models: A Query-Based Black-Box Method | 探索无参考图像质量评估模型的漏洞:基于查询的黑盒方法 | Chenxi Yang, Yujia Liu, Dingquan Li, Tingting jiang | arxiv.org/pdf/2401.05… | null |
| 2024-01-10 | Content-Aware Depth-Adaptive Image Restoration | 内容感知深度自适应图像恢复 | Tom Richard Vargis, Siavash Ghiasvand | arxiv.org/pdf/2401.05… | null |
| 2024-01-10 | Source-Free Cross-Modal Knowledge Transfer by Unleashing the Potential of Task-Irrelevant Data | 通过释放任务无关数据的潜力实现无源跨模式知识转移 | Jinjing Zhu, Yucheng Chen, Lin Wang | arxiv.org/pdf/2401.05… | null |
| 2024-01-10 | Large Model based Sequential Keyframe Extraction for Video Summarization | 基于大型模型的视频摘要序列关键帧提取 | Kailong Tan, Yuxiang Zhou, Qianchen Xia, Rui Liu, Yong Chen | arxiv.org/pdf/2401.04… | null |
| 2024-01-10 | Inconsistency-Based Data-Centric Active Open-Set Annotation | 基于不一致性的以数据为中心的主动开放集注释 | Ruiyu Mao, Ouyang Xu, Yunhui Guo | arxiv.org/pdf/2401.04… | null |