[UPDATED!] 2024-02-29 (Publish Time)
生成模型
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-29 | DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models | DistriFusion:高分辨率扩散模型的分布式并行推理 | Muyang Li, Tianle Cai, Jiaxin Cao, Qinsheng Zhang, Han Cai, Junjie Bai, Yangqing Jia, Ming-Yu Liu, Kai Li, Song Han | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Towards Generalizable Tumor Synthesis | 迈向可推广的肿瘤合成 | Qi Chen, Xiaoxi Chen, Haorui Song, Zhiwei Xiong, Alan Yuille, Chen Wei, Zongwei Zhou | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Humanoid Locomotion as Next Token Prediction | 人形运动作为下一个令牌预测 | Ilija Radosavovic, Bike Zhang, Baifeng Shi, Jathushan Rajasegaran, Sarthak Kamat, Trevor Darrell, Koushil Sreenath, Jitendra Malik | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Listening to the Noise: Blind Denoising with Gibbs Diffusion | 聆听噪音:使用吉布斯扩散进行盲降噪 | David Heurtel-Depeiges, Charles C. Margossian, Ruben Ohana, Bruno Régaldo-Saint Blancard | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | SeD: Semantic-Aware Discriminator for Image Super-Resolution | SeD:用于图像超分辨率的语义感知鉴别器 | Bingchen Li, Xin Li, Hanxin Zhu, Yeying Jin, Ruoyu Feng, Zhizheng Zhang, Zhibo Chen | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Structure Preserving Diffusion Models | 结构保持扩散模型 | Haoye Lu, Spencer Szabados, Yaoliang Yu | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | A Novel Approach to Industrial Defect Generation through Blended Latent Diffusion Model with Online Adaptation | 通过在线适应的混合潜在扩散模型生成工业缺陷的新方法 | Hanxi Li, Zhengxun Zhang, Hao Chen, Lin Wu, Bo Li, Deyin Liu, Mingwen Wang | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | DiffAssemble: A Unified Graph-Diffusion Model for 2D and 3D Reassembly | DiffAssemble:用于 2D 和 3D 重组的统一图扩散模型 | Gianluca Scarpellini, Stefano Fiorini, Francesco Giuliari, Pietro Morerio, Alessio Del Bue | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Training Generative Image Super-Resolution Models by Wavelet-Domain Losses Enables Better Control of Artifacts | 通过小波域损失训练生成图像超分辨率模型可以更好地控制伪影 | Cansu Korkmaz, A. Murat Tekalp, Zafer Dogan | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Disentangling representations of retinal images with generative models | 用生成模型解开视网膜图像的表示 | Sarah Müller, Lisa M. Koch, Hendrik P. A. Lensch, Philipp Berens | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Graph Convolutional Neural Networks for Automated Echocardiography View Recognition: A Holistic Approach | 用于自动超声心动图视图识别的图卷积神经网络:整体方法 | Sarina Thomas, Cristiana Tiago, Børge Solli Andreassen, Svein-Arne Aase, Jurica Sprem, Erik Steen, Anne Solberg, Guy Ben-Yosef | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | WDM: 3D Wavelet Diffusion Models for High-Resolution Medical Image Synthesis | WDM:用于高分辨率医学图像合成的 3D 小波扩散模型 | Paul Friedrich, Julia Wolleb, Florentin Bieder, Alicia Durrer, Philippe C. Cattin | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Dose Prediction Driven Radiotherapy Paramters Regression via Intra- and Inter-Relation Modeling | 通过内关系和相互关系建模进行剂量预测驱动的放射治疗参数回归 | Jiaqi Cui, Yuanyuan Xu, Jianghong Xiao, Yuchen Fei, Jiliu Zhou, Xingcheng Peng, Yan Wang | arxiv.org/pdf/2402.18… | null |
| 2024-02-29 | Enhancing Steganographic Text Extraction: Evaluating the Impact of NLP Models on Accuracy and Semantic Coherence | 增强隐写文本提取:评估 NLP 模型对准确性和语义连贯性的影响 | Mingyang Li, Maoqin Yuan, Luyao Li, Han Pengsihua | arxiv.org/pdf/2402.18… | null |
| 2024-02-29 | ViewFusion: Towards Multi-View Consistency via Interpolated Denoising | ViewFusion:通过插值去噪实现多视图一致性 | Xianghui Yang, Yan Zuo, Sameera Ramasinghe, Loris Bazzani, Gil Avraham, Anton van den Hengel | arxiv.org/pdf/2402.18… | null |
| 2024-02-29 | A Quantitative Evaluation of Score Distillation Sampling Based Text-to-3D | 基于文本转3D的分数蒸馏采样的定量评估 | Xiaohan Fei, Chethan Parameshwara, Jiawei Mo, Xiaolong Li, Ashwin Swaminathan, CJ Taylor, Paolo Favaro, Stefano Soatto | arxiv.org/pdf/2402.18… | null |
多模态
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-29 | Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers | Panda-70M:与多个跨模态教师一起为 70M 视频添加字幕 | Tsai-Shien Chen, Aliaksandr Siarohin, Willi Menapace, Ekaterina Deyneka, Hsiang-wei Chao, Byung Eun Jeon, Yuwei Fang, Hsin-Ying Lee, Jian Ren, Ming-Hsuan Yang, et.al. | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | The All-Seeing Project V2: Towards General Relation Comprehension of the Open World | 全视计划V2:迈向开放世界的一般关系理解 | Weiyun Wang, Yiming Ren, Haowen Luo, Tiantong Li, Chenxiang Yan, Zhe Chen, Wenhai Wang, Qingyun Li, Lewei Lu, Xizhou Zhu, et.al. | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | TV-TREES: Multimodal Entailment Trees for Neuro-Symbolic Video Reasoning | TV-TREES:用于神经符号视频推理的多模态蕴涵树 | Kate Sanders, Nathaniel Weir, Benjamin Van Durme | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Navigating Hallucinations for Reasoning of Unintentional Activities | 导航幻觉以推理无意识的活动 | Shresth Grover, Vibhav Vineet, Yogesh S Rawat | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Entity-Aware Multimodal Alignment Framework for News Image Captioning | 用于新闻图像字幕的实体感知多模态对齐框架 | Junzhe Zhang, Huixuan Zhang, Xiaojun Wan | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Suppress and Rebalance: Towards Generalized Multi-Modal Face Anti-Spoofing | 抑制和重新平衡:迈向广义多模态人脸反欺骗 | Xun Lin, Shuai Wang, Rizhao Cai, Yizhong Liu, Ying Fu, Zitong Yu, Wenzhong Tang, Alex Kot | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Modular Blind Video Quality Assessment | 模块化盲视频质量评估 | Wen Wen, Mu Li, Yabin Zhang, Yiting Liao, Junlin Li, Li Zhang, Kede Ma | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | MaskFi: Unsupervised Learning of WiFi and Vision Representations for Multimodal Human Activity Recognition | MaskFi:用于多模态人类活动识别的 WiFi 和视觉表示的无监督学习 | Jianfei Yang, Shijie Tang, Yuecong Xu, Yunjiao Zhou, Lihua Xie | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Typographic Attacks in Large Multimodal Models Can be Alleviated by More Informative Prompts | 大型多模式模型中的印刷攻击可以通过提供更多信息的提示来缓解 | Hao Cheng, Erjia Xiao, Renjing Xu | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models | 通过大型视觉语言模型中的对比学习增强视觉文档理解 | Xin Li, Yunfei Wu, Xinghua Jiang, Zhihao Guo, Mingming Gong, Haoyu Cao, Yinsong Liu, Deqiang Jiang, Xing Sun | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | GoalNet: Goal Areas Oriented Pedestrian Trajectory Prediction | GoalNet:面向目标区域的行人轨迹预测 | Ching-Lin Lee, Zhi-Xuan Wang, Kuan-Ting Lai, Amar Fadillah | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Percept, Chat, and then Adapt: Multimodal Knowledge Transfer of Foundation Models for Open-World Video Recognition | 感知、聊天,然后适应:开放世界视频识别基础模型的多模态知识转移 | Boyu Chen, Siran Chen, Kunchang Li, Qinglin Xu, Yu Qiao, Yali Wang | arxiv.org/pdf/2402.18… | null |
| 2024-02-29 | Modality-Agnostic Structural Image Representation Learning for Deformable Multi-Modality Medical Image Registration | 用于可变形多模态医学图像配准的模态不可知结构图像表示学习 | Tony C. W. Mok, Zi Li, Yunhao Bai, Jianpeng Zhang, Wei Liu, Yan-Jie Zhou, Ke Yan, Dakai Jin, Yu Shi, Xiaoli Yin, et.al. | arxiv.org/pdf/2402.18… | null |
| 2024-02-29 | Aligning Knowledge Graph with Visual Perception for Object-goal Navigation | 将知识图与视觉感知相结合以实现对象目标导航 | Nuo Xu, Wen Wang, Rong Yang, Mengjie Qin, Zheyuan Lin, Wei Song, Chunlong Zhang, Jason Gu, Chao Li | arxiv.org/pdf/2402.18… | null |
模型压缩/优化
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-29 | T3DNet: Compressing Point Cloud Models for Lightweight 3D Recognition | T3DNet:压缩点云模型以实现轻量级 3D 识别 | Zhiyuan Yang, Yunjiao Zhou, Lihua Xie, Jianfei Yang | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Trajectory Consistency Distillation | 轨迹一致性蒸馏 | Jianbin Zheng, Minghui Hu, Zhongyi Fan, Chaoyue Wang, Changxing Ding, Dacheng Tao, Tat-Jen Cham | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Weakly Supervised Monocular 3D Detection with a Single-View Image | 使用单视图图像的弱监督单目 3D 检测 | Xueying Jiang, Sheng Jin, Lewei Lu, Xiaoqin Zhang, Shijian Lu | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Continuous Sign Language Recognition Based on Motor attention mechanism and frame-level Self-distillation | 基于运动注意机制和帧级自蒸馏的连续手语识别 | Qidan Zhu, Jing Li, Fei Yuan, Quan Gan | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | FlatNAS: optimizing Flatness in Neural Architecture Search for Out-of-Distribution Robustness | FlatNAS:优化神经架构搜索中的平坦度以实现分布外的鲁棒性 | Matteo Gambella, Fabrizio Pittorino, Manuel Roveri | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Variable-Rate Learned Image Compression with Multi-Objective Optimization and Quantization-Reconstruction Offsets | 具有多目标优化和量化重建偏移的可变速率学习图像压缩 | Fatih Kamisli, Fabien Racape, Hyomin Choi | arxiv.org/pdf/2402.18… | null |
分类/检测/识别/分割/...
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-29 | SeMoLi: What Moves Together Belongs Together | SeMoLi:一起移动的就属于一起 | Jenny Seidenschwarz, Aljoša Ošep, Francesco Ferroni, Simon Lucey, Laura Leal-Taixé | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Leveraging AI Predicted and Expert Revised Annotations in Interactive Segmentation: Continual Tuning or Full Training? | 在交互式分割中利用人工智能预测和专家修订注释:持续调整还是全面培训? | Tiezheng Zhang, Xiaoxi Chen, Chongyu Qu, Alan Yuille, Zongwei Zhou | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | PEM: Prototype-based Efficient MaskFormer for Image Segmentation | PEM:用于图像分割的基于原型的 Efficient MaskFormer | Niccolò Cavagnero, Gabriele Rosi, Claudia Ruttano, Francesca Pistilli, Marco Ciccone, Giuseppe Averta, Fabio Cermelli | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Assessing Visually-Continuous Corruption Robustness of Neural Networks Relative to Human Performance | 评估神经网络相对于人类表现的视觉连续腐败鲁棒性 | Huakun Shen, Boyue Caroline Hu, Krzysztof Czarnecki, Lina Marsso, Marsha Chechik | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | The 6th Affective Behavior Analysis in-the-wild (ABAW) Competition | 第六届野外情感行为分析(ABAW)大赛 | Dimitrios Kollias, Panagiotis Tzirakis, Alan Cowen, Stefanos Zafeiriou, Chunchang Shao, Guanyu Hu | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | One model to use them all: Training a segmentation model with complementary datasets | 一种模型可以使用所有这些:使用互补数据集训练分割模型 | Alexander C. Jenke, Sebastian Bodenstedt, Fiona R. Kolbinger, Marius Distler, Jürgen Weitz, Stefanie Speidel | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Stitching Gaps: Fusing Situated Perceptual Knowledge with Vision Transformers for High-Level Image Classification | 缝合间隙:将情境感知知识与视觉变换器融合以实现高级图像分类 | Delfina Sol Martinez Pandiani, Nicolas Lazzari, Valentina Presutti | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Generalizable Whole Slide Image Classification with Fine-Grained Visual-Semantic Interaction | 具有细粒度视觉语义交互的可概括的整个幻灯片图像分类 | Hao Li, Ying Chen, Yifei Chen, Wenxian Yang, Bowen Ding, Yuchen Han, Liansheng Wang, Rongshan Yu | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | CAMixerSR: Only Details Need More "Attention" | CAMixerSR:只有细节需要更多“关注” | Yan Wang, Shijie Zhao, Yi Liu, Junlin Li, Li Zhang | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | PrPSeg: Universal Proposition Learning for Panoramic Renal Pathology Segmentation | PrPSeg:全景肾脏病理分割的通用命题学习 | Ruining Deng, Quan Liu, Can Cui, Tianyuan Yao, Jialin Yue, Juming Xiong, Lining Yu, Yifei Wu, Mengmeng Yin, Yu Wang, et.al. | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Spinal Osteophyte Detection via Robust Patch Extraction on minimally annotated X-rays | 通过在最少注释的 X 射线上进行稳健的斑块提取来检测脊柱骨赘 | Soumya Snigdha Kundu, Yuanhan Mo, Nicharee Srikijkasemwat, Bartłomiej W. Papiez | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition | CricaVPR:用于视觉位置识别的跨图像相关感知表示学习 | Feng Lu, Xiangyuan Lan, Lijun Zhang, Dongmei Jiang, Yaowei Wang, Chun Yuan | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Effective Message Hiding with Order-Preserving Mechanisms | 通过保序机制有效隐藏消息 | Gao Yu, Qiu Xuchong, Ye Zihan | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | A SAM-guided Two-stream Lightweight Model for Anomaly Detection | SAM引导的两流轻量级异常检测模型 | Chenghao Li, Lei Qi, Xin Geng | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | ProtoP-OD: Explainable Object Detection with Prototypical Parts | ProtoP-OD:使用原型部件进行可解释的对象检测 | Pavlos Rath-Manakidis, Frederik Strothmann, Tobias Glasmachers, Laurenz Wiskott | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | BigGait: Learning Gait Representation You Want by Large Vision Models | BigGait:通过大视觉模型学习您想要的步态表示 | Dingqiang Ye, Chao Fan, Jingzhe Ma, Xiaoming Liu, Shiqi Yu | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Leveraging Representations from Intermediate Encoder-blocks for Synthetic Image Detection | 利用中间编码器块的表示进行合成图像检测 | Christos Koutlis, Symeon Papadopoulos | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | VideoMAC: Video Masked Autoencoders Meet ConvNets | VideoMAC:视频屏蔽自动编码器遇见卷积网络 | Gensheng Pei, Tao Chen, Xiruo Jiang, Huafeng Liu, Zeren Sun, Yazhou Yao | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | VEnvision3D: A Synthetic Perception Dataset for 3D Multi-Task Model Research | VEnvision3D:用于 3D 多任务模型研究的综合感知数据集 | Jiahao Zhou, Chen Long, Yue Xie, Jialiang Wang, Boheng Li, Haiping Wang, Zhe Chen, Zhen Dong | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | DOZE: A Dataset for Open-Vocabulary Zero-Shot Object Navigation in Dynamic Environments | DOZE:动态环境中开放词汇零样本对象导航的数据集 | Ji Ma, Hongming Dai, Yao Mu, Pengying Wu, Hao Wang, Xiaowei Chi, Yang Fei, Shanghang Zhang, Chang Liu | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | RSAM-Seg: A SAM-based Approach with Prior Knowledge Integration for Remote Sensing Image Semantic Segmentation | RSAM-Seg:基于 SAM 的遥感图像语义分割先验知识集成方法 | Jie Zhang, Xubing Yang, Rui Jiang, Wei Shao, Li Zhang | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Analysis of the Two-Step Heterogeneous Transfer Learning for Laryngeal Blood Vessel Classification: Issue and Improvement | 喉血管分类两步异构迁移学习分析:问题与改进 | Xinyi Fang, Chak Fong Chong, Kei Long Wong, Yapeng Wang, Tiankui Zhang, Sio-Kei Im | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | COFT-AD: COntrastive Fine-Tuning for Few-Shot Anomaly Detection | COFT-AD:用于少样本异常检测的对比微调 | Jingyi Liao, Xun Xu, Manh Cuong Nguyen, Adam Goodge, Chuan Sheng Foo | arxiv.org/pdf/2402.18… | null |
| 2024-02-29 | Theoretically Achieving Continuous Representation of Oriented Bounding Boxes | 理论上实现定向边界框的连续表示 | Zikai Xiao, Guo-Ye Yang, Xue Yang, Tai-Jiang Mu, Junchi Yan, Shi-min Hu | arxiv.org/pdf/2402.18… | null |
| 2024-02-29 | Towards Out-of-Distribution Detection for breast cancer classification in Point-of-Care Ultrasound Imaging | 致力于床旁超声成像中乳腺癌分类的分布外检测 | Jennie Karlsson, Marisa Wodrich, Niels Christian Overgaard, Freja Sahlin, Kristina Lång, Anders Heyden, Ida Arvidsson | arxiv.org/pdf/2402.18… | null |
| 2024-02-29 | Boosting Semi-Supervised Object Detection in Remote Sensing Images With Active Teaching | 通过主动教学促进遥感图像中的半监督目标检测 | Boxuan Zhang, Zengmao Wang, Bo Du | arxiv.org/pdf/2402.18… | null |
| 2024-02-29 | Navigating Beyond Dropout: An Intriguing Solution Towards Generalizable Image Super Resolution | 超越 Dropout:实现通用图像超分辨率的有趣解决方案 | Hongjun Wang, Jiyuan Chen, Yinqiang Zheng, Tieyong Zeng | arxiv.org/pdf/2402.18… | null |
| 2024-02-29 | Edge Computing Enabled Real-Time Video Analysis via Adaptive Spatial-Temporal Semantic Filtering | 边缘计算通过自适应时空语义过滤实现实时视频分析 | Xiang Chen, Wenjie Zhu, Jiayuan Chen, Tong Zhang, Changyan Yi, Jun Cai | arxiv.org/pdf/2402.18… | null |
| 2024-02-29 | A Simple yet Effective Network based on Vision Transformer for Camouflaged Object and Salient Object Detection | 基于视觉变压器的简单而有效的网络,用于伪装物体和显着物体检测 | Chao Hao, Zitong Yu, Xin Liu, Jun Xu, Huanjing Yue, Jingyu Yang | arxiv.org/pdf/2402.18… | null |
| 2024-02-29 | Decompose-and-Compose: A Compositional Approach to Mitigating Spurious Correlation | 分解和组合:减轻杂散相关性的组合方法 | Fahimeh Hosseini Noohdani, Parsa Hosseini, Arian Yazdan Parast, Hamidreza Yaghoubi Araghi, Mahdieh Soleymani Baghshah | arxiv.org/pdf/2402.18… | null |
| 2024-02-29 | SNE-RoadSegV2: Advancing Heterogeneous Feature Fusion and Fallibility Awareness for Freespace Detection | SNE-RoadSegV2:推进自由空间检测的异构特征融合和易错意识 | Yi Feng, Yu Ma, Qijun Chen, Ioannis Pitas, Rui Fan | arxiv.org/pdf/2402.18… | null |
| 2024-02-29 | Rethinking Multi-domain Generalization with A General Learning Objective | 以通用学习目标重新思考多领域泛化 | Zhaorui Tan, Xi Yang, Kaizhu Huang | arxiv.org/pdf/2402.18… | null |
| 2024-02-29 | Debiased Novel Category Discovering and Localization | 去偏见的小说类别发现和本地化 | Juexiao Feng, Yuhong Yang, Yanchun Xie, Yaqian Li, Yandong Guo, Yuchen Guo, Yuwei He, Liuyu Xiang, Guiguang Ding | arxiv.org/pdf/2402.18… | null |
| 2024-02-29 | OpticalDR: A Deep Optical Imaging Model for Privacy-Protective Depression Recognition | OpticalDR:用于隐私保护抑郁症识别的深度光学成像模型 | Yuchen Pan, Junjun Jiang, Kui Jiang, Zhihao Wu, Keyuan Yu, Xianming Liu | arxiv.org/pdf/2402.18… | null |
图像理解
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-29 | RoadRunner -- Learning Traversability Estimation for Autonomous Off-road Driving | RoadRunner——学习自主越野驾驶的可通行性估计 | Jonas Frey, Shehryar Khattak, Manthan Patel, Deegan Atha, Julian Nubert, Curtis Padgett, Marco Hutter, Patrick Spieler | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | PCDepth: Pattern-based Complementary Learning for Monocular Depth Estimation by Best of Both Worlds | PCDepth:基于模式的互补学习,用于单目深度估计,两全其美 | Haotian Liu, Sanqing Qu, Fan Lu, Zongtao Bu, Florian Roehrbein, Alois Knoll, Guang Chen | arxiv.org/pdf/2402.18… | null |
| 2024-02-29 | SwitchLight: Co-design of Physics-driven Architecture and Pre-training Framework for Human Portrait Relighting | SwitchLight:物理驱动架构和人像补光预训练框架的协同设计 | Hoon Kim, Minje Jang, Wonjun Yoon, Jisoo Lee, Donghyun Na, Sanghyun Woo | arxiv.org/pdf/2402.18… | null |
Transformer
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-29 | Loss-Free Machine Unlearning | 无损失机器忘却 | Jack Foster, Stefan Schoepf, Alexandra Brintrup | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | HyenaPixel: Global Image Context with Convolutions | HyenaPixel:带有卷积的全局图像上下文 | Julian Spravil, Sebastian Houben, Sven Behnke | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Feature boosting with efficient attention for scene parsing | 通过有效关注场景解析来增强特征 | Vivek Singh, Shailza Sharma, Fabio Cuzzolin | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | MemoNav: Working Memory Model for Visual Navigation | MemoNav:视觉导航的工作记忆模型 | Hongxin Li, Zeyu Wang, Xu Yang, Yuran Yang, Shuqi Mei, Zhaoxiang Zhang | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Progressive Contrastive Learning with Multi-Prototype for Unsupervised Visible-Infrared Person Re-identification | 用于无监督可见光-红外人员重新识别的多原型渐进对比学习 | Jiangming Shi, Xiangbo Yin, Yaoxing Wang, Xiaofeng Liu, Yuan Xie, Yanyun Qu | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | LoLiSRFlow: Joint Single Image Low-light Enhancement and Super-resolution via Cross-scale Transformer-based Conditional Flow | LoLiSRFlow:通过基于跨尺度变压器的条件流联合单图像低光增强和超分辨率 | Ziyu Yue, Jiaxin Gao, Sihan Xie, Yang Liu, Zhixun Su | arxiv.org/pdf/2402.18… | null |
| 2024-02-29 | Gradient Alignment for Cross-Domain Face Anti-Spoofing | 跨域人脸反欺骗的梯度对齐 | Binh M. Le, Simon S. Woo | arxiv.org/pdf/2402.18… | null |
| 2024-02-29 | BFRFormer: Transformer-based generator for Real-World Blind Face Restoration | BFRFormer:基于 Transformer 的现实世界盲脸恢复生成器 | Guojing Ge, Qi Song, Guibo Zhu, Yuting Zhang, Jinglu Chen, Miao Xin, Ming Tang, Jinqiao Wang | arxiv.org/pdf/2402.18… | null |
3D/CG
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-29 | Learning a Generalized Physical Face Model From Data | 从数据中学习广义的物理人脸模型 | Lingchen Yang, Gaspard Zoss, Prashanth Chandran, Markus Gross, Barbara Solenthaler, Eftychios Sifakis, Derek Bradley | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Spectral Meets Spatial: Harmonising 3D Shape Matching and Interpolation | 光谱与空间的结合:协调 3D 形状匹配和插值 | Dongliang Cao, Marvin Eisenberger, Nafie El Amrani, Daniel Cremers, Florian Bernard | arxiv.org/pdf/2402.18… | null |
| 2024-02-29 | Deep Learning for 3D Human Pose Estimation and Mesh Recovery: A Survey | 用于 3D 人体姿势估计和网格恢复的深度学习:一项调查 | Yang Liu, Changzhen Qiu, Zhiyong Zhang | arxiv.org/pdf/2402.18… | null |
| 2024-02-29 | NARUTO: Neural Active Reconstruction from Uncertain Target Observations | NARUTO:从不确定目标观察中进行神经主动重建 | Ziyue Feng, Huangying Zhan, Zheng Chen, Qingan Yan, Xiangyu Xu, Changjiang Cai, Bing Li, Qilun Zhu, Yi Xu | arxiv.org/pdf/2402.18… | null |
各类学习方式
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-29 | Unsupervised Learning of High-resolution Light Field Imaging via Beam Splitter-based Hybrid Lenses | 通过基于分束器的混合镜头进行高分辨率光场成像的无监督学习 | Jianxin Lei, Chengcai Xu, Langqing Shi, Junhui Hou, Ping Zhou | arxiv.org/pdf/2402.19… | null |
其他
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-29 | Retrieval-Augmented Generation for AI-Generated Content: A Survey | 人工智能生成内容的检索增强生成:一项调查 | Penghao Zhao, Hailin Zhang, Qinhan Yu, Zhengren Wang, Yunteng Geng, Fangcheng Fu, Ling Yang, Wentao Zhang, Bin Cui | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Lifelong Benchmarks: Efficient Model Evaluation in an Era of Rapid Progress | 终身基准:快速进步时代的高效模型评估 | Ameya Prabhu, Vishaal Udandarao, Philip Torr, Matthias Bethge, Adel Bibi, Samuel Albanie | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Towards Safe and Reliable Autonomous Driving: Dynamic Occupancy Set Prediction | 迈向安全可靠的自动驾驶:动态占用集预测 | Wenbo Shao, Jiahui Xu, Wenhao Yu, Jun Li, Hong Wang | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | An AI based Digital Score of Tumour-Immune Microenvironment Predicts Benefit to Maintenance Immunotherapy in Advanced Oesophagogastric Adenocarcinoma | 基于人工智能的肿瘤免疫微环境数字评分可预测晚期食管胃腺癌维持免疫治疗的益处 | Quoc Dang Vu, Caroline Fong, Anderley Gordon, Tom Lund, Tatiany L Silveira, Daniel Rodrigues, Katharina von Loga, Shan E Ahmed Raza, David Cunningham, Nasir Rajpoot | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | SIFT-Aided Rectified 2D-DIC for Displacement and Strain Measurements in Asphalt Concrete Testing | SIFT 辅助修正 2D-DIC 用于沥青混凝土测试中的位移和应变测量 | Zehui Zhu, Imad L. Al-Qadi | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Learning Intra-view and Cross-view Geometric Knowledge for Stereo Matching | 学习立体匹配的视图内和跨视图几何知识 | Rui Gong, Weide Liu, Zaiwang Gu, Xulei Yang, Jun Cheng | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Fine Structure-Aware Sampling: A New Sampling Training Scheme for Pixel-Aligned Implicit Models in Single-View Human Reconstruction | 精细结构感知采样:单视图人体重建中像素对齐隐式模型的新采样训练方案 | Kennard Yanting Chan, Fayao Liu, Guosheng Lin, Chuan Sheng Foo, Weisi Lin | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | VIXEN: Visual Text Comparison Network for Image Difference Captioning | VIXEN:用于图像差异字幕的视觉文本比较网络 | Alexander Black, Jing Shi, Yifei Fai, Tu Bui, John Collomosse | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Deep Network for Image Compressed Sensing Coding Using Local Structural Sampling | 使用局部结构采样进行图像压缩感知编码的深度网络 | Wenxue Cui, Xingtao Wang, Xiaopeng Fan, Shaohui Liu, Xinwei Gao, Debin Zhao | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | DeepEraser: Deep Iterative Context Mining for Generic Text Eraser | DeepEraser:通用文本橡皮擦的深度迭代上下文挖掘 | Hao Feng, Wendi Wang, Shaokai Liu, Jiajun Deng, Wengang Zhou, Houqiang Li | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | Atmospheric Turbulence Removal with Video Sequence Deep Visual Priors | 使用视频序列深度视觉先验去除大气湍流 | P. Hill, N. Anantrasirichai, A. Achim, D. R. Bull | arxiv.org/pdf/2402.19… | null |
| 2024-02-29 | PrivatEyes: Appearance-based Gaze Estimation Using Federated Secure Multi-Party Computation | PrivatEyes:使用联合安全多方计算进行基于外观的注视估计 | Mayar Elfares, Pascal Reisert, Zhiming Hu, Wenwu Tang, Ralf Küsters, Andreas Bulling | arxiv.org/pdf/2402.18… | null |
| 2024-02-29 | OHTA: One-shot Hand Avatar via Data-driven Implicit Priors | OHTA:通过数据驱动的隐式先验实现一次性手部头像 | Xiaozheng Zheng, Chao Wen, Zhuo Su, Zeran Xu, Zhaohu Li, Yang Zhao, Zhou Xue | arxiv.org/pdf/2402.18… | null |
| 2024-02-29 | WWW: A Unified Framework for Explaining What, Where and Why of Neural Networks by Interpretation of Neuron Concepts | WWW:通过解释神经元概念来解释神经网络的内容、位置和原因的统一框架 | Yong Hyun Ahn, Hyeon Bae Kim, Seong Tae Kim | arxiv.org/pdf/2402.18… | null |
| 2024-02-29 | Anatomy-guided fiber trajectory distribution estimation for cranial nerves tractography | 脑神经束成像的解剖引导纤维轨迹分布估计 | Lei Xie, Qingrun Zeng, Huajun Zhou, Guoqiang Xie, Mingchu Li, Jiahao Huang, Jianan Cui, Hao Chen, Yuanjing Feng | arxiv.org/pdf/2402.18… | null |
| 2024-02-29 | GDCNet: Calibrationless geometric distortion correction of echo planar imaging data using deep learning | GDCNet:使用深度学习对回波平面成像数据进行无校准几何失真校正 | Marina Manso Jimeno, Keren Bachi, George Gardner, Yasmin L. Hurd, John Thomas Vaughan Jr., Sairam Geethanath | arxiv.org/pdf/2402.18… | null |
| 2024-02-29 | Exploration of Learned Lifting-Based Transform Structures for Fully Scalable and Accessible Wavelet-Like Image Compression | 探索基于学习提升的变换结构以实现完全可扩展且可访问的类小波图像压缩 | Xinyue Li, Aous Naman, David Taubman | arxiv.org/pdf/2402.18… | null |