[UPDATED!] 2024-02-02 (Publish Time)
生成模型
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-02 | NeuroCine: Decoding Vivid Video Sequences from Human Brain Activties | NeuroCine:解码人脑活动中的生动视频序列 | Jingyuan Sun, Mingxiao Li, Zijiao Chen, Marie-Francine Moens | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Boximator: Generating Rich and Controllable Motions for Video Synthesis | Boximator:为视频合成生成丰富且可控的运动 | Jiawei Wang, Yuchen Zhang, Jiaxin Zou, Yan Zeng, Guoqiang Wei, Liping Yuan, Hang Li | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Cross-view Masked Diffusion Transformers for Person Image Synthesis | 用于人物图像合成的交叉视图掩模扩散变压器 | Trung X. Pham, Zhang Kang, Chang D. Yoo | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Advancing Brain Tumor Inpainting with Generative Models | 利用生成模型推进脑肿瘤修复 | Ruizhi Zhu, Xinru Zhang, Haowen Pang, Chundan Xu, Chuyang Ye | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Synthetic Data for the Mitigation of Demographic Biases in Face Recognition | 用于减轻人脸识别中人口统计偏差的合成数据 | Pietro Melzi, Christian Rathgeb, Ruben Tolosana, Ruben Vera-Rodriguez, Aythami Morales, Dominik Lawatsch, Florian Domin, Maxim Schaubert | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | EmoSpeaker: One-shot Fine-grained Emotion-Controlled Talking Face Generation | EmoSpeaker:一次性细粒度情感控制说话面部生成 | Guanwen Feng, Haoran Cheng, Yunan Li, Zhiyuan Ma, Chaoneng Li, Zhihao Qian, Qiguang Miao, Chi-Man Pun | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Cheating Suffix: Targeted Attack to Text-To-Image Diffusion Models with Multi-Modal Priors | 作弊后缀:对具有多模态先验的文本到图像扩散模型的针对性攻击 | Dingcheng Yang, Yang Bai, Xiaojun Jia, Yang Liu, Xiaochun Cao, Wenjian Yu | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Can Shape-Infused Joint Embeddings Improve Image-Conditioned 3D Diffusion? | 形状注入的关节嵌入可以改善图像条件 3D 扩散吗? | Cristian Sbrolli, Paolo Cudrano, Matteo Matteucci | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | PRIME: Protect Your Videos From Malicious Editing | PRIME:保护您的视频免遭恶意编辑 | Guanlin Li, Shuai Yang, Jie Zhang, Tianwei Zhang | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Structured World Modeling via Semantic Vector Quantization | 通过语义向量量化进行结构化世界建模 | Yi-Fu Wu, Minseung Lee, Sungjin Ahn | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Unsupervised Generation of Pseudo Normal PET from MRI with Diffusion Model for Epileptic Focus Localization | 利用扩散模型从 MRI 中无监督生成伪正常 PET,用于癫痫病灶定位 | Wentao Chen, Jiwei Li, Xichen Xu, Hui Huang, Siyu Yuan, Miao Zhang, Tianming Xu, Jie Luo, Weimin Zhou | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Ambient-Pix2PixGAN for Translating Medical Images from Noisy Data | Ambient-Pix2PixGAN 用于从噪声数据转换医学图像 | Wentao Chen, Xichen Xu, Jie Luo, Weimin Zhou | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | AmbientCycleGAN for Establishing Interpretable Stochastic Object Models Based on Mathematical Phantoms and Medical Imaging Measurements | AmbientCycleGAN 用于基于数学模型和医学成像测量建立可解释的随机对象模型 | Xichen Xu, Wentao Chen, Weimin Zhou | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Source-Free Unsupervised Domain Adaptation with Hypothesis Consolidation of Prediction Rationale | 具有预测基本原理假设巩固的无源无监督域适应 | Yangyang Shu, Xiaofeng Cao, Qi Chen, Bowen Zhang, Ziqin Zhou, Anton van den Hengel, Lingqiao Liu | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | A Single Simple Patch is All You Need for AI-generated Image Detection | 只需一个简单的补丁即可进行 AI 生成的图像检测 | Jiaxuan Chen, Jieteng Yao, Li Niu | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Compositional Generative Modeling: A Single Model is Not All You Need | 组合生成建模:单一模型并不是您所需要的全部 | Yilun Du, Leslie Kaelbling | arxiv.org/pdf/2402.01… | null |
多模态
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-02 | Skip | Skip | Zongbo Han, Zechen Bai, Haiyang Mei, Qianli Xu, Changqing Zhang, Mike Zheng Shou | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | A general framework for rotation invariant point cloud analysis | 旋转不变点云分析的通用框架 | Shuqing Luo, Wei Gao | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Deep Multimodal Fusion of Data with Heterogeneous Dimensionality via Projective Networks | 通过投影网络实现异构维度数据的深度多模态融合 | José Morano, Guilherme Aresta, Christoph Grechenig, Ursula Schmidt-Erfurth, Hrvoje Bogunović | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | TSJNet: A Multi-modality Target and Semantic Awareness Joint-driven Image Fusion Network | TSJNet:多模态目标和语义感知联合驱动的图像融合网络 | Yuchan Jie, Yushen Xu, Xiaosong Li, Haishu Tan | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | 2AFC Prompting of Large Multimodal Models for Image Quality Assessment | 2AFC提示大型多模态模型用于图像质量评估 | Hanwei Zhu, Xiangjie Sui, Baoliang Chen, Xuelin Liu, Peilin Chen, Yuming Fang, Shiqi Wang | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | A Survey for Foundation Models in Autonomous Driving | 自动驾驶基础模型调查 | Haoxiang Gao, Yaqian Li, Kaiwen Long, Ming Yang, Yiqing Shen | arxiv.org/pdf/2402.01… | null |
Nerf
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-02 | HyperPlanes: Hypernetwork Approach to Rapid NeRF Adaptation | HyperPlanes:快速适应 NeRF 的超网络方法 | Paweł Batorski, Dawid Malarz, Marcin Przewięźlikowski, Marcin Mazur, Sławomir Tadeja, Przemysław Spurek | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | GaMeS: Mesh-Based Adapting and Modification of Gaussian Splatting | GaMeS:基于网格的高斯分布调整和修改 | Joanna Waczyńska, Piotr Borycki, Sławomir Tadeja, Jacek Tabor, Przemysław Spurek | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Efficient Dynamic-NeRF Based Volumetric Video Coding with Rate Distortion Optimization | 基于速率失真优化的高效动态 NeRF 体积视频编码 | Zhiyu Zhang, Guo Lu, Huanxiong Liang, Anni Tang, Qiang Hu, Li Song | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Taming Uncertainty in Sparse-view Generalizable NeRF via Indirect Diffusion Guidance | 通过间接扩散指导克服稀疏视图可推广 NeRF 中的不确定性 | Yaokun Li, Chao Gou, Guang Tan | arxiv.org/pdf/2402.01… | null |
模型压缩/优化
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-02 | AutoGCN -- Towards Generic Human Activity Recognition with Neural Architecture Search | AutoGCN——通过神经架构搜索实现通用人类活动识别 | Felix Tempel, Inga Strümke, Espen Alexander F. Ihlen | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Bi-CryptoNets: Leveraging Different-Level Privacy for Encrypted Inference | Bi-CryptoNets:利用不同级别的隐私进行加密推理 | Man-Jie Yuan, Zheng Zou, Wei Gao | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Spiking CenterNet: A Distillation-boosted Spiking Neural Network for Object Detection | Spiking CenterNet:用于物体检测的蒸馏增强尖峰神经网络 | Lennard Bodden, Franziska Schwaiger, Duc Bach Ha, Lars Kreuzberg, Sven Behnke | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Cascaded Scaling Classifier: class incremental learning with probability scaling | Cascaded Scaling Classifier:具有概率缩放的类增量学习 | Jary Pomponi, Alessio Devoto, Simone Scardapane | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Faster Inference of Integer SWIN Transformer by Removing the GELU Activation | 通过删除 GELU 激活来加快整数 SWIN Transformer 的推理 | Mohammadreza Tayaranian, Seyyed Hasan Mozafari, James J. Clark, Brett Meyer, Warren Gross | arxiv.org/pdf/2402.01… | null |
分类/检测/识别/分割/...
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-02 | Deep Continuous Networks | 深度连续网络 | Nergis Tomen, Silvia L. Pintea, Jan C. van Gemert | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Closing the Gap in Human Behavior Analysis: A Pipeline for Synthesizing Trimodal Data | 缩小人类行为分析的差距:合成三峰数据的管道 | Christian Stippel, Thomas Heitzinger, Rafael Sterzinger, Martin Kampel | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Convolution kernel adaptation to calibrated fisheye | 卷积核适应校准鱼眼 | Bruno Berenguel-Baeta, Maria Santos-Villafranca, Jesus Bermudez-Cameo, Alejandro Perez-Yus, Jose J. Guerrero | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | XAI for Skin Cancer Detection with Prototypes and Non-Expert Supervision | XAI 通过原型和非专家监督进行皮肤癌检测 | Miguel Correia, Alceu Bissoto, Carlos Santiago, Catarina Barata | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | ALERT-Transformer: Bridging Asynchronous and Synchronous Machine Learning for Real-Time Event-based Spatio-Temporal Data | ALERT-Transformer:桥接异步和同步机器学习,以实现基于事件的实时时空数据 | Carmen Martin-Turrero, Maxence Bouvier, Manuel Breitenstein, Pietro Zanuttigh, Vincent Parret | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | FindingEmo: An Image Dataset for Emotion Recognition in the Wild | FindEmo:用于野外情绪识别的图像数据集 | Laurent Mertens, Elahe' Yargholi, Hans Op de Beeck, Jan Van den Stock, Joost Vennekens | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Phrase Grounding-based Style Transfer for Single-Domain Generalized Object Detection | 用于单域广义目标检测的基于短语基础的风格迁移 | Hao Li, Wei Wang, Cong Wang, Zhigang Luo, Xinwang Liu, Kenli Li, Xiaochun Cao | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | AGILE: Approach-based Grasp Inference Learned from Element Decomposition | AGILE:从元素分解中学习的基于方法的抓取推理 | MohammadHossein Koosheshi, Hamed Hosseini, Mehdi Tale Masouleh, Ahmad Kalhor, Mohammad Reza Hairi Yazdi | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Delving into Decision-based Black-box Attacks on Semantic Segmentation | 深入研究语义分割的基于决策的黑盒攻击 | Zhaoyu Chen, Zhengyang Shan, Jingwen Chang, Kaixun Jiang, Dingkang Yang, Yiting Cheng, Wenqiang Zhang | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Segment Any Change | 分段任何更改 | Zhuo Zheng, Yanfei Zhong, Liangpei Zhang, Stefano Ermon | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | DeepBranchTracer: A Generally-Applicable Approach to Curvilinear Structure Reconstruction Using Multi-Feature Learning | DeepBranchTracer:一种使用多特征学习进行曲线结构重建的通用方法 | Chao Liu, Ting Zhao, Nenggan Zheng | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Scale Equalization for Multi-Level Feature Fusion | 多级特征融合的尺度均衡 | Bum Jun Kim, Sang Woo Kim | arxiv.org/pdf/2402.01… | null |
图像理解
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-02 | DeepAAT: Deep Automated Aerial Triangulation for Fast UAV-based Mapping | DeepAAT:深度自动化空中三角测量,用于基于无人机的快速测绘 | Zequan Chen, Jianping Li, Qusheng Li, Bisheng Yang, Zhen Dong | arxiv.org/pdf/2402.01… | null |
Transformer
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-02 | Self-Attention through Kernel-Eigen Pair Sparse Variational Gaussian Processes | 通过内核-特征对稀疏变分高斯过程进行自注意力 | Yingyi Chen, Qinghua Tao, Francesco Tonin, Johan A. K. Suykens | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | LIR: Efficient Degradation Removal for Lightweight Image Restoration | LIR:高效退化去除以实现轻量级图像恢复 | Dongqi Fan, Ting Yue, Xin Zhao, Liang Chang | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Spectrum-guided Feature Enhancement Network for Event Person Re-Identification | 用于事件人员重新识别的频谱引导特征增强网络 | Hongchen Tan, Yi Zhang, Xiuping Liu, Baocai Yin, Nan Ma, Xin Li, Huchuan Lu | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Enhanced Urban Region Profiling with Adversarial Self-Supervised Learning | 通过对抗性自我监督学习增强城市区域分析 | Weiliang Chan, Qianqian Ren, Jinbao Li | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Seeing Objects in a Cluttered World: Computational Objectness from Motion in Video | 在杂乱的世界中看到对象:视频中运动的计算对象性 | Douglas Poland, Amar Saini | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | How many views does your deep neural network use for prediction? | 您的深度神经网络使用多少个视图进行预测? | Keisuke Kawano, Takuro Kutsuna, Keisuke Sano | arxiv.org/pdf/2402.01… | null |
3D/CG
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-02 | Scaled 360 layouts: Revisiting non-central panoramas | 缩放 360 度布局:重新审视非中心全景图 | Bruno Berenguel-Baeta, Jesus Bermudez-Cameo, Jose J. Guerrero | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | 3D Vertebrae Measurements: Assessing Vertebral Dimensions in Human Spine Mesh Models Using Local Anatomical Vertebral Axes | 3D 椎骨测量:使用局部解剖椎轴评估人体脊柱网格模型中的椎骨尺寸 | Ivanna Kramer, Vinzent Rittel, Lara Blomenkamp, Sabine Bauer, Dietrich Paulus | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | SiMA-Hand: Boosting 3D Hand-Mesh Reconstruction by Single-to-Multi-View Adaptation | SiMA-Hand:通过单视图到多视图适应促进 3D 手网格重建 | Yinqiao Wang, Hao Xu, Pheng-Ann Heng, Chi-Wing Fu | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | A Comprehensive Survey on 3D Content Generation | 3D 内容生成的综合调查 | Jian Liu, Xiaoshui Huang, Tianyu Huang, Lu Chen, Yuenan Hou, Shixiang Tang, Ziwei Liu, Wanli Ouyang, Wangmeng Zuo, Junjun Jiang, et.al. | arxiv.org/pdf/2402.01… | null |
各类学习方式
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-02 | Simulator-Free Visual Domain Randomization via Video Games | 通过视频游戏实现无模拟器视觉域随机化 | Chintan Trivedi, Nemanja Rašajski, Konstantinos Makantasis, Antonios Liapis, Georgios N. Yannakakis | arxiv.org/pdf/2402.01… | null |
其他
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-02 | Immersive Video Compression using Implicit Neural Representations | 使用隐式神经表示的沉浸式视频压缩 | Ho Man Kwan, Fan Zhang, Andrew Gower, David Bull | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | SLYKLatent, a Learning Framework for Facial Features Estimation | SLYKLatent,面部特征估计的学习框架 | Samuel Adebayo, Joost C. Dessing, Seán McLoone | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Visual Gyroscope: Combination of Deep Learning Features and Direct Alignment for Panoramic Stabilization | 视觉陀螺仪:结合深度学习功能和直接对准实现全景稳定 | Bruno Berenguel-Baeta, Antoine N. Andre, Guillaume Caron, Jesus Bermudez-Cameo, Jose J. Guerrero | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Mission Critical -- Satellite Data is a Distinct Modality in Machine Learning | 关键任务——卫星数据是机器学习的一种独特模式 | Esther Rolf, Konstantin Klemmer, Caleb Robinson, Hannah Kerner | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | Describing Images | 描述图像 | Ece Takmaz, Sandro Pezzelle, Raquel Fernández | arxiv.org/pdf/2402.01… | null |
| 2024-02-02 | UCVC: A Unified Contextual Video Compression Framework with Joint P-frame and B-frame Coding | UCVC:具有联合 P 帧和 B 帧编码的统一上下文视频压缩框架 | Jiayu Yang, Wei Jiang, Yongqi Zhai, Chunhui Yang, Ronggang Wang | arxiv.org/pdf/2402.01… | null |