[UPDATED!] 2024-03-24 (Publish Time)
生成模型
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-24 | latentSplat: Autoencoding Variational Gaussians for Fast Generalizable 3D Reconstruction | Letantsplat:自动编码为快速概括3D重建的变异性高斯人 | Christopher Wewer, Kevin Raj, Eddy Ilg, Bernt Schiele, Jan Eric Lenssen | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Laplacian-guided Entropy Model in Neural Codec with Blur-dissipated Synthesis | 具有模糊耗散合成的神经编解码器中拉普拉斯引导的熵模型 | Atefeh Khoshkhahtinat, Ali Zafari, Piyush M. Mehta, Nasser M. Nasrabadi | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Skull-to-Face: Anatomy-Guided 3D Facial Reconstruction and Editing | Skull-to-Face:解剖学引导的 3D 面部重建和编辑 | Yongqing Liang, Congyi Zhang, Junli Zhao, Wenping Wang, Xin Li | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Diffusion Model is a Good Pose Estimator from 3D RF-Vision | 扩散模型是 3D RF-Vision 的良好姿势估计器 | Junqiao Fan, Jianfei Yang, Yuecong Xu, Lihua Xie | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Pose-Guided Self-Training with Two-Stage Clustering for Unsupervised Landmark Discovery | 姿势引导的自我训练以及两阶段聚类,无监督的地标发现 | Siddharth Tourani, Ahmed Alwheibi, Arif Mahmood, Muhammad Haris Khan | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Gaze-guided Hand-Object Interaction Synthesis: Benchmark and Method | 凝视引导的手动相互作用综合:基准和方法 | Jie Tian, Lingxiao Yang, Ran Ji, Yuexin Ma, Lan Xu, Jingyi Yu, Ye Shi, Jingya Wang | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Robust Diffusion Models for Adversarial Purification | 用于对抗性净化的鲁棒扩散模型 | Guang Lin, Zerui Tao, Jianhai Zhang, Toshihisa Tanaka, Qibin Zhao | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | A Unified Module for Accelerating STABLE-DIFFUSION: LCM-LORA | 加速稳定扩散的统一模块:LCM-LORA | Ayush Thakur, Rashmi Vashisth | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | SM2C: Boost the Semi-supervised Segmentation for Medical Image by using Meta Pseudo Labels and Mixed Images | SM2C:使用元伪标签和混合图像增强医学图像的半监督分割 | Yifei Wang, Chuhong Zhu | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | CBGT-Net: A Neuromimetic Architecture for Robust Classification of Streaming Data | CBGT-NET:一种用于鲁棒分类流数据的神经模拟体系结构 | Shreya Sharma, Dana Hughes, Katia Sycara | arxiv.org/pdf/2403.15… | null |
多模态
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-24 | AutoInst: Automatic Instance-Based Segmentation of LiDAR 3D Scans | AutoInst:基于实例的 LiDAR 3D 扫描自动分割 | Cedric Perauer, Laurenz Adrian Heidrich, Haifan Zhang, Matthias Nießner, Anastasiia Kornilova, Alexey Artemov | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | AVicuna: Audio-Visual LLM with Interleaver and Context-Boundary Alignment for Temporal Referential Dialogue | avicuna:带有交叉裂缝和上下文与临时对话的视听llm | Yunlong Tang, Daiki Shimada, Jing Bi, Chenliang Xu | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Unlearning Backdoor Threats: Enhancing Backdoor Defense in Multimodal Contrastive Learning via Local Token Unlearning | 学习后门威胁:通过本地令牌学习在多模式对比学习中增强后门防御 | Siyuan Liang, Kuanrong Liu, Jiajun Gong, Jiawei Liang, Yuan Xun, Ee-Chien Chang, Xiaochun Cao | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Cross-domain Multi-modal Few-shot Object Detection via Rich Text | 跨域多模式通过丰富的文本检测几射击对象检测 | Zeyu Shangguan, Daniel Seita, Mohammad Rostami | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World | Egoexolearn:用于桥接异步的自我和以外的过程的数据集,以现实世界中的程序活动为中心 | Yifei Huang, Guo Chen, Jilan Xu, Mingfang Zhang, Lijin Yang, Baoqi Pei, Hongjie Zhang, Lu Dong, Yali Wang, Limin Wang, et.al. | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Opportunities and challenges in the application of large artificial intelligence models in radiology | 大型人工智能模型在放射科应用的机遇与挑战 | Liangrui Pan, Zhenyu Zhao, Ying Lu, Kewei Tang, Liyong Fu, Qingchun Liang, Shaoliang Peng | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | V2X-Real: a Largs-Scale Dataset for Vehicle-to-Everything Cooperative Perception | V2X-Real:用于车对万物协作感知的大规模数据集 | Hao Xiang, Zhaoliang Zheng, Xin Xia, Runsheng Xu, Letian Gao, Zewei Zhou, Xu Han, Xinkai Ji, Mingxi Li, Zonglin Meng, et.al. | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | SDSTrack: Self-Distillation Symmetric Adapter Learning for Multi-Modal Visual Object Tracking | SDSTrack:用于多模态视觉对象跟踪的自蒸馏对称适配器学习 | Xiaojun Hou, Jiazheng Xing, Yijie Qian, Yaowei Guo, Shuo Xin, Junhao Chen, Kai Tang, Mengmeng Wang, Zhengkai Jiang, Liang Liu, et.al. | arxiv.org/pdf/2403.16… | null |
Nerf
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-24 | Inverse Rendering of Glossy Objects via the Neural Plenoptic Function and Radiance Fields | 通过神经全光功能和辐射场逆向渲染有光泽的物体 | Haoyuan Wang, Wenbo Hu, Lei Zhu, Rynson W. H. Lau | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Entity-NeRF: Detecting and Removing Moving Entities in Urban Scenes | Entity-NeRF:检测和删除城市场景中的移动实体 | Takashi Otonari, Satoshi Ikehata, Kiyoharu Aizawa | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | CG-SLAM: Efficient Dense RGB-D SLAM in a Consistent Uncertainty-aware 3D Gaussian Field | CG-SLAM:一致的不确定性感知 3D 高斯场中的高效密集 RGB-D SLAM | Jiarui Hu, Xianhao Chen, Boyin Feng, Guanglin Li, Liangjing Yang, Hujun Bao, Guofeng Zhang, Zhaopeng Cui | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Are NeRFs ready for autonomous driving? Towards closing the real-to-simulation gap | NeRF 准备好自动驾驶了吗?缩小真实与模拟之间的差距 | Carl Lindström, Georg Hess, Adam Lilja, Maryam Fatemi, Lars Hammarstrand, Christoffer Petersson, Lennart Svensson | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | PKU-DyMVHumans: A Multi-View Video Benchmark for High-Fidelity Dynamic Human Modeling | PKU-DyMVHumans:高保真动态人体建模的多视图视频基准 | Xiaoyun Zheng, Liwei Liao, Xufeng Li, Jianbo Jiao, Rongjie Wang, Feng Gao, Shiqi Wang, Ronggang Wang | arxiv.org/pdf/2403.16… | link |
| 2024-03-24 | Semantic Is Enough: Only Semantic Information For NeRF Reconstruction | 语义就足够了:只有语义信息才能进行 NeRF 重建 | Ruibo Wang, Song Zhang, Ping Huang, Donghai Zhang, Wei Yan | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Exploring Accurate 3D Phenotyping in Greenhouse through Neural Radiance Fields | 通过神经辐射场探索温室中准确的 3D 表型分析 | unhong Zhao, Wei Ying, Yaoqiang Pan, Zhenfeng Yi, Chao Chen, Kewei Hu, Hanwen Kang | arxiv.org/pdf/2403.15… | null |
模型压缩/优化
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-24 | Exploring the Impact of Dataset Bias on Dataset Distillation | 探索数据集偏差对数据集蒸馏的影响 | Yao Lu, Jianyang Gu, Xuguang Chen, Saeed Vahidian, Qi Xuan | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | PaPr: Training-Free One-Step Patch Pruning with Lightweight ConvNets for Faster Inference | PaPr:使用轻量级卷积网络进行免训练一步式补丁修剪,以实现更快的推理 | Tanvir Mahmud, Burhaneddin Yaman, Chun-Hao Liu, Diana Marculescu | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Mars Spectrometry 2: Gas Chromatography -- Second place solution | 火星光谱法2:气相色谱法——第二名解决方案 | Dmitry A. Konovalov | arxiv.org/pdf/2403.15… | null |
分类/检测/识别/分割/...
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-24 | HemoSet: The First Blood Segmentation Dataset for Automation of Hemostasis Management | HemoSet:第一个用于止血管理自动化的血液分割数据集 | Albert J. Miao Shan Lin, Jingpei Lu, Florian Richter, Benjamin Ostrander, Emily K. Funk, Ryan K. Orosco, Michael C. Yip | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | L-MAE: Longitudinal masked auto-encoder with time and severity-aware encoding for diabetic retinopathy progression prediction | L-MAE:纵向屏蔽自动编码器,具有时间和严重程度感知编码,用于糖尿病视网膜病变进展预测 | Rachid Zeghlache, Pierre-Henri Conze, Mostafa El Habib Daho, Yihao Li, Alireza Rezaei, Hugo Le Boité, Ramin Tadayoni, Pascal Massin, Béatrice Cochener, Ikram Brahim, et.al. | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Object Detectors in the Open Environment:Challenges, Solutions, and Outlook | 开放环境中的物体检测器:挑战、解决方案和展望 | Siyuan Liang, Wei Wang, Ruoyu Chen, Aishan Liu, Boxi Wu, Ee-Chien Chang, Xiaochun Cao, Dacheng Tao | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Constricting Normal Latent Space for Anomaly Detection with Normal-only Training Data | 使用纯正态训练数据限制异常检测的正态潜在空间 | Marcella Astrid, Muhammad Zaigham Zaheer, Seung-Ik Lee | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Emotion Recognition from the perspective of Activity Recognition | 从活动识别的角度进行情绪识别 | Savinay Nagendra, Prapti Panigrahi | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Out-of-Distribution Detection via Deep Multi-Comprehension Ensemble | 通过深度多重理解集成进行分布外检测 | Chenhui Xu, Fuxun Yu, Zirui Xu, Nathan Inkawhich, Xiang Chen | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Partially Blinded Unlearning: Class Unlearning for Deep Networks a Bayesian Perspective | 部分盲解学习:贝叶斯视角下深度网络的类解学习 | Subhodip Panda, Shashwat Sourav, Prathosh A. P | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Dual-modal Prior Semantic Guided Infrared and Visible Image Fusion for Intelligent Transportation System | 用于智能交通系统的双模先验语义引导红外和可见光图像融合 | Jing Li, Lu Bai, Bin Yang, Chang Li, Lingfei Ma, Lixin Cui, Edwin R. Hancock | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Leveraging Deep Learning and Xception Architecture for High-Accuracy MRI Classification in Alzheimer Diagnosis | 利用深度学习和 Xception 架构在阿尔茨海默病诊断中进行高精度 MRI 分类 | Shaojie Li, Haichen Qu, Xinqi Dong, Bo Dang, Hengyi Zang, Yulu Gong | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Enhancing MRI-Based Classification of Alzheimer's Disease with Explainable 3D Hybrid Compact Convolutional Transformers | 利用可解释的 3D 混合紧凑卷积变压器增强基于 MRI 的阿尔茨海默病分类 | Arindam Majee, Avisek Gupta, Sourav Raha, Swagatam Das | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Fusion of Minutia Cylinder Codes and Minutia Patch Embeddings for Latent Fingerprint Recognition | 用于潜在指纹识别的细节柱面代码和细节补丁嵌入的融合 | Yusuf Artan, Bensu Alkan Semiz | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Salience DETR: Enhancing Detection Transformer with Hierarchical Salience Filtering Refinement | Salience DETR:通过分层显着性过滤细化增强检测变压器 | Xiuquan Hou, Meiqin Liu, Senlin Zhang, Ping Wei, Badong Chen | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Segment Anything Model for Road Network Graph Extraction | 用于道路网络图提取的分段任意模型 | Congrui Hetang, Haoru Xue, Cindy Le, Tianwei Yue, Wenping Wang, Yihui He | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Edit3K: Universal Representation Learning for Video Editing Components | Edit3K:视频编辑组件的通用表示学习 | Xin Gu, Libo Zhang, Fan Chen, Longyin Wen, Yufei Wang, Tiejian Luo, Sijie Zhu | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | RPMArt: Towards Robust Perception and Manipulation for Articulated Objects | RPMArt:实现铰接物体的鲁棒感知和操纵 | Junbo Wang, Wenhai Liu, Qiaojun Yu, Yang You, Liu Liu, Weiming Wang, Cewu Lu | arxiv.org/pdf/2403.16… | null |
Transformer
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-24 | Adversarially Masked Video Consistency for Unsupervised Domain Adaptation | 用于无监督域适应的对抗性屏蔽视频一致性 | Xiaoyu Zhu, Junwei Liang, Po-Yao Huang, Alex Hauptmann | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Towards Online Real-Time Memory-based Video Inpainting Transformers | 迈向基于内存的在线实时视频修复变形金刚 | Guillaume Thiry, Hao Tang, Radu Timofte, Luc Van Gool | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | CFAT: Unleashing TriangularWindows for Image Super-resolution | CFAT:释放 TriangleWindows 实现图像超分辨率 | Abhisek Ray, Gaurav Kumar, Maheshkumar H. Kolekar | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Enhancing Video Transformers for Action Understanding with VLM-aided Training | 通过 VLM 辅助训练增强视频转换器的动作理解 | Hui Lu, Hu Jian, Ronald Poppe, Albert Ali Salah | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | EVA: Zero-shot Accurate Attributes and Multi-Object Video Editing | EVA:零样本精确属性和多对象视频编辑 | Xiangpeng Yang, Linchao Zhu, Hehe Fan, Yi Yang | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Landmark-Guided Cross-Speaker Lip Reading with Mutual Information Regularization | 具有互信息正则化的地标引导跨说话者唇读 | Linzhi Wu, Xingyu Zhang, Yakun Zhang, Changyan Zheng, Tiejun Liu, Liang Xie, Ye Yan, Erwei Yin | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | A General and Efficient Federated Split Learning with Pre-trained Image Transformers for Heterogeneous Data | 针对异构数据的具有预训练图像变换器的通用且高效的联合分割学习 | Yifan Shi, Yuhui Zhang, Ziyue Huang, Xiaofeng Yang, Li Shen, Wei Chen, Xueqian Wang | arxiv.org/pdf/2403.16… | null |
3D/CG
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-24 | Frankenstein: Generating Semantic-Compositional 3D Scenes in One Tri-Plane | Frankenstein:在一个三平面中生成语义组合 3D 场景 | Han Yan, Yang Li, Zhennan Wu, Shenzhou Chen, Weixuan Sun, Taizhang Shang, Weizhe Liu, Tian Chen, Xiaqiang Dai, Chao Ma, et.al. | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | FH-SSTNet: Forehead Creases based User Verification using Spatio-Spatial Temporal Network | FH-SSTNet:使用时空网络进行基于额头皱纹的用户验证 | Geetanjali Sharma, Gaurav Jaswal, Aditya Nigam, Raghavendra Ramachandra | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | BIMCV-R: A Landmark Dataset for 3D CT Text-Image Retrieval | BIMCV-R:3D CT 文本图像检索的里程碑数据集 | Yinda Chen, Che Liu, Xiaoyu Liu, Rossella Arcucci, Zhiwei Xiong | arxiv.org/pdf/2403.15… | null |
各类学习方式
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-24 | Exemplar-Free Class Incremental Learning via Incremental Representation | 通过增量表示的无范例类增量学习 | Libo Huang, Zhulin An, Yan Zeng, Chuanguang Yang, Xinqiang Yu, Yongjun Xu | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Blur2Blur: Blur Conversion for Unsupervised Image Deblurring on Unknown Domains | Blur2Blur:未知域上无监督图像去模糊的模糊转换 | Bang-Dang Pham, Phong Tran, Anh Tran, Cuong Pham, Rang Nguyen, Minh Hoai | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Exploiting Semantic Reconstruction to Mitigate Hallucinations in Vision-Language Models | 利用语义重建来减轻视觉语言模型中的幻觉 | Minchan Kim, Minyeong Kim, Junik Bae, Suhwan Choi, Sungkyung Kim, Buru Chang | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Enhancing Visual Continual Learning with Language-Guided Supervision | 通过语言引导的监督增强视觉持续学习 | Bolin Ni, Hongbo Zhao, Chenghao Zhang, Ke Hu, Gaofeng Meng, Zhaoxiang Zhang, Shiming Xiang | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Knowledge-Enhanced Dual-stream Zero-shot Composed Image Retrieval | 知识增强双流零样本合成图像检索 | Yucheng Suo, Fan Ma, Linchao Zhu, Yi Yang | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Multi-Scale Spatio-Temporal Graph Convolutional Network for Facial Expression Spotting | 用于面部表情识别的多尺度时空图卷积网络 | Yicheng Deng, Hideaki Hayashi, Hajime Nagahara | arxiv.org/pdf/2403.15… | null |
其他
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-24 | On the Equivalency, Substitutability, and Flexibility of Synthetic Data | 论合成数据的等价性、可替代性和灵活性 | Che-Jui Chang, Danrui Li, Seonghyeon Moon, Mubbasir Kapadia | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Low Rank Groupwise Deformations for Motion Tracking in Cardiac Cine MRI | 心脏电影 MRI 中运动跟踪的低阶分组变形 | Sean Rendell, Jinming Duan | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Image Captioning in news report scenario | 新闻报道场景中的图像字幕 | Tianrui Liu, Qi Cai, Changxin Xu, Zhanxin Zhou, Jize Xiong, Yuxin Qiao, Tsungwei Yang | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | From Discrete to Continuous: Deep Fair Clustering With Transferable Representations | 从离散到连续:具有可转移表示的深度公平聚类 | Xiang Zhang | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Improving Scene Graph Generation with Relation Words' Debiasing in Vision-Language Models | 通过视觉语言模型中的关系词去偏改进场景图生成 | Yuxuan Wang, Xiaoyuan Liu | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Realtime Robust Shape Estimation of Deformable Linear Object | 可变形线性物体的实时鲁棒形状估计 | Jiaming Zhang, Zhaomeng Zhang, Yihao Liu, Yaqian Chen, Amir Kheradmand, Mehran Armand | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Self-Supervised Multi-Frame Neural Scene Flow | 自监督多帧神经场景流 | Dongrui Liu, Daqi Liu, Xueqian Li, Sihao Lin, Hongwei xie, Bing Wang, Xiaojun Chang, Lei Chu | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Fill in the ____ (a Diffusion-based Image Inpainting Pipeline) | 填写____(基于扩散的图像修复管道) | Eyoel Gebre, Krishna Saxena, Timothy Tran | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Diverse Representation Embedding for Lifelong Person Re-Identification | 用于终身人员重新识别的多样化表示嵌入 | Shiben Liu, Huijie Fan, Qiang Wang, Xiai Chen, Zhi Han, Yandong Tang | arxiv.org/pdf/2403.16… | null |
| 2024-03-24 | Towards Two-Stream Foveation-based Active Vision Learning | 迈向基于注视点的双流主动视觉学习 | Timur Ibrayev, Amitangshu Mukherjee, Sai Aparna Aketi, Kaushik Roy | arxiv.org/pdf/2403.15… | null |