[分享][每日更新][2024.03.28][CV_arxiv_papers]

343 阅读18分钟

[UPDATED!] 2024-03-28 (Publish Time)

生成模型

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-28GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative ModelingGaussianCube:使用 3D 生成建模的最佳传输构建高斯泼溅Bowen Zhang, Yiji Cheng, Jiaolong Yang, Chunyu Wang, Feng Zhao, Yansong Tang, Dong Chen, Baining Guoarxiv.org/pdf/2403.19…null
2024-03-28Detecting Image Attribution for Text-to-Image Diffusion Models in RGB and Beyond检测 RGB 及以上文本到图像扩散模型的图像属性Katherine Xu, Lingzhi Zhang, Jianbo Shiarxiv.org/pdf/2403.19…null
2024-03-28InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object InteractionInterDreamer:零镜头文本到 3D 动态人机交互Sirui Xu, Ziyin Wang, Yu-Xiong Wang, Liang-Yan Guiarxiv.org/pdf/2403.19…null
2024-03-28GANTASTIC: GAN-based Transfer of Interpretable Directions for Disentangled Image Editing in Text-to-Image Diffusion ModelsGANTASTIC:基于 GAN 的可解释方向传输,用于文本到图像扩散模型中的解缠结图像编辑Yusuf Dalva, Hidir Yesiltepe, Pinar Yanardagarxiv.org/pdf/2403.19…null
2024-03-28Collaborative Interactive Evolution of Art in the Latent Space of Deep Generative Models深度生成模型潜在空间中艺术的协作交互进化Ole Hall, Anil Yamanarxiv.org/pdf/2403.19…null
2024-03-28Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model通过扩散模型的类间图像混合增强图像分类Zhicai Wang, Longhui Wei, Tan Wang, Heyu Chen, Yanbin Hao, Xiang Wang, Xiangnan He, Qi Tianarxiv.org/pdf/2403.19…null
2024-03-28Frame by Familiar Frame: Understanding Replication in Video Diffusion Models逐帧熟悉:了解视频扩散模型中的复制Aimon Rahman, Malsha V. Perera, Vishal M. Patelarxiv.org/pdf/2403.19…null
2024-03-28XScale-NVS: Cross-Scale Novel View Synthesis with Hash Featurized ManifoldXScale-NVS:具有哈希特征流形的跨尺度新颖视图合成Guangyu Wang, Jinzhi Zhang, Fan Wang, Ruqi Huang, Lu Fangarxiv.org/pdf/2403.19…null
2024-03-28Debiasing Cardiac Imaging with Controlled Latent Diffusion Models利用受控潜伏扩散模型消除心脏成像偏差Grzegorz Skorupko, Richard Osuala, Zuzanna Szafranowska, Kaisar Kushibar, Nay Aung, Steffen E Petersen, Karim Lekadir, Polyxeni Gkontraarxiv.org/pdf/2403.19…null
2024-03-28Beyond Talking -- Generating Holistic 3D Human Dyadic Motion for Communication超越言语——生成用于沟通的整体 3D 人体二元运动Mingze Sun, Chao Xu, Xinyu Jiang, Yang Liu, Baigui Sun, Ruqi Huangarxiv.org/pdf/2403.19…null
2024-03-28SubjectDrive: Scaling Generative Data in Autonomous Driving via Subject ControlsubjectDrive:通过主题控制扩展自动驾驶中的生成数据Binyuan Huang, Yuqing Wen, Yucheng Zhao, Yaosi Hu, Yingfei Liu, Fan Jia, Weixin Mao, Tiancai Wang, Chi Zhang, Chang Wen Chen, et.al.arxiv.org/pdf/2403.19…null
2024-03-28Burst Super-Resolution with Diffusion Models for Improving Perceptual Quality具有扩散模型的突发超分辨率可提高感知质量Kyotaro Tokoro, Kazutoshi Akita, Norimichi Ukitaarxiv.org/pdf/2403.19…null
2024-03-28OAKINK2: A Dataset of Bimanual Hands-Object Manipulation in Complex Task CompletionOAKINK2:完成复杂任务的双手物体操作数据集Xinyu Zhan, Lixin Yang, Yifei Zhao, Kangrui Mao, Hanlin Xu, Zenan Lin, Kailin Li, Cewu Luarxiv.org/pdf/2403.19…null
2024-03-28Imperceptible Protection against Style Imitation from Diffusion Models潜移默化地防止扩散模型的风格模仿Namhyuk Ahn, Wonhyuk Ahn, KiYoon Yoo, Daesik Kim, Seung-Hun Namarxiv.org/pdf/2403.19…null
2024-03-28RecDiffusion: Rectangling for Image Stitching with Diffusion ModelsRecDiffusion:使用扩散模型进行图像拼接的矩形Tianhao Zhou, Haipeng Li, Ziyi Wang, Ao Luo, Chen-Lin Zhang, Jiajun Li, Bing Zeng, Shuaicheng Liuarxiv.org/pdf/2403.19…null
2024-03-28MoDiTalker: Motion-Disentangled Diffusion Model for High-Fidelity Talking Head GenerationMoDiTalker:用于生成高保真头部说话的运动解缠扩散模型Seyeon Kim, Siyoon Jin, Jihye Park, Kihong Kim, Jiyoung Kim, Jisu Nam, Seungryong Kimarxiv.org/pdf/2403.19…null
2024-03-28QNCD: Quantization Noise Correction for Diffusion ModelsQNCD:扩散模型的量化噪声校正Huanpeng Chu, Wei Wu, Chengjie Zang, Kun Yuanarxiv.org/pdf/2403.19…null
2024-03-28Synthetic Medical Imaging Generation with Generative Adversarial Networks For Plain Radiographs使用生成对抗网络生成普通放射线照片的合成医学成像John R. McNulty, Lee Kho, Alexandria L. Case, Charlie Fornaca, Drew Johnston, David Slater, Joshua M. Abzug, Sybil A. Russellarxiv.org/pdf/2403.19…null
2024-03-28Automated Black-box Prompt Engineering for Personalized Text-to-Image Generation用于个性化文本到图像生成的自动化黑盒提示工程Yutong He, Alexander Robey, Naoki Murata, Yiding Jiang, Joshua Williams, George J. Pappas, Hamed Hassani, Yuki Mitsufuji, Ruslan Salakhutdinov, J. Zico Kolterarxiv.org/pdf/2403.19…null

多模态

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-28MagicLens: Self-Supervised Image Retrieval with Open-Ended InstructionsMagicLens:具有开放式指令的自监督图像检索Kai Zhang, Yi Luan, Hexiang Hu, Kenton Lee, Siyuan Qiao, Wenhu Chen, Yu Su, Ming-Wei Changarxiv.org/pdf/2403.19…null
2024-03-28Img2Loc: Revisiting Image Geolocalization using Multi-modality Foundation Models and Image-based Retrieval-Augmented GenerationImg2Loc:使用多模态基础模型和基于图像的检索增强生成重新审视图像地理定位Zhongliang Zhou, Jielu Zhang, Zihan Guan, Mengxuan Hu, Ni Lao, Lan Mu, Sheng Li, Gengchen Maiarxiv.org/pdf/2403.19…null
2024-03-28OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via Cycle-Modality PropagationOV-Uni3DETR:通过循环模态传播实现统一开放词汇 3D 对象检测Zhenyu Wang, Yali Li, Taichi Liu, Hengshuang Zhao, Shengjin Wangarxiv.org/pdf/2403.19…null
2024-03-28Locate, Assign, Refine: Taming Customized Image Inpainting with Text-Subject Guidance定位、分配、优化:使用文本主题指导驯服定制图像修复Yulin Pan, Chaojie Mao, Zeyinzi Jiang, Zhen Han, Jingfeng Zhangarxiv.org/pdf/2403.19…null
2024-03-28RELI11D: A Comprehensive Multimodal Human Motion Dataset and MethodRELI11D:综合多模态人体运动数据集和方法Ming Yan, Yan Zhang, Shuqiang Cai, Shuqi Fan, Xincheng Lin, Yudi Dai, Siqi Shen, Chenglu Wen, Lan Xu, Yuexin Ma, et.al.arxiv.org/pdf/2403.19…null
2024-03-28Plug-and-Play Grounding of Reasoning in Multimodal Large Language Models多模态大语言模型推理的即插即用基础Jiaxing Chen, Yuxuan Liu, Dehu Li, Xiang An, Ziyong Feng, Yongle Zhao, Yin Xiearxiv.org/pdf/2403.19…null
2024-03-28Towards Multimodal Video Paragraph Captioning Models Robust to Missing Modality迈向对缺失模态具有鲁棒性的多模态视频段落字幕模型Sishuo Chen, Lei Li, Shuhuai Ren, Rundong Gao, Yuanxin Liu, Xiaohan Bi, Xu Sun, Lu Houarxiv.org/pdf/2403.19…null
2024-03-28Single-Shared Network with Prior-Inspired Loss for Parameter-Efficient Multi-Modal Imaging Skin Lesion Classification具有先验启发损失的单共享网络,用于参数高效的多模态成像皮肤病变分类Peng Tang, Tobias Lasserarxiv.org/pdf/2403.19…null
2024-03-28MMCert: Provable Defense against Adversarial Attacks to Multi-modal ModelsMMCert:针对多模态模型的对抗性攻击的可证明防御Yanting Wang, Hongye Fu, Wei Zou, Jinyuan Jiaarxiv.org/pdf/2403.19…null

Nerf

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-28SAID-NeRF: Segmentation-AIDed NeRF for Depth Completion of Transparent ObjectsSAID-NeRF:用于透明对象深度补全的分割辅助 NeRFAvinash Ummadisingu, Jongkeum Choi, Koki Yamane, Shimpei Masuda, Naoki Fukaya, Kuniyuki Takahashiarxiv.org/pdf/2403.19…null
2024-03-28CoherentGS: Sparse Novel View Synthesis with Coherent 3D GaussiansCoherentGS:使用 Coherent 3D 高斯进行稀疏新颖视图合成Avinash Paliwal, Wei Ye, Jinhui Xiong, Dmytro Kotovenko, Rakesh Ranjan, Vikas Chandra, Nima Khademi Kalantariarxiv.org/pdf/2403.19…null
2024-03-28Mesh2NeRF: Direct Mesh Supervision for Neural Radiance Field Representation and GenerationMesh2NeRF:用于神经辐射场表示和生成的直接网格监督Yujin Chen, Yinyu Nie, Benjamin Ummenhofer, Reiner Birkl, Michael Paulitsch, Matthias Müller, Matthias Nießnerarxiv.org/pdf/2403.19…null
2024-03-28Sine Activated Low-Rank Matrices for Parameter Efficient Learning用于参数高效学习的正弦激活低秩矩阵Yiping Ji, Hemanth Saratchandran, Cameron Gordon, Zeyu Zhang, Simon Luceyarxiv.org/pdf/2403.19…null

3DGS

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-28GauStudio: A Modular Framework for 3D Gaussian Splatting and BeyondGauStudio:用于 3D 高斯泼溅及其他功能的模块化框架Chongjie Ye, Yinyu Nie, Jiahao Chang, Yuantao Chen, Yihao Zhi, Xiaoguang Hanarxiv.org/pdf/2403.19…null
2024-03-28SA-GS: Scale-Adaptive Gaussian Splatting for Training-Free Anti-AliasingSA-GS:用于免训练抗锯齿的尺度自适应高斯泼溅Xiaowei Song, Jv Zheng, Shiran Yuan, Huan-ang Gao, Jingwei Zhao, Xiang He, Weihao Gu, Hao Zhaoarxiv.org/pdf/2403.19…null
2024-03-28TOGS: Gaussian Splatting with Temporal Opacity Offset for Real-Time 4D DSA RenderingTOGS:具有时间不透明度偏移的高斯泼溅,用于实时 4D DSA 渲染Shuai Zhang, Huangxuan Zhao, Zhenghong Zhou, Guanjun Wu, Chuansheng Zheng, Xinggang Wang, Wenyu Liuarxiv.org/pdf/2403.19…null

模型压缩/优化

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-28De-confounded Data-free Knowledge Distillation for Handling Distribution Shifts用于处理分布变化的去混杂无数据知识蒸馏Yuzheng Wang, Dingkang Yang, Zhaoyu Chen, Yang Liu, Siao Liu, Wenqiang Zhang, Lihua Zhang, Lizhe Qiarxiv.org/pdf/2403.19…null
2024-03-28Jointly Training and Pruning CNNs via Learnable Agent Guidance and Alignment通过可学习代理指导和对齐联合训练和修剪 CNNAlireza Ganjdanesh, Shangqian Gao, Heng Huangarxiv.org/pdf/2403.19…null
2024-03-28AZ-NAS: Assembling Zero-Cost Proxies for Network Architecture SearchAZ-NAS:组装零成本代理进行网络架构搜索Junghyup Lee, Bumsub Hamarxiv.org/pdf/2403.19…null
2024-03-28Within the Dynamic Context: Inertia-aware 3D Human Modeling with Pose Sequence在动态环境中:具有姿势序列的惯性感知 3D 人体建模Yutong Chen, Yifan Zhan, Zhihang Zhong, Wei Wang, Xiao Sun, Yu Qiao, Yinqiang Zhengarxiv.org/pdf/2403.19…null
2024-03-28Uncertainty-Aware Deep Video Compression with Ensembles使用集成的不确定性感知深度视频压缩Wufei Ma, Jiahao Li, Bin Li, Yan Luarxiv.org/pdf/2403.19…null
2024-03-28CRKD: Enhanced Camera-Radar Object Detection with Cross-modality Knowledge DistillationCRKD:通过跨模态知识蒸馏增强相机雷达目标检测Lingjun Zhao, Jingyu Song, Katherine A. Skinnerarxiv.org/pdf/2403.19…null

分类/检测/识别/分割/...

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-28RSMamba: Remote Sensing Image Classification with State Space ModelRSMamba:利用状态空间模型进行遥感图像分类Keyan Chen, Bowen Chen, Chenyang Liu, Wenyuan Li, Zhengxia Zou, Zhenwei Shiarxiv.org/pdf/2403.19…link
2024-03-28Change-Agent: Towards Interactive Comprehensive Change Interpretation and Analysis from Change Detection and Change CaptioningChange-Agent:从变更检测和变更字幕走向交互式综合变更解释和分析Chenyang Liu, Keyan Chen, Haotian Zhang, Zipeng Qi, Zhengxia Zou, Zhenwei Shiarxiv.org/pdf/2403.19…null
2024-03-28Siamese Vision Transformers are Scalable Audio-visual Learners连体视觉变压器是可扩展的视听学习器Yan-Bo Lin, Gedas Bertasiusarxiv.org/pdf/2403.19…null
2024-03-28ILPO-NET: Network for the invariant recognition of arbitrary volumetric patterns in 3DILPO-NET:用于 3D 中任意体积图案的不变识别的网络Dmitrii Zhemchuzhnikov, Sergei Grudininarxiv.org/pdf/2403.19…null
2024-03-28DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTsDenseNets 重装上阵:超越 ResNets 和 ViT 的范式转变Donghyun Kim, Byeongho Heo, Dongyoon Hanarxiv.org/pdf/2403.19…null
2024-03-28The Bad Batches: Enhancing Self-Supervised Learning in Image Classification Through Representative Batch Curation糟糕的批次:通过代表性批次管理增强图像分类中的自我监督学习Ozgu Goksu, Nicolas Pugeaultarxiv.org/pdf/2403.19…null
2024-03-28Cross-Attention is Not Always Needed: Dynamic Cross-Attention for Audio-Visual Dimensional Emotion Recognition并不总是需要交叉注意力:视听维度情感识别的动态交叉注意力R. Gnana Praveen, Jahangir Alamarxiv.org/pdf/2403.19…null
2024-03-28Instance-Adaptive and Geometric-Aware Keypoint Learning for Category-Level 6D Object Pose Estimation用于类别级 6D 物体姿态估计的实例自适应和几何感知关键点学习Xiao Lin, Wenfei Yang, Yuan Gao, Tianzhu Zhangarxiv.org/pdf/2403.19…null
2024-03-28Surface-based parcellation and vertex-wise analysis of ultra high-resolution ex vivo 7 tesla MRI in neurodegenerative diseases神经退行性疾病中超高分辨率离体 7 特斯拉 MRI 的基于表面的分割和顶点分析Pulkit Khandelwal, Michael Tran Duong, Constanza Fuentes, Amanda Denning, Winifred Trotman, Ranjit Ittyerah, Alejandra Bahena, Theresa Schuck, Marianna Gabrielyan, Karthik Prabhakaran, et.al.arxiv.org/pdf/2403.19…null
2024-03-28Segmentation tool for images of cracks裂缝图像分割工具Andrii Kompanets, Remco Duits, Davide Leonetti, Nicky van den Berg, H. H., Snijderarxiv.org/pdf/2403.19…null
2024-03-28Transparent and Clinically Interpretable AI for Lung Cancer Detection in Chest X-Rays透明且临床可解释的人工智能用于胸部 X 射线肺癌检测Amy Rafferty, Rishi Ramaesh, Ajitha Rajanarxiv.org/pdf/2403.19…null
2024-03-28A Robust Ensemble Algorithm for Ischemic Stroke Lesion Segmentation: Generalizability and Clinical Utility Beyond the ISLES Challenge用于缺血性中风病变分割的稳健集成算法:超越 ISLES 挑战的普遍性和临床实用性Ezequiel de la Rosa, Mauricio Reyes, Sook-Lei Liew, Alexandre Hutton, Roland Wiest, Johannes Kaesmacher, Uta Hanning, Arsany Hakim, Richard Zubal, Waldo Valenzuela, et.al.arxiv.org/pdf/2403.19…null
2024-03-28Towards Temporally Consistent Referring Video Object Segmentation实现时间一致的参考视频对象分割Bo Miao, Mohammed Bennamoun, Yongsheng Gao, Mubarak Shah, Ajmal Mianarxiv.org/pdf/2403.19…null
2024-03-28Infrared Small Target Detection with Scale and Location Sensitivity具有规模和位置敏感性的红外小目标检测Qiankun Liu, Rui Liu, Bolun Zheng, Hongkui Wang, Ying Fuarxiv.org/pdf/2403.19…null
2024-03-28Test-Time Domain Generalization for Face Anti-Spoofing人脸反欺骗的测试时域泛化Qianyu Zhou, Ke-Yue Zhang, Taiping Yao, Xuequan Lu, Shouhong Ding, Lizhuang Maarxiv.org/pdf/2403.19…null
2024-03-28Hypergraph-based Multi-View Action Recognition using Event Cameras使用事件摄像机的基于超图的多视图动作识别Yue Gao, Jiaxuan Lu, Siqi Li, Yipeng Li, Shaoyi Duarxiv.org/pdf/2403.19…null
2024-03-28Total-Decom: Decomposed 3D Scene Reconstruction with Minimal InteractionTotal-Decom:以最少的交互进行分解的 3D 场景重建Xiaoyang Lyu, Chirui Chang, Peng Dai, Yang-tian Sun, Xiaojuang Qiarxiv.org/pdf/2403.19…null
2024-03-28Sparse Generation: Making Pseudo Labels Sparse for weakly supervision with points稀疏生成:使伪标签稀疏以​​实现带点的弱监督Tian Ma, Chuyang Shang, Wanzhu Ren, Yuancheng Li, Jiiayi Yang, Jiali Qianarxiv.org/pdf/2403.19…null
2024-03-28CAT: Exploiting Inter-Class Dynamics for Domain Adaptive Object DetectionCAT:利用类间动力学进行域自适应对象检测Mikhail Kennerley, Jian-Gang Wang, Bharadwaj Veeravalli, Robby T. Tanarxiv.org/pdf/2403.19…null
2024-03-28Efficient and Effective Weakly-Supervised Action Segmentation via Action-Transition-Aware Boundary Alignment通过动作转换感知边界对齐实现高效且有效的弱监督动作分割Angchi Xu, Wei-Shi Zhengarxiv.org/pdf/2403.19…null
2024-03-28Learning Multiple Representations with Inconsistency-Guided Detail Regularization for Mask-Guided Matting通过不一致引导的细节正则化学习多重表示以实现掩模引导的抠图Weihao Jiang, Zhaozhi Xie, Yuxiang Lu, Longjie Qi, Jingyong Cai, Hiroyuki Uchiyama, Bin Chen, Yue Ding, Hongtao Luarxiv.org/pdf/2403.19…null
2024-03-28Rethinking Information Loss in Medical Image Segmentation with Various-sized Targets重新思考不同大小目标的医学图像分割中的信息丢失Tianyi Liu, Zhaorui Tan, Kaizhu Huang, Haochuan Jiangarxiv.org/pdf/2403.19…null
2024-03-28Algorithmic Ways of Seeing: Using Object Detection to Facilitate Art Exploration观看的算法方式:使用对象检测促进艺术探索Louie Søs Meyer, Johanne Engel Aaen, Anitamalina Regitse Tranberg, Peter Kun, Matthias Freiberger, Sebastian Risi, Anders Sundnes Løvliearxiv.org/pdf/2403.19…null
2024-03-28CLAP4CLIP: Continual Learning with Probabilistic Finetuning for Vision-Language ModelsCLAP4CLIP:视觉语言模型的概率微调的持续学习Saurav Jha, Dong Gong, Lina Yaoarxiv.org/pdf/2403.19…null
2024-03-28OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table RecognitionOmniParser:文本识别、关键信息提取和表格识别的统一框架Jianqiang Wan, Sibo Song, Wenwen Yu, Yuliang Liu, Wenqing Cheng, Fei Huang, Xiang Bai, Cong Yao, Zhibo Yangarxiv.org/pdf/2403.19…null
2024-03-28PoCo: A Self-Supervised Approach via Polar Transformation Based Progressive Contrastive Learning for Ophthalmic Disease DiagnosisPoCo:基于极地变换的渐进对比学习的自监督方法用于眼科疾病诊断Jinhong Wang, Tingting Chen, Jintai Chen, Yixuan Wu, Yuyang Xu, Danny Chen, Haochao Ying, Jian Wuarxiv.org/pdf/2403.19…null
2024-03-28Patch Spatio-Temporal Relation Prediction for Video Anomaly Detection用于视频异常检测的补丁时空关系预测Hao Shen, Lu Shi, Wanru Xu, Yigang Cen, Linna Zhang, Gaoyun Anarxiv.org/pdf/2403.19…null
2024-03-28A Real-Time Framework for Domain-Adaptive Underwater Object Detection with Image Enhancement具有图像增强功能的域自适应水下目标检测的实时框架Junjie Wen, Jinqiang Cui, Benyun Zhao, Bingxin Han, Xuchen Liu, Zhi Gao, Ben M. Chenarxiv.org/pdf/2403.19…null
2024-03-28Low-Rank Rescaled Vision Transformer Fine-Tuning: A Residual Design Approach低阶重缩放视觉变压器微调:残差设计方法Wei Dong, Xing Zhang, Bihui Chen, Dawei Yan, Zhijun Lin, Qingsen Yan, Peng Wang, Yang Yangarxiv.org/pdf/2403.19…null

GNN

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-28SG-PGM: Partial Graph Matching Network with Semantic Geometric Fusion for 3D Scene Graph Alignment and Its Downstream TasksSG-PGM:具有语义几何融合的部分图匹配网络,用于 3D 场景图对齐及其下游任务Yaxu Xie, Alain Pagani, Didier Strickerarxiv.org/pdf/2403.19…null

图像理解

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-28Model Stock: All we need is just a few fine-tuned models模型库存:我们需要的只是一些经过微调的模型Dong-Hwan Jang, Sangdoo Yun, Dongyoon Hanarxiv.org/pdf/2403.19…null
2024-03-28A Simple and Effective Point-based Network for Event Camera 6-DOFs Pose Relocalization用于事件相机 6 自由度姿势重定位的简单有效的基于点的网络Hongwei Ren, Jiadong Zhu, Yue Zhou, Haotian FU, Yulong Huang, Bojun Chengarxiv.org/pdf/2403.19…null
2024-03-28FlowDepth: Decoupling Optical Flow for Self-Supervised Monocular Depth EstimationFlowDepth:用于自监督单目深度估计的解耦光流Yiyang Sun, Zhiyuan Xu, Xiaonian Wang, Jing Yaoarxiv.org/pdf/2403.19…null
2024-03-28AAPMT: AGI Assessment Through Prompt and Metric TransformerAAPMT:通过 Prompt 和 Metric Transformer 进行 AGI 评估Benhao Huangarxiv.org/pdf/2403.19…null

Transformer

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-28BAMM: Bidirectional Autoregressive Motion ModelBAMM:双向自回归运动模型Ekkasit Pinyoanuntapong, Muhammad Usama Saleem, Pu Wang, Minwoo Lee, Srijan Das, Chen Chenarxiv.org/pdf/2403.19…null
2024-03-28PointCloud-Text Matching: Benchmark Datasets and a Baseline点云-文本匹配:基准数据集和基线Yanglin Feng, Yang Qin, Dezhong Peng, Hongyuan Zhu, Xi Peng, Peng Huarxiv.org/pdf/2403.19…null
2024-03-28MedBN: Robust Test-Time Adaptation against Malicious Test SamplesMedBN:针对恶意测试样本的稳健测试时间适应Hyejin Park, Jeongyeon Hwang, Sunung Mun, Sangdon Park, Jungseul Okarxiv.org/pdf/2403.19…null
2024-03-28RTracker: Recoverable Tracking via PN Tree Structured MemoryRTracker:通过 PN 树结构内存进行可恢复跟踪Yuqing Huang, Xin Li, Zikun Zhou, Yaowei Wang, Zhenyu He, Ming-Hsuan Yangarxiv.org/pdf/2403.19…null
2024-03-28GraphAD: Interaction Scene Graph for End-to-end Autonomous DrivingGraphAD:端到端自动驾驶的交互场景图Yunpeng Zhang, Deheng Qian, Ding Li, Yifeng Pan, Yong Chen, Zhenbao Liang, Zhiyao Zhang, Shurui Zhang, Hongxu Li, Maolei Fu, et.al.arxiv.org/pdf/2403.19…null

3D/CG

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-28GraspXL: Generating Grasping Motions for Diverse Objects at ScaleGraspXL:大规模生成各种物体的抓取动作Hui Zhang, Sammy Christen, Zicong Fan, Otmar Hilliges, Jie Songarxiv.org/pdf/2403.19…null
2024-03-28Situation Awareness for Driver-Centric Driving Style Adaptation以驾驶员为中心的驾驶风格适应的态势感知Johann Haselberger, Bonifaz Stuhr, Bernhard Schick, Steffen Müllerarxiv.org/pdf/2403.19…null
2024-03-28TOD3Cap: Towards 3D Dense Captioning in Outdoor ScenesTOD3Cap:迈向户外场景中的 3D 密集字幕Bu Jin, Yupeng Zheng, Pengfei Li, Weize Li, Yuhang Zheng, Sujie Hu, Xinyu Liu, Jinwei Zhu, Zhijie Yan, Haiyang Sun, et.al.arxiv.org/pdf/2403.19…null
2024-03-28GlORIE-SLAM: Globally Optimized RGB-only Implicit Encoding Point Cloud SLAMGlORIE-SLAM:全局优化的仅 RGB 隐式编码点云 SLAMGanlin Zhang, Erik Sandström, Youmin Zhang, Manthan Patel, Luc Van Gool, Martin R. Oswaldarxiv.org/pdf/2403.19…null
2024-03-28Benchmarking Implicit Neural Representation and Geometric Rendering in Real-Time RGB-D SLAM实时 RGB-D SLAM 中隐式神经表示和几何渲染的基准测试Tongyan Hua, Lin Wangarxiv.org/pdf/2403.19…null
2024-03-28Brain-Shift: Unsupervised Pseudo-Healthy Brain Synthesis for Novel Biomarker Extraction in Chronic Subdural Hematoma脑转移:无监督的伪健康脑合成用于慢性硬膜下血肿的新型生物标志物提取Baris Imre, Elina Thibeau-Sutre, Jorieke Reimer, Kuan Kho, Jelmer M. Wolterinkarxiv.org/pdf/2403.19…null
2024-03-28NIGHT -- Non-Line-of-Sight Imaging from Indirect Time of Flight Data夜间——来自间接飞行时间数据的非视距成像Matteo Caligiuri, Adriano Simonetto, Gianluca Agresti, Pietro Zanuttigharxiv.org/pdf/2403.19…null
2024-03-28Neural Fields for 3D Tracking of Anatomy and Surgical Instruments in Monocular Laparoscopic Video Clips用于单眼腹腔镜视频剪辑中解剖和手术器械 3D 跟踪的神经场Beerend G. A. Gerats, Jelmer M. Wolterink, Seb P. Mol, Ivo A. M. J. Broedersarxiv.org/pdf/2403.19…null
2024-03-28GeoAuxNet: Towards Universal 3D Representation Learning for Multi-sensor Point CloudsGeoAuxNet:迈向多传感器点云的通用 3D 表示学习Shengjun Zhang, Xin Fei, Yueqi Duanarxiv.org/pdf/2403.19…null

各类学习方式

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-28IVLMap: Instance-Aware Visual Language Grounding for Consumer Robot NavigationIVLMap:消费机器人导航的实例感知视觉语言基础Jiacui Huang, Hongtao Zhang, Mingbo Zhao, Zhou Wuarxiv.org/pdf/2403.19…null
2024-03-28Text Data-Centric Image Captioning with Interactive Prompts带有交互式提示的以文本数据为中心的图像说明Yiyu Wang, Hao Luo, Jungang Xu, Yingfei Sun, Fan Wangarxiv.org/pdf/2403.19…null

其他

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-28RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization AgentsRH20T-P:面向可组合泛化代理的原始级机器人数据集Zeren Chen, Zhelun Shi, Xiaoya Lu, Lehan He, Sucheng Qian, Hao Shu Fang, Zhenfei Yin, Wanli Ouyang, Jing Shao, Yu Qiao, et.al.arxiv.org/pdf/2403.19…null
2024-03-28Nearest Neighbor Classication for Classical Image Upsampling经典图像上采样的最近邻分类Evan Matthews, Nicolas Pratearxiv.org/pdf/2403.19…null
2024-03-28Semantic Map-based Generation of Navigation Instructions基于语义图的导航指令生成Chengzu Li, Chao Zhang, Simone Teufel, Rama Sanand Doddipatla, Svetlana Stoyanchevarxiv.org/pdf/2403.19…null
2024-03-28LocCa: Visual Pretraining with Location-aware CaptionersLocCa:使用位置感知字幕进行视觉预训练Bo Wan, Michael Tschannen, Yongqin Xian, Filip Pavetic, Ibrahim Alabdulmohsin, Xiao Wang, André Susano Pinto, Andreas Steiner, Lucas Beyer, Xiaohua Zhaiarxiv.org/pdf/2403.19…null
2024-03-28CDIMC-net: Cognitive Deep Incomplete Multi-view Clustering NetworkCDIMC-net:认知深度不完全多视图聚类网络Jie Wen, Zheng Zhang, Yong Xu, Bob Zhang, Lunke Fei, Guo-Sen Xiearxiv.org/pdf/2403.19…null
2024-03-28Break-for-Make: Modular Low-Rank Adaptations for Composable Content-Style CustomizationBreak-for-Make:用于可组合内容样式定制的模块化低阶改编Yu Xu, Fan Tang, Juan Cao, Yuxin Zhang, Oliver Deussen, Weiming Dong, Jintao Li, Tong-Yee Leearxiv.org/pdf/2403.19…null
2024-03-28Taming Lookup Tables for Efficient Image Retouching驯服查找表以实现高效图像修饰Sidi Yang, Binxiao Huang, Mingdeng Cao, Yatai Ji, Hanzhong Guo, Ngai Wong, Yujiu Yangarxiv.org/pdf/2403.19…null
2024-03-28DreamSalon: A Staged Diffusion Framework for Preserving Identity-Context in Editable Face GenerationDreamSalon:在可编辑面部生成中保留身份上下文的分阶段扩散框架Haonan Lin, Mengmeng Wang, Yan Chen, Wenbin An, Yuzhe Yao, Guang Dai, Qianying Wang, Yong Liu, Jingdong Wangarxiv.org/pdf/2403.19…null
2024-03-28From Activation to Initialization: Scaling Insights for Optimizing Neural Fields从激活到初始化:扩展洞察以优化神经场Hemanth Saratchandran, Sameera Ramasinghe, Simon Luceyarxiv.org/pdf/2403.19…null
2024-03-28D'OH: Decoder-Only random Hypernetworks for Implicit Neural RepresentationsD'OH:用于隐式神经表示的仅解码器随机超网络Cameron Gordon, Lachlan Ewen MacDonald, Hemanth Saratchandran, Simon Luceyarxiv.org/pdf/2403.19…null
2024-03-28Towards Understanding Dual BN In Hybrid Adversarial Training理解混合对抗训练中的双重 BNChenshuang Zhang, Chaoning Zhang, Kang Zhang, Axi Niu, Junmo Kim, In So Kweonarxiv.org/pdf/2403.19…null
2024-03-28MVEB: Self-Supervised Learning with Multi-View Entropy BottleneckMVEB:具有多视图熵瓶颈的自监督学习Liangjian Wen, Xiasi Wang, Jianzhuang Liu, Zenglin Xuarxiv.org/pdf/2403.19…null
2024-03-28Tiny Machine Learning: Progress and Futures微型机器学习:进步与未来Ji Lin, Ligeng Zhu, Wei-Ming Chen, Wei-Chen Wang, Song Hanarxiv.org/pdf/2403.19…null
2024-03-28Generative Quanta Color Imaging生成量子彩色成像Vishal Purohit, Junjie Luo, Yiheng Chi, Qi Guo, Stanley H. Chan, Qiang Qiuarxiv.org/pdf/2403.19…null