[分享][每日更新][2024.03.21][CV_arxiv_papers]

370 阅读22分钟

[UPDATED!] 2024-03-21 (Publish Time)

生成模型

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-21Simplified Diffusion Schrödinger Bridge简化扩散薛定谔桥Zhicong Tang, Tiankai Hang, Shuyang Gu, Dong Chen, Baining Guoarxiv.org/pdf/2403.14…null
2024-03-21GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and GenerationGRM:用于高效 3D 重建和生成的大型高斯重建模型Yinghao Xu, Zifan Shi, Wang Yifan, Hansheng Chen, Ceyuan Yang, Sida Peng, Yujun Shen, Gordon Wetzsteinarxiv.org/pdf/2403.14…null
2024-03-21ClusteringSDF: Self-Organized Neural Implicit Surfaces for 3D DecompositionClusteringSDF:用于 3D 分解的自组织神经隐式曲面Tianhao Wu, Chuanxia Zheng, Tat-Jen Cham, Qianyi Wuarxiv.org/pdf/2403.14…null
2024-03-21DreamReward: Text-to-3D Generation with Human PreferenceDreamReward:根据人类偏好生成文本到 3DJunliang Ye, Fangfu Liu, Qixiu Li, Zhengyi Wang, Yikai Wang, Xinzhou Wang, Yueqi Duan, Jun Zhuarxiv.org/pdf/2403.14…null
2024-03-21ReNoise: Real Image Inversion Through Iterative NoisingReNoise:通过迭代噪声进行真实图像反转Daniel Garibi, Or Patashnik, Andrey Voynov, Hadar Averbuch-Elor, Daniel Cohen-Orarxiv.org/pdf/2403.14…null
2024-03-21Object-Centric Domain Randomization for 3D Shape Reconstruction in the Wild用于野外 3D 形状重建的以对象为中心的域随机化Junhyeong Cho, Kim Youwang, Hunmin Yang, Tae-Hyun Oharxiv.org/pdf/2403.14…null
2024-03-21HAC: Hash-grid Assisted Context for 3D Gaussian Splatting CompressionHAC:用于 3D 高斯泼溅压缩的哈希网格辅助上下文Yihang Chen, Qianyi Wu, Jianfei Cai, Mehrtash Harandi, Weiyao Linarxiv.org/pdf/2403.14…null
2024-03-21Click to Grasp: Zero-Shot Precise Manipulation via Visual Diffusion Descriptors点击掌握:通过视觉扩散描述符进行零射击精确操作Nikolaos Tsagkas, Jack Rome, Subramanian Ramamoorthy, Oisin Mac Aodha, Chris Xiaoxuan Luarxiv.org/pdf/2403.14…null
2024-03-21Denoising Diffusion Models for 3D Healthy Brain Tissue Inpainting用于 3D 健康脑组织修复的去噪扩散模型Alicia Durrer, Julia Wolleb, Florentin Bieder, Paul Friedrich, Lester Melie-Garcia, Mario Ocampo-Pineda, Cosmin I. Bercea, Ibrahim E. Hamamci, Benedikt Wiestler, Marie Piraud, et.al.arxiv.org/pdf/2403.14…null
2024-03-21Style-Extracting Diffusion Models for Semi-Supervised Histopathology Segmentation用于半监督组织病理学分割的风格提取扩散模型Mathias Öttl, Frauke Wilm, Jana Steenpass, Jingna Qiu, Matthias Rübner, Arndt Hartmann, Matthias Beckmann, Peter Fasching, Andreas Maier, Ramona Erber, et.al.arxiv.org/pdf/2403.14…null
2024-03-21DP-RDM: Adapting Diffusion Models to Private Domains Without Fine-TuningDP-RDM:无需微调即可使扩散模型适应私有域Jonathan Lebensold, Maziar Sanjabi, Pietro Astolfi, Adriana Romero-Soriano, Kamalika Chaudhuri, Mike Rabbat, Chuan Guoarxiv.org/pdf/2403.14…null
2024-03-21OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic SegmentationOA-CNN:用于 3D 语义分割的全自适应稀疏 CNNBohao Peng, Xiaoyang Wu, Li Jiang, Yukang Chen, Hengshuang Zhao, Zhuotao Tian, Jiaya Jiaarxiv.org/pdf/2403.14…null
2024-03-21A Bag of Tricks for Few-Shot Class-Incremental Learning少样本类增量学习的一大堆技巧Shuvendu Roy, Chunjong Park, Aldi Fahrezi, Ali Etemadarxiv.org/pdf/2403.14…null
2024-03-21InfNeRF: Towards Infinite Scale NeRF Rendering with O(log n) Space ComplexityInfNeRF:以 O(log n) 空间复杂度实现无限规模 NeRF 渲染Jiabin Liang, Lanqing Zhang, Zhuoran Zhao, Xiangyu Xuarxiv.org/pdf/2403.14…null
2024-03-21Open-Vocabulary Attention Maps with Token Optimization for Semantic Segmentation in Diffusion Models具有令牌优化的开放词汇注意力图用于扩散模型中的语义分割Pablo Marcos-Manchón, Roberto Alcover-Couso, Juan C. SanMiguel, Jose M. Martínezarxiv.org/pdf/2403.14…null
2024-03-21Zero123-6D: Zero-shot Novel View Synthesis for RGB Category-level 6D Pose EstimationZero123-6D:用于 RGB 类别级 6D 姿势估计的零样本新颖视图合成Francesco Di Felice, Alberto Remus, Stefano Gasperini, Benjamin Busam, Lionel Ott, Federico Tombari, Roland Siegwart, Carlo Alberto Avizzanoarxiv.org/pdf/2403.14…null
2024-03-21Diffusion Models with Ensembled Structure-Based Anomaly Scoring for Unsupervised Anomaly Detection用于无监督异常检测的具有基于集成结构的异常评分的扩散模型Finn Behrendt, Debayan Bhattacharya, Lennart Maack, Julia Krüger, Roland Opfer, Robin Mieling, Alexander Schlaeferarxiv.org/pdf/2403.14…null
2024-03-21StyleCineGAN: Landscape Cinemagraph Generation using a Pre-trained StyleGANStyleCineGAN:使用预先训练的 StyleGAN 生成景观电影图片Jongwoo Choi, Kwanggyoon Seo, Amirsaman Ashtari, Junyong Noharxiv.org/pdf/2403.14…null
2024-03-21Mini-Splatting: Representing Scenes with a Constrained Number of GaussiansMini-Splatting:用有限数量的高斯表示场景Guangchi Fang, Bing Wangarxiv.org/pdf/2403.14…null
2024-03-21Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition通过内容帧运动潜在分解的高效视频扩散模型Sihyun Yu, Weili Nie, De-An Huang, Boyi Li, Jinwoo Shin, Anima Anandkumararxiv.org/pdf/2403.14…null
2024-03-21Powerful Lossy Compression for Noisy Images针对噪声图像的强大有损压缩Shilv Cai, Xiaoguo Liang, Shuning Cao, Luxin Yan, Sheng Zhong, Liqun Chen, Xu Zouarxiv.org/pdf/2403.14…null
2024-03-21QSMDiff: Unsupervised 3D Diffusion Models for Quantitative Susceptibility MappingQSMDiff:用于定量磁化率绘图的无监督 3D 扩散模型Zhuang Xiong, Wei Jiang, Yang Gao, Feng Liu, Hongfu Sunarxiv.org/pdf/2403.14…null
2024-03-21LeFusion: Synthesizing Myocardial Pathology on Cardiac MRI via Lesion-Focus Diffusion ModelsLeFusion:通过病变焦点扩散模型在心脏 MRI 上综合心肌病理学Hantao Zhang, Jiancheng Yang, Shouhong Wan, Pascal Fuaarxiv.org/pdf/2403.14…null

多模态

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-21MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?MathVerse:您的多模式法学硕士能否真正看到视觉数学问题中的图表?Renrui Zhang, Dongzhi Jiang, Yichi Zhang, Haokun Lin, Ziyu Guo, Pengshuo Qiu, Aojun Zhou, Pan Lu, Kai-Wei Chang, Peng Gao, et.al.arxiv.org/pdf/2403.14…null
2024-03-21Language Repository for Long Video Understanding长视频理解语言库Kumara Kahatapitiya, Kanchana Ranasinghe, Jongwoo Park, Michael S. Ryooarxiv.org/pdf/2403.14…null
2024-03-21PSALM: Pixelwise SegmentAtion with Large Multi-Modal ModelPSALM:具有大型多模态模型的像素分割Zheng Zhang, Yeyao Ma, Enming Zhang, Xiang Baiarxiv.org/pdf/2403.14…null
2024-03-21Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient InferenceCobra:将 Mamba 扩展到多模态大型语言模型以实现高效推理Han Zhao, Min Zhang, Wei Zhao, Pengxiang Ding, Siteng Huang, Donglin Wangarxiv.org/pdf/2403.14…null
2024-03-21Pensieve: Retrospect-then-Compare Mitigates Visual HallucinationPensieve:回顾然后比较可以减轻幻视Dingchen Yang, Bowen Cao, Guang Chen, Changjun Jiangarxiv.org/pdf/2403.14…null
2024-03-21LayoutLLM: Large Language Model Instruction Tuning for Visually Rich Document UnderstandingLayoutLLM:大型语言模型指令调整,以实现视觉丰富的文档理解Masato Fujitakearxiv.org/pdf/2403.14…null
2024-03-21Dermacen Analytica: A Novel Methodology Integrating Multi-Modal Large Language Models with Machine Learning in tele-dermatologyDermacen Analytica:一种将多模态大型语言模型与远程皮肤病学机器学习相结合的新方法Dimitrios P. Panagoulias, Evridiki Tsoureli-Nikita, Maria Virvou, George A. Tsihrintzisarxiv.org/pdf/2403.14…null
2024-03-21Unsupervised Audio-Visual Segmentation with Modality Alignment具有模态对齐的无监督视听分割Swapnil Bhosale, Haosen Yang, Diptesh Kanojia, Jiangkang Deng, Xiatian Zhuarxiv.org/pdf/2403.14…null
2024-03-21OTSeg: Multi-prompt Sinkhorn Attention for Zero-Shot Semantic SegmentationOTSeg:零样本语义分割的多提示 Sinkhorn 注意力Kwanyoung Kim, Yujin Oh, Jong Chul Yearxiv.org/pdf/2403.14…null
2024-03-21Leveraging Large Language Model-based Room-Object Relationships Knowledge for Enhancing Multimodal-Input Object Goal Navigation利用基于大语言模型的房间-对象关系知识来增强多模式输入对象目标导航Leyuan Sun, Asako Kanezaki, Guillaume Caron, Yusuke Yoshiyasuarxiv.org/pdf/2403.14…null
2024-03-21Empowering Segmentation Ability to Multi-modal Large Language Models增强多模态大型语言模型的细分能力Yuqi Yang, Peng-Tao Jiang, Jing Wang, Hao Zhang, Kai Zhao, Jinwei Chen, Bo Liarxiv.org/pdf/2403.14…null
2024-03-21Leveraging Thermal Modality to Enhance Reconstruction in Low-Light Conditions利用热模态增强弱光条件下的重建Jiacong Xu, Mingqian Liao, K Ram Prabhakar, Vishal M. Patelarxiv.org/pdf/2403.14…null

Nerf

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-21CombiNeRF: A Combination of Regularization Techniques for Few-Shot Neural Radiance Field View SynthesisCombiNeRF:用于少样本神经辐射场视图合成的正则化技术组合Matteo Bonotto, Luigi Sarrocco, Daniele Evangelista, Marco Imperoli, Alberto Prettoarxiv.org/pdf/2403.14…null

3DGS

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-21MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View ImagesMVSplat:稀疏多视图图像的高效 3D 高斯分布Yuedong Chen, Haofei Xu, Chuanxia Zheng, Bohan Zhuang, Marc Pollefeys, Andreas Geiger, Tat-Jen Cham, Jianfei Caiarxiv.org/pdf/2403.14…null
2024-03-21Gaussian Frosting: Editable Complex Radiance Fields with Real-Time Rendering高斯磨砂:具有实时渲染的可编辑复杂辐射场Antoine Guédon, Vincent Lepetitarxiv.org/pdf/2403.14…null
2024-03-21Isotropic Gaussian Splatting for Real-Time Radiance Field Rendering用于实时辐射场渲染的各向同性高斯喷射Yuanhao Gong, Lantao Yu, Guanghui Yuearxiv.org/pdf/2403.14…null

模型压缩/优化

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-21Learning to Project for Cross-Task Knowledge Distillation学习项目以进行跨任务知识蒸馏Dylan Auty, Roy Miles, Benedikt Kolbeinsson, Krystian Mikolajczykarxiv.org/pdf/2403.14…null
2024-03-21Ranking Distillation for Open-Ended Video Question Answering with Insufficient Labels标签不足的开放式视频问答的排名蒸馏Tianming Liang, Chaolei Tan, Beihao Xia, Wei-Shi Zheng, Jian-Fang Huarxiv.org/pdf/2403.14…null
2024-03-21Accelerating ViT Inference on FPGA through Static and Dynamic Pruning通过静态和动态修剪加速 FPGA 上的 ViT 推理Dhruv Parikh, Shouyi Li, Bingyi Zhang, Rajgopal Kannan, Carl Busart, Viktor Prasannaarxiv.org/pdf/2403.14…null

分类/检测/识别/分割/...

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-21ODTFormer: Efficient Obstacle Detection and Tracking with Stereo Cameras Based on TransformerODTFormer:基于 Transformer 的立体相机高效障碍物检测与跟踪Tianye Ding, Hongyu Li, Huaizu Jiangarxiv.org/pdf/2403.14…null
2024-03-21LiFT: A Surprisingly Simple Lightweight Feature Transform for Dense ViT DescriptorsLiFT:密集 ViT 描述符的极其简单的轻量级特征转换Saksham Suri, Matthew Walmer, Kamal Gupta, Abhinav Shrivastavaarxiv.org/pdf/2403.14…null
2024-03-21T-Rex2: Towards Generic Object Detection via Text-Visual Prompt SynergyT-Rex2:通过文本-视觉提示协同实现通用对象检测Qing Jiang, Feng Li, Zhaoyang Zeng, Tianhe Ren, Shilong Liu, Lei Zhangarxiv.org/pdf/2403.14…null
2024-03-21VXP: Voxel-Cross-Pixel Large-scale Image-LiDAR Place RecognitionVXP:体素-跨像素大尺寸图像-LiDAR地点识别Yun-Jin Li, Mariia Gladkova, Yan Xia, Rui Wang, Daniel Cremersarxiv.org/pdf/2403.14…null
2024-03-21Token Transformation Matters: Towards Faithful Post-hoc Explanation for Vision Transformer代币转换很重要:对 Vision Transformer 进行忠实的事后解释Junyi Wu, Bin Duan, Weitai Kang, Hao Tang, Yan Yanarxiv.org/pdf/2403.14…null
2024-03-21Estimating Physical Information Consistency of Channel Data Augmentation for Remote Sensing Images估计遥感图像通道数据增强的物理信息一致性Tom Burgert, Begüm Demirarxiv.org/pdf/2403.14…null
2024-03-21Transfer Learning for Cross-dataset Isolated Sign Language Recognition in Under-Resourced Datasets资源贫乏数据集中跨数据集隔离手语识别的迁移学习Ahmet Alp Kindiroglu, Ozgur Kara, Ogulcan Ozdemir, Lale Akarunarxiv.org/pdf/2403.14…null
2024-03-21Invisible Needle Detection in Ultrasound: Leveraging Mechanism-Induced Vibration超声波中的隐形针检测:利用机制引起的振动Chenyang Li, Dianye Huang, Angelos Karlas, Nassir Navab, Zhongliang Jiangarxiv.org/pdf/2403.14…null
2024-03-21MULDE: Multiscale Log-Density Estimation via Denoising Score Matching for Video Anomaly DetectionMULDE:通过去噪分数匹配进行多尺度对数密度估计,用于视频异常检测Jakub Micorek, Horst Possegger, Dominik Narnhofer, Horst Bischof, Mateusz Kozinskiarxiv.org/pdf/2403.14…null
2024-03-21Adversary-Robust Graph-Based Learning of WSIs基于对抗鲁棒图的 WSI 学习Saba Heidari Gheshlaghi, Milan Aryal, Nasim Yahyasoltani, Masoud Ganjiarxiv.org/pdf/2403.14…null
2024-03-21DesignEdit: Multi-Layered Latent Decomposition and Fusion for Unified & Accurate Image EditingDesignEdit:多层潜在分解和融合,实现统一准确的图像编辑Yueru Jia, Yuhui Yuan, Aosong Cheng, Chuke Wang, Ji Li, Huizhu Jia, Shanghang Zhangarxiv.org/pdf/2403.14…null
2024-03-21HyperGALE: ASD Classification via Hypergraph Gated Attention with Learnable HyperedgesHyperGALE:通过具有可学习超边的超图门控注意力进行 ASD 分类Mehul Arora, Chirag Shantilal Jain, Lalith Bharadwaj Baru, Kamalaker Dadi, Bapi Raju Surampudiarxiv.org/pdf/2403.14…null
2024-03-21CathFlow: Self-Supervised Segmentation of Catheters in Interventional Ultrasound Using Optical Flow and TransformersCathFlow:使用光流和变压器对介入超声中的导管进行自监督分割Alex Ranne, Liming Kuang, Yordanka Velikova, Nassir Navab, Ferdinando Rodriguez y Baenaarxiv.org/pdf/2403.14…null
2024-03-21Analysing Diffusion Segmentation for Medical Images分析医学图像的扩散分割Mathias Öttl, Siyuan Mei, Frauke Wilm, Jana Steenpass, Matthias Rübner, Arndt Hartmann, Matthias Beckmann, Peter Fasching, Andreas Maier, Ramona Erber, et.al.arxiv.org/pdf/2403.14…null
2024-03-21Raw Instinct: Trust Your Classifiers and Skip the Conversion原始本能:相信您的分类器并跳过转换Christos Kantas, Bjørk Antoniussen, Mathias V. Andersen, Rasmus Munksø, Shobhit Kotnala, Simon B. Jensen, Andreas Møgelmose, Lau Nørgaard, Thomas B. Moeslundarxiv.org/pdf/2403.14…null
2024-03-21Biased Binary Attribute Classifiers Ignore the Majority Classes有偏差的二元属性分类器忽略大多数类Xinyi Zhang, Johanna Sophie Bieri, Manuel Güntherarxiv.org/pdf/2403.14…null
2024-03-21Tensor network compressibility of convolutional models卷积模型的张量网络可压缩性Sukhbinder Singh, Saeed S. Jahromi, Roman Orusarxiv.org/pdf/2403.14…null
2024-03-21Varroa destructor detection on honey bees using hyperspectral imagery使用高光谱图像检测蜜蜂瓦螨破坏者Zina-Sabrina Duma, Tomas Zemcik, Simon Bilik, Tuomas Sihvonen, Peter Honec, Satu-Pia Reinikainen, Karel Horakarxiv.org/pdf/2403.14…null
2024-03-21LDTR: Transformer-based Lane Detection with Anchor-chain RepresentationLDTR:具有锚链表示的基于变压器的车道检测Zhongyu Yang, Chen Shen, Wei Shao, Tengfei Xing, Runbo Hu, Pengfei Xu, Hua Chai, Ruini Xuearxiv.org/pdf/2403.14…null
2024-03-21Annotation-Efficient Polyp Segmentation via Active Learning通过主动学习进行注释高效的息肉分割Duojun Huang, Xinyu Xiong, De-Jun Fan, Feng Gao, Xiao-Jian Wu, Guanbin Liarxiv.org/pdf/2403.14…null
2024-03-21Towards Efficient Information Fusion: Concentric Dual Fusion Attention Based Multiple Instance Learning for Whole Slide Images迈向高效信息融合:基于同心双融合注意力的整个幻灯片图像的多实例学习Yujian Liu, Ruoxuan Wu, Xinjie Shen, Zihuang Lu, Lingyu Liang, Haiyu Zhou, Shipu Xu, Shaoai Cai, Shidang Xuarxiv.org/pdf/2403.14…null
2024-03-21FFT-based Selection and Optimization of Statistics for Robust Recognition of Severely Corrupted Images基于 FFT 的统计选择和优化,用于严重损坏图像的鲁棒识别Elena Camuffo, Umberto Michieli, Jijoong Moon, Daehyun Kim, Mete Ozayarxiv.org/pdf/2403.14…null
2024-03-21Exosense: A Vision-Centric Scene Understanding System For Safe Exoskeleton NavigationExosense:用于安全外骨骼导航的以视觉为中心的场景理解系统Jianeng Wang, Matias Mattamala, Christina Kassab, Lintong Zhang, Maurice Fallonarxiv.org/pdf/2403.14…null
2024-03-21A Lightweight Attention-based Deep Network via Multi-Scale Feature Fusion for Multi-View Facial Expression Recognition通过多尺度特征融合的轻量级基于注意力的深度网络用于多视图面部表情识别Ali Ezati, Mohammadreza Dezyani, Rajib Rana, Roozbeh Rajabi, Ahmad Ayatollahiarxiv.org/pdf/2403.14…null
2024-03-21Impact Assessment of Missing Data in Model Predictions for Earth Observation Applications地球观测应用模型预测中缺失数据的影响评估Francisco Mena, Diego Arenas, Marcela Charfuelan, Marlon Nuske, Andreas Dengelarxiv.org/pdf/2403.14…null
2024-03-21Exploring Green AI for Audio Deepfake Detection探索用于音频 Deepfake 检测的绿色 AISubhajit Saha, Md Sahidullah, Swagatam Dasarxiv.org/pdf/2403.14…null
2024-03-21Scene-Graph ViT: End-to-End Open-Vocabulary Visual Relationship Detection场景图 ViT:端到端开放词汇视觉关系检测Tim Salzmann, Markus Ryll, Alex Bewley, Matthias Mindererarxiv.org/pdf/2403.14…null
2024-03-21Safeguarding Medical Image Segmentation Datasets against Unauthorized Training via Contour- and Texture-Aware Perturbations通过轮廓和纹理感知扰动保护医学图像分割数据集免受未经授权的训练Xun Lin, Yi Yu, Song Xia, Jue Jiang, Haoran Wang, Zitong Yu, Yizhong Liu, Ying Fu, Shuai Wang, Wenzhong Tang, et.al.arxiv.org/pdf/2403.14…null
2024-03-21ResNet101 and DAE for Enhance Quality and Classification Accuracy in Skin Cancer ImagingResNet101 和 DAE 用于提高皮肤癌成像的质量和分类准确性Sibasish Dhibararxiv.org/pdf/2403.14…null
2024-03-21RG-CAT: Detection Pipeline and Catalogue of Radio Galaxies in the EMU Pilot SurveyRG-CAT:EMU 试点巡天中射电星系的探测管道和目录Nikhel Gupta, Ray P. Norris, Zeeshan Hayder, Minh Huynh, Lars Petersson, X. Rosalind Wang, Andrew M. Hopkins, Heinz Andernach, Yjan Gordon, Simone Riggi, et.al.arxiv.org/pdf/2403.14…null
2024-03-21SoftPatch: Unsupervised Anomaly Detection with Noisy DataSoftPatch:使用噪声数据进行无监督异常检测Xi Jiang, Ying Chen, Qiang Nie, Yong Liu, Jianlin Liu, Bin-Bin Gao, Jun Liu, Chengjie Wang, Feng Zhengarxiv.org/pdf/2403.14…null
2024-03-21Toward Multi-class Anomaly Detection: Exploring Class-aware Unified Model against Inter-class Interference面向多类异常检测:探索针对类间干扰的类感知统一模型Xi Jiang, Ying Chen, Qiang Nie, Jianlin Liu, Yong Liu, Chengjie Wang, Feng Zhengarxiv.org/pdf/2403.14…null
2024-03-21PECI-Net: Bolus segmentation from video fluoroscopic swallowing study images using preprocessing ensemble and cascaded inferencePECI-Net:使用预处理集成和级联推理对视频透视吞咽研究图像进行团注分割Dougho Park, Younghun Kim, Harim Kang, Junmyeoung Lee, Jinyoung Choi, Taeyeon Kim, Sangeok Lee, Seokil Son, Minsol Kim, Injung Kimarxiv.org/pdf/2403.14…null
2024-03-21Unified Static and Dynamic Network: Efficient Temporal Filtering for Video Grounding静动态统一网络:视频接地的高效时域过滤Jingjing Hu, Dan Guo, Kun Li, Zhan Si, Xun Yang, Xiaojun Chang, Meng Wangarxiv.org/pdf/2403.14…null
2024-03-21Learning Decomposable and Debiased Representations via Attribute-Centric Information Bottlenecks通过以属性为中心的信息瓶颈学习可分解和有偏差的表示Jinyung Hong, Eun Som Jeon, Changhoon Kim, Keun Hee Park, Utkarsh Nath, Yezhou Yang, Pavan Turaga, Theodore P. Pavlicarxiv.org/pdf/2403.14…null
2024-03-21Evidential Semantic Mapping in Off-road Environments with Uncertainty-aware Bayesian Kernel Inference使用不确定性感知贝叶斯核推理在越野环境中进行证据语义映射Junyoung Kim, Junwon Seo, Jihong Minarxiv.org/pdf/2403.14…null
2024-03-21Improving Image Classification Accuracy through Complementary Intra-Class and Inter-Class Mixup通过互补的类内和类间混合提高图像分类精度Ye Xu, Ya Gao, Xiaorong Qiu, Yang Chen, Ying Jiarxiv.org/pdf/2403.14…null
2024-03-213D Object Detection from Point Cloud via Voting Step Diffusion通过投票步骤扩散从点云检测 3D 对象Haoran Hou, Mingtao Feng, Zijie Wu, Weisheng Dong, Qing Zhu, Yaonan Wang, Ajmal Mianarxiv.org/pdf/2403.14…null
2024-03-21Soft Masked Transformer for Point Cloud Processing with Skip Attention-Based Upsampling用于点云处理的软掩模变压器,具有基于跳过注意力的上采样Yong He, Hongshan Yu, Muhammad Ibrahim, Xiaoyan Liu, Tongjia Chen, Anwaar Ulhaq, Ajmal Mianarxiv.org/pdf/2403.14…null
2024-03-21Training point-based deep learning networks for forest segmentation with synthetic data使用合成数据训练基于点的深度学习网络进行森林分割Francisco Raverta Capua, Juan Schandin, Pablo De Cristóforisarxiv.org/pdf/2403.14…null
2024-03-21Test-time Similarity Modification for Person Re-identification toward Temporal Distribution Shift针对时间分布转移的人员重新识别的测试时相似性修改Kazuki Adachi, Shohei Enomoto, Taku Sasaki, Shin'ya Yamaguchiarxiv.org/pdf/2403.14…null
2024-03-21Spatio-Temporal Proximity-Aware Dual-Path Model for Panoramic Activity Recognition用于全景活动识别的时空接近感知双路径模型Sumin Lee, Yooseung Wang, Sangmin Woo, Changick Kimarxiv.org/pdf/2403.14…null
2024-03-21MaskSAM: Towards Auto-prompt SAM with Mask Classification for Medical Image SegmentationMaskSAM:针对医学图像分割具有掩模分类的自动提示 SAMBin Xie, Hao Tang, Bin Duan, Dawen Cai, Yan Yanarxiv.org/pdf/2403.14…null
2024-03-21Unsupervised Intrinsic Image Decomposition with LiDAR Intensity Enhanced Training利用 LiDAR 强度增强训练进行无监督本征图像分解Shogo Sato, Takuhiro Kaneko, Kazuhiko Murasaki, Taiga Yoshida, Ryuichi Tanida, Akisato Kimuraarxiv.org/pdf/2403.14…null
2024-03-21Surface Reconstruction from Point Clouds via Grid-based Intersection Prediction通过基于网格的交叉点预测从点云重建表面Hui Tian, Kai Xuarxiv.org/pdf/2403.14…null
2024-03-21EventDance: Unsupervised Source-free Cross-modal Adaptation for Event-based Object RecognitionEventDance:用于基于事件的对象识别的无监督无源跨模式适应Xu Zheng, Lin Wangarxiv.org/pdf/2403.14…null
2024-03-21Semantics from Space: Satellite-Guided Thermal Semantic Segmentation Annotation for Aerial Field Robots来自太空的语义:航空领域机器人的卫星引导热语义分割注释Connor Lee, Saraswati Soedarmadji, Matthew Anderson, Anthony J. Clark, Soon-Jo Chungarxiv.org/pdf/2403.14…null

图像理解

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-21Enhancing Historical Image Retrieval with Compositional Cues通过构图线索增强历史图像检索Tingyu Lin, Robert Sablatnigarxiv.org/pdf/2403.14…null

LLM

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-21Detoxifying Large Language Models via Knowledge Editing通过知识编辑消除大型语言模型的毒害Mengru Wang, Ningyu Zhang, Ziwen Xu, Zekun Xi, Shumin Deng, Yunzhi Yao, Qishen Zhang, Linyi Yang, Jindong Wang, Huajun Chenarxiv.org/pdf/2403.14…link
2024-03-21Less but Better: Enabling Generalized Zero-shot Learning Towards Unseen Domains by Intrinsic Learning from Redundant LLM Semantics更少但更好:通过冗余 LLM 语义的内在学习实现对未见领域的广义零样本学习Jiaqi Yue, Jiancheng Zhao, Chunhui Zhaoarxiv.org/pdf/2403.14…null

Transformer

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-21AdaIR: Adaptive All-in-One Image Restoration via Frequency Mining and ModulationAdaIR:通过频率挖掘和调制进行自适应一体化图像恢复Yuning Cui, Syed Waqas Zamir, Salman Khan, Alois Knoll, Mubarak Shah, Fahad Shahbaz Khanarxiv.org/pdf/2403.14…null
2024-03-21View-decoupled Transformer for Person Re-identification under Aerial-ground Camera Network用于空地摄像机网络下人员重识别的视图解耦变压器Quan Zhang, Lei Wang, Vishal M. Patel, Xiaohua Xie, Jianhuang Laiarxiv.org/pdf/2403.14…null
2024-03-21RoDLA: Benchmarking the Robustness of Document Layout Analysis ModelsRoDLA:文档布局分析模型的稳健性基准测试Yufan Chen, Jiaming Zhang, Kunyu Peng, Junwei Zheng, Ruiping Liu, Philip Torr, Rainer Stiefelhagenarxiv.org/pdf/2403.14…null
2024-03-21SurroundSDF: Implicit 3D Scene Understanding Based on Signed Distance FieldSurroundSDF:基于有符号距离场的隐式 3D 场景理解Lizhe Liu, Bohua Wang, Hongwei Xie, Daqi Liu, Li Liu, Zhiqiang Tian, Kuiyuan Yang, Bing Wangarxiv.org/pdf/2403.14…null
2024-03-21On the Concept Trustworthiness in Concept Bottleneck Models概念瓶颈模型中的概念可信度研究Qihan Huang, Jie Song, Jingwen Hu, Haofei Zhang, Yong Wang, Mingli Songarxiv.org/pdf/2403.14…null
2024-03-21\nabla τ: Gradient-based and Task-Agnostic machine Unlearning\nabla τ:基于梯度和任务无关的机器取消学习Daniel Trippa, Cesare Campagnano, Maria Sofia Bucarelli, Gabriele Tolomei, Fabrizio Silvestriarxiv.org/pdf/2403.14…null
2024-03-21CFPL-FAS: Class Free Prompt Learning for Generalizable Face Anti-spoofingCFPL-FAS:通用人脸反欺骗的免费即时学习Ajian Liu, Shuai Xue, Jianwen Gan, Jun Wan, Yanyan Liang, Jiankang Deng, Sergio Escalera, Zhen Leiarxiv.org/pdf/2403.14…null
2024-03-21SpikingResformer: Bridging ResNet and Vision Transformer in Spiking Neural NetworksSpikingResformer:在尖峰神经网络中桥接 ResNet 和 Vision TransformerXinyu Shi, Zecheng Hao, Zhaofei Yuarxiv.org/pdf/2403.14…null
2024-03-21Weak Supervision with Arbitrary Single Frame for Micro- and Macro-expression Spotting任意单帧微表情和宏观表情识别的弱监督Wang-Wang Yu, Xian-Shi Zhang, Fu-Ya Luo, Yijun Cao, Kai-Fu Yang, Hong-Mei Yan, Yong-Jie Liarxiv.org/pdf/2403.14…null
2024-03-21Harmonizing Visual and Textual Embeddings for Zero-Shot Text-to-Image Customization协调视觉和文本嵌入以实现零样本文本到图像的定制Yeji Song, Jimyeong Kim, Wonhark Park, Wonsik Shin, Wonjong Rhee, Nojun Kwakarxiv.org/pdf/2403.14…null
2024-03-21External Knowledge Enhanced 3D Scene Generation from Sketch外部知识增强了从草图生成 3D 场景的能力Zijie Wu, Mingtao Feng, Yaonan Wang, He Xie, Weisheng Dong, Bo Miao, Ajmal Mianarxiv.org/pdf/2403.14…null
2024-03-21C-TPT: Calibrated Test-Time Prompt Tuning for Vision-Language Models via Text Feature DispersionC-TPT:通过文本特征分散对视觉语言模型进行校准测试时提示调整Hee Suk Yoon, Eunseop Yoon, Joshua Tian Jin Tee, Mark Hasegawa-Johnson, Yingzhen Li, Chang D. Yooarxiv.org/pdf/2403.14…null
2024-03-21Existence Is Chaos: Enhancing 3D Human Motion Prediction with Uncertainty Consideration存在就是混沌:考虑不确定性增强 3D 人体运动预测Zhihao Wang, Yulin Zhou, Ningyu Zhang, Xiaosong Yang, Jun Xiao, Zhao Wangarxiv.org/pdf/2403.14…null

3D/CG

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-21Zero-Shot Multi-Object Shape Completion零样本多对象形状完成Shun Iwase, Katherine Liu, Vitor Guizilini, Adrien Gaidon, Kris Kitani, Rares Ambrus, Sergey Zakharovarxiv.org/pdf/2403.14…null
2024-03-21Explorative Inbetweening of Time and Space时间与空间的探索Haiwen Feng, Zheng Ding, Zhihao Xia, Simon Niklaus, Victoria Abrevaya, Michael J. Black, Xuaner Zhangarxiv.org/pdf/2403.14…null
2024-03-21Visibility-Aware Keypoint Localization for 6DoF Object Pose Estimation用于 6DoF 物体姿态估计的可见性感知关键点定位Ruyi Lian, Haibin Lingarxiv.org/pdf/2403.14…null
2024-03-21Exploring 3D Human Pose Estimation and Forecasting from the Robot's Perspective: The HARPER Dataset从机器人的角度探索 3D 人体姿势估计和预测:HARPER 数据集Andrea Avogaro. Andrea Toaiari, Federico Cunico, Xiangmin Xu, Haralambos Dafas, Alessandro Vinciarelli, Emma Li, Marco Cristaniarxiv.org/pdf/2403.14…null
2024-03-21Enabling Visual Composition and Animation in Unsupervised Video Generation在无监督视频生成中启用视觉合成和动画Aram Davtyan, Sepehr Sameni, Björn Ommer, Paolo Favaroarxiv.org/pdf/2403.14…null
2024-03-21Volumetric Environment Representation for Vision-Language Navigation视觉语言导航的体积环境表示Rui Liu, Wenguan Wang, Yi Yangarxiv.org/pdf/2403.14…null

各类学习方式

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-21GLC++: Source-Free Universal Domain Adaptation through Global-Local Clustering and Contrastive Affinity LearningGLC++:通过全局局部聚类和对比亲和学习进行无源通用域适应Sanqing Qu, Tianpei Zou, Florian Röhrbein, Cewu Lu, Guang Chen, Dacheng Tao, Changjun Jiangarxiv.org/pdf/2403.14…null
2024-03-21Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization释放未标记的数据:跨视图地理定位的范例Guopeng Li, Ming Qian, Gui-Song Xiaarxiv.org/pdf/2403.14…null
2024-03-21Text-Enhanced Data-free Approach for Federated Class-Incremental Learning用于联邦类增量学习的文本增强无数据方法Minh-Tuan Tran, Trung Le, Xuan-May Le, Mehrtash Harandi, Dinh Phungarxiv.org/pdf/2403.14…null

其他

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-21Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion InversionVideoshop:使用噪声外推扩散反转进行本地化语义视频编辑Xiang Fan, Anand Bhattad, Ranjay Krishnaarxiv.org/pdf/2403.14…null
2024-03-21Hierarchical Text-to-Vision Self Supervised Alignment for Improved Histopathology Representation Learning分层文本到视觉自我监督对齐以改进组织病理学表示学习Hasindri Watawana, Kanchana Ranasinghe, Tariq Mahmood, Muzammal Naseer, Salman Khan, Fahad Shahbaz Khanarxiv.org/pdf/2403.14…null
2024-03-21MyVLM: Personalizing VLMs for User-Specific QueriesMyVLM:针对特定于用户的查询个性化 VLMYuval Alaluf, Elad Richardson, Sergey Tulyakov, Kfir Aberman, Daniel Cohen-Orarxiv.org/pdf/2403.14…null
2024-03-21Implicit Style-Content Separation using B-LoRA使用 B-LoRA 隐式风格内容分离Yarden Frenkel, Yael Vinker, Ariel Shamir, Daniel Cohen-Orarxiv.org/pdf/2403.14…null
2024-03-21DINO-Tracker: Taming DINO for Self-Supervised Point Tracking in a Single VideoDINO-Tracker:驯服 DINO,在单个视频中进行自我监督点跟踪Narek Tumanyan, Assaf Singer, Shai Bagon, Tali Dekelarxiv.org/pdf/2403.14…null
2024-03-21AnyV2V: A Plug-and-Play Framework For Any Video-to-Video Editing TasksAnyV2V:适用于任何视频到视频编辑任务的即插即用框架Max Ku, Cong Wei, Weiming Ren, Huan Yang, Wenhu Chenarxiv.org/pdf/2403.14…null
2024-03-21SyncTweedies: A General Generative Framework Based on Synchronized DiffusionsSyncTweedies:基于同步扩散的通用生成框架Jaihoon Kim, Juil Koo, Kyeongmin Yeo, Minhyuk Sungarxiv.org/pdf/2403.14…null
2024-03-21Neural Network-Based Processing and Reconstruction of Compromised Biophotonic Image Data基于神经网络的受损生物光子图像数据处理和重建Michael John Fanous, Paloma Casteleiro Costa, Cagatay Isil, Luzhe Huang, Aydogan Ozcanarxiv.org/pdf/2403.14…null
2024-03-21HySim: An Efficient Hybrid Similarity Measure for Patch Matching in Image InpaintingHySim:图像修复中补丁匹配的高效混合相似度测量Saad Noufel, Nadir Maaroufi, Mehdi Najib, Mohamed Bakhouyaarxiv.org/pdf/2403.14…null
2024-03-21Assessing the Robustness of Spectral Clustering for Deep Speaker Diarization评估深度说话人二值化的谱聚类的鲁棒性Nikhil Raghav, Md Sahidullaharxiv.org/pdf/2403.14…null
2024-03-21A Framework for Portrait Stylization with Skin-Tone Awareness and Nudity Identification具有肤色感知和裸体识别的肖像风格化框架Seungkwon Kim, Sangyeon Kim, Seung-Hun Namarxiv.org/pdf/2403.14…null
2024-03-21Debiasing surgeon: fantastic weights and how to find them去偏外科医生:奇妙的权重以及如何找到它们Rémi Nahon, Ivan Luiz De Moura Matos, Van-Tam Nguyen, Enzo Tartaglionearxiv.org/pdf/2403.14…null
2024-03-21Science based AI model certification for untrained operational environments with application in traffic state estimation基于科学的人工智能模型认证,适用于未经训练的操作环境,并应用于交通状态估计Daryl Mupupuni, Anupama Guntu, Liang Hong, Kamrul Hasan, Leehyun Keelarxiv.org/pdf/2403.14…null