[分享][每日更新][2024.03.20][CV_arxiv_papers]

295 阅读19分钟

[UPDATED!] 2024-03-20 (Publish Time)

生成模型

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-20Editing Massive Concepts in Text-to-Image Diffusion Models编辑文本到图像扩散模型中的大量概念Tianwei Xiong, Yue Wu, Enze Xie, Yue Wu, Zhenguo Li, Xihui Liuarxiv.org/pdf/2403.13…null
2024-03-20ZigMa: Zigzag Mamba Diffusion ModelZigMa:之字形曼巴扩散模型Vincent Tao Hu, Stefan Andreas Baumann, Ming Gui, Olga Grebenkova, Pingchuan Ma, Johannes Fischer, Bjorn Ommerarxiv.org/pdf/2403.13…null
2024-03-20TimeRewind: Rewinding Time with Image-and-Events Video DiffusionTimeRewind:通过图像和事件视频扩散来倒带时间Jingxi Chen, Brandon Y. Feng, Haoming Cai, Mingyang Xie, Christopher Metzler, Cornelia Fermuller, Yiannis Aloimonosarxiv.org/pdf/2403.13…null
2024-03-20DepthFM: Fast Monocular Depth Estimation with Flow MatchingDepthFM:利用流量匹配进行快速单目深度估计Ming Gui, Johannes S. Fischer, Ulrich Prestel, Pingchuan Ma, Dmytro Kotovenko, Olga Grebenkova, Stefan Andreas Baumann, Vincent Tao Hu, Björn Ommerarxiv.org/pdf/2403.13…null
2024-03-20Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation成为你的外画师:通过特定于输入的适应掌握视频外画Fu-Yun Wang, Xiaoshi Wu, Zhaoyang Huang, Xiaoyu Shi, Dazhong Shen, Guanglu Song, Yu Liu, Hongsheng Liarxiv.org/pdf/2403.13…null
2024-03-20DanceCamera3D: 3D Camera Movement Synthesis with Music and DanceDanceCamera3D:音乐和舞蹈的 3D 摄像机运动合成Zixuan Wang, Jia Jia, Shikun Sun, Haozhe Wu, Rong Han, Zhenyu Li, Di Tang, Jiaqing Zhou, Jiebo Luoarxiv.org/pdf/2403.13…null
2024-03-20Multimodal Variational Autoencoder for Low-cost Cardiac Hemodynamics Instability Detection用于低成本心脏血流动力学不稳定性检测的多模态变分自动编码器Mohammod N. I. Suvon, Prasun C. Tripathi, Wenrui Fan, Shuo Zhou, Xianyuan Liu, Samer Alabed, Venet Osmani, Andrew J. Swift, Chen Chen, Haiping Luarxiv.org/pdf/2403.13…null
2024-03-20ZoDi: Zero-Shot Domain Adaptation with Diffusion-Based Image TransferZoDi:基于扩散的图像传输的零射击域适应Hiroki Azuma, Yusuke Matsui, Atsuto Makiarxiv.org/pdf/2403.13…null
2024-03-20ReGround: Improving Textual and Spatial Grounding at No CostReGround:免费改善文本和空间基础Yuseung Lee, Minhyuk Sungarxiv.org/pdf/2403.13…null
2024-03-20Ground-A-Score: Scaling Up the Score Distillation for Multi-Attribute EditingGround-A-Score:扩大多属性编辑的乐谱蒸馏Hangeol Chang, Jinho Chang, Jong Chul Yearxiv.org/pdf/2403.13…null
2024-03-20Diversity-aware Channel Pruning for StyleGAN CompressionStyleGAN 压缩的多样性感知通道修剪Jiwoo Chung, Sangeek Hyun, Sang-Heon Shim, Jae-Pil Heoarxiv.org/pdf/2403.13…link
2024-03-20Compress3D: a Compressed Latent Space for 3D Generation from a Single ImageCompress3D:用于从单个图像生成 3D 的压缩潜在空间Bowen Zhang, Tianyu Yang, Yu Li, Lei Zhang, Xi Zhaoarxiv.org/pdf/2403.13…null
2024-03-20VSTAR: Generative Temporal Nursing for Longer Dynamic Video SynthesisVSTAR:用于更长动态视频合成的生成时间护理Yumeng Li, William Beluch, Margret Keuper, Dan Zhang, Anna Khorevaarxiv.org/pdf/2403.13…null
2024-03-20Scaling Diffusion Models to Real-World 3D LiDAR Scene Completion将扩散模型扩展到真实世界的 3D LiDAR 场景完成Lucas Nunes, Rodrigo Marcuzzi, Benedikt Mersch, Jens Behley, Cyrill Stachnissarxiv.org/pdf/2403.13…link
2024-03-20Cell Tracking in C. elegans with Cell Position Heatmap-Based Alignment and Pairwise Detection使用基于细胞位置热图的对齐和成对检测进行线虫细胞跟踪Kaito Shiku, Hiromitsu Shirai, Takeshi Ishihara, Ryoma Bisearxiv.org/pdf/2403.13…null
2024-03-20S2DM: Sector-Shaped Diffusion Models for Video GenerationS2DM:用于视频生成的扇形扩散模型Haoran Lang, Yuxuan Ge, Zheng Tianarxiv.org/pdf/2403.13…null
2024-03-20IIDM: Image-to-Image Diffusion Model for Semantic Image SynthesisIIDM:用于语义图像合成的图像到图像扩散模型Feng Liu, Xiaobin-Changarxiv.org/pdf/2403.13…null
2024-03-20Correlation Clustering of Organoid Images类器官图像的相关聚类Jannik Presberger, Rashmiparvathi Keshara, David Stein, Yung Hae Kim, Anne Grapin-Botton, Bjoern Andresarxiv.org/pdf/2403.13…null
2024-03-20AGFSync: Leveraging AI-Generated Feedback for Preference Optimization in Text-to-Image GenerationAGFSync:利用人工智能生成的反馈来优化文本到图像生成中的偏好Jingkun An, Yinghao Zhu, Zongjian Li, Haoran Feng, Bohua Chen, Yemin Shi, Chengwei Panarxiv.org/pdf/2403.13…null
2024-03-20LaserHuman: Language-guided Scene-aware Human Motion Generation in Free EnvironmentLaserHuman:自由环境中语言引导的场景感知人体运动生成Peishan Cong, Ziyi WangZhiyang Dou, Yiming Ren, Wei Yin, Kai Cheng, Yujing Sun, Xiaoxiao Long, Xinge Zhu, Yuexin Maarxiv.org/pdf/2403.13…null
2024-03-20DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and PerceptionDetDiffusion:协同生成和感知模型以增强数据生成和感知Yibo Wang, Ruiyuan Gao, Kai Chen, Kaiqiang Zhou, Yingjie Cai, Lanqing Hong, Zhenguo Li, Lihui Jiang, Dit-Yan Yeung, Qiang Xu, et.al.arxiv.org/pdf/2403.13…null
2024-03-20Building Optimal Neural Architectures using Interpretable Knowledge使用可解释的知识构建最佳神经架构Keith G. Mills, Fred X. Han, Mohammad Salameh, Shengyao Lu, Chunhua Zhou, Jiao He, Fengyu Sun, Di Niuarxiv.org/pdf/2403.13…null
2024-03-20Beyond Skeletons: Integrative Latent Mapping for Coherent 4D Sequence Generation超越骨骼:用于连贯 4D 序列生成的集成潜在映射Qitong Yang, Mingtao Feng, Zijie Wu, Shijie Sun, Weisheng Dong, Yaonan Wang, Ajmal Mianarxiv.org/pdf/2403.13…null
2024-03-20Nellie: Automated organelle segmentation, tracking, and hierarchical feature extraction in 2D/3D live-cell microscopyNellie:2D/3D 活细胞显微镜中的自动细胞器分割、跟踪和分层特征提取Austin E. Y. T. Lefebvre, Gabriel Sturm, Ting-Yu Lin, Emily Stoops, Magdalena Preciado Lopez, Benjamin Kaufmann-Malaga, Kayley Hakearxiv.org/pdf/2403.13…link

多模态

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-20RAR: Retrieving And Ranking Augmented MLLMs for Visual RecognitionRAR:检索和排序用于视觉识别的增强 MLLMZiyu Liu, Zeyi Sun, Yuhang Zang, Wei Li, Pan Zhang, Xiaoyi Dong, Yuanjun Xiong, Dahua Lin, Jiaqi Wangarxiv.org/pdf/2403.13…null
2024-03-20Describe-and-Dissect: Interpreting Neurons in Vision Networks with Language Models描述和剖析:用语言模型解释视觉网络中的神经元Nicholas Bai, Rahul A. Iyer, Tuomas Oikarinen, Tsui-Wei Wengarxiv.org/pdf/2403.13…null
2024-03-20AUD-TGN: Advancing Action Unit Detection with Temporal Convolution and GPT-2 in Wild Audiovisual ContextsAUD-TGN:在野外视听环境中使用时间卷积和 GPT-2 推进动作单元检测Jun Yu, Zerui Zhang, Zhihong Wei, Gongpeng Zhao, Zhongpeng Cai, Yongqi Wang, Guochen Xie, Jichao Zhu, Wangyuan Zhuarxiv.org/pdf/2403.13…null
2024-03-20Recursive Cross-Modal Attention for Multimodal Fusion in Dimensional Emotion Recognition维度情感识别中多模态融合的递归跨模态注意R. Gnana Praveen, Jahangir Alamarxiv.org/pdf/2403.13…null
2024-03-20VL-Mamba: Exploring State Space Models for Multimodal LearningVL-Mamba:探索多模态学习的状态空间模型Yanyuan Qiao, Zheng Yu, Longteng Guo, Sihan Chen, Zijia Zhao, Mingzhen Sun, Qi Wu, Jing Liuarxiv.org/pdf/2403.13…null
2024-03-20What if...?: Counterfactual Inception to Mitigate Hallucination Effects in Large Multimodal Models如果......会怎样?:减轻大型多模态模型中幻觉效应的反事实起始Junho Kim, Yeon Ju Kim, Yong Man Roarxiv.org/pdf/2403.13…null
2024-03-20FMM-Attack: A Flow-based Multi-modal Adversarial Attack on Video-based LLMsFMM-Attack:对基于视频的 LLM 的基于流的多模态对抗攻击Jinmin Li, Kuofeng Gao, Yang Bai, Jingyun Zhang, Shu-tao Xia, Yisen Wangarxiv.org/pdf/2403.13…null
2024-03-20A Unified Optimal Transport Framework for Cross-Modal Retrieval with Noisy Labels用于带噪声标签的跨模式检索的统一最优传输框架Haochen Han, Minnan Luo, Huan Liu, Fang Nanarxiv.org/pdf/2403.13…null
2024-03-20HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language ModelsHyperLLaVA:针对多模态大型语言模型的动态视觉和语言专家调整Wenqiao Zhang, Tianwei Lin, Jiang Liu, Fangxun Shu, Haoyuan Li, Lei Zhang, He Wanggui, Hao Zhou, Zheqi Lv, Hao Jiang, et.al.arxiv.org/pdf/2403.13…null
2024-03-20Unifying Local and Global Multimodal Features for Place Recognition in Aliased and Low-Texture Environments统一局部和全局多模态特征,以在别名和低纹理环境中进行地点识别Alberto García-Hernández, Riccardo Giubilato, Klaus H. Strobl, Javier Civera, Rudolph Triebelarxiv.org/pdf/2403.13…null
2024-03-20HyperFusion: A Hypernetwork Approach to Multimodal Integration of Tabular and Medical Imaging Data for Predictive ModelingHyperFusion:用于预测建模的表格和医学成像数据多模态集成的超网络方法Daniel Duenias, Brennan Nichyporuk, Tal Arbel, Tammy Riklin Ravivarxiv.org/pdf/2403.13…null
2024-03-20PuzzleVQA: Diagnosing Multimodal Reasoning Challenges of Language Models with Abstract Visual PatternsPuzzleVQA:用抽象视觉模式诊断语言模型的多模态推理挑战Yew Ken Chia, Vernon Toh Yan Han, Deepanway Ghosal, Lidong Bing, Soujanya Poriaarxiv.org/pdf/2403.13…link
2024-03-20Self-Supervised Class-Agnostic Motion Prediction with Spatial and Temporal Consistency Regularizations具有空间和时间一致性正则化的自监督类无关运动预测Kewei Wang, Yizheng Wu, Jun Cen, Zhiyu Pan, Xingyi Li, Zhe Wang, Zhiguo Cao, Guosheng Linarxiv.org/pdf/2403.13…link

3DGS

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-20RadSplat: Radiance Field-Informed Gaussian Splatting for Robust Real-Time Rendering with 900+ FPSRadSplat:基于辐射场的高斯喷射,可实现 900+ FPS 的鲁棒实时渲染Michael Niemeyer, Fabian Manhardt, Marie-Julie Rakotosaona, Michael Oechsle, Daniel Duckworth, Rama Gosula, Keisuke Tateno, John Bates, Dominik Kaeser, Federico Tombariarxiv.org/pdf/2403.13…null
2024-03-20Gaussian Splatting on the Move: Blur and Rolling Shutter Compensation for Natural Camera Motion移动中的高斯泼溅:自然相机运动的模糊和滚动快门补偿Otto Seiskari, Jerry Ylilammi, Valtteri Kaatrasalo, Pekka Rantalankila, Matias Turkulainen, Juho Kannala, Esa Rahtu, Arno Solinarxiv.org/pdf/2403.13…null

模型压缩/优化

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-20REAL: Representation Enhanced Analytic Learning for Exemplar-free Class-incremental LearningREAL:用于无范例类增量学习的表示增强分析学习Run He, Huiping Zhuang, Di Fang, Yizhu Chen, Kai Tong, Cen Chenarxiv.org/pdf/2403.13…null
2024-03-20Scale Decoupled Distillation规模解耦蒸馏Shicai Wei Chunbo Luo Yang Luoarxiv.org/pdf/2403.13…null
2024-03-20Progressive trajectory matching for medical dataset distillation用于医疗数据集蒸馏的渐进轨迹匹配Zhen Yu, Yang Liu, Qingchao Chenarxiv.org/pdf/2403.13…null
2024-03-20Diversified and Personalized Multi-rater Medical Image Segmentation多样化、个性化的多评估者医学图像分割Yicheng Wu, Xiangde Luo, Zhe Xu, Xiaoqing Guo, Lie Ju, Zongyuan Ge, Wenjun Liao, Jianfei Caiarxiv.org/pdf/2403.13…null
2024-03-20OrthCaps: An Orthogonal CapsNet with Sparse Attention Routing and PruningOrthCaps:具有稀疏注意力路由和剪枝的正交 CapsNetXinyu Geng, Jiaming Wang, Jiawei Gong, Yuerong Xue, Jun Xu, Fanglin Chen, Xiaolin Huangarxiv.org/pdf/2403.13…null
2024-03-20DD-RobustBench: An Adversarial Robustness Benchmark for Dataset DistillationDD-RobustBench:数据集蒸馏的对抗性鲁棒性基准Yifan Wu, Jiawei Du, Ping Liu, Yuewei Lin, Wenqing Cheng, Wei Xuarxiv.org/pdf/2403.13…null

分类/检测/识别/分割/...

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-20Bounding Box Stability against Feature Dropout Reflects Detector Generalization across Environments边界框针对特征丢失的稳定性反映了跨环境的检测器泛化Yang Yang, Wenhai Wang, Zhe Chen, Jifeng Dai, Liang Zhengarxiv.org/pdf/2403.13…null
2024-03-20Hierarchical NeuroSymbolic Approach for Action Quality Assessment行动质量评估的分层神经符号方法Lauren Okamoto, Paritosh Parmararxiv.org/pdf/2403.13…null
2024-03-20Bridge the Modality and Capacity Gaps in Vision-Language Model Selection弥合视觉语言模型选择中的模态和能力差距Chao Yi, De-Chuan Zhan, Han-Jia Yearxiv.org/pdf/2403.13…null
2024-03-20Practical End-to-End Optical Music Recognition for Pianoform Music实用的钢琴音乐端到端光学音乐识别Jiří Mayer, Milan Straka, Jan Hajič jr., Pavel Pecinaarxiv.org/pdf/2403.13…link
2024-03-20When Cars meet Drones: Hyperbolic Federated Learning for Source-Free Domain Adaptation in Adverse Weather当汽车遇到无人机:恶劣天气下无源域适应的双曲联合学习Giulia Rizzoli, Matteo Caligiuri, Donald Shenaj, Francesco Barbato, Pietro Zanuttigharxiv.org/pdf/2403.13…null
2024-03-20HierCode: A Lightweight Hierarchical Codebook for Zero-shot Chinese Text RecognitionHierCode:用于零样本中文文本识别的轻量级分层密码本Yuyi Zhang, Yuanzhi Zhu, Dezhi Peng, Peirong Zhang, Zhenhua Yang, Zhibo Yang, Cong Yao, Lianwen Jinarxiv.org/pdf/2403.13…null
2024-03-20Enhancing Gait Video Analysis in Neurodegenerative Diseases by Knowledge Augmentation in Vision Language Model通过视觉语言模型中的知识增强增强神经退行性疾病的步态视频分析Diwei Wang, Kun Yuan, Candice Muller, Frédéric Blanc, Nicolas Padoy, Hyewon Seoarxiv.org/pdf/2403.13…null
2024-03-20Fostc3net:A Lightweight YOLOv5 Based On the Network Structure OptimizationFostc3net:基于网络结构优化的轻量级YOLOv5Danqing Ma, Shaojie Li, Bo Dang, Hengyi Zang, Xinqi Dongarxiv.org/pdf/2403.13…null
2024-03-20Insight Into the Collocation of Multi-Source Satellite Imagery for Multi-Scale Vessel Detection深入探讨多源卫星图像搭配用于多尺度船舶检测Tran-Vu La, Minh-Tan Pham, Marco Chiniarxiv.org/pdf/2403.13…null
2024-03-20MotorEase: Automated Detection of Motor Impairment Accessibility Issues in Mobile App UIsMotorEase:自动检测移动应用程序 UI 中的运动障碍辅助功能问题Arun Krishnavajjala, SM Hasan Mansur, Justin Jose, Kevin Moranarxiv.org/pdf/2403.13…null
2024-03-20Step-Calibrated Diffusion for Biomedical Optical Image Restoration用于生物医学光学图像恢复的步进校准扩散Yiwei Lyu, Sung Jik Cha, Cheng Jiang, Asadur Chowdury, Xinhai Hou, Edward Harake, Akhil Kondepudi, Christian Freudiger, Honglak Lee, Todd C. Hollonarxiv.org/pdf/2403.13…null
2024-03-20ProMamba: Prompt-Mamba for polyp segmentationProMamba:用于息肉分割的 Prompt-MambaJianhao Xie, Ruofan Liao, Ziang Zhang, Sida Yi, Yuesheng Zhu, Guibo Luoarxiv.org/pdf/2403.13…null
2024-03-20H-vmunet: High-order Vision Mamba UNet for Medical Image SegmentationH-vmunet:用于医学图像分割的高阶视觉 Mamba UNetRenkai Wu, Yinghao Liu, Pengchen Liang, Qing Changarxiv.org/pdf/2403.13…link
2024-03-20Leveraging feature communication in federated learning for remote sensing image classification利用联邦学习中的特征通信进行遥感图像分类Anh-Kiet Duong, Hoàng-Ân Lê, Minh-Tan Phamarxiv.org/pdf/2403.13…null
2024-03-20Find n' Propagate: Open-Vocabulary 3D Object Detection in Urban EnvironmentsFind n' Propagate:城市环境中的开放词汇 3D 对象检测Djamahl Etchegaray, Zi Huang, Tatsuya Harada, Yadan Luoarxiv.org/pdf/2403.13…null
2024-03-20Next day fire prediction via semantic segmentation通过语义分割预测第二天火灾Konstantinos Alexis, Stella Girtsou, Alexis Apostolakis, Giorgos Giannopoulos, Charalampos Kontoesarxiv.org/pdf/2403.13…null
2024-03-20High-confidence pseudo-labels for domain adaptation in COVID-19 detection用于 COVID-19 检测中域适应的高置信度伪标签Robert Turnbull, Simon Mutcharxiv.org/pdf/2403.13…null
2024-03-20Deepfake Detection without Deepfakes: Generalization via Synthetic Frequency Patterns Injection没有 Deepfakes 的 Deepfake 检测:通过合成频率模式注入进行泛化Davide Alessandro Coccomini, Roberto Caldelli, Claudio Gennaro, Giuseppe Fiameni, Giuseppe Amato, Fabrizio Falchiarxiv.org/pdf/2403.13…null
2024-03-20Fast-Poly: A Fast Polyhedral Framework For 3D Multi-Object TrackingFast-Poly:用于 3D 多对象跟踪的快速多面体框架Xiaoyu Li, Dedong Liu, Lijun Zhao, Yitao Wu, Xian Wu, Jinghan Gaoarxiv.org/pdf/2403.13…null
2024-03-20Stochastic Geometry Models for Texture Synthesis of Machined Metallic Surfaces: Sandblasting and Milling用于机加工金属表面纹理合成的随机几何模型:喷砂和铣削Natascha Jeziorski, Claudia Redenbacharxiv.org/pdf/2403.13…null
2024-03-20MTP: Advancing Remote Sensing Foundation Model via Multi-Task PretrainingMTP:通过多任务预训练推进遥感基础模型Di Wang, Jing Zhang, Minqiang Xu, Lin Liu, Dongsheng Wang, Erzhong Gao, Chengxi Han, Haonan Guo, Bo Du, Dacheng Tao, et.al.arxiv.org/pdf/2403.13…null
2024-03-20DOR3D-Net: Dense Ordinal Regression Network for 3D Hand Pose EstimationDOR3D-Net:用于 3D 手势估计的密集序数回归网络Yamin Mao, Zhihua Liu, Weiming Li, SoonYong Cho, Qiang Wang, Xiaoshuai Haoarxiv.org/pdf/2403.13…null
2024-03-20Robust image segmentation model based on binary level set基于二值水平集的鲁棒图像分割模型Wenqi Zhaoarxiv.org/pdf/2403.13…null
2024-03-20Few-shot Oriented Object Detection with Memorable Contrastive Learning in Remote Sensing Images遥感图像中具有令人难忘的对比学习的面向少镜头的目标检测Jiawei Zhou, Wuzhou Li, Yi Cao, Hongtao Cai, Xiang Liarxiv.org/pdf/2403.13…null
2024-03-20Counting Network for Learning from Majority Label用于从多数标签学习的计数网络Kaito Shiku, Shinnosuke Matsuo, Daiki Suehiro, Ryoma Bisearxiv.org/pdf/2403.13…null
2024-03-20Hierarchical Gaussian Mixture Normalizing Flow Modeling for Unified Anomaly Detection用于统一异常检测的分层高斯混合归一化流建模Xincheng Yao, Ruoqi Li, Zefeng Qian, Lu Wang, Chongyang Zhangarxiv.org/pdf/2403.13…null
2024-03-20Efficient scene text image super-resolution with semantic guidance具有语义指导的高效场景文本图像超分辨率LeoWu TomyEnrique, Xiangcheng Du, Kangliang Liu, Han Yuan, Zhao Zhou, Cheng Jinarxiv.org/pdf/2403.13…null
2024-03-20Out-of-Distribution Detection Using Peer-Class Generated by Large Language Model使用大型语言模型生成的对等类进行分布外检测K Huang, G Song, Hanwen Su, Jiyan Wangarxiv.org/pdf/2403.13…null
2024-03-20Rotary Position Embedding for Vision Transformer视觉变压器的旋转位置嵌入Byeongho Heo, Song Park, Dongyoon Han, Sangdoo Yunarxiv.org/pdf/2403.13…null
2024-03-20SAMCT: Segment Any CT Allowing Labor-Free Task-Indicator PromptsSAMCT:分段任何允许无人工任务指示器提示的 CTXian Lin, Yangyang Xiang, Zhehao Wang, Kwang-Ting Cheng, Zengqiang Yan, Li Yuarxiv.org/pdf/2403.13…null
2024-03-20Self-Attention Based Semantic Decomposition in Vector Symbolic Architectures向量符号架构中基于自注意力的语义分解Calvin Yeung, Prathyush Poduval, Mohsen Imaniarxiv.org/pdf/2403.13…null

GNN

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-20Adaptive Critical Subgraph Mining for Cognitive Impairment Conversion Prediction with T1-MRI-based Brain Network基于 T1-MRI 的脑网络进行认知障碍转换预测的自适应关键子图挖掘Yilin Leng, Wenju Cui, Bai Chen, Xi Jiang, Shuangqing Chen, Jian Zhengarxiv.org/pdf/2403.13…null

图像理解

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-20Learning Novel View Synthesis from Heterogeneous Low-light Captures从异构低光捕获中学习新颖的视图合成Quan Zheng, Hao Sun, Huiyao Xu, Fanjiang Xuarxiv.org/pdf/2403.13…null

LLM

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-20Improved Baselines for Data-efficient Perceptual Augmentation of LLMs改进法学硕士数据高效感知增强的基线Théophane Vallaeys, Mustafa Shukor, Matthieu Cord, Jakob Verbeekarxiv.org/pdf/2403.13…null
2024-03-20ManiPose: A Comprehensive Benchmark for Pose-aware Object Manipulation in RoboticsManiPose:机器人中姿势感知对象操纵的综合基准Qiaojun Yu, Ce Hao, Junbo Wang, Wenhai Liu, Liu Liu, Yao Mu, Yang You, Hengxu Yan, Cewu Luarxiv.org/pdf/2403.13…null

Transformer

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-20Learning from Models and Data for Visual Grounding从模型和数据中学习视觉基础Ruozhen He, Paola Cascante-Bonilla, Ziyan Yang, Alexander C. Berg, Vicente Ordonezarxiv.org/pdf/2403.13…null
2024-03-20Retina Vision Transformer (RetinaViT): Introducing Scaled Patches into Vision Transformers视网膜视觉变压器 (RetinaViT):将缩放补丁引入视觉变压器Yuyang Shu, Michael E. Bainarxiv.org/pdf/2403.13…null
2024-03-20T-Pixel2Mesh: Combining Global and Local Transformer for 3D Mesh Generation from a Single ImageT-Pixel2Mesh:结合全局和局部 Transformer 从单个图像生成 3D 网格Shijie Zhang, Boyan Jiang, Keke He, Junwei Zhu, Ying Tai, Chengjie Wang, Yinda Zhang, Yanwei Fuarxiv.org/pdf/2403.13…null
2024-03-20Portrait4D-v2: Pseudo Multi-View Data Creates Better 4D Head SynthesizerPortrait4D-v2:伪多视图数据创建更好的4D头部合成器Yu Deng, Duomin Wang, Baoyuan Wangarxiv.org/pdf/2403.13…null
2024-03-20What explains the success of cross-modal fine-tuning with ORCA?如何解释 ORCA 跨模式微调的成功?Paloma García-de-Herreros, Vagrant Gautam, Philipp Slusallek, Dietrich Klakow, Marius Mosbacharxiv.org/pdf/2403.13…null
2024-03-20vid-TLDR: Training Free Token merging for Light-weight Video Transformervid-TLDR:轻量级视频变压器的免费训练令牌合并Joonmyung Choi, Sanghyeok Lee, Jaewon Chu, Minhyuk Choi, Hyunwoo J. Kimarxiv.org/pdf/2403.13…null
2024-03-20AMP: Autoregressive Motion Prediction Revisited with Next Token Prediction for Autonomous DrivingAMP:通过自动驾驶的下一个令牌预测重新审视自回归运动预测Xiaosong Jia, Shaoshuai Shi, Zijun Chen, Li Jiang, Wenlong Liao, Tao He, Junchi Yanarxiv.org/pdf/2403.13…null
2024-03-20Mora: Enabling Generalist Video Generation via A Multi-Agent FrameworkMora:通过多代理框架实现通用视频生成Zhengqing Yuan, Ruoxi Chen, Zhaoxu Li, Haolong Jia, Lifang He, Chi Wang, Lichao Sunarxiv.org/pdf/2403.13…link

3D/CG

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-20Towards Principled Representation Learning from Videos for Reinforcement Learning从强化学习视频中进行有原则的表示学习Dipendra Misra, Akanksha Saran, Tengyang Xie, Alex Lamb, John Langfordarxiv.org/pdf/2403.13…null
2024-03-20DVMNet: Computing Relative Pose for Unseen Objects Beyond HypothesesDVMNet:超越假设计算看不见的物体的相对姿态Chen Zhao, Tong Zhang, Zheng Dang, Mathieu Salzmannarxiv.org/pdf/2403.13…null
2024-03-20Motion Generation from Fine-grained Textual Descriptions根据细粒度文本描述生成运动Kunhang Li, Yansong Fengarxiv.org/pdf/2403.13…null
2024-03-20CLIPSwarm: Generating Drone Shows from Text Prompts with Vision-Language ModelsCLIPSwarm:使用视觉语言模型根据文本提示生成无人机表演Pablo Pueyo, Eduardo Montijano, Ana C. Murillo, Mac Schwagerarxiv.org/pdf/2403.13…null
2024-03-20Advancing 6D Pose Estimation in Augmented Reality -- Overcoming Projection Ambiguity with Uncontrolled Imagery推进增强现实中的 6D 姿态估计——克服不受控制的图像的投影模糊性Mayura Manawadu, Sieun Park, Soon-Yong Parkarxiv.org/pdf/2403.13…null
2024-03-20Text-to-3D Shape Generation文本到 3D 形状生成Han-Hung Lee, Manolis Savva, Angel X. Changarxiv.org/pdf/2403.13…null

各类学习方式

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-20A Unified and General Framework for Continual Learning持续学习的统一通用框架Zhenyi Wang, Yan Li, Li Shen, Heng Huangarxiv.org/pdf/2403.13…null

其他

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-20On Pretraining Data Diversity for Self-Supervised Learning关于自监督学习的预训练数据多样性Hasan Abed Al Kader Hammoud, Tuhin Das, Fabio Pizzati, Philip Torr, Adel Bibi, Bernard Ghanemarxiv.org/pdf/2403.13…null
2024-03-20Certified Human Trajectory Prediction经过认证的人体轨迹预测Mohammadhossein Bahari, Saeed Saadatnejad, Amirhossein Asgari Farsangi, Seyed-Mohsen Moosavi-Dezfooli, Alexandre Alahiarxiv.org/pdf/2403.13…null
2024-03-20Leveraging High-Resolution Features for Improved Deep Hashing-based Image Retrieval利用高分辨率功能改进基于深度哈希的图像检索Aymene Berriche, Mehdi Adjal Zakaria, Riyadh Baghdadiarxiv.org/pdf/2403.13…null
2024-03-20DBA-Fusion: Tightly Integrating Deep Dense Visual Bundle Adjustment with Multiple Sensors for Large-Scale Localization and MappingDBA-Fusion:将深度密集视觉束调整与多个传感器紧密集成,以实现大规模定位和绘图Yuxuan Zhou, Xingxing Li, Shengyu Li, Xuanbin Wang, Shaoquan Feng, Yuxuan Tanarxiv.org/pdf/2403.13…null
2024-03-20SPTNet: An Efficient Alternative Framework for Generalized Category Discovery with Spatial Prompt TuningSPTNet:具有空间提示调整的广义类别发现的有效替代框架Hongjun Wang, Sagar Vaze, Kai Hanarxiv.org/pdf/2403.13…null
2024-03-20Learning User Embeddings from Human Gaze for Personalised Saliency Prediction从人类注视中学习用户嵌入以进行个性化显着性预测Florian Strohm, Mihai Bâce, Andreas Bullingarxiv.org/pdf/2403.13…null
2024-03-20Meta-Point Learning and Refining for Category-Agnostic Pose Estimation用于类别无关姿势估计的元点学习和细化Junjie Chen, Jiebin Yan, Yuming Fang, Li Niuarxiv.org/pdf/2403.13…link
2024-03-20IDAdapter: Learning Mixed Features for Tuning-Free Personalization of Text-to-Image ModelsIDAdapter:学习混合特征以实现文本到图像模型的免调整个性化Siying Cui, Jiankang Deng, Jia Guo, Xiang An, Yongle Zhao, Xinyu Wei, Ziyong Fengarxiv.org/pdf/2403.13…null
2024-03-20An AI-Assisted Skincare Routine Recommendation System in XRXR 中人工智能辅助的日常护肤推荐系统Gowravi Malalur Rajegowda, Yannis Spyridis, Barbara Villarini, Vasileios Argyriouarxiv.org/pdf/2403.13…null
2024-03-20MedCycle: Unpaired Medical Report Generation via Cycle-ConsistencyMedCycle:通过周期一致性生成不成对的医疗报告Elad Hirsch, Gefen Dawidowicz, Ayellet Talarxiv.org/pdf/2403.13…null
2024-03-20TiBiX: Leveraging Temporal Information for Bidirectional X-ray and Report GenerationTiBiX:利用时间信息进行双向 X 射线和报告生成Santosh Sanjeev, Fadillah Adamsyah Maani, Arsen Abzhanov, Vijay Ram Papineni, Ibrahim Almakky, Bartłomiej W. Papież, Mohammad Yaqubarxiv.org/pdf/2403.13…null
2024-03-20FissionFusion: Fast Geometric Generation and Hierarchical Souping for Medical Image AnalysisFissionFusion:用于医学图像分析的快速几何生成和分层汤Santosh Sanjeev, Nuren Zhaksylyk, Ibrahim Almakky, Anees Ur Rehman Hashmi, Mohammad Areeb Qazi, Mohammad Yaqubarxiv.org/pdf/2403.13…null
2024-03-20AdaViPro: Region-based Adaptive Visual Prompt for Large-Scale Models AdaptingAdaViPro:用于大规模模型自适应的基于区域的自适应视觉提示Mengyu Yang, Ye Tian, Lanshan Zhang, Xiao Liang, Xuming Ran, Wendong Wangarxiv.org/pdf/2403.13…null
2024-03-20SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language ModelsSC-Tune:在大视觉语言模型中释放自洽的指涉理解Tongtian Yue, Jie Cheng, Longteng Guo, Xingyuan Dai, Zijia Zhao, Xingjian He, Gang Xiong, Yisheng Lv, Jing Liuarxiv.org/pdf/2403.13…null