[UPDATED!] 2024-03-22 (Publish Time)
生成模型
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-22 | DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data | DiffusionMTL:从部分注释的数据中学习多任务去噪扩散模型 | Hanrong Ye, Dan Xu | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Ultrasound Imaging based on the Variance of a Diffusion Restoration Model | 基于扩散恢复模型方差的超声成像 | Yuxin Zhang, Clément Huneau, Jérôme Idier, Diana Mateus | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Controlled Training Data Generation with Diffusion Models | 使用扩散模型控制训练数据生成 | Teresa Yeo, Andrei Atanov, Harold Benoit, Aleksandr Alekseev, Ruchira Ray, Pooya Esmaeil Akhoondi, Amir Zamir | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Spectral Motion Alignment for Video Motion Transfer using Diffusion Models | 使用扩散模型进行视频运动传输的频谱运动对齐 | Geon Yeong Park, Hyeonho Jeong, Sang Wan Lee, Jong Chul Ye | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Shadow Generation for Composite Image Using Diffusion model | 使用扩散模型生成复合图像的阴影 | Qingyang Liu, Junqi You, Jianting Wang, Xinhao Tao, Bo Zhang, Li Niu | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | A Multimodal Approach for Cross-Domain Image Retrieval | 跨域图像检索的多模态方法 | Lucas Iijima, Tania Stathaki | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Deep Generative Model based Rate-Distortion for Image Downscaling Assessment | 用于图像缩小评估的基于率失真的深度生成模型 | Yuanbang Liang, Bhavesh Garg, Paul L Rosin, Yipeng Qin | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Recent Trends in 3D Reconstruction of General Non-Rigid Scenes | 一般非刚性场景 3D 重建的最新趋势 | Raza Yunus, Jan Eric Lenssen, Michael Niemeyer, Yiyi Liao, Christian Rupprecht, Christian Theobalt, Gerard Pons-Moll, Jia-Bin Huang, Vladislav Golyanik, Eddy Ilg | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Towards a Comprehensive, Efficient and Promptable Anatomic Structure Segmentation Model using 3D Whole-body CT Scans | 使用 3D 全身 CT 扫描建立全面、高效、快速的解剖结构分割模型 | Heng Guo, Jianfeng Zhang, Jiaxing Huang, Tony C. W. Mok, Dazhou Guo, Ke Yan, Le Lu, Dakai Jin, Minfeng Xu | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | MM-Diff: High-Fidelity Image Personalization via Multi-Modal Condition Integration | MM-Diff:通过多模态条件集成实现高保真图像个性化 | Zhichao Wei, Qingkun Su, Long Qin, Weizhi Wang | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Toward Tiny and High-quality Facial Makeup with Data Amplify Learning | 通过数据放大学习实现微小且高品质的面部化妆 | Qiaoqiao Jin, Xuanhong Chen, Meiguang Jin, Ying Cheng, Rui Shi, Yucheng Zheng, Yupeng Zhu, Bingbing Ni | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Generative Active Learning for Image Synthesis Personalization | 用于图像合成个性化的生成主动学习 | Xulu Zhang, Wengyu Zhang, Xiao-Yong Wei, Jinlin Wu, Zhaoxiang Zhang, Zhen Lei, Qing Li | arxiv.org/pdf/2403.14… | null |
| 2024-03-22 | DreamFlow: High-Quality Text-to-3D Generation by Approximating Probability Flow | DreamFlow:通过近似概率流生成高质量文本到 3D | Kyungmin Lee, Kihyuk Sohn, Jinwoo Shin | arxiv.org/pdf/2403.14… | null |
| 2024-03-22 | CLIP-VQDiffusion : Langauge Free Training of Text To Image generation using CLIP and vector quantized diffusion model | CLIP-VQDiffusion :使用 CLIP 和矢量量化扩散模型进行文本到图像生成的语言免费训练 | Seungdae Han, Joohee Kim | arxiv.org/pdf/2403.14… | null |
| 2024-03-22 | STAG4D: Spatial-Temporal Anchored Generative 4D Gaussians | STAG4D:时空锚定生成 4D 高斯 | Yifei Zeng, Yanqin Jiang, Siyu Zhu, Yuanxun Lu, Youtian Lin, Hao Zhu, Weiming Hu, Xun Cao, Yao Yao | arxiv.org/pdf/2403.14… | null |
| 2024-03-22 | Geometric Generative Models based on Morphological Equivariant PDEs and GANs | 基于形态等变偏微分方程和生成对抗网络的几何生成模型 | El Hadji S. Diop, Thierno Fall, Alioune Mbengue, Mohamed Daoudi | arxiv.org/pdf/2403.14… | null |
多模态
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-22 | LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models | LLaVA-PruMerge:高效大型多模态模型的自适应令牌缩减 | Yuzhang Shang, Mu Cai, Bingxin Xu, Yong Jae Lee, Yan Yan | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding | InternVideo2:扩展视频基础模型以实现多模态视频理解 | Yi Wang, Kunchang Li, Xinhao Li, Jiashuo Yu, Yinan He, Guo Chen, Baoqi Pei, Rongkun Zheng, Jilan Xu, Zun Wang, et.al. | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Neural Plasticity-Inspired Foundation Model for Observing the Earth Crossing Modalities | 用于观察地球穿越模式的神经可塑性基础模型 | Zhitong Xiong, Yi Wang, Fahong Zhang, Adam J. Stewart, Joëlle Hanna, Damian Borth, Ioannis Papoutsis, Bertrand Le Saux, Gustau Camps-Valls, Xiao Xiang Zhu | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Selectively Informative Description can Reduce Undesired Embedding Entanglements in Text-to-Image Personalization | 选择性信息描述可以减少文本到图像个性化中不需要的嵌入纠缠 | Jimyeong Kim, Jungwon Park, Wonjong Rhee | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | IS-Fusion: Instance-Scene Collaborative Fusion for Multimodal 3D Object Detection | IS-Fusion:用于多模态 3D 对象检测的实例场景协作融合 | Junbo Yin, Jianbing Shen, Runnan Chen, Wei Li, Ruigang Yang, Pascal Frossard, Wenguan Wang | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | MSCoTDet: Language-driven Multi-modal Fusion for Improved Multispectral Pedestrian Detection | MSCoTDet:语言驱动的多模态融合,用于改进多光谱行人检测 | Taeheon Kim, Sangyun Chung, Damin Yeom, Youngjoon Yu, Hak Gu Kim, Yong Man Ro | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Multimodal Fusion with Pre-Trained Model Features in Affective Behaviour Analysis In-the-wild | 野外情感行为分析中与预训练模型特征的多模态融合 | Zhuofan Wen, Fengyu Zhang, Siyuan Zhang, Haiyang Sun, Mingyu Xu, Licai Sun, Zheng Lian, Bin Liu, Jianhua Tao | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | AVT2-DWF: Improving Deepfake Detection with Audio-Visual Fusion and Dynamic Weighting Strategies | AVT2-DWF:通过视听融合和动态加权策略改进 Deepfake 检测 | Rui Wang, Dengpan Ye, Long Tang, Yunming Zhang, Jiacheng Deng | arxiv.org/pdf/2403.14… | null |
Nerf
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-22 | WSCLoc: Weakly-Supervised Sparse-View Camera Relocalization | WSCLoc:弱监督稀疏视图相机重定位 | Jialu Wang, Kaichen Zhou, Andrew Markham, Niki Trigoni | arxiv.org/pdf/2403.15… | null |
3DGS
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-22 | EndoGSLAM: Real-Time Dense Reconstruction and Tracking in Endoscopic Surgeries using Gaussian Splatting | EndoGSLAM:使用高斯溅射在内窥镜手术中进行实时密集重建和跟踪 | Kailing Wang, Chen Yang, Yuehao Wang, Sikuang Li, Yan Wang, Qi Dou, Xiaokang Yang, Wei Shen | arxiv.org/pdf/2403.15… | null |
模型压缩/优化
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-22 | ThemeStation: Generating Theme-Aware 3D Assets from Few Exemplars | ThemeStation:从少数示例中生成主题感知的 3D 资源 | Zhenwei Wang, Tengfei Wang, Gerhard Hancke, Ziwei Liu, Rynson W. H. Lau | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | LSK3DNet: Towards Effective and Efficient 3D Perception with Large Sparse Kernels | LSK3DNet:利用大型稀疏内核实现有效且高效的 3D 感知 | Tuo Feng, Wenguan Wang, Fan Ma, Yi Yang | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Infrastructure-Assisted Collaborative Perception in Automated Valet Parking: A Safety Perspective | 自动代客泊车中基础设施辅助的协作感知:安全视角 | Yukuan Jia, Jiawen Zhang, Shimeng Lu, Baokang Fan, Ruiqing Mao, Sheng Zhou, Zhisheng Niu | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Magic for the Age of Quantized DNNs | 量化 DNN 时代的魔力 | Yoshihide Sawada, Ryuji Saiin, Kazuma Suetake | arxiv.org/pdf/2403.14… | null |
分类/检测/识别/分割/...
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-22 | Long-CLIP: Unlocking the Long-Text Capability of CLIP | Long-CLIP:解锁 CLIP 的长文本功能 | Beichen Zhang, Pan Zhang, Xiaoyi Dong, Yuhang Zang, Jiaqi Wang | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Learning Topological Representations for Deep Image Understanding | 学习拓扑表示以进行深度图像理解 | Xiaoling Hu | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Fully automated workflow for the design of patient-specific orthopaedic implants: application to total knee arthroplasty | 用于设计患者特定骨科植入物的全自动工作流程:在全膝关节置换术中的应用 | Aziliz Guezou-Philippe, Arnaud Clavé, Ehouarn Maguet, Ludivine Maintier, Charles Garraud, Jean-Rassaire Fouefack, Valérie Burdin, Eric Stindel, Guillaume Dardenne | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Point-DETR3D: Leveraging Imagery Data with Spatial Point Prior for Weakly Semi-supervised 3D Object Detection | Point-DETR3D:利用图像数据和空间点先验进行弱半监督 3D 物体检测 | Hongzhi Gao, Zheng Chen, Zehui Chen, Lin Chen, Jiaming Liu, Shanghang Zhang, Feng Zhao | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Global Control for Local SO(3)-Equivariant Scale-Invariant Vessel Segmentation | 局部 SO(3) 等变尺度不变血管分割的全局控制 | Patryk Rygiel, Dieuwertje Alblas, Christoph Brune, Kak Khee Yeung, Jelmer M. Wolterink | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | CR3DT: Camera-RADAR Fusion for 3D Detection and Tracking | CR3DT:用于 3D 检测和跟踪的相机-雷达融合 | Nicolas Baumann, Michael Baumgartner, Edoardo Ghignone, Jonas Kühne, Tobias Fischer, Yung-Hsu Yang, Marc Pollefeys, Michele Magno | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Hyperbolic Metric Learning for Visual Outlier Detection | 用于视觉异常值检测的双曲度量学习 | Alvaro Gonzalez-Jimenez, Simone Lionetti, Dena Bazazian, Philippe Gottfrois, Fabian Gröger, Marc Pouly, Alexander Navarini | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Reasoning-Enhanced Object-Centric Learning for Videos | 视频的推理增强型以对象为中心的学习 | Jian Li, Pu Ren, Yang Liu, Hao Sun | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | WEEP: A method for spatial interpretation of weakly supervised CNN models in computational pathology | WEEP:计算病理学中弱监督 CNN 模型的空间解释方法 | Abhinav Sharma, Bojing Liu, Mattias Rantalainen | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Anytime, Anywhere, Anyone: Investigating the Feasibility of Segment Anything Model for Crowd-Sourcing Medical Image Annotations | 随时随地、任何人:研究众包医学图像注释的分段任意模型的可行性 | Pranav Kulkarni, Adway Kanhere, Dharmam Savani, Andrew Chan, Devina Chatterjee, Paul H. Yi, Vishwa S. Parekh | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | GCN-DevLSTM: Path Development for Skeleton-Based Action Recognition | GCN-DevLSTM:基于骨架的动作识别的路径开发 | Lei Jiang, Weixin Yang, Xin Zhang, Hao Ni | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | DITTO: Demonstration Imitation by Trajectory Transformation | DITTO:通过轨迹变换进行演示模仿 | Nick Heppert, Max Argus, Tim Welschehold, Thomas Brox, Abhinav Valada | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Your Image is My Video: Reshaping the Receptive Field via Image-To-Video Differentiable AutoAugmentation and Fusion | 你的图像就是我的视频:通过图像到视频可微分自动增强和融合重塑感受野 | Sofia Casarin, Cynthia I. Ugwu, Sergio Escalera, Oswald Lanz | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | SFOD: Spiking Fusion Object Detector | SFOD:尖峰融合物体探测器 | Yimeng Fan, Wei Zhang, Changsong Liu, Mingyang Li, Wenrui Lu | arxiv.org/pdf/2403.15… | link |
| 2024-03-22 | An In-Depth Analysis of Data Reduction Methods for Sustainable Deep Learning | 深入分析可持续深度学习的数据缩减方法 | Víctor Toscano-Durán, Javier Perera-Lago, Eduardo Paluzo-Hidalgo, Rocío Gonzalez-Diaz, Miguel Ángel Gutierrez-Naranjo, Matteo Rucco | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Modular Deep Active Learning Framework for Image Annotation: A Technical Report for the Ophthalmo-AI Project | 用于图像注释的模块化深度主动学习框架:Oathmo-AI 项目的技术报告 | Md Abdul Kadir, Hasan Md Tusfiqur Alam, Pascale Maul, Hans-Jürgen Profitlich, Moritz Wolf, Daniel Sonntag | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Transfer CLIP for Generalizable Image Denoising | 用于通用图像去噪的传输 CLIP | Jun Cheng, Dong Liang, Shan Tan | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Gradient-based Sampling for Class Imbalanced Semi-supervised Object Detection | 用于类不平衡半监督目标检测的基于梯度的采样 | Jiaming Li, Xiangru Lin, Wei Zhang, Xiao Tan, Yingying Li, Junyu Han, Errui Ding, Jingdong Wang, Guanbin Li | arxiv.org/pdf/2403.15… | link |
| 2024-03-22 | SYNCS: Synthetic Data and Contrastive Self-Supervised Training for Central Sulcus Segmentation | SYNCS:中央沟分割的综合数据和对比自我监督训练 | Vladyslav Zalevskyi, Kristoffer Hougaard Madsen | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | PseudoTouch: Efficiently Imaging the Surface Feel of Objects for Robotic Manipulation | PseudoTouch:有效成像用于机器人操作的物体表面感觉 | Adrian Röfer, Nick Heppert, Abdallah Ayman, Eugenio Chisari, Abhinav Valada | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Improving cross-domain brain tissue segmentation in fetal MRI with synthetic data | 利用合成数据改进胎儿 MRI 中的跨域脑组织分割 | Vladyslav Zalevskyi, Thomas Sanchez, Margaux Roulet, Jordina Aviles Verddera, Jana Hutter, Hamza Kebiri, Meritxell Bach Cuadra | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | IFSENet : Harnessing Sparse Iterations for Interactive Few-shot Segmentation Excellence | IFSENet:利用稀疏迭代实现卓越的交互式少样本分割 | Shreyas Chandgothia, Ardhendu Sekhar, Amit Sethi | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Cell Variational Information Bottleneck Network | 细胞变异信息瓶颈网络 | Zhonghua Zhai, Chen Ju, Jinsong Lan, Shuai Xiao | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Cartoon Hallucinations Detection: Pose-aware In Context Visual Learning | 卡通幻觉检测:情境视觉学习中的姿势感知 | Bumsoo Kim, Wonseop Shin, Kyuchul Lee, Sanghyun Seo | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | An Integrated Neighborhood and Scale Information Network for Open-Pit Mine Change Detection in High-Resolution Remote Sensing Images | 用于高分辨率遥感图像露天矿变化检测的综合邻域和规模信息网络 | Zilin Xie, Kangning Li, Jinbao Jiang, Jinzhong Yang, Xiaojun Qiao, Deshuai Yuan, Cheng Nie | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Image Classification with Rotation-Invariant Variational Quantum Circuits | 使用旋转不变变分量子电路进行图像分类 | Paul San Sebastian, Mikel Cañizo, Román Orús | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | VRSO: Visual-Centric Reconstruction for Static Object Annotation | VRSO:静态对象注释的以视觉为中心的重建 | Chenyao Yu, Yingfeng Cai, Jiaxin Zhang, Hui Kong, Wei Sui, Cong Yang | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | BSNet: Box-Supervised Simulation-assisted Mean Teacher for 3D Instance Segmentation | BSNet:用于 3D 实例分割的框监督模拟辅助 Mean Teacher | Jiahao Lu, Jiacheng Deng, Tianzhu Zhang | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Vehicle Detection Performance in Nordic Region | 北欧地区车辆检测性能 | Hamam Mokayed, Rajkumar Saini, Oluwatosin Adewumi, Lama Alkhaled, Bjorn Backe, Palaiahnakote Shivakumara, Olle Hagner, Yan Chai Hum | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Extracting Human Attention through Crowdsourced Patch Labeling | 通过众包补丁标签吸引人们的注意力 | Minsuk Chang, Seokhyeon Park, Hyeon Jeon, Aeri Cho, Soohyun Lee, Jinwook Seo | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Cell Tracking according to Biological Needs -- Strong Mitosis-aware Random-finite Sets Tracker with Aleatoric Uncertainty | 根据生物需求进行细胞追踪——具有任意不确定性的强有丝分裂感知随机有限集追踪器 | Timo Kaiser, Maximilian Schier, Bodo Rosenhahn | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Clean-image Backdoor Attacks | 干净图像后门攻击 | Dazhong Rong, Shuheng Shen, Xinyi Fu, Peng Qian, Jianhai Chen, Qinming He, Xing Fu, Weiqiang Wang | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | ParFormer: Vision Transformer Baseline with Parallel Local Global Token Mixer and Convolution Attention Patch Embedding | ParFormer:具有并行局部全局令牌混合器和卷积注意补丁嵌入的视觉变换器基线 | Novendra Setyawan, Ghufron Wahyu Kurniawan, Chi-Chia Sun, Jun-Wei Hsieh, Hui-Kai Su, Wen-Kai Kuo | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Improve Cross-domain Mixed Sampling with Guidance Training for Adaptive Segmentation | 通过自适应分割的指导训练改进跨域混合采样 | Wenlve Zhou, Zhiheng Zhou, Tianlei Wang, Delu Zeng | arxiv.org/pdf/2403.14… | null |
| 2024-03-22 | Trajectory Regularization Enhances Self-Supervised Geometric Representation | 轨迹正则化增强自监督几何表示 | Jiayun Wang, Stella X. Yu, Yubei Chen | arxiv.org/pdf/2403.14… | null |
| 2024-03-22 | Web-based Melanoma Detection | 基于网络的黑色素瘤检测 | SangHyuk Kim, Edward Gaibor, Daniel Haehn | arxiv.org/pdf/2403.14… | null |
图像理解
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-22 | An Open-World, Diverse, Cross-Spatial-Temporal Benchmark for Dynamic Wild Person Re-Identification | 开放世界、多样化、跨时空的动态野人重识别基准 | Lei Zhang, Xiaowei Fu, Fuxiang Huang, Yi Yang, Xinbo Gao | arxiv.org/pdf/2403.15… | null |
Transformer
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-22 | SiMBA: Simplified Mamba-Based Architecture for Vision and Multivariate Time series | SiMBA:基于 Mamba 的简化视觉和多元时间序列架构 | Badri N. Patro, Vijay S. Agneeswaran | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | GPT-Connect: Interaction between Text-Driven Human Motion Generator and 3D Scenes in a Training-free Manner | GPT-Connect:文本驱动的人体运动生成器和 3D 场景之间以免训练的方式进行交互 | Haoxuan Qu, Ziyan Guo, Jun Liu | arxiv.org/pdf/2403.14… | null |
3D/CG
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-22 | LATTE3D: Large-scale Amortized Text-To-Enhanced3D Synthesis | LATTE3D:大规模摊销文本到增强型 3D 合成 | Kevin Xie, Jonathan Lorraine, Tianshi Cao, Jun Gao, James Lucas, Antonio Torralba, Sanja Fidler, Xiaohui Zeng | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Augmented Reality based Simulated Data (ARSim) with multi-view consistency for AV perception networks | 基于增强现实的模拟数据 (ARSim),具有适用于 AV 感知网络的多视图一致性 | Aqeel Anwar, Tae Eun Choe, Zian Wang, Sanja Fidler, Minwoo Park | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | LeGO: Leveraging a Surface Deformation Network for Animatable Stylized Face Generation with One Example | 乐高:利用表面变形网络生成可动画化的风格化脸部(以一个示例为例) | Soyeon Yoon, Kwan Yun, Kwanggyoon Seo, Sihun Cha, Jung Eun Yoo, Junyong Noh | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | FastCAD: Real-Time CAD Retrieval and Alignment from Scans and Videos | FastCAD:从扫描和视频中实时检索和对齐 CAD | Florian Langer, Jihong Ju, Georgi Dikov, Gerhard Reitmayr, Mohsen Ghafoorian | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Integrating multiscale topology in digital pathology with pyramidal graph convolutional networks | 将数字病理学中的多尺度拓扑与金字塔图卷积网络相集成 | Victor Ibañez, Przemyslaw Szostak, Quincy Wong, Konstanty Korski, Samaneh Abbasi-Sureshjani, Alvaro Gomariz | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | TexRO: Generating Delicate Textures of 3D Models by Recursive Optimization | TexRO:通过递归优化生成 3D 模型的精致纹理 | Jinbo Wu, Xing Liu, Chenming Wu, Xiaobo Gao, Jialun Liu, Xinqi Liu, Chen Zhao, Haocheng Feng, Errui Ding, Jingdong Wang | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Tri-Perspective View Decomposition for Geometry-Aware Depth Completion | 用于几何感知深度补全的三透视视图分解 | Zhiqiang Yan, Yuankai Lin, Kun Wang, Yupeng Zheng, Yufei Wang, Zhenyu Zhang, Jun Li, Jian Yang | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Survey on Modeling of Articulated Objects | 铰接物体建模调查 | Jiayi Liu, Manolis Savva, Ali Mahdavi-Amiri | arxiv.org/pdf/2403.14… | null |
各类学习方式
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-22 | Self-Supervised Backbone Framework for Diverse Agricultural Vision Tasks | 用于多种农业视觉任务的自监督骨干框架 | Sudhir Sornapudi, Rajhans Singh | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Continual Vision-and-Language Navigation | 持续视觉和语言导航 | Seongjun Jeong, Gi-Cheon Kang, Seongho Choi, Joochan Kim, Byoung-Tak Zhang | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Piecewise-Linear Manifolds for Deep Metric Learning | 用于深度度量学习的分段线性流形 | Shubhang Bhatnagar, Narendra Ahuja | arxiv.org/pdf/2403.14… | null |
| 2024-03-22 | Defying Imbalanced Forgetting in Class Incremental Learning | 克服课堂渐进学习中的不平衡遗忘 | Shixiong Xu, Gaofeng Meng, Xing Nie, Bolin Ni, Bin Fan, Shiming Xiang | arxiv.org/pdf/2403.14… | null |
其他
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-22 | DragAPart: Learning a Part-Level Motion Prior for Articulated Objects | DragAPart:学习铰接对象的零件级运动先验 | Ruining Li, Chuanxia Zheng, Christian Rupprecht, Andrea Vedaldi | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | PDE-CNNs: Axiomatic Derivations and Applications | PDE-CNN:公理推导和应用 | Gijs Bellaard, Sei Sakata, Bart M. N. Smets, Remco Duits | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | UniTraj: A Unified Framework for Scalable Vehicle Trajectory Prediction | UniTraj:可扩展车辆轨迹预测的统一框架 | Lan Feng, Mohammadhossein Bahari, Kaouther Messaoud Ben Amor, Éloi Zablocki, Matthieu Cord, Alexandre Alahi | arxiv.org/pdf/2403.15… | null |
| 2024-03-22 | Subjective Quality Assessment of Compressed Tone-Mapped High Dynamic Range Videos | 压缩色调映射高动态范围视频的主观质量评估 | Abhinau K. Venkataramanan, Alan C. Bovik | arxiv.org/pdf/2403.15… | null |