[UPDATED!] 2024-01-17 (Publish Time)
分类/检测/识别/分割
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-17 | GARField: Group Anything with Radiance Fields | GARField:用辐射场对任何东西进行分组 | Chung Min Kim, Mingxuan Wu, Justin Kerr, Ken Goldberg, Matthew Tancik, Angjoo Kanazawa | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model | Vision Mamba:利用双向状态空间模型进行高效视觉表示学习 | Lianghui Zhu, Bencheng Liao, Qian Zhang, Xinlong Wang, Wenyu Liu, Xinggang Wang | arxiv.org/pdf/2401.09… | link |
| 2024-01-17 | POP-3D: Open-Vocabulary 3D Occupancy Prediction from Images | POP-3D:根据图像进行开放词汇 3D 占用预测 | Antonin Vobecky, Oriane Siméoni, David Hurych, Spyros Gidaris, Andrei Bursuc, Patrick Pérez, Josef Sivic | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | To deform or not: treatment-aware longitudinal registration for breast DCE-MRI during neoadjuvant chemotherapy via unsupervised keypoints detection | 变形与否:新辅助化疗期间通过无监督关键点检测对乳腺 DCE-MRI 进行治疗感知纵向配准 | Luyi Han, Tao Tan, Tianyu Zhang, Yuan Gao, Xin Wang, Valentina Longo, Sofía Ventura-Díaz, Anna D'Angelo, Jonas Teuwen, Ritse Mann | arxiv.org/pdf/2401.09… | link |
| 2024-01-17 | Siamese Meets Diffusion Network: SMDNet for Enhanced Change Detection in High-Resolution RS Imagery | Siamese 遇上扩散网络:SMDNet 用于增强高分辨率 RS 图像中的变化检测 | Jia Jia, Geunho Lee, Zhibo Wang, Lyu Zhi, Yuchu He | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | PixelDINO: Semi-Supervised Semantic Segmentation for Detecting Permafrost Disturbances | PixelDINO:用于检测永久冻土扰动的半监督语义分割 | Konrad Heidler, Ingmar Nitze, Guido Grosse, Xiao Xiang Zhu | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | Uncertainty estimates for semantic segmentation: providing enhanced reliability for automated motor claims handling | 语义分割的不确定性估计:为自动汽车索赔处理提供增强的可靠性 | Jan Küchler, Daniel Kröll, Sebastian Schoenen, Andreas Witte | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | Dynamic Relation Transformer for Contextual Text Block Detection | 用于上下文文本块检测的动态关系转换器 | Jiawei Wang, Shunchi Zhang, Kai Hu, Chixiang Ma, Zhuoyao Zhong, Lei Sun, Qiang Huo | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | Exploring the Role of Convolutional Neural Networks (CNN) in Dental Radiography Segmentation: A Comprehensive Systematic Literature Review | 探索卷积神经网络 (CNN) 在牙科放射线摄影分割中的作用:全面系统的文献综述 | Walid Brahmi, Imen Jdey, Fadoua Drira | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | DK-SLAM: Monocular Visual SLAM with Deep Keypoints Adaptive Learning, Tracking and Loop-Closing | DK-SLAM:具有深度关键点自适应学习、跟踪和闭环的单目视觉 SLAM | Hao Qu, Lilian Zhang, Jun Mao, Junbo Tie, Xiaofeng He, Xiaoping Hu, Yifei Shi, Changhao Chen | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | Trapped in texture bias? A large scale comparison of deep instance segmentation | 陷入纹理偏差?深度实例分割的大规模比较 | Johannes Theodoridis, Jessica Hofmann, Johannes Maucher, Andreas Schilling | arxiv.org/pdf/2401.09… | link |
| 2024-01-17 | Enhancing Lidar-based Object Detection in Adverse Weather using Offset Sequences in Time | 使用时间偏移序列增强恶劣天气下基于激光雷达的物体检测 | Raphael van Kempen, Tim Rehbronn, Abin Jose, Johannes Stegmaier, Bastian Lampe, Timo Woopen, Lutz Eckstein | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | Change Detection Between Optical Remote Sensing Imagery and Map Data via Segment Anything Model (SAM) | 通过分段任意模型 (SAM) 检测光学遥感图像和地图数据之间的变化 | Hongruixuan Chen, Jian Song, Naoto Yokoya | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | Generalized Face Liveness Detection via De-spoofing Face Generator | 通过反欺骗人脸生成器进行广义人脸活体检测 | Xingming Long, Shiguang Shan, Jie Zhang | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | Hearing Loss Detection from Facial Expressions in One-on-one Conversations | 从一对一对话中的面部表情检测听力损失 | Yufeng Yin, Ishwarya Ananthabhotla, Vamsi Krishna Ithapu, Stavros Petridis, Yu-Hsiang Wu, Christi Miller | arxiv.org/pdf/2401.08… | null |
| 2024-01-17 | Learning to detect cloud and snow in remote sensing images from noisy labels | 学习从噪声标签中检测遥感图像中的云和雪 | Zili Liu, Hao Chen, Wenyuan Li, Keyan Chen, Zipeng Qi, Chenyang Liu, Zhengxia Zou, Zhenwei Shi | arxiv.org/pdf/2401.08… | null |
| 2024-01-17 | PPR: Enhancing Dodging Attacks while Maintaining Impersonation Attacks on Face Recognition Systems | PPR:增强躲避攻击的同时维持对人脸识别系统的模拟攻击 | Fengfan Zhou, Heifei Ling | arxiv.org/pdf/2401.08… | null |
模型压缩/优化
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-17 | TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion | TextureDreamer:通过几何感知扩散进行图像引导纹理合成 | Yu-Ying Yeh, Jia-Bin Huang, Changil Kim, Lei Xiao, Thu Nguyen-Phuoc, Numair Khan, Cheng Zhang, Manmohan Chandraker, Carl S Marshall, Zhao Dong, et.al. | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | An Efficient Generalizable Framework for Visuomotor Policies via Control-aware Augmentation and Privilege-guided Distillation | 通过控制感知增强和特权引导蒸馏的有效通用视觉运动策略框架 | Yinuo Zhao, Kun Wu, Tianjiao Yi, Zhiyuan Xu, Xiaozhu Ju, Zhengping Che, Qinru Qiu, Chi Harold Liu, Jian Tang | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | Consistent3D: Towards Consistent High-Fidelity Text-to-3D Generation with Deterministic Sampling Prior | Consolidated3D:通过确定性采样先验实现一致的高保真文本到 3D 生成 | Zike Wu, Pan Zhou, Xuanyu Yi, Xiaoding Yuan, Hanwang Zhang | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | Hybrid of DiffStride and Spectral Pooling in Convolutional Neural Networks | 卷积神经网络中 DiffStride 和谱池的混合 | Sulthan Rafif, Mochamad Arfan Ravy Wahyu Pratama, Mohammad Faris Azhar, Ahmad Mustafidul Ibad, Lailil Muflikhah, Novanto Yudistira | arxiv.org/pdf/2401.09… | null |
OCR
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-17 | VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models | VideoCrafter2:克服高质量视频扩散模型的数据限制 | Haoxin Chen, Yong Zhang, Xiaodong Cun, Menghan Xia, Xintao Wang, Chao Weng, Ying Shan | arxiv.org/pdf/2401.09… | null |
生成模型
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-17 | Vlogger: Make Your Dream A Vlog | 视频博主:让你的梦想成为视频博客 | Shaobin Zhuang, Kunchang Li, Xinyuan Chen, Yaohui Wang, Ziwei Liu, Yu Qiao, Yali Wang | arxiv.org/pdf/2401.09… | link |
| 2024-01-17 | Diverse Part Synthesis for 3D Shape Creation | 用于创建 3D 形状的多种零件合成 | Yanran Guan, Oliver van Kaick | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | Training-Free Semantic Video Composition via Pre-trained Diffusion Model | 通过预训练扩散模型进行免训练语义视频合成 | Jiaqi Guo, Sitong Su, Junchen Zhu, Lianli Gao, Jingkuan Song | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | Unsupervised Multiple Domain Translation through Controlled Disentanglement in Variational Autoencoder | 通过变分自动编码器中的受控解缠实现无监督多域翻译 | Almudévar Antonio, Mariotte Théo, Ortega Alfonso, Tahon Marie | arxiv.org/pdf/2401.09… | link |
| 2024-01-17 | Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis | 组合与征服:基于扩散的 3D 深度感知可组合图像合成 | Jonghyun Lee, Hansam Cho, Youngjoon Yoo, Seoung Bum Kim, Yonghyun Jeong | arxiv.org/pdf/2401.09… | link |
| 2024-01-17 | 3D Human Pose Analysis via Diffusion Synthesis | 通过扩散合成进行 3D 人体姿势分析 | Haorui Ji, Hongdong Li | arxiv.org/pdf/2401.08… | null |
| 2024-01-17 | Uncertainty-aware No-Reference Point Cloud Quality Assessment | 不确定性感知无参考点云质量评估 | Songlin Fan, Zixuan Guo, Wei Gao, Ge Li | arxiv.org/pdf/2401.08… | null |
| 2024-01-17 | Idempotence and Perceptual Image Compression | 幂等性和感知图像压缩 | Tongda Xu, Ziran Zhu, Dailan He, Yanghao Li, Lina Guo, Yuanyuan Wang, Zhe Wang, Hongwei Qin, Yan Wang, Jingjing Liu, et.al. | arxiv.org/pdf/2401.08… | link |
多模态
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-17 | SM![^3](): Self-Supervised Multi-task Modeling with Multi-view 2D Images for Articulated Objects | SM![^3]():针对铰接物体的多视图 2D 图像的自监督多任务建模 | Haowen Wang, Zhen Zhao, Zhao Jin, Zhengping Che, Liang Qiao, Yakun Huang, Zhipeng Fan, Xiuquan Qiao, Jian Tang | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | Autonomous Catheterization with Open-source Simulator and Expert Trajectory | 使用开源模拟器和专家轨迹进行自主导尿 | Tudor Jianu, Baoru Huang, Tuan Vo, Minh Nhat Vu, Jingxuan Kang, Hoan Nguyen, Olatunji Omisore, Pierre Berthet-Rayne, Sebastiano Fichera, Anh Nguyen | arxiv.org/pdf/2401.09… | link |
| 2024-01-17 | Cross-modality Guidance-aided Multi-modal Learning with Dual Attention for MRI Brain Tumor Grading | 具有双重关注的跨模态指导辅助多模态学习用于 MRI 脑肿瘤分级 | Dunyuan Xu, Xi Wang, Jinyue Cai, Pheng-Ann Heng | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | COCO is "ALL'' You Need for Visual Instruction Fine-tuning | COCO 是视觉指令微调所需的“全部” | Xiaotian Han, Yiqi Wang, Bohan Zhai, Quanzeng You, Hongxia Yang | arxiv.org/pdf/2401.08… | null |
LLM
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-17 | Remote Sensing ChatGPT: Solving Remote Sensing Tasks with ChatGPT and Visual Models | 遥感 ChatGPT:使用 ChatGPT 和视觉模型解决遥感任务 | Haonan Guo, Xin Su, Chen Wu, Bo Du, Liangpei Zhang, Deren Li | arxiv.org/pdf/2401.09… | link |
Transformer
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-17 | DaFoEs: Mixing Datasets towards the generalization of vision-state deep-learning Force Estimation in Minimally Invasive Robotic Surgery | DaFoE:混合数据集以推广微创机器人手术中的视觉状态深度学习力估计 | Mikel De Iturrate Reyzabal, Mingcong Chen, Wei Huang, Sebastien Ourselin, Hongbin Liu | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | UniVG: Towards UNIfied-modal Video Generation | UniVG:迈向统一模态视频生成 | Ludan Ruan, Lei Tian, Chuanwei Huang, Xu Zhang, Xinyan Xiao | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | Efficient Image Super-Resolution via Symmetric Visual Attention Network | 通过对称视觉注意网络实现高效图像超分辨率 | Chengxu Wu, Qinrui Fan, Shu Hu, Xi Wu, Xin Wang, Jing Hu | arxiv.org/pdf/2401.08… | null |
Nerf
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-17 | ICON: Incremental CONfidence for Joint Pose and Radiance Field Optimization | ICON:联合姿势和辐射场优化的增量置信度 | Weiyao Wang, Pierre Gleize, Hao Tang, Xingyu Chen, Kevin J Liang, Matt Feiszli | arxiv.org/pdf/2401.08… | null |
3D/CG
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-17 | Tri![^{2}]()-plane: Volumetric Avatar Reconstruction with Feature Pyramid | Tri![^{2}]()-plane:利用特征金字塔重建体积头像 | Luchuan Song, Pinxin Liu, Lele Chen, Celong Liu, Chenliang Xu | arxiv.org/pdf/2401.09… | link |
| 2024-01-17 | SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding | SceneVerse:扩展 3D 视觉语言学习以实现基础场景理解 | Baoxiong Jia, Yixin Chen, Huangyue Yu, Yan Wang, Xuesong Niu, Tengyu Liu, Qing Li, Siyuan Huang | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | 3D Scene Geometry Estimation from 360![^\circ]() Imagery: A Survey | 根据 360![^\circ]() 图像进行 3D 场景几何估计:一项调查 | Thiago Lopes Trugillo da Silveira, Paulo Gamarra Lessa Pinto, Jeffri Erwin Murrugarra Llerena, Claudio Rosito Jung | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | Continuous Piecewise-Affine Based Motion Model for Image Animation | 用于图像动画的连续分段仿射运动模型 | Hexiang Wang, Fengqi Liu, Qianyu Zhou, Ran Yi, Xin Tan, Lizhuang Ma | arxiv.org/pdf/2401.09… | link |
| 2024-01-17 | Objects With Lighting: A Real-World Dataset for Evaluating Reconstruction and Rendering for Object Relighting | 具有照明的对象:用于评估对象重新照明的重建和渲染的真实数据集 | Benjamin Ummenhofer, Sanskar Agrawal, Rene Sepulveda, Yixing Lao, Kai Zhang, Tianhang Cheng, Stephan Richter, Shenlong Wang, German Ros | arxiv.org/pdf/2401.09… | link |
| 2024-01-17 | Stream Query Denoising for Vectorized HD Map Construction | 用于矢量化高精地图构建的流查询去噪 | Shuo Wang, Fan Jia, Yingfei Liu, Yucheng Zhao, Zehui Chen, Tiancai Wang, Chi Zhang, Xiangyu Zhang, Feng Zhao | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | Attack and Reset for Unlearning: Exploiting Adversarial Noise toward Machine Unlearning through Parameter Re-initialization | 攻击和重置以实现遗忘:通过参数重新初始化利用对抗性噪声来实现机器遗忘 | Yoonhwa Jung, Ikhyun Cho, Shun-Hsiang Hsu, Julia Hockenmaier | arxiv.org/pdf/2401.08… | null |
各类学习方式
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-17 | PIN-SLAM: LiDAR SLAM Using a Point-Based Implicit Neural Representation for Achieving Global Map Consistency | PIN-SLAM:使用基于点的隐式神经表示实现全球地图一致性的 LiDAR SLAM | Yue Pan, Xingguang Zhong, Louis Wiesmann, Thorbjörn Posewsky, Jens Behley, Cyrill Stachniss | arxiv.org/pdf/2401.09… | link |
| 2024-01-17 | Towards Continual Learning Desiderata via HSIC-Bottleneck Orthogonalization and Equiangular Embedding | 通过 HSIC 瓶颈正交化和等角嵌入实现持续学习需求 | Depeng Li, Tianqi Wang, Junwei Chen, Qining Ren, Kenji Kawaguchi, Zhigang Zeng | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | CrossVideo: Self-supervised Cross-modal Contrastive Learning for Point Cloud Video Understanding | CrossVideo:用于点云视频理解的自监督跨模态对比学习 | Yunze Liu, Changxi Chen, Zifan Wang, Li Yi | arxiv.org/pdf/2401.09… | null |
其他
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-17 | Event-Based Visual Odometry on Non-Holonomic Ground Vehicles | 非完整地面车辆上基于事件的视觉里程计 | Wanting Xu, Si'ao Zhang, Li Cui, Xin Peng, Laurent Kneip | arxiv.org/pdf/2401.09… | link |
| 2024-01-17 | Online Stability Improvement of Groebner Basis Solvers using Deep Learning | 使用深度学习提高 Groebner 基解算器的在线稳定性 | Wanting Xu, Lan Hu, Manolis C. Tsakiris, Laurent Kneip | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | Tight Fusion of Events and Inertial Measurements for Direct Velocity Estimation | 事件和惯性测量的紧密融合用于直接速度估计 | Wanting Xu, Xin Peng, Laurent Kneip | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | A gradient-based approach to fast and accurate head motion compensation in cone-beam CT | 锥束 CT 中基于梯度的快速、准确头部运动补偿方法 | Mareike Thies, Fabian Wagner, Noah Maul, Haijun Yu, Manuela Meier, Linda-Sophie Schneider, Mingxuan Gu, Siyuan Mei, Lukas Folle, Andreas Maier | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | P![^2]()OT: Progressive Partial Optimal Transport for Deep Imbalanced Clustering | P![^2]()OT:深度不平衡聚类的渐进部分最优传输 | Chuyu Zhang, Hui Ren, Xuming He | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | Relative Pose for Nonrigid Multi-Perspective Cameras: The Static Case | 非刚性多视角相机的相对姿势:静态情况 | Min Li, Jiaqi Yang, Laurent Kneip | arxiv.org/pdf/2401.09… | null |
| 2024-01-17 | OCTO+: A Suite for Automatic Open-Vocabulary Object Placement in Mixed Reality | OCTO+:混合现实中自动开放词汇对象放置套件 | Aditya Sharma, Luke Yoffe, Tobias Höllerer | arxiv.org/pdf/2401.08… | null |
| 2024-01-17 | Dynamic DNNs and Runtime Management for Efficient Inference on Mobile/Embedded Devices | 动态 DNN 和运行时管理可在移动/嵌入式设备上进行高效推理 | Lei Xun, Jonathon Hare, Geoff V. Merrett | arxiv.org/pdf/2401.08… | null |
| 2024-01-17 | Fluid Dynamic DNNs for Reliable and Adaptive Distributed Inference on Edge Devices | 用于边缘设备上可靠、自适应分布式推理的流体动态 DNN | Lei Xun, Mingyu Hu, Hengrui Zhao, Amit Kumar Singh, Jonathon Hare, Geoff V. Merrett | arxiv.org/pdf/2401.08… | null |
| 2024-01-17 | Subwavelength Imaging using a Solid-Immersion Diffractive Optical Processor | 使用固体浸没衍射光学处理器进行亚波长成像 | Jingtian Hu, Kun Liao, Niyazi Ulas Dinc, Carlo Gigli, Bijie Bai, Tianyi Gan, Xurong Li, Hanlong Chen, Xilin Yang, Yuhang Li, et.al. | arxiv.org/pdf/2401.08… | null |