[UPDATED!] 2024-02-28 (Publish Time)
生成模型
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-28 | MambaMIR: An Arbitrary-Masked Mamba for Joint Medical Image Reconstruction and Uncertainty Estimation | MambaMIR:用于联合医学图像重建和不确定性估计的任意屏蔽曼巴 | Jiahao Huang, Liutao Yang, Fanwen Wang, Yinzhe Wu, Yang Nan, Angelica I. Aviles-Rivero, Carola-Bibiane Schönlieb, Daoqiang Zhang, Guang Yang | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Objective and Interpretable Breast Cosmesis Evaluation with Attention Guided Denoising Diffusion Anomaly Detection Model | 使用注意力引导的去噪扩散异常检测模型进行客观且可解释的乳房美容评估 | Sangjoon Park, Yong Bae Kim, Jee Suk Chang, Seo Hee Choi, Hyungjin Chung, Ik Jae Lee, Hwa Kyung Byun | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | LatentSwap: An Efficient Latent Code Mapping Framework for Face Swapping | LatentSwap:一种高效的人脸交换潜在代码映射框架 | Changho Choi, Minho Kim, Junhyeok Lee, Hyoung-Kyu Song, Younggeun Kim, Seungryong Kim | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | FineDiffusion: Scaling up Diffusion Models for Fine-grained Image Generation with 10,000 Classes | FineDiffusion:扩展扩散模型以生成具有 10,000 个类别的细粒度图像 | Ziying Pan, Kun Wang, Gang Li, Feihong He, Xiwang Li, Yongxuan Lai | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Balancing Act: Distribution-Guided Debiasing in Diffusion Models | 平衡法:扩散模型中分布引导的去偏 | Rishubh Parihar, Abhijnya Bhat, Saswat Mallick, Abhipsa Basu, Jogendra Nath Kundu, R. Venkatesh Babu | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | DecisionNCE: Embodied Multimodal Representations via Implicit Preference Learning | DecisionNCE:通过隐式偏好学习体现多模态表示 | Jianxiong Li, Jinliang Zheng, Yinan Zheng, Liyuan Mao, Xiao Hu, Sijie Cheng, Haoyi Niu, Jihao Liu, Yu Liu, Jingjing Liu, et.al. | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Context-aware Talking Face Video Generation | 上下文感知说话人脸视频生成 | Meidai Xuanyuan, Yuwang Wang, Honglei Guo, Qionghai Dai | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis | 用于姿势引导人体图像合成的从粗到细的潜在扩散 | Yanzuo Lu, Manlin Zhang, Andy J Ma, Xiaohua Xie, Jian-Huang Lai | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | SynArtifact: Classifying and Alleviating Artifacts in Synthetic Images via Vision-Language Model | SynArtifact:通过视觉语言模型对合成图像中的伪影进行分类和消除 | Bin Cao, Jianhao Yuan, Yexin Liu, Jian Li, Shuyang Sun, Jing Liu, Bo Zhao | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | OpenMEDLab: An Open-source Platform for Multi-modality Foundation Models in Medicine | OpenMEDLab:医学多模态基础模型的开源平台 | Xiaosong Wang, Xiaofan Zhang, Guotai Wang, Junjun He, Zhongyu Li, Wentao Zhu, Yi Guo, Qi Dou, Xiaoxiao Li, Dequan Wang, et.al. | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Breaking the Black-Box: Confidence-Guided Model Inversion Attack for Distribution Shift | 打破黑匣子:针对分布偏移的置信引导模型反转攻击 | Xinhao Liu, Yingzhao Jiang, Zetao Lin | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | PolyOculus: Simultaneous Multi-view Image-based Novel View Synthesis | PolyOculus:基于图像的同时多视图新颖视图合成 | Jason J. Yu, Tristan Aumentado-Armstrong, Fereshteh Forghani, Konstantinos G. Derpanis, Marcus A. Brubaker | arxiv.org/pdf/2402.17… | null |
| 2024-02-28 | Vision Language Model-based Caption Evaluation Method Leveraging Visual Context Extraction | 基于视觉语言模型的利用视觉上下文提取的字幕评估方法 | Koki Maeda, Shuhei Kurita, Taiki Miyanishi, Naoaki Okazaki | arxiv.org/pdf/2402.17… | null |
多模态
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-28 | Multimodal Learning To Improve Cardiac Late Mechanical Activation Detection From Cine MR Images | 多模态学习改善电影 MR 图像中的心脏晚期机械激活检测 | Jiarui Xing, Nian Wu, Kenneth Bilchick, Frederick Epstein, Miaomiao Zhang | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | TAMM: TriAdapter Multi-Modal Learning for 3D Shape Understanding | TAMM:用于 3D 形状理解的 TriAdapter 多模态学习 | Zhihao Zhang, Shengcao Cao, Yu-Xiong Wang | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Prediction of recurrence free survival of head and neck cancer using PET/CT radiomics and clinical information | 使用 PET/CT 放射组学和临床信息预测头颈癌的无复发生存期 | Mona Furukawa, Daniel R. McGowan, Bartłomiej W. Papież | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | A Multimodal Handover Failure Detection Dataset and Baselines | 多模式切换失败检测数据集和基线 | Santosh Thoduka, Nico Hochgeschwender, Juergen Gall, Paul G. Plöger | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Hierarchical Multimodal Pre-training for Visually Rich Webpage Understanding | 用于视觉丰富网页理解的分层多模态预训练 | Hongshen Xu, Lu Chen, Zihan Zhao, Da Ma, Ruisheng Cao, Zichen Zhu, Kai Yu | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Polos: Multimodal Metric Learning from Human Feedback for Image Captioning | Polos:根据图像字幕的人类反馈进行多模态度量学习 | Yuiga Wada, Kanta Kaneda, Daichi Saito, Komei Sugiura | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | M3-VRD: Multimodal Multi-task Multi-teacher Visually-Rich Form Document Understanding | M3-VRD:多模态多任务多教师视觉丰富形式文档理解 | Yihao Ding, Lorenzo Vaiani, Caren Han, Jean Lee, Paolo Garza, Josiah Poon, Luca Cagliero | arxiv.org/pdf/2402.17… | null |
| 2024-02-28 | All in a Single Image: Large Multimodal Models are In-Image Learners | 一切都在一个图像中:大型多模态模型是图像内学习器 | Lei Wang, Wanyu Xu, Zhiqiang Hu, Yihuai Lan, Shan Dong, Hao Wang, Roy Ka-Wei Lee, Ee-Peng Lim | arxiv.org/pdf/2402.17… | null |
Nerf
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-28 | NToP: NeRF-Powered Large-scale Dataset Generation for 2D and 3D Human Pose Estimation in Top-View Fisheye Images | NToP:NeRF 支持的大规模数据集生成,用于顶视图鱼眼图像中的 2D 和 3D 人体姿势估计 | Jingrui Yu, Dipankar Nandi, Roman Seidel, Gangolf Hirtz | arxiv.org/pdf/2402.18… | null |
模型压缩/优化
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-28 | Gradient Reweighting: Towards Imbalanced Class-Incremental Learning | 梯度重新加权:走向不平衡的班级增量学习 | Jiangpeng He, Fengqing Zhu | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Sunshine to Rainstorm: Cross-Weather Knowledge Distillation for Robust 3D Object Detection | 阳光明媚到暴雨:跨天气知识蒸馏,实现稳健的 3D 物体检测 | Xun Huang, Hai Wu, Xin Li, Xiaoliang Fan, Chenglu Wen, Cheng Wang | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Multi-objective Differentiable Neural Architecture Search | 多目标可微神经架构搜索 | Rhea Sanjay Sukthanker, Arber Zela, Benedikt Staffler, Samuel Dooley, Josif Grabocka, Frank Hutter | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | CFDNet: A Generalizable Foggy Stereo Matching Network with Contrastive Feature Distillation | CFDNet:具有对比特征蒸馏的可推广雾立体匹配网络 | Zihua Liu, Yizhou Li, Masatoshi Okutomi | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Ef-QuantFace: Streamlined Face Recognition with Small Data and Low-Bit Precision | Ef-QuantFace:具有小数据和低位精度的简化人脸识别 | William Gazali, Jocelyn Michelle Kho, Joshua Santoso, Williem | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | A Lightweight Low-Light Image Enhancement Network via Channel Prior and Gamma Correction | 通过通道先验和伽玛校正的轻量级低光图像增强网络 | Shyang-En Weng, Shaou-Gang Miaou, Ricky Christanto | arxiv.org/pdf/2402.18… | null |
分类/检测/识别/分割/...
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-28 | UniMODE: Unified Monocular 3D Object Detection | UniMODE:统一单目 3D 物体检测 | Zhuoling Li, Xiaogang Xu, SerNam Lim, Hengshuang Zhao | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Defect Detection in Tire X-Ray Images: Conventional Methods Meet Deep Structures | 轮胎 X 射线图像中的缺陷检测:传统方法与深层结构的结合 | Andrei Cozma, Landon Harris, Hairong Qi, Ping Ji, Wenpeng Guo, Song Yuan | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Detection of Micromobility Vehicles in Urban Traffic Videos | 城市交通视频中微型车辆的检测 | Khalil Sabri, Célia Djilali, Guillaume-Alexandre Bilodeau, Nicolas Saunier, Wassim Bouachir | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Separate and Conquer: Decoupling Co-occurrence via Decomposition and Representation for Weakly Supervised Semantic Segmentation | 分离与征服:通过弱监督语义分割的分解和表示来解耦共现 | Zhiwei Yang, Kexue Fu, Minghong Duan, Linhao Qu, Shuo Wang, Zhijian Song | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Prompt-Driven Dynamic Object-Centric Learning for Single Domain Generalization | 用于单域泛化的快速驱动的动态以对象为中心的学习 | Deng Li, Aming Wu, Yaowei Wang, Yahong Han | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | A Modular System for Enhanced Robustness of Multimedia Understanding Networks via Deep Parametric Estimation | 通过深度参数估计增强多媒体理解网络鲁棒性的模块化系统 | Francesco Barbato, Umberto Michieli, Mehmet Karim Yucel, Pietro Zanuttigh, Mete Ozay | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Robust Quantification of Percent Emphysema on CT via Domain Attention: the Multi-Ethnic Study of Atherosclerosis (MESA) Lung Study | 通过领域注意力对 CT 上的肺气肿百分比进行稳健量化:动脉粥样硬化 (MESA) 肺研究的多种族研究 | Xuzhe Zhang, Elsa D. Angelini, Eric A. Hoffman, Karol E. Watson, Benjamin M. Smith, R. Graham Barr, Andrew F. Laine | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Enhancing Roadway Safety: LiDAR-based Tree Clearance Analysis | 增强道路安全:基于激光雷达的树木间隙分析 | Miriam Louise Carnot, Eric Peukert, Bogdan Franczyk | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Feature Denoising For Low-Light Instance Segmentation Using Weighted Non-Local Blocks | 使用加权非局部块进行低光实例分割的特征去噪 | Joanne Lin, Nantheera Anantrasirichai, David Bull | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | EchoTrack: Auditory Referring Multi-Object Tracking for Autonomous Driving | EchoTrack:用于自动驾驶的听觉参考多目标跟踪 | Jiacheng Lin, Jiajun Chen, Kunyu Peng, Xuan He, Zhiyong Li, Rainer Stiefelhagen, Kailun Yang | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Grid-Based Continuous Normal Representation for Anomaly Detection | 用于异常检测的基于网格的连续正态表示 | Joo Chan Lee, Taejune Kim, Eunbyung Park, Simon S. Woo, Jong Hwan Ko | arxiv.org/pdf/2402.18… | link |
| 2024-02-28 | FSL Model can Score Higher as It Is | FSL 模型可以得分更高 | Yunwei Bai, Ying Kiat Tan, Tsuhan Chen | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Self-Supervised Learning in Electron Microscopy: Towards a Foundation Model for Advanced Image Analysis | 电子显微镜中的自我监督学习:建立高级图像分析的基础模型 | Bashir Kazimi, Karina Ruzaeva, Stefan Sandfeld | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | EAN-MapNet: Efficient Vectorized HD Map Construction with Anchor Neighborhoods | EAN-MapNet:利用锚点邻域构建高效的矢量化高清地图 | Huiyuan Xiong, Jun Shen, Taohong Zhu, Yuelong Pan | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | On the Accuracy of Edge Detectors in Number Plate Extraction | 边缘检测器在车牌提取中的准确性研究 | Bashir Olaniyi Sadiq | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Image2Flow: A hybrid image and graph convolutional neural network for rapid patient-specific pulmonary artery segmentation and CFD flow field calculation from 3D cardiac MRI data | Image2Flow:混合图像和图形卷积神经网络,用于根据 3D 心脏 MRI 数据快速进行患者特定肺动脉分割和 CFD 流场计算 | Tina Yao, Endrit Pajaziti, Michael Quail, Silvia Schievano, Jennifer A Steeden, Vivek Muthurangu | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Zero-Shot Aerial Object Detection with Visual Description Regularization | 具有视觉描述正则化的零样本空中物体检测 | Zhengqing Zang, Chenyu Lin, Chenwei Tang, Tao Wang, Jiancheng Lv | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Oil Spill Drone: A Dataset of Drone-Captured, Segmented RGB Images for Oil Spill Detection in Port Environments | 溢油无人机:无人机捕获的分段 RGB 图像数据集,用于港口环境中的溢油检测 | T. De Kerf, S. Sels, S. Samsonova, S. Vanlanduit | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Out-of-Distribution Detection using Neural Activation Prior | 使用神经激活先验进行分布外检测 | Weilin Wan, Weizhong Zhang, Cheng Jin | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | OccTransformer: Improving BEVFormer for 3D camera-only occupancy prediction | OccTransformer:改进 BEVFormer,以实现仅 3D 相机的占用预测 | Jian Liu, Sipeng Zhang, Chuixin Kong, Wenyuan Zhang, Yuhang Wu, Yikang Ding, Borun Xu, Ruibo Ming, Donglai Wei, Xianming Liu | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Classes Are Not Equal: An Empirical Study on Image Recognition Fairness | 类不平等:图像识别公平性的实证研究 | Jiequan Cui, Beier Zhu, Xin Wen, Xiaojuan Qi, Bei Yu, Hanwang Zhang | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Understanding the Role of Pathways in a Deep Neural Network | 了解深度神经网络中路径的作用 | Lei Lyu, Chen Pang, Jihua Wang | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | PRCL: Probabilistic Representation Contrastive Learning for Semi-Supervised Semantic Segmentation | PRCL:半监督语义分割的概率表示对比学习 | Haoyu Xie, Changqi Wang, Jian Zhao, Yang Liu, Jun Dan, Chong Fu, Baigui Sun | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | UniVS: Unified and Universal Video Segmentation with Prompts as Queries | UniVS:以提示作为查询的统一通用视频分割 | Minghan Li, Shuai Li, Xindong Zhang, Lei Zhang | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Dual-Context Aggregation for Universal Image Matting | 用于通用图像抠图的双上下文聚合 | Qinglin Liu, Xiaoqian Lv, Wei Yu, Changyong Guo, Shengping Zhang | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Spannotation: Enhancing Semantic Segmentation for Autonomous Navigation with Efficient Image Annotation | Spanotation:通过高效的图像注释增强自主导航的语义分割 | Samuel O. Folorunsho, William R. Norris | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Human Shape and Clothing Estimation | 人体形状和服装估计 | Aayush Gupta, Aditya Gulati, Himanshu, Lakshya LNU | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Multistatic-Radar RCS-Signature Recognition of Aerial Vehicles: A Bayesian Fusion Approach | 飞行器多基地雷达 RCS 签名识别:贝叶斯融合方法 | Michael Potter, Murat Akcakaya, Marius Necsoiu, Gunar Schirner, Deniz Erdogmus, Tales Imbiriba | arxiv.org/pdf/2402.17… | null |
| 2024-02-28 | Enhancing Tracking Robustness with Auxiliary Adversarial Defense Networks | 通过辅助对抗性防御网络增强跟踪鲁棒性 | Zhewei Wu, Ruilong Yu, Qihe Liu, Shuying Cheng, Shilin Qiu, Shijie Zhou | arxiv.org/pdf/2402.17… | null |
| 2024-02-28 | From Generalization to Precision: Exploring SAM for Tool Segmentation in Surgical Environments | 从泛化到精确:探索 SAM 在手术环境中的工具分割 | Kanyifeechukwu J. Oguine, Roger D. Soberanis-Mukul, Nathan Drenkow, Mathias Unberath | arxiv.org/pdf/2402.17… | null |
| 2024-02-28 | Rapid hyperspectral photothermal mid-infrared spectroscopic imaging from sparse data for gynecologic cancer tissue subtyping | 利用稀疏数据进行快速高光谱光热中红外光谱成像,用于妇科癌症组织亚型分析 | Reza Reihanisaransari, Chalapathi Charan Gajjela, Xinyu Wu, Ragib Ishrak, Sara Corvigno, Yanping Zhong, Jinsong Liu, Anil K. Sood, David Mayerich, Sebastian Berisha, et.al. | arxiv.org/pdf/2402.17… | null |
图像理解
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-28 | Attentive Illumination Decomposition Model for Multi-Illuminant White Balancing | 用于多光源白平衡的关注照明分解模型 | Dongyoung Kim, Jinwoo Kim, Junsang Yu, Seon Joo Kim | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Self-Supervised Spatially Variant PSF Estimation for Aberration-Aware Depth-from-Defocus | 用于像差感知散焦深度的自监督空间变异 PSF 估计 | Zhuofeng Wu, Yusuke Monno, Masatoshi Okutomi | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | NiteDR: Nighttime Image De-Raining with Cross-View Sensor Cooperative Learning for Dynamic Driving Scenes | NiteDR:夜间图像除雨与交叉视角传感器协作学习动态驾驶场景 | Cidan Shi, Lihuang Fang, Han Wu, Xiaoyu Xian, Yukai Shi, Liang Lin | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Passive Snapshot Coded Aperture Dual-Pixel RGB-D Imaging | 被动快照编码孔径双像素 RGB-D 成像 | Bhargav Ghanekar, Salman Siddique Khan, Vivek Boominathan, Pranav Sharma, Shreyas Singh, Kaushik Mitra, Ashok Veeraraghavan | arxiv.org/pdf/2402.18… | null |
LLM
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-28 | From Summary to Action: Enhancing Large Language Models for Complex Tasks with Open World APIs | 从总结到行动:使用开放世界 API 增强复杂任务的大型语言模型 | Yulong Liu, Yunlong Yuan, Chunwei Wang, Jianhua Han, Yongqiang Ma, Li Zhang, Nanning Zheng, Hang Xu | arxiv.org/pdf/2402.18… | null |
Transformer
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-28 | Attention-Propagation Network for Egocentric Heatmap to 3D Pose Lifting | 用于以自我为中心的热图到 3D 姿势提升的注意力传播网络 | Taeho Kang, Youngki Lee | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | SFTformer: A Spatial-Frequency-Temporal Correlation-Decoupling Transformer for Radar Echo Extrapolation | SFTformer:用于雷达回波外推的时空相关解耦变压器 | Liangyu Xu, Wanxuan Lu, Hongfeng Yu, Fanglong Yao, Xian Sun, Kun Fu | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Representing 3D sparse map points and lines for camera relocalization | 表示 3D 稀疏地图点和线以进行相机重新定位 | Bach-Thuan Bui, Huy-Hoang Bui, Dinh-Tuan Tran, Joo-Ho Lee | arxiv.org/pdf/2402.18… | null |
3D/CG
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-28 | Selection of appropriate multispectral camera exposure settings and radiometric calibration methods for applications in phenotyping and precision agriculture | 选择适当的多光谱相机曝光设置和辐射校准方法,用于表型分析和精准农业 | Vaishali Swaminathan, J. Alex Thomasson, Robert G. Hardin, Nithya Rajan | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Windowed-FourierMixer: Enhancing Clutter-Free Room Modeling with Fourier Transform | Windowed-FourierMixer:通过傅里叶变换增强整洁的房间建模 | Bruno Henriques, Benjamin Allaert, Jean-Philippe Vandeborre | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | 3DSFLabelling: Boosting 3D Scene Flow Estimation by Pseudo Auto-labelling | 3DSFLabelling:通过伪自动标记增强 3D 场景流估计 | Chaokang Jiang, Guangming Wang, Jiuming Liu, Hesheng Wang, Zhuang Ma, Zhenqiang Liu, Zhujin Liang, Yi Shan, Dalong Du | arxiv.org/pdf/2402.18… | null |
各类学习方式
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-28 | Unsupervised Cross-Domain Image Retrieval via Prototypical Optimal Transport | 通过原型最优传输进行无监督跨域图像检索 | Bin Li, Ye Shi, Qian Yu, Jingya Wang | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Generalizable Two-Branch Framework for Image Class-Incremental Learning | 图像类增量学习的可推广二分支框架 | Chao Wu, Xiaobin Chang, Ruixuan Wang | arxiv.org/pdf/2402.18… | null |
其他
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-28 | IBD: Alleviating Hallucinations in Large Vision-Language Models via Image-Biased Decoding | IBD:通过图像偏向解码减轻大型视觉语言模型中的幻觉 | Lanyun Zhu, Deyi Ji, Tianrun Chen, Peng Xu, Jieping Ye, Jun Liu | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | A Cognitive Evaluation Benchmark of Image Reasoning and Description for Large Vision Language Models | 大视觉语言模型图像推理与描述的认知评估基准 | Xiujie Song, Mengyue Wu, Kenny Q. Zhu, Chunhao Zhang, Yanyi Chen | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Probabilistic Bayesian optimal experimental design using conditional normalizing flows | 使用条件归一化流的概率贝叶斯最优实验设计 | Rafael Orozco, Felix J. Herrmann, Peng Chen | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Location-guided Head Pose Estimation for Fisheye Image | 鱼眼图像的位置引导头部姿势估计 | Bing Li, Dong Zhang, Cheng Huang, Yun Xian, Ming Li, Dah-Jye Lee | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | NERV++: An Enhanced Implicit Neural Video Representation | NERV++:增强的隐式神经视频表示 | Ahmed Ghorbel, Wassim Hamidouche, Luce Morin | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Development of Context-Sensitive Formulas to Obtain Constant Luminance Perception for a Foreground Object in Front of Backgrounds of Varying Luminance | 开发上下文相关公式以获得变化亮度背景下前景物体的恒定亮度感知 | Ergun Akleman, Bekir Tevfik Akgun, Adil Alpkocak | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Region-Aware Exposure Consistency Network for Mixed Exposure Correction | 用于混合曝光校正的区域感知曝光一致性网络 | Jin Liu, Huiyuan Fu, Chuanming Wang, Huadong Ma | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Learning Invariant Inter-pixel Correlations for Superpixel Generation | 学习超像素生成的不变像素间相关性 | Sen Xu, Shikui Wei, Tao Ruan, Lixin Liao | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Misalignment-Robust Frequency Distribution Loss for Image Transformation | 图像变换的失准鲁棒频率分布损失 | Zhangkai Ni, Juncheng Wu, Zian Wang, Wenhan Yang, Hanli Wang, Lin Ma | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Reflection Removal Using Recurrent Polarization-to-Polarization Network | 使用循环偏振到偏振网络去除反射 | Wenjiao Bian, Yusuke Monno, Masatoshi Okutomi | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Digging Into Normal Incorporated Stereo Matching | 深入研究正常合并的立体匹配 | Zihua Liu, Songyan Zhang, Zhicheng Wang, Masatoshi Okutomi | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Boosting Neural Representations for Videos with a Conditional Decoder | 使用条件解码器增强视频的神经表示 | Xinjie Zhang, Ren Yang, Dailan He, Xingtong Ge, Tongda Xu, Yan Wang, Hongwei Qin, Jun Zhang | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Learning to Deblur Polarized Images | 学习去模糊偏振图像 | Chu Zhou, Minggui Teng, Xinyu Zhou, Chao Xu, Boxin Sh | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Downstream Task Guided Masking Learning in Masked Autoencoders Using Multi-Level Optimization | 使用多级优化的掩蔽自动编码器中的下游任务引导掩蔽学习 | Han Guo, Ramtin Hosseini, Ruiyi Zhang, Sai Ashish Somayajula, Ranak Roy Chowdhury, Rajesh K. Gupta, Pengtao Xie | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | G4G:A Generic Framework for High Fidelity Talking Face Generation with Fine-grained Intra-modal Alignment | G4G:具有细粒度模内对齐的高保真说话人脸生成的通用框架 | Juan Zhang, Jiahao Chen, Cheng Wang, Zhiwang Yu, Tangquan Qi, Di Wu | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Block and Detail: Scaffolding Sketch-to-Image Generation | 块和细节:脚手架草图到图像的生成 | Vishnu Sarukkai, Lu Yuan, Mia Tang, Maneesh Agrawala, Kayvon Fatahalian | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Six-Point Method for Multi-Camera Systems with Reduced Solution Space | 具有减少解空间的多摄像机系统的六点法 | Banglei Guan, Ji Zhao, Laurent Kneip | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | Fast and Interpretable 2D Homography Decomposition: Similarity-Kernel-Similarity and Affine-Core-Affine Transformations | 快速且可解释的 2D 单应性分解:相似性-核-相似性和仿射-核心-仿射变换 | Shen Cai, Zhanhao Wu, Lingxi Guo, Jiachun Wang, Siyu Zhang, Junchi Yan, Shuhan Shen | arxiv.org/pdf/2402.18… | null |
| 2024-02-28 | QN-Mixer: A Quasi-Newton MLP-Mixer Model for Sparse-View CT Reconstruction | QN-Mixer:用于稀疏视图 CT 重建的拟牛顿 MLP 混合器模型 | Ishak Ayad, Nicolas Larue, Maï K. Nguyen | arxiv.org/pdf/2402.17… | null |