[UPDATED!] 2024-02-27 (Publish Time)
模型压缩/优化
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-27 | ShapeLLM: Universal 3D Object Understanding for Embodied Interaction | ShapeLLM:用于实体交互的通用 3D 对象理解 | Zekun Qi, Runpei Dong, Shaochen Zhang, Haoran Geng, Chunrui Han, Zheng Ge, Li Yi, Kaisheng Ma | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Adaptive quantization with mixed-precision based on low-cost proxy | 基于低成本代理的混合精度自适应量化 | Junzhe Chen, Qiao Yang, Senmao Tian, Shunli Zhang | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | MCF-VC: Mitigate Catastrophic Forgetting in Class-Incremental Learning for Multimodal Video Captioning | MCF-VC:减轻多模态视频字幕的类增量学习中的灾难性遗忘 | Huiyu Xiong, Lanxiao Wang, Heqian Qiu, Taijin Zhao, Benliu Qiu, Hongliang Li | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Neural Video Compression with Feature Modulation | 具有特征调制的神经视频压缩 | Jiahao Li, Bin Li, Yan Lu | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Towards Robust and Efficient Cloud-Edge Elastic Model Adaptation via Selective Entropy Distillation | 通过选择性熵蒸馏实现稳健高效的云边缘弹性模型适应 | Yaofo Chen, Shuaicheng Niu, Shoukai Xu, Hengjie Song, Yaowei Wang, Mingkui Tan | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | LiveHPS: LiDAR-based Scene-level Human Pose and Shape Estimation in Free Environment | LiveHPS:自由环境下基于激光雷达的场景级人体姿态和形状估计 | Yiming Ren, Xiao Han, Chengfeng Zhao, Jingya Wang, Lan Xu, Jingyi Yu, Yuexin Ma | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Structural Teacher-Student Normality Learning for Multi-Class Anomaly Detection and Localization | 用于多类异常检测和定位的结构性师生正态学习 | Hanqiu Deng, Xingyu Li | arxiv.org/pdf/2402.17… | null |
分类/检测/识别/分割/...
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-27 | ADL4D: Towards A Contextually Rich Dataset for 4D Activities of Daily Living | ADL4D:为日常生活的 4D 活动建立上下文丰富的数据集 | Marsil Zakour, Partha Pratim Nath, Ludwig Lohmer, Emre Faik Gökçe, Martin Piccolrovazzi, Constantin Patsch, Yuankai Wu, Rahul Chaudhari, Eckehard Steinbach | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | VRP-SAM: SAM with Visual Reference Prompt | VRP-SAM:带有视觉参考提示的 SAM | Yanpeng Sun, Jiahui Chen, Shan Zhang, Xinyu Zhang, Qiang Chen, Gang Zhang, Errui Ding, Jingdong Wang, Zechao Li | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | MedContext: Learning Contextual Cues for Efficient Volumetric Medical Segmentation | MedContext:学习上下文线索以实现高效的体积医学分割 | Hanan Gani, Muzammal Naseer, Fahad Khan, Salman Khan | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | SDF2Net: Shallow to Deep Feature Fusion Network for PolSAR Image Classification | SDF2Net:用于 PolSAR 图像分类的浅层到深层特征融合网络 | Mohammed Q. Alkhatib, M. Sami Zitouni, Mina Al-Saad, Nour Aburaed, Hussain Al-Ahmad | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Mitigating Distributional Shift in Semantic Segmentation via Uncertainty Estimation from Unlabelled Data | 通过未标记数据的不确定性估计来减轻语义分割中的分布变化 | David S. W. Williams, Daniele De Martini, Matthew Gadd, Paul Newman | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Masked Gamma-SSL: Learning Uncertainty Estimation via Masked Image Modeling | Masked Gamma-SSL:通过掩模图像建模学习不确定性估计 | David S. W. Williams, Matthew Gadd, Paul Newman, Daniele De Martini | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Adapt Before Comparison: A New Perspective on Cross-Domain Few-Shot Segmentation | 先适应后比较:跨域少样本分割的新视角 | Jonas Herzog | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | A Large-scale Evaluation of Pretraining Paradigms for the Detection of Defects in Electroluminescence Solar Cell Images | 用于检测电致发光太阳能电池图像缺陷的预训练范式的大规模评估 | David Torpey, Lawrence Pratt, Richard Klein | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Scribble Hides Class: Promoting Scribble-Based Weakly-Supervised Semantic Segmentation with Its Class Label | Scribble 隐藏类:利用其类标签促进基于 Scribble 的弱监督语义分割 | Xinliang Zhang, Lei Zhu, Hangzhou He, Lujia Jin, Yanye Lu | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Robust Unsupervised Crowd Counting and Localization with Adaptive Resolution SAM | 具有自适应分辨率 SAM 的稳健无监督人群计数和定位 | Jia Wan, Qiangqiang Wu, Wei Lin, Antoni B. Chan | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | FedLPPA: Learning Personalized Prompt and Aggregation for Federated Weakly-supervised Medical Image Segmentation | FedLPPA:学习联合弱监督医学图像分割的个性化提示和聚合 | Li Lin, Yixiang Liu, Jiewei Wu, Pujin Cheng, Zhiyuan Cai, Kenneth K. Y. Wong, Xiaoying Tang | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Automated Classification of Phonetic Segments in Child Speech Using Raw Ultrasound Imaging | 使用原始超声成像对儿童语音中的语音片段进行自动分类 | Saja Al Ani, Joanne Cleland, Ahmed Zoha | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Model X-ray:Detect Backdoored Models via Decision Boundary | 模型 X 射线:通过决策边界检测后门模型 | Yanghao Su, Jie Zhang, Ting Xu, Tianwei Zhang, Weiming Zhang, Nenghai Yu | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Segment anything model for head and neck tumor segmentation with CT, PET and MRI multi-modality images | 使用 CT、PET 和 MRI 多模态图像分割任何头颈部肿瘤模型 | Jintao Ren, Mathis Rasmussen, Jasper Nijkamp, Jesper Grau Eriksen, Stine Korreman | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | ViTaL: An Advanced Framework for Automated Plant Disease Identification in Leaf Images Using Vision Transformers and Linear Projection For Feature Reduction | ViTaL:使用视觉变换器和线性投影进行特征缩减的叶子图像中自动植物病害识别的高级框架 | Abhishek Sebastian, Annis Fathima A, Pragna R, Madhan Kumar S, Yaswanth Kannan G, Vinay Murali | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | PANDAS: Prototype-based Novel Class Discovery and Detection | PANDAS:基于原型的新类发现和检测 | Tyler L. Hayes, César R. de Souza, Namil Kim, Jiwon Kim, Riccardo Volpi, Diane Larlus | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | CARZero: Cross-Attention Alignment for Radiology Zero-Shot Classification | CARZero:放射学零样本分类的交叉注意力对齐 | Haoran Lai, Qingsong Yao, Zihang Jiang, Rongsheng Wang, Zhiyang He, Xiaodong Tao, S. Kevin Zhou | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | An Efficient MLP-based Point-guided Segmentation Network for Ore Images with Ambiguous Boundary | 一种基于MLP的高效的边界模糊矿石图像点引导分割网络 | Guodong Sun, Yuting Peng, Le Cheng, Mengya Xu, An Wang, Bo Wu, Hongliang Ren, Yang Zhang | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | SDDGR: Stable Diffusion-based Deep Generative Replay for Class Incremental Object Detection | SDDGR:用于类增量对象检测的稳定的基于扩散的深度生成重放 | Junsu Kim, Hoseong Cho, Jihyeon Kim, Yihalem Yimolal Tiruneh, Seungryul Baek | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | A Vanilla Multi-Task Framework for Dense Visual Prediction Solution to 1st VCL Challenge -- Multi-Task Robustness Track | 用于第一届 VCL 挑战赛密集视觉预测解决方案的普通多任务框架——多任务鲁棒性赛道 | Zehui Chen, Qiuchen Wang, Zhenyu Li, Jiaming Liu, Shanghang Zhang, Feng Zhao | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Scaling Supervised Local Learning with Augmented Auxiliary Networks | 使用增强辅助网络扩展监督本地学习 | Chenxiang Ma, Jibin Wu, Chenyang Si, Kay Chen Tan | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | How we won BraTS 2023 Adult Glioma challenge? Just faking it! Enhanced Synthetic Data Augmentation and Model Ensemble for brain tumour segmentation | 我们如何赢得 BraTS 2023 成人胶质瘤挑战赛?只是假装而已!用于脑肿瘤分割的增强型合成数据增强和模型集成 | André Ferreira, Naida Solak, Jianning Li, Philipp Dammann, Jens Kleesiek, Victor Alves, Jan Egger | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Explicit Interaction for Fusion-Based Place Recognition | 基于融合的地点识别的显式交互 | Jingyi Xu, Junyi Ma, Qi Wu, Zijie Zhou, Yue Wang, Xieyuanli Chen, Ling Pei | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Deep Learning-Based Speech and Vision Synthesis to Improve Phishing Attack Detection through a Multi-layer Adaptive Framework | 基于深度学习的语音和视觉合成通过多层自适应框架改进网络钓鱼攻击检测 | Tosin Ige, Christopher Kiekintveld, Aritran Piplai | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | SDR-Former: A Siamese Dual-Resolution Transformer for Liver Lesion Classification Using 3D Multi-Phase Imaging | SDR-Former:使用 3D 多相成像进行肝脏病变分类的连体双分辨率变压器 | Meng Lou, Hanning Ying, Xiaoqing Liu, Hong-Yu Zhou, Yuqing Zhang, Yizhou Yu | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Preserving Fairness Generalization in Deepfake Detection | 在 Deepfake 检测中保持公平泛化 | Li Lin, Xinan He, Yan Ju, Xin Wang, Feng Ding, Shu Hu | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Deployment Prior Injection for Run-time Calibratable Object Detection | 用于运行时可校准对象检测的部署预注入 | Mo Zhou, Yiding Yang, Haoxiang Li, Vishal M. Patel, Gang Hua | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | PE-MVCNet: Multi-view and Cross-modal Fusion Network for Pulmonary Embolism Prediction | PE-MVCNet:用于肺栓塞预测的多视图和跨模态融合网络 | Zhaoxin Guo, Zhipeng Wang, Ruiquan Ge, Jianxun Yu, Feiwei Qin, Yuan Tian, Yuqing Peng, Yonghong Li, Changmiao Wang | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Lane2Seq: Towards Unified Lane Detection via Sequence Generation | Lane2Seq:通过序列生成实现统一车道检测 | Kunyang Zhou | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Few-shot adaptation for morphology-independent cell instance segmentation | 与形态无关的细胞实例分割的少样本适应 | Ram J. Zaveri, Voke Brume, Gianfranco Doretto | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | NocPlace: Nocturnal Visual Place Recognition Using Generative and Inherited Knowledge Transfer | NocPlace:使用生成和遗传知识转移进行夜间视觉地点识别 | Bingxi Liu, Yiqun Wang, Huaqi Tao, Tingjun Huang, Fulin Tang, Yihong Wu, Jinqiang Cui, Hong Zhang | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Efficiently Leveraging Linguistic Priors for Scene Text Spotting | 有效利用语言先验进行场景文本识别 | Nguyen Nguyen, Yapeng Tian, Chenliang Xu | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | SAM-DiffSR: Structure-Modulated Diffusion Model for Image Super-Resolution | SAM-DiffSR:用于图像超分辨率的结构调制扩散模型 | Chengcheng Wang, Zhiwei Hao, Yehui Tang, Jianyuan Guo, Yujie Yang, Kai Han, Yunhe Wang | arxiv.org/pdf/2402.17… | link |
| 2024-02-27 | OSCaR: Object State Captioning and State Change Representation | OSCaR:对象状态描述和状态变化表示 | Nguyen Nguyen, Jing Bi, Ali Vosoughi, Yapeng Tian, Pooyan Fazli, Chenliang Xu | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | In Defense and Revival of Bayesian Filtering for Thermal Infrared Object Tracking | 热红外物体跟踪贝叶斯过滤的防御和复兴 | Peng Gao, Shi-Min Li, Feng Gao, Fei Wang, Ru-Yue Yuan, Hamido Fujita | arxiv.org/pdf/2402.17… | null |
OCR
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-27 | Advancing Generative Model Evaluation: A Novel Algorithm for Realistic Image Synthesis and Comparison in OCR System | 推进生成模型评估:一种用于 OCR 系统中真实图像合成和比较的新算法 | Majid Memari, Khaled R. Ahmed, Shahram Rahimi, Noorbakhsh Amiri Golilarz | arxiv.org/pdf/2402.17… | null |
Transformer
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-27 | CAD-SIGNet: CAD Language Inference from Point Clouds using Layer-wise Sketch Instance Guided Attention | CAD-SIGNet:使用分层草图实例引导注意力从点云进行 CAD 语言推理 | Mohammad Sadil Khan, Elona Dupont, Sk Aziz Ali, Kseniya Cherenkova, Anis Kacem, Djamila Aouada | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Structure-Guided Adversarial Training of Diffusion Models | 结构引导的扩散模型对抗训练 | Ling Yang, Haotian Qian, Zhilong Zhang, Jingwei Liu, Bin Cui | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | An Empirical Study of the Generalization Ability of Lidar 3D Object Detectors to Unseen Domains | 激光雷达 3D 物体探测器对不可见领域的泛化能力的实证研究 | George Eskandar, Chongzhe Zhang, Abhishek Kaushik, Karim Guirguis, Mohamed Sayed, Bin Yang | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Diffusion Model-Based Image Editing: A Survey | 基于扩散模型的图像编辑:调查 | Yi Huang, Jiancheng Huang, Yifan Liu, Mingfu Yan, Jiaxi Lv, Jianzhuang Liu, Wei Xiong, He Zhang, Shifeng Chen, Liangliang Cao | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Interactive Multi-Head Self-Attention with Linear Complexity | 具有线性复杂度的交互式多头自注意力 | Hankyul Kang, Ming-Hsuan Yang, Jongbin Ryu | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Bit Rate Matching Algorithm Optimization in JPEG-AI Verification Model | JPEG-AI验证模型中的比特率匹配算法优化 | Panqi Jia, A. Burakhan Koyuncu, Jue Mao, Ze Cui, Yi Ma, Tiansheng Guo, Timofey Solovyev, Alexander Karabutov, Yin Zhao, Jing Wang, et.al. | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Leveraging Enhanced Queries of Point Sets for Vectorized Map Construction | 利用增强的点集查询进行矢量化地图构建 | Zihao Liu, Xiaoyu Zhang, Guangwei Liu, Ji Zhao, Ningyi Xu | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | LSPT: Long-term Spatial Prompt Tuning for Visual Representation Learning | LSPT:视觉表示学习的长期空间提示调整 | Shentong Mo, Yansen Wang, Xufang Luo, Dongsheng Li | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | CAPT: Category-level Articulation Estimation from a Single Point Cloud Using Transformer | CAPT:使用 Transformer 从单点云进行类别级清晰度估计 | Lian Fu, Ryoichi Ishikawa, Yoshihiro Sato, Takeshi Oishi | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Image-Text Matching with Multi-View Attention | 具有多视图注意力的图像文本匹配 | Rui Cheng, Wanqing Cui | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Feature Re-Embedding: Towards Foundation Model-Level Performance in Computational Pathology | 特征重新嵌入:迈向计算病理学的基础模型级性能 | Wenhao Tang, Fengtao Zhou, Sheng Huang, Xiang Zhu, Yi Zhang, Bo Liu | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | CharacterGen: Efficient 3D Character Generation from Single Images with Multi-View Pose Canonicalization | CharacterGen:通过多视图姿势规范化从单张图像高效生成 3D 角色 | Hao-Yang Peng, Jia-Peng Zhang, Meng-Hao Guo, Yan-Pei Cao, Shi-Min Hu | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | CharNeRF: 3D Character Generation from Concept Art | CharNeRF:从概念艺术生成 3D 角色 | Eddy Chu, Yiyang Chen, Chedy Raissi, Anand Bhojan | arxiv.org/pdf/2402.17… | null |
3D/CG
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-27 | LoDIP: Low light phase retrieval with deep image prior | LoDIP:具有深度图像先验的低光相位检索 | Raunak Manekar, Elisa Negrini, Minh Pham, Daniel Jacobs, Jaideep Srivastava | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Analyzing Regional Organization of the Human Hippocampus in 3D-PLI Using Contrastive Learning and Geometric Unfolding | 使用对比学习和几何展开在 3D-PLI 中分析人类海马的区域组织 | Alexander Oberstrass, Jordan DeKraker, Nicola Palomero-Gallagher, Sascha E. A. Muenzing, Alan C. Evans, Markus Axer, Katrin Amunts, Timo Dickscheid | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | PHNet: Patch-based Normalization for Portrait Harmonization | PHNet:基于补丁的标准化肖像协调 | Karen Efremyan, Elizaveta Petrova, Evgeny Kaskov, Alexander Kapitanov | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | AVS-Net: Point Sampling with Adaptive Voxel Size for 3D Point Cloud Analysis | AVS-Net:采用自适应体素大小的点采样进行 3D 点云分析 | Hongcheng Yang, Dingkang Liang, Dingyuan Zhang, Xingyu Jiang, Zhe Liu, Zhikang Zou, Yingying Zhu | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions | EMO:Emote Portrait Alive - 在弱条件下使用音视频扩散模型生成富有表现力的肖像视频 | Linrui Tian, Qi Wang, Bang Zhang, Liefeng Bo | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Generative 3D Part Assembly via Part-Whole-Hierarchy Message Passing | 通过部分-整体-层次结构消息传递生成 3D 零件组装 | Bi'an Du, Xiang Gao, Wei Hu, Renjie Liao | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | VastGaussian: Vast 3D Gaussians for Large Scene Reconstruction | VastGaussian:用于大型场景重建的 Vast 3D 高斯 | Jiaqi Lin, Zhihao Li, Xiao Tang, Jianzhuang Liu, Shiyong Liu, Jiayue Liu, Yangdi Lu, Xiaofei Wu, Songcen Xu, Youliang Yan, et.al. | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | DiffuseKronA: A Parameter Efficient Fine-tuning Method for Personalized Diffusion Model | DiffuseKronA:个性化扩散模型的参数高效微调方法 | Shyam Marjit, Harshit Singh, Nityanand Mathur, Sayak Paul, Chia-Mu Yu, Pin-Yu Chen | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Sora Generates Videos with Stunning Geometrical Consistency | Sora 生成具有令人惊叹的几何一致性的视频 | Xuanyi Li, Daquan Zhou, Chenxu Zhang, Shaodong Wei, Qibin Hou, Ming-Ming Cheng | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Coupled Laplacian Eigenmaps for Locally-Aware 3D Rigid Point Cloud Matching | 用于局部感知 3D 刚性点云匹配的耦合拉普拉斯特征图 | Matteo Bastico, Etienne Decencière, Laurent Corté, Yannick Tillier, David Ryckelynck | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Learning Dynamic Tetrahedra for High-Quality Talking Head Synthesis | 学习动态四面体以实现高质量的头部说话合成 | Zicheng Zhang, Ruobing Zheng, Ziwen Liu, Congying Han, Tianqi Li, Meng Wang, Tiande Guo, Jingdong Chen, Bonan Li, Ming Yang | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | ICP-Flow: LiDAR Scene Flow Estimation with ICP | ICP-Flow:利用 ICP 进行 LiDAR 场景流量估计 | Yancong Lin, Holger Caesar | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Denoising Diffusion Models for Inpainting of Healthy Brain Tissue | 用于修复健康脑组织的去噪扩散模型 | Alicia Durrer, Philippe C. Cattin, Julia Wolleb | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | DivAvatar: Diverse 3D Avatar Generation with a Single Prompt | DivAvatar:通过单一提示生成多样化的 3D 头像 | Weijing Tao, Biwen Lei, Kunhao Liu, Shijian Lu, Miaomiao Cui, Xuansong Xie, Chunyan Miao | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Enhancing Hyperspectral Images via Diffusion Model and Group-Autoencoder Super-resolution Network | 通过扩散模型和组自动编码器超分辨率网络增强高光谱图像 | Zhaoyang Wang, Dongyang Li, Mingyang Zhang, Hao Luo, Maoguo Gong | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Differentiable Biomechanics Unlocks Opportunities for Markerless Motion Capture | 可微分生物力学为无标记运动捕捉带来机遇 | R. James Cotton | arxiv.org/pdf/2402.17… | null |
各类学习方式
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-27 | Opening Cabinets and Drawers in the Real World using a Commodity Mobile Manipulator | 使用商品移动机械手打开现实世界中的橱柜和抽屉 | Arjun Gupta, Michelle Zhang, Rishik Sathua, Saurabh Gupta | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | PLReMix: Combating Noisy Labels with Pseudo-Label Relaxed Contrastive Representation Learning | PLReMix:用伪标签松弛对比表示学习对抗噪声标签 | Xiaoyu Liu, Beitong Zhou, Cheng Cheng | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Demonstrating and Reducing Shortcuts in Vision-Language Representation Learning | 展示和减少视觉语言表征学习的捷径 | Maurits Bleeker, Mariya Hendriksen, Andrew Yates, Maarten de Rijke | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | MGE: A Training-Free and Efficient Model Generation and Enhancement Scheme | MGE:免训练的高效模型生成和增强方案 | Xuan Wang, Zeshan Pang, Yuliang Lu, Xuehu Yan | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | ArcSin: Adaptive ranged cosine Similarity injected noise for Language-Driven Visual Tasks | ArcSin:用于语言驱动视觉任务的自适应范围余弦相似度注入噪声 | Yang Liu, Xiaomin Yu, Gongyu Zhang, Christos Bergeles, Prokar Dasgupta, Alejandro Granados, Sebastien Ourselin | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Context-based and Diversity-driven Specificity in Compositional Zero-Shot Learning | 组合零样本学习中基于上下文和多样性驱动的特异性 | Yun Li, Zhe Liu, Hang Chen, Lina Yao | arxiv.org/pdf/2402.17… | null |
其他
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-27 | Diffusion Meets DAgger: Supercharging Eye-in-hand Imitation Learning | Diffusion 遇上 DAgger:增强手眼模仿学习 | Xiaoyu Zhang, Matthew Chang, Pranav Kumar, Saurabh Gupta | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Towards Fairness-Aware Adversarial Learning | 迈向公平意识的对抗性学习 | Yanghao Zhang, Tianle Zhang, Ronghui Mu, Xiaowei Huang, Wenjie Ruan | arxiv.org/pdf/2402.17… | link |
| 2024-02-27 | Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners | 视觉和听觉:具有扩散潜在对准器的开放域视音频生成 | Yazhou Xing, Yingqing He, Zeyue Tian, Xintao Wang, Qifeng Chen | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Bayesian Differentiable Physics for Cloth Digitalization | 用于布料数字化的贝叶斯微分物理 | Deshan Gong, Ningtao Mao, He Wang | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | CustomSketching: Sketch Concept Extraction for Sketch-based Image Synthesis and Editing | CustomSketching:基于草图的图像合成和编辑的草图概念提取 | Chufeng Xiao, Hongbo Fu | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist Autonomous Agents for Desktop and Web | OmniACT:为桌面和 Web 启用多模式通才自治代理的数据集和基准 | Raghav Kapoor, Yash Parag Butala, Melisa Russak, Jing Yu Koh, Kiran Kamble, Waseem Alshikh, Ruslan Salakhutdinov | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Adapting Learned Image Codecs to Screen Content via Adjustable Transformations | 通过可调整的转换使学习的图像编解码器适应屏幕内容 | H. Burak Dogaroglu, A. Burakhan Koyuncu, Atanas Boev, Elena Alshina, Eckehard Steinbach | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Multimodal Learned Sparse Retrieval with Probabilistic Expansion Control | 具有概率扩展控制的多模态学习稀疏检索 | Thong Nguyen, Mariya Hendriksen, Andrew Yates, Maarten de Rijke | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Black-box Adversarial Attacks Against Image Quality Assessment Models | 针对图像质量评估模型的黑盒对抗攻击 | Yu Ran, Ao-Xiang Zhang, Mingjie Li, Weixuan Tang, Yuan-Gen Wang | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | AlignMiF: Geometry-Aligned Multimodal Implicit Field for LiDAR-Camera Joint Synthesis | AlignMiF:用于 LiDAR-相机联合合成的几何对齐多模态隐式场 | Tao Tang, Guangrun Wang, Yixing Lao, Peng Chen, Jie Liu, Liang Lin, Kaicheng Yu, Xiaodan Liang | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Bit Distribution Study and Implementation of Spatial Quality Map in the JPEG-AI Standardization | JPEG-AI标准化中空间质量图的比特分布研究与实现 | Panqi Jia, Jue Mao, Esin Koyuncu, A. Burakhan Koyuncu, Timofey Solovyev, Alexander Karabutov, Yin Zhao, Elena Alshina, Andre Kaup | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | V2C-Long: Longitudinal Cortex Reconstruction with Spatiotemporal Correspondence | V2C-Long:具有时空对应性的纵向皮层重建 | Fabian Bongratz, Jan Fecht, Anne-Marie Rickmann, Christian Wachinger | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | A novel image space formalism of Fourier domain interpolation neural networks for noise propagation analysis | 用于噪声传播分析的傅里叶域插值神经网络的新颖图像空间形式 | Peter Dawood, Felix Breuer, Istvan Homolya, Jannik Stebani, Maximilian Gram, Peter M. Jakob, Moritz Zaiss, Martin Blaimer | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Accelerating Diffusion Sampling with Optimized Time Steps | 通过优化时间步长加速扩散采样 | Shuchen Xue, Zhaoqiang Liu, Fei Chen, Shifeng Zhang, Tianyang Hu, Enze Xie, Zhenguo Li | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | SocialCVAE: Predicting Pedestrian Trajectory via Interaction Conditioned Latents | SocialCVAE:通过交互条件潜伏预测行人轨迹 | Wei Xiang, Haoteng Yin, He Wang, Xiaogang Jin | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Method of Tracking and Analysis of Fluorescent-Labeled Cells Using Automatic Thresholding and Labeling | 使用自动阈值和标记跟踪和分析荧光标记细胞的方法 | Mizuki Fukasawa, Tomokazu Fukuda, Takuya Akashi | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Learning Exposure Correction in Dynamic Scenes | 学习动态场景中的曝光校正 | Jin Liu, Bo Wang, Chuanming Wang, Huiyuan Fu, Huadong Ma | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | An Interpretable Evaluation of Entropy-based Novelty of Generative Models | 基于熵的生成模型新颖性的可解释评估 | Jingwei Zhang, Cheuk Ting Li, Farzan Farnia | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | One-Shot Structure-Aware Stylized Image Synthesis | 一次性结构感知风格化图像合成 | Hansam Cho, Jonghyun Lee, Seunggyu Chang, Yonghyun Jeong | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Playground v2.5: Three Insights towards Enhancing Aesthetic Quality in Text-to-Image Generation | Playground v2.5:增强文本到图像生成美学质量的三个见解 | Daiqing Li, Aleks Kamko, Ehsan Akhgari, Ali Sabet, Linmiao Xu, Suhail Doshi | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | VCD: Knowledge Base Guided Visual Commonsense Discovery in Images | VCD:知识库引导图像中的视觉常识发现 | Xiangqing Shen, Yurun Song, Siwei Wu, Rui Xia | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Purified and Unified Steganographic Network | 纯净统一的隐写网络 | Guobiao Li, Sheng Li, Zicong Luo, Zhenxing Qian, Xinpeng Zhang | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Enhancing Quality of Compressed Images by Mitigating Enhancement Bias Towards Compression Domain | 通过减轻压缩域的增强偏差来增强压缩图像的质量 | Qunliang Xing, Mai Xu, Shengxi Li, Xin Deng, Meisong Zheng, Huaida Liu, Ying Chen | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models | Sora:大视觉模型的背景、技术、局限性和机遇回顾 | Yixin Liu, Kai Zhang, Yuan Li, Zhiling Yan, Chujie Gao, Ruoxi Chen, Zhengqing Yuan, Yue Huang, Hanchi Sun, Jianfeng Gao, et.al. | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Deep Umbra: A Generative Approach for Sunlight Access Computation in Urban Spaces | Deep Umbra:城市空间中阳光照射计算的生成方法 | Kazi Shahrukh Omar, Gustavo Moreira, Daniel Hodczak, Maryam Hosseini, Nicola Colaninno, Marcos Lage, Fabio Miranda | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Video as the New Language for Real-World Decision Making | 视频作为现实世界决策的新语言 | Sherry Yang, Jacob Walker, Jack Parker-Holder, Yilun Du, Jake Bruce, Andre Barreto, Pieter Abbeel, Dale Schuurmans | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | Transparent Image Layer Diffusion using Latent Transparency | 使用潜在透明度的透明图像层扩散 | Lvmin Zhang, Maneesh Agrawala | arxiv.org/pdf/2402.17… | null |
| 2024-02-27 | T-HITL Effectively Addresses Problematic Associations in Image Generation and Maintains Overall Visual Quality | T-HITL 有效解决图像生成中的问题关联并保持整体视觉质量 | Susan Epstein, Li Chen, Alessandro Vecchiato, Ankit Jain | arxiv.org/pdf/2402.17… | null |