| 2024-01-22 | Exploring Simple Open-Vocabulary Semantic Segmentation | 探索简单的开放词汇语义分割 | Zihang Lai | arxiv.org/pdf/2401.12… | null |
| 2024-01-22 | Connecting the Dots: Leveraging Spatio-Temporal Graph Neural Networks for Accurate Bangla Sign Language Recognition | 连接点:利用时空图神经网络进行准确的孟加拉手语识别 | Haz Sameen Shahgir, Khondker Salman Sayeed, Md Toki Tahmid, Tanjeem Azwad Zaman, Md. Zarif Ul Alam | arxiv.org/pdf/2401.12… | null |
| 2024-01-22 | OK-Robot: What Really Matters in Integrating Open-Knowledge Models for Robotics | OK-Robot:集成机器人开放知识模型真正重要的是什么 | Peiqi Liu, Yaswanth Orru, Chris Paxton, Nur Muhammad Mahi Shafiullah, Lerrel Pinto | arxiv.org/pdf/2401.12… | null |
| 2024-01-22 | Broiler-Net: A Deep Convolutional Framework for Broiler Behavior Analysis in Poultry Houses | Broiler-Net:用于家禽舍中肉鸡行为分析的深度卷积框架 | Tahereh Zarrat Ehsan, Seyed Mehdi Mohtavipour | arxiv.org/pdf/2401.12… | null |
| 2024-01-22 | Semi-supervised segmentation of land cover images using nonlinear canonical correlation analysis with multiple features and t-SNE | 使用多特征非线性典型相关分析和 t-SNE 对土地覆盖图像进行半监督分割 | Hong Wei, James Xiao, Yichao Zhang, Xia Hong | arxiv.org/pdf/2401.12… | null |
| 2024-01-22 | Automated facial recognition system using deep learning for pain assessment in adults with cerebral palsy | 使用深度学习的自动面部识别系统对脑瘫成人患者进行疼痛评估 | Álvaro Sabater-Gárriz, F. Xavier Gaya-Morey, José María Buades-Rubio, Cristina Manresa Yee, Pedro Montoya, Inmaculada Riquelme | arxiv.org/pdf/2401.12… | null |
| 2024-01-22 | VRMN-bD: A Multi-modal Natural Behavior Dataset of Immersive Human Fear Responses in VR Stand-up Interactive Games | VRMN-bD:VR 单口互动游戏中沉浸式人类恐惧反应的多模态自然行为数据集 | He Zhang, Xinyang Li, Yuanxi Sun, Xinyi Fu, Christine Qiu, John M. Carroll | arxiv.org/pdf/2401.12… | null |
| 2024-01-22 | Out-of-Distribution Detection & Applications With Ablated Learned Temperature Energy | 具有消融学习温度能量的分布外检测和应用 | Will LeVine, Benjamin Pikus, Jacob Phillips, Berk Norman, Fernando Amat Gil, Sean Hendryx | arxiv.org/pdf/2401.12… | null |
| 2024-01-22 | DeepCERES: A Deep learning method for cerebellar lobule segmentation using ultra-high resolution multimodal MRI | DeepCERES:使用超高分辨率多模态 MRI 进行小脑小叶分割的深度学习方法 | Sergio Morell-Ortega, Marina Ruiz-Perez, Marien Gadea, Roberto Vivo-Hernando, Gregorio Rubio, Fernando Aparici, Mariam de la Iglesia-Vaya, Gwenaelle Catheline, Pierrick Coupé, José V. Manjón | arxiv.org/pdf/2401.12… | null |
| 2024-01-22 | CloSe: A 3D Clothing Segmentation Dataset and Model | CloSe:3D 服装分割数据集和模型 | Dimitrije Antić, Garvita Tiwari, Batuhan Ozcomlekci, Riccardo Marin, Gerard Pons-Moll | arxiv.org/pdf/2401.12… | null |
| 2024-01-22 | HomeRobot Open Vocabulary Mobile Manipulation Challenge 2023 Participant Report (Team KuzHum) | HomeRobot 开放词汇移动操作挑战赛 2023 参赛者报告(KuzHum 团队) | Volodymyr Kuzma, Vladyslav Humennyy, Ruslan Partsey | arxiv.org/pdf/2401.12… | null |
| 2024-01-22 | Look, Listen and Recognise: Character-Aware Audio-Visual Subtitling | 看、听、认:角色感知视听字幕 | Bruno Korbar, Jaesung Huh, Andrew Zisserman | arxiv.org/pdf/2401.12… | null |
| 2024-01-22 | A Saliency Enhanced Feature Fusion based multiscale RGB-D Salient Object Detection Network | 基于显着性增强特征融合的多尺度 RGB-D 显着目标检测网络 | Rui Huang, Qingyi Zhao, Yan Xing, Sihua Gao, Weifeng Xu, Yuxiang Zhang, Wei Fan | arxiv.org/pdf/2401.11… | null |
| 2024-01-22 | Large receptive field strategy and important feature extraction strategy in 3D object detection | 3D物体检测中的大感受野策略和重要特征提取策略 | Leichao Cui, Xiuxian Li, Min Meng | arxiv.org/pdf/2401.11… | null |
| 2024-01-22 | Evaluating the Feasibility of Standard Facial Expression Recognition in Individuals with Moderate to Severe Intellectual Disabilities | 评估标准面部表情识别对中度至重度智力障碍个体的可行性 | F. Xavier Gaya-Morey, Silvia Ramis, Jose M. Buades-Rubio, Cristina Manresa-Yee | arxiv.org/pdf/2401.11… | null |
| 2024-01-22 | Detect-Order-Construct: A Tree Construction based Approach for Hierarchical Document Structure Analysis | 检测-顺序-构造:一种基于树构造的分层文档结构分析方法 | Jiawei Wang, Kai Hu, Zhuoyao Zhong, Lei Sun, Qiang Huo | arxiv.org/pdf/2401.11… | null |
| 2024-01-22 | MOSformer: Momentum encoder-based inter-slice fusion transformer for medical image segmentation | MOSformer:用于医学图像分割的基于动量编码器的层间融合变压器 | De-Xing Huang, Xiao-Hu Zhou, Xiao-Liang Xie, Shi-Qi Liu, Zhen-Qiu Feng, Mei-Jiang Gui, Hao Li, Tian-Yu Xiang, Xiu-Ling Liu, Zeng-Guang Hou | arxiv.org/pdf/2401.11… | null |
| 2024-01-22 | SignVTCL: Multi-Modal Continuous Sign Language Recognition Enhanced by Visual-Textual Contrastive Learning | SignVTCL:通过视觉文本对比学习增强多模式连续手语识别 | Hao Chen, Jiaze Wang, Ziyu Guo, Jinpeng Li, Donghao Zhou, Bian Wu, Chenyong Guan, Guangyong Chen, Pheng-Ann Heng | arxiv.org/pdf/2401.11… | null |
| 2024-01-22 | Unveiling the Human-like Similarities of Automatic Facial Expression Recognition: An Empirical Exploration through Explainable AI | 揭示自动面部表情识别的类人相似性:通过可解释的人工智能进行实证探索 | F. Xavier Gaya-Morey, Silvia Ramis-Guarinos, Cristina Manresa-Yee, Jose M. Buades-Rubio | arxiv.org/pdf/2401.11… | null |
| 2024-01-22 | Rethinking Centered Kernel Alignment in Knowledge Distillation | 重新思考知识蒸馏中的中心内核对齐 | Zikai Zhou, Yunhang Shen, Shitong Shao, Huanran Chen, Linrui Gong, Shaohui Lin | arxiv.org/pdf/2401.11… | null |
| 2024-01-22 | Symbrain: A large-scale dataset of MRI images for neonatal brain symmetry analysis | Symbrain:用于新生儿大脑对称性分析的大规模 MRI 图像数据集 | Arnaud Gucciardi, Safouane El Ghazouali, Francesca Venturini, Vida Groznik, Umberto Michelucci | arxiv.org/pdf/2401.11… | null |
| 2024-01-22 | SemPLeS: Semantic Prompt Learning for Weakly-Supervised Semantic Segmentation | SemPLeS:弱监督语义分割的语义提示学习 | Ci-Siang Lin, Chien-Yi Wang, Yu-Chiang Frank Wang, Min-Hung Chen | arxiv.org/pdf/2401.11… | null |
| 2024-01-22 | Deep Learning for Computer Vision based Activity Recognition and Fall Detection of the Elderly: a Systematic Review | 基于计算机视觉的深度学习老年人活动识别和跌倒检测:系统综述 | F. Xavier Gaya-Morey, Cristina Manresa-Yee, Jose M. Buades-Rubio | arxiv.org/pdf/2401.11… | null |
| 2024-01-22 | Collaborative Position Reasoning Network for Referring Image Segmentation | 用于参考图像分割的协作位置推理网络 | Jianjian Cao, Beiya Dai, Yulin Li, Xiameng Qin, Jingdong Wang | arxiv.org/pdf/2401.11… | null |
| 2024-01-22 | Concealed Object Segmentation with Hierarchical Coherence Modeling | 使用分层一致性建模的隐藏对象分割 | Fengyang Xiao, Pan Zhang, Chunming He, Runze Hu, Yutao Liu | arxiv.org/pdf/2401.11… | null |
| 2024-01-22 | EmerDiff: Emerging Pixel-level Semantic Knowledge in Diffusion Models | EmerDiff:扩散模型中新兴的像素级语义知识 | Koichi Namekata, Amirmojtaba Sabour, Sanja Fidler, Seung Wook Kim | arxiv.org/pdf/2401.11… | null |
| 2024-01-22 | MetaSeg: Content-Aware Meta-Net for Omni-Supervised Semantic Segmentation | MetaSeg:用于全监督语义分割的内容感知元网络 | Shenwang Jiang, Jianan Li, Ying Wang, Wenxuan Wu, Jizhou Zhang, Bo Huang, Tingfa Xu | arxiv.org/pdf/2401.11… | null |
| 2024-01-22 | Colorectal Polyp Segmentation in the Deep Learning Era: A Comprehensive Survey | 深度学习时代的结直肠息肉分割:综合调查 | Zhenyu Wu, Fengmao Lv, Chenglizhao Chen, Aimin Hao, Shuo Li | arxiv.org/pdf/2401.11… | null |
| 2024-01-22 | Detecting Out-of-Distribution Samples via Conditional Distribution Entropy with Optimal Transport | 通过具有最佳传输的条件分布熵检测分布外样本 | Chuanwen Feng, Wenlong Chen, Ao Ke, Yilong Ren, Xike Xie, S. Kevin Zhou | arxiv.org/pdf/2401.11… | null |
| 2024-01-22 | Augmenting Prototype Network with TransMix for Few-shot Hyperspectral Image Classification | 使用 TransMix 增强原型网络以实现少样本高光谱图像分类 | Chun Liu, Longwei Yang, Dongmei Dong, Zheng Li, Wei Yang, Zhigang Han, Jiayao Wang | arxiv.org/pdf/2401.11… | null |
| 2024-01-22 | SFC: Shared Feature Calibration in Weakly Supervised Semantic Segmentation | SFC:弱监督语义分割中的共享特征校准 | Xinqiao Zhao, Feilong Tang, Xiaoyang Wang, Jimin Xiao | arxiv.org/pdf/2401.11… | null |
| 2024-01-22 | MsSVT++: Mixed-scale Sparse Voxel Transformer with Center Voting for 3D Object Detection | MsSVT++:用于 3D 对象检测的具有中心投票功能的混合尺度稀疏体素变换器 | Jianan Li, Shaocong Dong, Lihe Ding, Tingfa Xu | arxiv.org/pdf/2401.11… | null |
| 2024-01-22 | Medical Image Debiasing by Learning Adaptive Agreement from a Biased Council | 通过从有偏见的委员会学习自适应协议来消除医学图像偏见 | Luyang Luo, Xin Huang, Minghao Wang, Zhuoyue Wan, Hao Chen | arxiv.org/pdf/2401.11… | null |
| 2024-01-22 | EK-Net:Real-time Scene Text Detection with Expand Kernel Distance | EK-Net:扩展核距离的实时场景文本检测 | Boyuan Zhu, Fagui Liu, Xi Chen, Quan Tang | arxiv.org/pdf/2401.11… | null |
| 2024-01-22 | Memory-Efficient Prompt Tuning for Incremental Histopathology Classification | 用于增量组织病理学分类的内存高效提示调整 | Yu Zhu, Kang Li, Lequan Yu, Pheng-Ann Heng | arxiv.org/pdf/2401.11… | null |
| 2024-01-22 | RTA-Former: Reverse Transformer Attention for Polyp Segmentation | RTA-Former:用于息肉分割的反向变压器注意力 | Zhikai Li, Murong Yi, Ali Uneri, Sihan Niu, Craig Jones | arxiv.org/pdf/2401.11… | null |
| 2024-01-22 | ActionHub: A Large-scale Action Video Description Dataset for Zero-shot Action Recognition | ActionHub:用于零镜头动作识别的大规模动作视频描述数据集 | Jiaming Zhou, Junwei Liang, Kun-Yu Lin, Jinrui Yang, Wei-Shi Zheng | arxiv.org/pdf/2401.11… | null |
| 2024-01-22 | M2-CLIP: A Multimodal, Multi-task Adapting Framework for Video Action Recognition | M2-CLIP:视频动作识别的多模态、多任务适应框架 | Mengmeng Wang, Jiazheng Xing, Boyuan Jiang, Jun Chen, Jianbiao Mei, Xingxing Zuo, Guang Dai, Jingdong Wang, Yong Liu | arxiv.org/pdf/2401.11… | null |
| 2024-01-22 | Friends Across Time: Multi-Scale Action Segmentation Transformer for Surgical Phase Recognition | 跨越时间的朋友:用于手术阶段识别的多尺度动作分段变压器 | Bokai Zhang, Jiayuan Meng, Bin Cheng, Dean Biskup, Svetlana Petculescu, Angela Chapman | arxiv.org/pdf/2401.11… | null |
| 2024-01-22 | Zoom-shot: Fast and Efficient Unsupervised Zero-Shot Transfer of CLIP to Vision Encoders with Multimodal Loss | Zoom-shot:快速高效的无监督零样本将 CLIP 传输到具有多模态损失的视觉编码器 | Jordan Shipard, Arnold Wiliem, Kien Nguyen Thanh, Wei Xiang, Clinton Fookes | arxiv.org/pdf/2401.11… | null |