[UPDATED!] 2024-02-14 (Publish Time)
生成模型
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-14 | Magic-Me: Identity-Specific Video Customized Diffusion | Magic-Me:针对特定身份的视频定制扩散 | Ze Ma, Daquan Zhou, Chun-Hsiao Yeh, Xue-She Wang, Xiuyu Li, Huanrui Yang, Zhen Dong, Kurt Keutzer, Jiashi Feng | arxiv.org/pdf/2402.09… | null |
2024-02-14 | Synthesizing Knowledge-enhanced Features for Real-world Zero-shot Food Detection | 综合知识增强特征以实现现实世界的零样本食品检测 | Pengfei Zhou, Weiqing Min, Jiajun Song, Yang Zhang, Shuqiang Jiang | arxiv.org/pdf/2402.09… | link |
2024-02-14 | Semi-Supervised Diffusion Model for Brain Age Prediction | 用于脑年龄预测的半监督扩散模型 | Ayodeji Ijishakin, Sophie Martin, Florence Townend, Federica Agosta, Edoardo Gioele Spinelli, Silvia Basaia, Paride Schito, Yuri Falzone, Massimo Filippi, James Cole, et.al. | arxiv.org/pdf/2402.09… | null |
2024-02-14 | DestripeCycleGAN: Stripe Simulation CycleGAN for Unsupervised Infrared Image Destriping | DestripeCycleGAN:用于无监督红外图像去条纹的条纹模拟 CycleGAN | Shiqi Yang, Hanlin Qin, Shuai Yuan, Xiang Yan, Hossein Rahmani | arxiv.org/pdf/2402.09… | null |
2024-02-14 | Towards Realistic Landmark-Guided Facial Video Inpainting Based on GANs | 基于 GAN 的逼真地标引导面部视频修复 | Fatemeh Ghorbani Lohesara, Karen Egiazarian, Sebastian Knorr | arxiv.org/pdf/2402.09… | null |
2024-02-14 | Extreme Video Compression with Pre-trained Diffusion Models | 使用预先训练的扩散模型进行极限视频压缩 | Bohan Li, Yiming Liu, Xueyan Niu, Bo Bai, Lei Deng, Deniz Gündüz | arxiv.org/pdf/2402.08… | link |
多模态
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-14 | MultiMedEval: A Benchmark and a Toolkit for Evaluating Medical Vision-Language Models | MultiMedEval:评估医学视觉语言模型的基准和工具包 | Corentin Royer, Bjoern Menze, Anjany Sekuboyina | arxiv.org/pdf/2402.09… | null |
2024-02-14 | OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLM | OmniMedVQA:医疗 LVLM 的新型大规模综合评估基准 | Yutao Hu, Tianbin Li, Quanfeng Lu, Wenqi Shao, Junjun He, Yu Qiao, Ping Luo | arxiv.org/pdf/2402.09… | null |
2024-02-14 | Headset: Human emotion awareness under partial occlusions multimodal dataset | 耳机:部分遮挡多模态数据集下的人类情感意识 | Fatemeh Ghorbani Lohesara, Davi Rabbouni Freitas, Christine Guillemot, Karen Eguiazarian, Sebastian Knorr | arxiv.org/pdf/2402.09… | null |
2024-02-14 | Comment-aided Video-Language Alignment via Contrastive Pre-training for Short-form Video Humor Detection | 通过对比预训练进行短视频幽默检测的评论辅助视频语言对齐 | Yang Liu, Tongfei Shen, Dong Zhang, Qingying Sun, Shoushan Li, Guodong Zhou | arxiv.org/pdf/2402.09… | null |
2024-02-14 | Can Text-to-image Model Assist Multi-modal Learning for Visual Recognition with Visual Modality Missing? | 文本到图像模型能否辅助视觉模态缺失的视觉识别的多模态学习? | Tiantian Feng, Daniel Yang, Digbalay Bose, Shrikanth Narayanan | arxiv.org/pdf/2402.09… | null |
2024-02-14 | Multi-modality transrectal ultrasound vudei classification for identification of clinically significant prostate cancer | 多模态经直肠超声 vudei 分类用于识别有临床意义的前列腺癌 | Hong Wu, Juan Fu, Hongsheng Ye, Yuming Zhong, Xuebin Zhou, Jianhua Zhou, Yi Wang | arxiv.org/pdf/2402.08… | link |
2024-02-14 | Pretraining Vision-Language Model for Difference Visual Question Answering in Longitudinal Chest X-rays | 纵向胸部 X 射线差异视觉问答的预训练视觉语言模型 | Yeongjae Cho, Taehee Kim, Heejun Shin, Sungzoon Cho, Dongmyung Shin | arxiv.org/pdf/2402.08… | null |
2024-02-14 | Interpretable Measures of Conceptual Similarity by Complexity-Constrained Descriptive Auto-Encoding | 通过复杂性约束的描述性自动编码来测量概念相似性的可解释性 | Alessandro Achille, Greg Ver Steeg, Tian Yu Liu, Matthew Trager, Carson Klingenberg, Stefano Soatto | arxiv.org/pdf/2402.08… | null |
Nerf
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-14 | PC-NeRF: Parent-Child Neural Radiance Fields Using Sparse LiDAR Frames in Autonomous Driving Environments | PC-NeRF:在自动驾驶环境中使用稀疏 LiDAR 帧的亲子神经辐射场 | Xiuzhong Hu, Guangming Xiong, Zheng Zang, Peng Jia, Yuxuan Han, Junyi Ma | arxiv.org/pdf/2402.09… | link |
模型压缩/优化
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-14 | Pruning Sparse Tensor Neural Networks Enables Deep Learning for 3D Ultrasound Localization Microscopy | 修剪稀疏张量神经网络支持 3D 超声定位显微镜的深度学习 | Brice Rauby, Paul Xing, Jonathan Porée, Maxime Gasse, Jean Provost | arxiv.org/pdf/2402.09… | null |
分类/检测/识别/分割/...
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-14 | Deep Rib Fracture Instance Segmentation and Classification from CT on the RibFrac Challenge | RibFrac 挑战赛中 CT 的深部肋骨骨折实例分割和分类 | Jiancheng Yang, Rui Shi, Liang Jin, Xiaoyang Huang, Kaiming Kuang, Donglai Wei, Shixuan Gu, Jianying Liu, Pengfei Liu, Zhizhong Chai, et.al. | arxiv.org/pdf/2402.09… | null |
2024-02-14 | YOLOv8-AM: YOLOv8 with Attention Mechanisms for Pediatric Wrist Fracture Detection | YOLOv8-AM:带有注意力机制的 YOLOv8,用于儿童手腕骨折检测 | Chun-Tse Chien, Rui-Yang Ju, Kuang-Yi Chou, Chien-Sheng Lin, Jen-Shiun Chiang | arxiv.org/pdf/2402.09… | null |
2024-02-14 | Only My Model On My Data: A Privacy Preserving Approach Protecting one Model and Deceiving Unauthorized Black-Box Models | 只有我的模型在我的数据上:一种保护一个模型并欺骗未经授权的黑盒模型的隐私保护方法 | Weiheng Chai, Brian Testa, Huantao Ren, Asif Salekin, Senem Velipasalar | arxiv.org/pdf/2402.09… | null |
2024-02-14 | Few-Shot Object Detection with Sparse Context Transformers | 使用稀疏上下文转换器进行少样本目标检测 | Jie Mei, Mingyuan Jiu, Hichem Sahbi, Xiaoheng Jiang, Mingliang Xu | arxiv.org/pdf/2402.09… | null |
2024-02-14 | Immediate generalisation in humans but a generalisation lag in deep neural networks | 在人类中可以立即泛化,但在深度神经网络中泛化滞后 | Lukas S. Huber, Fred W. Mast, Felix A. Wichmann | arxiv.org/pdf/2402.09… | null |
2024-02-14 | TDViT: Temporal Dilated Video Transformer for Dense Video Tasks | TDViT:用于密集视频任务的时间扩张视频转换器 | Guanxiong Sun, Yang Hua, Guosheng Hu, Neil Robertson | arxiv.org/pdf/2402.09… | link |
2024-02-14 | Efficient One-stage Video Object Detection by Exploiting Temporal Consistency | 利用时间一致性的高效单阶段视频目标检测 | Guanxiong Sun, Yang Hua, Guosheng Hu, Neil Robertson | arxiv.org/pdf/2402.09… | link |
2024-02-14 | Switch EMA: A Free Lunch for Better Flatness and Sharpness | Switch EMA:更好的平坦度和清晰度的免费午餐 | Siyuan Li, Zicheng Liu, Juanxi Tian, Ge Wang, Zedong Wang, Weiyang Jin, Di Wu, Cheng Tan, Tao Lin, Yang Liu, et.al. | arxiv.org/pdf/2402.09… | null |
2024-02-14 | Is my Data in your AI Model? Membership Inference Test with Application to Face Images | 我的数据在你们的人工智能模型中吗?应用于人脸图像的隶属推理测试 | Daniel DeAlcala, Aythami Morales, Gonzalo Mancera, Julian Fierrez, Ruben Tolosana, Javier Ortega-Garcia | arxiv.org/pdf/2402.09… | null |
2024-02-14 | Domain-adaptive and Subgroup-specific Cascaded Temperature Regression for Out-of-distribution Calibration | 用于分布外校准的域自适应和子组特定级联温度回归 | Jiexin Wang, Jiahao Chen, Bing Su | arxiv.org/pdf/2402.09… | null |
2024-02-14 | Crop and Couple: cardiac image segmentation using interlinked specialist networks | Crop and Couple:使用互连的专家网络进行心脏图像分割 | Abbas Khan, Muhammad Asad, Martin Benning, Caroline Roney, Gregory Slabaugh | arxiv.org/pdf/2402.09… | null |
2024-02-14 | Solid Waste Detection in Remote Sensing Images: A Survey | 遥感图像中的固体废物检测:调查 | Piero Fraternali, Luca Morandini, Sergio Luis Herrera González | arxiv.org/pdf/2402.09… | null |
2024-02-14 | Open-Vocabulary Segmentation with Unpaired Mask-Text Supervision | 具有不配对掩码文本监督的开放词汇分割 | Zhaoqing Wang, Xiaobo Xia, Ziye Chen, Xiao He, Yandong Guo, Mingming Gong, Tongliang Liu | arxiv.org/pdf/2402.08… | link |
2024-02-14 | Learning-based Bone Quality Classification Method for Spinal Metastasis | 基于学习的脊柱转移骨质量分类方法 | Shiqi Peng, Bolin Lai, Guangyu Yao, Xiaoyun Zhang, Ya Zhang, Yan-Feng Wang, Hui Zhao | arxiv.org/pdf/2402.08… | null |
2024-02-14 | Weakly Supervised Segmentation of Vertebral Bodies with Iterative Slice-propagation | 迭代切片传播的椎体弱监督分割 | Shiqi Peng, Bolin Lai, Guangyu Yao, Xiaoyun Zhang, Ya Zhang, Yan-Feng Wang, Hui Zhao | arxiv.org/pdf/2402.08… | null |
2024-02-14 | Moving Object Proposals with Deep Learned Optical Flow for Video Object Segmentation | 用于视频对象分割的深度学习光流的移动对象建议 | Ge Shi, Zhili Yang | arxiv.org/pdf/2402.08… | null |
2024-02-14 | TikTokActions: A TikTok-Derived Video Dataset for Human Action Recognition | TikTokActions:用于人类动作识别的 TikTok 衍生视频数据集 | Yang Qian, Yinan Sun, Ali Kargarandehkordi, Onur Cezmi Mutlu, Saimourya Surabhi, Pingyi Chen, Zain Jabbar, Dennis Paul Wall, Peter Washington | arxiv.org/pdf/2402.08… | null |
图像理解
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-14 | Depth-aware Volume Attention for Texture-less Stereo Matching | 用于无纹理立体匹配的深度感知体积注意 | Tong Zhao, Mingyu Ding, Wei Zhan, Masayoshi Tomizuka, Yintao Wei | arxiv.org/pdf/2402.08… | link |
Transformer
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-14 | Less is More: Fewer Interpretable Region via Submodular Subset Selection | 少即是多:通过子模子集选择减少可解释区域 | Ruoyu Chen, Hua Zhang, Siyuan Liang, Jingzhi Li, Xiaochun Cao | arxiv.org/pdf/2402.09… | null |
2024-02-14 | Pyramid Attention Network for Medical Image Registration | 用于医学图像配准的金字塔注意力网络 | Zhuoyuan Wang, Haiqiao Wang, Yi Wang | arxiv.org/pdf/2402.09… | null |
2024-02-14 | CLIP-MUSED: CLIP-Guided Multi-Subject Visual Neural Information Semantic Decoding | CLIP-MUSED:CLIP引导的多主题视觉神经信息语义解码 | Qiongyi Zhou, Changde Du, Shengpei Wang, Huiguang He | arxiv.org/pdf/2402.08… | null |
2024-02-14 | Predictive Temporal Attention on Event-based Video Stream for Energy-efficient Situation Awareness | 基于事件的视频流的预测时间注意力以实现节能态势感知 | Yiming Bu, Jiayang Liu, Qinru Qiu | arxiv.org/pdf/2402.08… | null |
3D/CG
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-14 | Registration of Longitudinal Spine CTs for Monitoring Lesion Growth | 用于监测病变生长的纵向脊柱 CT 配准 | Malika Sanhinova, Nazim Haouchine, Steve D. Pieper, William M. Wells III, Tracy A. Balboni, Alexander Spektor, Mai Anh Huynh, Jeffrey P. Guenette, Bryan Czajkowski, Sarah Caplan, et.al. | arxiv.org/pdf/2402.09… | null |
2024-02-14 | DUDF: Differentiable Unsigned Distance Fields with Hyperbolic Scaling | DUDF:具有双曲标度的可微分无符号距离场 | Miguel Fainstein, Viviana Siless, Emmanuel Iarussi | arxiv.org/pdf/2402.08… | null |
其他
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-02-14 | Weatherproofing Retrieval for Localization with Generative AI and Geometric Consistency | 利用生成式人工智能和几何一致性进行本地化防风雨检索 | Yannis Kalantidis, Mert Bülent Sarıyıldız, Rafael S. Rezende, Philippe Weinzaepfel, Diane Larlus, Gabriela Csurka | arxiv.org/pdf/2402.09… | null |
2024-02-14 | Traj-LIO: A Resilient Multi-LiDAR Multi-IMU State Estimator Through Sparse Gaussian Process | Traj-LIO:通过稀疏高斯过程的弹性多 LiDAR 多 IMU 状态估计器 | Xin Zheng, Jianke Zhu | arxiv.org/pdf/2402.09… | null |
2024-02-14 | Generalized Portrait Quality Assessment | 广义肖像质量评估 | Nicolas Chahine, Sira Ferradans, Javier Vazquez-Corral, Jean Ponce | arxiv.org/pdf/2402.09… | link |