[UPDATED!] 2024-01-16 (Publish Time)
分类/检测/识别/分割
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-16 | SAMF: Small-Area-Aware Multi-focus Image Fusion for Object Detection | SAMF:用于物体检测的小区域感知多焦点图像融合 | Xilai Li, Xiaosong Li, Haishu Tan, Jinyang Li | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | Multi-view Distillation based on Multi-modal Fusion for Few-shot Action Recognition(CLIP- | 基于多模态融合的多视图蒸馏进行小样本动作识别(CLIP- | Fei Guo, YiKang Wang, Han Qi, WenPing Jin, Li Zhu | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | Generative Denoise Distillation: Simple Stochastic Noises Induce Efficient Knowledge Transfer for Dense Prediction | 生成去噪蒸馏:简单的随机噪声诱导高效的知识转移以实现密集预测 | Zhaoge Liu, Xiaohao Xu, Yunkang Cao, Weiming Shen | arxiv.org/pdf/2401.08… | link |
| 2024-01-16 | Modeling Spoof Noise by De-spoofing Diffusion and its Application in Face Anti-spoofing | 反欺骗扩散模拟欺骗噪声及其在人脸反欺骗中的应用 | Bin Zhang, Xiangyu Zhu, Xiaoyu Zhang, Zhen Lei | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | Multi-Technique Sequential Information Consistency For Dynamic Visual Place Recognition In Changing Environments | 变化环境中动态视觉位置识别的多技术序列信息一致性 | Bruno Arcanjo, Bruno Ferrarini, Michael Milford, Klaus D. McDonald-Maier, Shoaib Ehsan | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | Multi-scale 2D Temporal Map Diffusion Models for Natural Language Video Localization | 用于自然语言视频定位的多尺度 2D 时间图扩散模型 | Chongzhi Zhang, Mingyuan Zhang, Zhiyang Teng, Jiayi Li, Xizhou Zhu, Lewei Lu, Ziwei Liu, Aixin Sun | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | ModelNet-O: A Large-Scale Synthetic Dataset for Occlusion-Aware Point Cloud Classification | ModelNet-O:用于遮挡感知点云分类的大规模综合数据集 | Zhongbin Fang, Xia Li, Xiangtai Li, Shen Zhao, Mengyuan Liu | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | End-to-End Optimized Image Compression with the Frequency-Oriented Transform | 使用面向频率的变换进行端到端优化的图像压缩 | Yuefeng Zhang, Kai Lin | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | Completely Occluded and Dense Object Instance Segmentation Using Box Prompt-Based Segmentation Foundation Models | 使用基于框提示的分割基础模型进行完全遮挡和密集的对象实例分割 | Zhen Zhou, Junfeng Fan, Yunkai Ma, Sihan Zhao, Fengshui Jing, Min Tan | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | Mobile Contactless Palmprint Recognition: Use of Multiscale, Multimodel Embeddings | 移动非接触式掌纹识别:使用多尺度、多模型嵌入 | Steven A. Grosz, Akash Godbole, Anil K. Jain | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | Hardware Acceleration for Real-Time Wildfire Detection Onboard Drone Networks | 用于实时野火检测机载无人机网络的硬件加速 | Austin Briley, Fatemeh Afghah | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | UV-SAM: Adapting Segment Anything Model for Urban Village Identification | UV-SAM:采用分段任意模型进行城中村识别 | Xin Zhang, Yu Liu, Yuming Lin, Qingming Liao, Yong Li | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | Adversarial Masking Contrastive Learning for vein recognition | 用于静脉识别的对抗性掩蔽对比学习 | Huafeng Qin, Yiquan Wu, Mounim A. El-Yacoubi, Jun Wang, Guangxiang Yang | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | Achieve Fairness without Demographics for Dermatological Disease Diagnosis | 无需人口统计即可实现皮肤病诊断的公平性 | Ching-Hao Chiu, Yu-Jen Chen, Yawen Wu, Yiyu Shi, Tsung-Yi Ho | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | Toward Clinically Trustworthy Deep Learning: Applying Conformal Prediction to Intracranial Hemorrhage Detection | 迈向临床值得信赖的深度学习:将适形预测应用于颅内出血检测 | Cooper Gamble, Shahriar Faghani, Bradley J. Erickson | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | Robust Tiny Object Detection in Aerial Images amidst Label Noise | 标签噪声中航空图像中稳健的微小物体检测 | Haoran Zhu, Chang Xu, Wen Yang, Ruixiang Zhang, Yan Zhang, Gui-Song Xia | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | 3D Lane Detection from Front or Surround-View using Joint-Modeling & Matching | 使用联合建模和匹配从前视或环视进行 3D 车道检测 | Haibin Zhou, Jun Chang, Tao Lu, Huabing Zhou | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | BanglaNet: Bangla Handwritten Character Recognition using Ensembling of Convolutional Neural Network | BanglaNet:使用卷积神经网络集成进行孟加拉语手写字符识别 | Chandrika Saha, Md. Mostafijur Rahman | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | Small Object Detection by DETR via Information Augmentation and Adaptive Feature Fusion | DETR 通过信息增强和自适应特征融合进行小物体检测 | Ji Huang, Hui Wang | arxiv.org/pdf/2401.08… | null |
生成模型
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-16 | Key-point Guided Deformable Image Manipulation Using Diffusion Model | 使用扩散模型的关键点引导可变形图像处理 | Seok-Hwan Oh, Guil Jung, Myeong-Gee Kim, Sang-Yun Kim, Young-Min Kim, Hyeon-Jik Lee, Hyuk-Sool Kwon, Hyeon-Min Bae | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | Inpainting Normal Maps for Lightstage data | 修复 Lightstage 数据的法线贴图 | Hancheng Zuo, Bernard Tiddeman | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | EmoTalker: Emotionally Editable Talking Face Generation via Diffusion Model | EmoTalker:通过扩散模型生成情感可编辑的说话面孔 | Bingyuan Zhang, Xulong Zhang, Ning Cheng, Jun Yu, Jing Xiao, Jianzong Wang | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | Forging Vision Foundation Models for Autonomous Driving: Challenges, Methodologies, and Opportunities | 打造自动驾驶视觉基础模型:挑战、方法和机遇 | Xu Yan, Haiming Zhang, Yingjie Cai, Jingming Guo, Weichao Qiu, Bin Gao, Kaiqiang Zhou, Yue Zhao, Huan Jin, Jiantao Gao, et.al. | arxiv.org/pdf/2401.08… | null |
多模态
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-16 | AesBench: An Expert Benchmark for Multimodal Large Language Models on Image Aesthetics Perception | AesBench:图像美学感知多模态大语言模型的专家基准 | Yipo Huang, Quan Yuan, Xiangfei Sheng, Zhichao Yang, Haoning Wu, Pengfei Chen, Yuzhe Yang, Leida Li, Weisi Lin | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | Human vs. LMMs: Exploring the Discrepancy in Emoji Interpretation and Usage in Digital Communication | 人类与 LMM:探索数字通信中表情符号解释和使用的差异 | Hanjia Lyu, Weihong Qi, Zhongyu Wei, Jiebo Luo | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | The Devil is in the Details: Boosting Guided Depth Super-Resolution via Rethinking Cross-Modal Alignment and Aggregation | 细节决定成败:通过重新思考跨模式对齐和聚合来提高引导深度超分辨率 | Xinni Jiang, Zengsheng Kuang, Chunle Guo, Ruixun Zhang, Lei Cai, Xiao Fan, Chongyi Li | arxiv.org/pdf/2401.08… | null |
Transformer
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-16 | Transcending the Limit of Local Window: Advanced Super-Resolution Transformer with Adaptive Token Dictionary | 超越本地窗口的限制:具有自适应令牌字典的高级超分辨率变压器 | Leheng Zhang, Yawei Li, Xingyu Zhou, Xiaorui Zhao, Shuhang Gu | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | DPAFNet:Dual Path Attention Fusion Network for Single Image Deraining | DPAFNet:用于单图像去雨的双路径注意力融合网络 | Bingcai Wei | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | Deep Linear Array Pushbroom Image Restoration: A Degradation Pipeline and Jitter-Aware Restoration Network | 深度线性阵列推扫式图像恢复:退化管道和抖动感知恢复网络 | Zida Chen, Ziran Zhang, Haoying Li, Menghao Li, Yueting Chen, Qi Li, Huajun Feng, Zhihai Xu, Shiqi Chen | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | Spatial-Semantic Collaborative Cropping for User Generated Content | 用户生成内容的空间语义协作裁剪 | Yukun Su, Yiwen Cao, Jingliang Deng, Fengyun Rao, Qingyao Wu | arxiv.org/pdf/2401.08… | null |
Nerf
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-16 | ProvNeRF: Modeling per Point Provenance in NeRFs as a Stochastic Process | ProvNeRF:NeRF 中每点来源的建模作为随机过程 | Kiyohiro Nakayama, Mikaela Angelina Uy, Yang You, Ke Li, Leonidas Guibas | arxiv.org/pdf/2401.08… | null |
3D/CG
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-16 | No-Clean-Reference Image Super-Resolution: Application to Electron Microscopy | 免清洁参考图像超分辨率:在电子显微镜中的应用 | Mohammad Khateri, Morteza Ghahremani, Alejandra Sierra, Jussi Tohka | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | Augmenting Ground-Level PM2.5 Prediction via Kriging-Based Pseudo-Label Generation | 通过基于克里金法的伪标签生成增强地面 PM2.5 预测 | Lei Duan, Ziyang Jiang, David Carlson | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | Cross-Modal Semi-Dense 6-DoF Tracking of an Event Camera in Challenging Conditions | 具有挑战性的条件下事件相机的跨模态半密集 6 自由度跟踪 | Yi-Fan Zuo, Wanting Xu, Xia Wang, Yifu Wang, Laurent Kneip | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | Spatial Channel State Information Prediction with Generative AI: Towards Holographic Communication and Digital Radio Twin | 利用生成式人工智能进行空间信道状态信息预测:迈向全息通信和数字无线电孪生 | Lihao Zhang, Haijian Sun, Yong Zeng, Rose Qingyang Hu | arxiv.org/pdf/2401.08… | null |
图像理解
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-16 | Multitask Learning in Minimally Invasive Surgical Vision: A Review | 微创手术视觉中的多任务学习:综述 | Oluwatosin Alabi, Tom Vercauteren, Miaojing Shi | arxiv.org/pdf/2401.08… | null |
其他
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-16 | Un-Mixing Test-Time Normalization Statistics: Combatting Label Temporal Correlation | 取消混合测试时间归一化统计:对抗标签时间相关性 | Devavrat Tomar, Guillaume Vray, Jean-Philippe Thiran, Behzad Bozorgtabar | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | The Faiss library | 费斯图书馆 | Matthijs Douze, Alexandr Guzhva, Chengqi Deng, Jeff Johnson, Gergely Szilvasy, Pierre-Emmanuel Mazaré, Maria Lomeli, Lucas Hosseini, Hervé Jégou | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | Siamese Content-based Search Engine for a More Transparent Skin and Breast Cancer Diagnosis through Histological Imaging | 基于连体内容的搜索引擎,通过组织学成像实现更透明的皮肤和乳腺癌诊断 | Zahra Tabatabaei, Adrián Colomer, JAvier Oliver Moll, Valery Naranjo | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | Learned Image Compression with ROI-Weighted Distortion and Bit Allocation | 通过 ROI 加权失真和位分配学习图像压缩 | Wei Jiang, Yongqi Zhai, Hangyu Li, Ronggang Wang | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | E2HQV: High-Quality Video Generation from Event Camera via Theory-Inspired Model-Aided Deep Learning | E2HQV:通过理论启发的模型辅助深度学习从事件摄像机生成高质量视频 | Qiang Qu, Yiran Shen, Xiaoming Chen, Yuk Ying Chung, Tongliang Liu | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | Deep Shape-Texture Statistics for Completely Blind Image Quality Evaluation | 用于完全盲图像质量评估的深层形状纹理统计 | Yixuan Li, Peilin Chen, Hanwei Zhu, Keyan Ding, Leida Li, Shiqi Wang | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | KTVIC: A Vietnamese Image Captioning Dataset on the Life Domain | KTVIC:生命领域的越南图像字幕数据集 | Anh-Cuong Pham, Van-Quang Nguyen, Thi-Hong Vuong, Quang-Thuy Ha | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | Representation Learning on Event Stream via an Elastic Net-incorporated Tensor Network | 通过弹性网络结合的张量网络对事件流进行表示学习 | Beibei Yang, Weiling Li, Yan Fang | arxiv.org/pdf/2401.08… | null |
| 2024-01-16 | SCoFT: Self-Contrastive Fine-Tuning for Equitable Image Generation | SCoFT:自我对比微调以实现公平的图像生成 | Zhixuan Liu, Peter Schaldenbrand, Beverley-Claire Okogwu, Wenxuan Peng, Youngsik Yun, Andrew Hundt, Jihie Kim, Jean Oh | arxiv.org/pdf/2401.08… | null |