[UPDATED!] 2024-01-19 (Publish Time)
分类/检测/识别/分割
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-19 | Event detection from novel data sources: Leveraging satellite imagery alongside GPS traces | 来自新颖数据源的事件检测:利用卫星图像和 GPS 轨迹 | Ekin Ugurel, Steffen Coenen, Minda Zhou Chen, Cynthia Chen | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | ActAnywhere: Subject-Aware Video Background Generation | ActAnywhere:主题感知视频背景生成 | Boxiao Pan, Zhan Xu, Chun-Hao Paul Huang, Krishna Kumar Singh, Yang Zhou, Leonidas J. Guibas, Jimei Yang | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | RAD-DINO: Exploring Scalable Medical Image Encoders Beyond Text Supervision | RAD-DINO:探索文本监督之外的可扩展医学图像编码器 | Fernando Pérez-García, Harshita Sharma, Sam Bond-Taylor, Kenza Bouzid, Valentina Salvatelli, Maximilian Ilse, Shruthi Bannur, Daniel C. Castro, Anton Schwaighofer, Matthew P. Lungren, et.al. | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | Measuring the Impact of Scene Level Objects on Object Detection: Towards Quantitative Explanations of Detection Decisions | 测量场景级物体对物体检测的影响:对检测决策的定量解释 | Lynn Vonder Haar, Timothy Elvira, Luke Newcomb, Omar Ochoa | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | HiCD: Change Detection in Quality-Varied Images via Hierarchical Correlation Distillation | HiCD:通过分层相关蒸馏对质量变化的图像进行变化检测 | Chao Pang, Xingxing Weng, Jiang Wu, Qiang Wang, Gui-Song Xia | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | Character Recognition in Byzantine Seals with Deep Neural Networks | 使用深度神经网络进行拜占庭印章中的字符识别 | Théophile Rageau, Laurence Likforman-Sulem, Attilio Fiandrotti, Victoria Eyharabide, Béatrice Caseau, Jean-Claude Cheynet | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | Removal and Selection: Improving RGB-Infrared Object Detection via Coarse-to-Fine Fusion | 移除和选择:通过粗到精融合改进 RGB 红外物体检测 | Tianyi Zhao, Maoxun Yuan, Xingxing Wei | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | BadODD: Bangladeshi Autonomous Driving Object Detection Dataset | BadODD:孟加拉国自动驾驶物体检测数据集 | Mirza Nihal Baig, Rony Hajong, Mahdi Murshed Patwary, Mohammad Shahidur Rahman, Husne Ara Chowdhury | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | Towards Universal Unsupervised Anomaly Detection in Medical Imaging | 迈向医学成像中普遍的无监督异常检测 | Cosmin I. Bercea, Benedikt Wiestler, Daniel Rueckert, Julia A. Schnabel | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | MAEDiff: Masked Autoencoder-enhanced Diffusion Models for Unsupervised Anomaly Detection in Brain Images | MAEDiff:用于脑图像中无监督异常检测的掩模自动编码器增强扩散模型 | Rui Xu, Yunke Wang, Bo Du | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | Symbol as Points: Panoptic Symbol Spotting via Point-based Representation | 符号作为点:通过基于点的表示进行全景符号识别 | Wenlong Liu, Tianyu Yang, Yuhan Wang, Qizhi Yu, Lei Zhang | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | I-SplitEE: Image classification in Split Computing DNNs with Early Exits | I-SplitEE:早期退出的分割计算 DNN 中的图像分类 | Divya Jyoti Bajpai, Aastha Jaiswal, Manjesh Kumar Hanawal | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | Focaler-IoU: More Focused Intersection over Union Loss | Focaler-IoU:更集中于联合损失的交叉点 | Hao Zhang, Shuaijie Zhang | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | Exploring Color Invariance through Image-Level Ensemble Learning | 通过图像级集成学习探索颜色不变性 | Yunpeng Gong, Jiaquan Li, Lifei Chen, Min Jiang | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | Enhancing medical vision-language contrastive learning via inter-matching relation modelling | 通过相互匹配关系建模增强医学视觉语言对比学习 | Mingjian Li, Mingyuan Meng, Michael Fulham, David Dagan Feng, Lei Bi, Jinman Kim | arxiv.org/pdf/2401.10… | null |
模型压缩/优化
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-19 | On mitigating stability-plasticity dilemma in CLIP-guided image morphing via geodesic distillation loss | 通过测地线蒸馏损失缓解 CLIP 引导图像变形中的稳定性-可塑性困境 | Yeongtak Oh, Saehyung Lee, Uiwon Hwang, Sungroh Yoon | arxiv.org/pdf/2401.10… | null |
生成模型
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-19 | Synthesizing Moving People with 3D Control | 通过 3D 控制合成移动人物 | Boyi Li, Jathushan Rajasegaran, Yossi Gandelsman, Alexei A. Efros, Jitendra Malik | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | Source-Free and Image-Only Unsupervised Domain Adaptation for Category Level Object Pose Estimation | 用于类别级物体姿态估计的无源和仅图像无监督域适应 | Prakhar Kaushik, Aayush Mishra, Adam Kortylewski, Alan Yuille | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | Sat2Scene: 3D Urban Scene Generation from Satellite Images with Diffusion | Sat2Scene:通过扩散卫星图像生成 3D 城市场景 | Zuoyue Li, Zhenqiang Li, Zhaopeng Cui, Marc Pollefeys, Martin R. Oswald | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | Dream360: Diverse and Immersive Outdoor Virtual Scene Creation via Transformer-Based 360 Image Outpainting | Dream360:通过基于 Transformer 的 360 度图像绘制创建多样化、身临其境的户外虚拟场景 | Hao Ai, Zidong Cao, Haonan Lu, Chen Chen, Jian Ma, Pengyuan Zhou, Tae-Kyun Kim, Pan Hui, Lin Wang | arxiv.org/pdf/2401.10… | null |
多模态
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-19 | Tool-LMM: A Large Multi-Modal Model for Tool Agent Learning | Tool-LMM:用于工具代理学习的大型多模态模型 | Chenyu Wang, Weixin Luo, Qianyu Chen, Haonan Mai, Jindi Guo, Sixun Dong, Xiaohua, Xuan, Zhengxin Li, Lin Ma, et.al. | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | Q&A Prompts: Discovering Rich Visual Clues through Mining Question-Answer Prompts for VQA requiring Diverse World Knowledge | 问答提示:通过挖掘问答提示发现丰富的视觉线索,VQA需要多元化的世界知识 | Haibi Wang, Weifeng Ge | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | Weakly Supervised Gaussian Contrastive Grounding with Large Multimodal Models for Video Question Answering | 用于视频问答的大型多模态模型的弱监督高斯对比基础 | Haibo Wang, Chenghang Lai, Yixuan Sun, Weifeng Ge | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval | DGL:用于文本视频检索的动态全局局部提示调整 | Xiangpeng Yang, Linchao Zhu, Xiaohan Wang, Yi Yang | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences | Mementos:图像序列多模态大语言模型推理的综合基准 | Xiyao Wang, Yuhang Zhou, Xiaoyu Liu, Hongjin Lu, Yuancheng Xu, Feihong He, Jaehong Yoon, Taixi Lu, Gedas Bertasius, Mohit Bansal, et.al. | arxiv.org/pdf/2401.10… | null |
Transformer
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-19 | The Cadaver in the Machine: The Social Practices of Measurement and Validation in Motion Capture Technology | 机器中的尸体:运动捕捉技术测量和验证的社会实践 | Emma Harvey, Hauke Sandhaus, Abigail Z. Jacobs, Emanuel Moss, Mona Sloane | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | Motion Consistency Loss for Monocular Visual Odometry with Attention-Based Deep Learning | 基于注意力的深度学习的单目视觉里程计的运动一致性损失 | André O. Françani, Marcos R. O. A. Maximo | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | Understanding Video Transformers via Universal Concept Discovery | 通过通用概念发现了解视频转换器 | Matthew Kowal, Achal Dave, Rares Ambrus, Adrien Gaidon, Konstantinos G. Derpanis, Pavel Tokmakov | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | M2ORT: Many-To-One Regression Transformer for Spatial Transcriptomics Prediction from Histopathology Images | M2ORT:用于根据组织病理学图像进行空间转录组预测的多对一回归变压器 | Hongyi Wang, Xiuju Du, Jing Liu, Shuyi Ouyang, Yen-Wei Chen, Lanfen Lin | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | Learning Position-Aware Implicit Neural Network for Real-World Face Inpainting | 学习用于现实世界面部修复的位置感知隐式神经网络 | Bo Zhao, Huan Yang, Jianlong Fu | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | NWPU-MOC: A Benchmark for Fine-grained Multi-category Object Counting in Aerial Images | NWPU-MOC:航空图像中细粒度多类别物体计数的基准 | Junyu Gao, Liangliang Zhao, Xuelong Li | arxiv.org/pdf/2401.10… | null |
3D/CG
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-19 | SCENES: Subpixel Correspondence Estimation With Epipolar Supervision | 场景:具有极线监督的子像素对应估计 | Dominik A. Kloepfer, João F. Henriques, Dylan Campbell | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | Dense 3D Reconstruction Through Lidar: A Comparative Study on Ex-vivo Porcine Tissue | 通过激光雷达进行密集 3D 重建:离体猪组织的比较研究 | Guido Caccianiga, Julian Nubert, Marco Hutter, Katherine J. Kuchenbecker | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | MixNet: Towards Effective and Efficient UHD Low-Light Image Enhancement | MixNet:迈向有效且高效的超高清低光图像增强 | Chen Wu, Zhuoran Zheng, Xiuyi Jia, Wenqi Ren | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | 3D Shape Completion on Unseen Categories:A Weakly-supervised Approach | 看不见的类别的 3D 形状补全:弱监督方法 | Lintai Wu, Junhui Hou, Linqi Song, Yong Xu | arxiv.org/pdf/2401.10… | null |
各类学习方式
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-19 | Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data | 深入一切:释放大规模未标记数据的力量 | Lihe Yang, Bingyi Kang, Zilong Huang, Xiaogang Xu, Jiashi Feng, Hengshuang Zhao | arxiv.org/pdf/2401.10… | link |
| 2024-01-19 | A Comprehensive Survey on Deep-Learning-based Vehicle Re-Identification: Models, Data Sets and Challenges | 基于深度学习的车辆重新识别的全面调查:模型、数据集和挑战 | Ali Amiri, Aydin Kaya, Ali Seydi Keceli | arxiv.org/pdf/2401.10… | null |
其他
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-19 | Learning to Visually Connect Actions and their Effects | 学习以视觉方式连接动作及其效果 | Eric Peh, Paritosh Parmar, Basura Fernando | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | Determination of efficiency indicators of the stand for intelligent control of manual operations in industrial production | 工业生产中手动操作智能控制工位效率指标的确定 | Anton Sergeev, Victor Minchenkov, Aleksei Soldatov | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | NN-VVC: Versatile Video Coding boosted by self-supervisedly learned image coding for machines | NN-VVC:通过机器自监督学习图像编码推动多功能视频编码 | Jukka I. Ahonen, Nam Le, Honglei Zhang, Antti Hallapuro, Francesco Cricri, Hamed Rezazadegan Tavakoli, Miska M. Hannuksela, Esa Rahtu | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | Bridging the gap between image coding for machines and humans | 弥合机器和人类图像编码之间的差距 | Nam Le, Honglei Zhang, Francesco Cricri, Ramin G. Youvalari, Hamed Rezazadegan Tavakoli, Emre Aksu, Miska M. Hannuksela, Esa Rahtu | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | A comprehensive study on fidelity metrics for XAI | XAI 保真度指标的综合研究 | Miquel Miró-Nicolau, Antoni Jaume-i-Capó, Gabriel Moyà-Alcover | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | Polytopic Autoencoders with Smooth Clustering for Reduced-order Modelling of Flows | 用于流降阶建模的具有平滑聚类的多面自编码器 | Jan Heiland, Yongho Kim | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | 360ORB-SLAM: A Visual SLAM System for Panoramic Images with Depth Completion Network | 360ORB-SLAM:具有深度补全网络的全景图像视觉SLAM系统 | Yichen Chen, Yiqi Pan, Ruyu Liu, Haoyu Zhang, Guodao Zhang, Bo Sun, Jianhua Zhang | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | GMC-IQA: Exploiting Global-correlation and Mean-opinion Consistency for No-reference Image Quality Assessment | GMC-IQA:利用全局相关性和平均意见一致性进行无参考图像质量评估 | Zewen Chen, Juan Wang, Bing Li, Chunfeng Yuan, Weiming Hu, Junxian Liu, Peng Li, Yan Wang, Youqun Zhang, Congxuan Zhang | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | CBVS: A Large-Scale Chinese Image-Text Benchmark for Real-World Short Video Search Scenarios | CBVS:真实世界短视频搜索场景的大规模中文图文基准 | Xiangshuo Qiao, Xianxin Li, Xiaozhe Qu, Jie Zhang, Yang Liu, Yu Luo, Cihang Jin, Jin Ma | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | LDReg: Local Dimensionality Regularized Self-Supervised Learning | LDReg:局部维度正则化自监督学习 | Hanxun Huang, Ricardo J. G. B. Campello, Sarah Monazam Erfani, Xingjun Ma, Michael E. Houle, James Bailey | arxiv.org/pdf/2401.10… | link |
| 2024-01-19 | Learning to Robustly Reconstruct Low-light Dynamic Scenes from Spike Streams | 学习从尖峰流中稳健地重建低光动态场景 | Liwen Hu, Ziluo Ding, Mianzhi Liu, Lei Ma, Tiejun Huang | arxiv.org/pdf/2401.10… | null |
| 2024-01-19 | Path Choice Matters for Clear Attribution in Path Methods | 路径选择对于路径方法中的清晰归因至关重要 | Borui Zhang, Wenzhao Zheng, Jie Zhou, Jiwen Lu | arxiv.org/pdf/2401.10… | null |