[分享][每日更新][2024.01.19][CV_arxiv_papers]

138 阅读9分钟

[UPDATED!] 2024-01-19 (Publish Time)

分类/检测/识别/分割

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-19Event detection from novel data sources: Leveraging satellite imagery alongside GPS traces来自新颖数据源的事件检测:利用卫星图像和 GPS 轨迹Ekin Ugurel, Steffen Coenen, Minda Zhou Chen, Cynthia Chenarxiv.org/pdf/2401.10…null
2024-01-19ActAnywhere: Subject-Aware Video Background GenerationActAnywhere:主题感知视频背景生成Boxiao Pan, Zhan Xu, Chun-Hao Paul Huang, Krishna Kumar Singh, Yang Zhou, Leonidas J. Guibas, Jimei Yangarxiv.org/pdf/2401.10…null
2024-01-19RAD-DINO: Exploring Scalable Medical Image Encoders Beyond Text SupervisionRAD-DINO:探索文本监督之外的可扩展医学图像编码器Fernando Pérez-García, Harshita Sharma, Sam Bond-Taylor, Kenza Bouzid, Valentina Salvatelli, Maximilian Ilse, Shruthi Bannur, Daniel C. Castro, Anton Schwaighofer, Matthew P. Lungren, et.al.arxiv.org/pdf/2401.10…null
2024-01-19Measuring the Impact of Scene Level Objects on Object Detection: Towards Quantitative Explanations of Detection Decisions测量场景级物体对物体检测的影响:对检测决策的定量解释Lynn Vonder Haar, Timothy Elvira, Luke Newcomb, Omar Ochoaarxiv.org/pdf/2401.10…null
2024-01-19HiCD: Change Detection in Quality-Varied Images via Hierarchical Correlation DistillationHiCD:通过分层相关蒸馏对质量变化的图像进行变化检测Chao Pang, Xingxing Weng, Jiang Wu, Qiang Wang, Gui-Song Xiaarxiv.org/pdf/2401.10…null
2024-01-19Character Recognition in Byzantine Seals with Deep Neural Networks使用深度神经网络进行拜占庭印章中的字符识别Théophile Rageau, Laurence Likforman-Sulem, Attilio Fiandrotti, Victoria Eyharabide, Béatrice Caseau, Jean-Claude Cheynetarxiv.org/pdf/2401.10…null
2024-01-19Removal and Selection: Improving RGB-Infrared Object Detection via Coarse-to-Fine Fusion移除和选择:通过粗到精融合改进 RGB 红外物体检测Tianyi Zhao, Maoxun Yuan, Xingxing Weiarxiv.org/pdf/2401.10…null
2024-01-19BadODD: Bangladeshi Autonomous Driving Object Detection DatasetBadODD:孟加拉国自动驾驶物体检测数据集Mirza Nihal Baig, Rony Hajong, Mahdi Murshed Patwary, Mohammad Shahidur Rahman, Husne Ara Chowdhuryarxiv.org/pdf/2401.10…null
2024-01-19Towards Universal Unsupervised Anomaly Detection in Medical Imaging迈向医学成像中普遍的无监督异常检测Cosmin I. Bercea, Benedikt Wiestler, Daniel Rueckert, Julia A. Schnabelarxiv.org/pdf/2401.10…null
2024-01-19MAEDiff: Masked Autoencoder-enhanced Diffusion Models for Unsupervised Anomaly Detection in Brain ImagesMAEDiff:用于脑图像中无监督异常检测的掩模自动编码器增强扩散模型Rui Xu, Yunke Wang, Bo Duarxiv.org/pdf/2401.10…null
2024-01-19Symbol as Points: Panoptic Symbol Spotting via Point-based Representation符号作为点:通过基于点的表示进行全景符号识别Wenlong Liu, Tianyu Yang, Yuhan Wang, Qizhi Yu, Lei Zhangarxiv.org/pdf/2401.10…null
2024-01-19I-SplitEE: Image classification in Split Computing DNNs with Early ExitsI-SplitEE:早期退出的分割计算 DNN 中的图像分类Divya Jyoti Bajpai, Aastha Jaiswal, Manjesh Kumar Hanawalarxiv.org/pdf/2401.10…null
2024-01-19Focaler-IoU: More Focused Intersection over Union LossFocaler-IoU:更集中于联合损失的交叉点Hao Zhang, Shuaijie Zhangarxiv.org/pdf/2401.10…null
2024-01-19Exploring Color Invariance through Image-Level Ensemble Learning通过图像级集成学习探索颜色不变性Yunpeng Gong, Jiaquan Li, Lifei Chen, Min Jiangarxiv.org/pdf/2401.10…null
2024-01-19Enhancing medical vision-language contrastive learning via inter-matching relation modelling通过相互匹配关系建模增强医学视觉语言对比学习Mingjian Li, Mingyuan Meng, Michael Fulham, David Dagan Feng, Lei Bi, Jinman Kimarxiv.org/pdf/2401.10…null

模型压缩/优化

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-19On mitigating stability-plasticity dilemma in CLIP-guided image morphing via geodesic distillation loss通过测地线蒸馏损失缓解 CLIP 引导图像变形中的稳定性-可塑性困境Yeongtak Oh, Saehyung Lee, Uiwon Hwang, Sungroh Yoonarxiv.org/pdf/2401.10…null

生成模型

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-19Synthesizing Moving People with 3D Control通过 3D 控制合成移动人物Boyi Li, Jathushan Rajasegaran, Yossi Gandelsman, Alexei A. Efros, Jitendra Malikarxiv.org/pdf/2401.10…null
2024-01-19Source-Free and Image-Only Unsupervised Domain Adaptation for Category Level Object Pose Estimation用于类别级物体姿态估计的无源和仅图像无监督域适应Prakhar Kaushik, Aayush Mishra, Adam Kortylewski, Alan Yuillearxiv.org/pdf/2401.10…null
2024-01-19Sat2Scene: 3D Urban Scene Generation from Satellite Images with DiffusionSat2Scene:通过扩散卫星图像生成 3D 城市场景Zuoyue Li, Zhenqiang Li, Zhaopeng Cui, Marc Pollefeys, Martin R. Oswaldarxiv.org/pdf/2401.10…null
2024-01-19Dream360: Diverse and Immersive Outdoor Virtual Scene Creation via Transformer-Based 360 Image OutpaintingDream360:通过基于 Transformer 的 360 度图像绘制创建多样化、身临其境的户外虚拟场景Hao Ai, Zidong Cao, Haonan Lu, Chen Chen, Jian Ma, Pengyuan Zhou, Tae-Kyun Kim, Pan Hui, Lin Wangarxiv.org/pdf/2401.10…null

多模态

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-19Tool-LMM: A Large Multi-Modal Model for Tool Agent LearningTool-LMM:用于工具代理学习的大型多模态模型Chenyu Wang, Weixin Luo, Qianyu Chen, Haonan Mai, Jindi Guo, Sixun Dong, Xiaohua, Xuan, Zhengxin Li, Lin Ma, et.al.arxiv.org/pdf/2401.10…null
2024-01-19Q&A Prompts: Discovering Rich Visual Clues through Mining Question-Answer Prompts for VQA requiring Diverse World Knowledge问答提示:通过挖掘问答提示发现丰富的视觉线索,VQA需要多元化的世界知识Haibi Wang, Weifeng Gearxiv.org/pdf/2401.10…null
2024-01-19Weakly Supervised Gaussian Contrastive Grounding with Large Multimodal Models for Video Question Answering用于视频问答的大型多模态模型的弱监督高斯对比基础Haibo Wang, Chenghang Lai, Yixuan Sun, Weifeng Gearxiv.org/pdf/2401.10…null
2024-01-19DGL: Dynamic Global-Local Prompt Tuning for Text-Video RetrievalDGL:用于文本视频检索的动态全局局部提示调整Xiangpeng Yang, Linchao Zhu, Xiaohan Wang, Yi Yangarxiv.org/pdf/2401.10…null
2024-01-19Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image SequencesMementos:图像序列多模态大语言模型推理的综合基准Xiyao Wang, Yuhang Zhou, Xiaoyu Liu, Hongjin Lu, Yuancheng Xu, Feihong He, Jaehong Yoon, Taixi Lu, Gedas Bertasius, Mohit Bansal, et.al.arxiv.org/pdf/2401.10…null

Transformer

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-19The Cadaver in the Machine: The Social Practices of Measurement and Validation in Motion Capture Technology机器中的尸体:运动捕捉技术测量和验证的社会实践Emma Harvey, Hauke Sandhaus, Abigail Z. Jacobs, Emanuel Moss, Mona Sloanearxiv.org/pdf/2401.10…null
2024-01-19Motion Consistency Loss for Monocular Visual Odometry with Attention-Based Deep Learning基于注意力的深度学习的单目视觉里程计的运动一致性损失André O. Françani, Marcos R. O. A. Maximoarxiv.org/pdf/2401.10…null
2024-01-19Understanding Video Transformers via Universal Concept Discovery通过通用概念发现了解视频转换器Matthew Kowal, Achal Dave, Rares Ambrus, Adrien Gaidon, Konstantinos G. Derpanis, Pavel Tokmakovarxiv.org/pdf/2401.10…null
2024-01-19M2ORT: Many-To-One Regression Transformer for Spatial Transcriptomics Prediction from Histopathology ImagesM2ORT:用于根据组织病理学图像进行空间转录组预测的多对一回归变压器Hongyi Wang, Xiuju Du, Jing Liu, Shuyi Ouyang, Yen-Wei Chen, Lanfen Linarxiv.org/pdf/2401.10…null
2024-01-19Learning Position-Aware Implicit Neural Network for Real-World Face Inpainting学习用于现实世界面部修复的位置感知隐式神经网络Bo Zhao, Huan Yang, Jianlong Fuarxiv.org/pdf/2401.10…null
2024-01-19NWPU-MOC: A Benchmark for Fine-grained Multi-category Object Counting in Aerial ImagesNWPU-MOC:航空图像中细粒度多类别物体计数的基准Junyu Gao, Liangliang Zhao, Xuelong Liarxiv.org/pdf/2401.10…null

3D/CG

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-19SCENES: Subpixel Correspondence Estimation With Epipolar Supervision场景:具有极线监督的子像素对应估计Dominik A. Kloepfer, João F. Henriques, Dylan Campbellarxiv.org/pdf/2401.10…null
2024-01-19Dense 3D Reconstruction Through Lidar: A Comparative Study on Ex-vivo Porcine Tissue通过激光雷达进行密集 3D 重建:离体猪组织的比较研究Guido Caccianiga, Julian Nubert, Marco Hutter, Katherine J. Kuchenbeckerarxiv.org/pdf/2401.10…null
2024-01-19MixNet: Towards Effective and Efficient UHD Low-Light Image EnhancementMixNet:迈向有效且高效的超高清低光图像增强Chen Wu, Zhuoran Zheng, Xiuyi Jia, Wenqi Renarxiv.org/pdf/2401.10…null
2024-01-193D Shape Completion on Unseen Categories:A Weakly-supervised Approach看不见的类别的 3D 形状补全:弱监督方法Lintai Wu, Junhui Hou, Linqi Song, Yong Xuarxiv.org/pdf/2401.10…null

各类学习方式

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-19Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data深入一切:释放大规模未标记数据的力量Lihe Yang, Bingyi Kang, Zilong Huang, Xiaogang Xu, Jiashi Feng, Hengshuang Zhaoarxiv.org/pdf/2401.10…link
2024-01-19A Comprehensive Survey on Deep-Learning-based Vehicle Re-Identification: Models, Data Sets and Challenges基于深度学习的车辆重新识别的全面调查:模型、数据集和挑战Ali Amiri, Aydin Kaya, Ali Seydi Keceliarxiv.org/pdf/2401.10…null

其他

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-19Learning to Visually Connect Actions and their Effects学习以视觉方式连接动作及其效果Eric Peh, Paritosh Parmar, Basura Fernandoarxiv.org/pdf/2401.10…null
2024-01-19Determination of efficiency indicators of the stand for intelligent control of manual operations in industrial production工业生产中手动操作智能控制工位效率指标的确定Anton Sergeev, Victor Minchenkov, Aleksei Soldatovarxiv.org/pdf/2401.10…null
2024-01-19NN-VVC: Versatile Video Coding boosted by self-supervisedly learned image coding for machinesNN-VVC:通过机器自监督学习图像编码推动多功能视频编码Jukka I. Ahonen, Nam Le, Honglei Zhang, Antti Hallapuro, Francesco Cricri, Hamed Rezazadegan Tavakoli, Miska M. Hannuksela, Esa Rahtuarxiv.org/pdf/2401.10…null
2024-01-19Bridging the gap between image coding for machines and humans弥合机器和人类图像编码之间的差距Nam Le, Honglei Zhang, Francesco Cricri, Ramin G. Youvalari, Hamed Rezazadegan Tavakoli, Emre Aksu, Miska M. Hannuksela, Esa Rahtuarxiv.org/pdf/2401.10…null
2024-01-19A comprehensive study on fidelity metrics for XAIXAI 保真度指标的综合研究Miquel Miró-Nicolau, Antoni Jaume-i-Capó, Gabriel Moyà-Alcoverarxiv.org/pdf/2401.10…null
2024-01-19Polytopic Autoencoders with Smooth Clustering for Reduced-order Modelling of Flows用于流降阶建模的具有平滑聚类的多面自编码器Jan Heiland, Yongho Kimarxiv.org/pdf/2401.10…null
2024-01-19360ORB-SLAM: A Visual SLAM System for Panoramic Images with Depth Completion Network360ORB-SLAM:具有深度补全网络的全景图像视觉SLAM系统Yichen Chen, Yiqi Pan, Ruyu Liu, Haoyu Zhang, Guodao Zhang, Bo Sun, Jianhua Zhangarxiv.org/pdf/2401.10…null
2024-01-19GMC-IQA: Exploiting Global-correlation and Mean-opinion Consistency for No-reference Image Quality AssessmentGMC-IQA:利用全局相关性和平均意见一致性进行无参考图像质量评估Zewen Chen, Juan Wang, Bing Li, Chunfeng Yuan, Weiming Hu, Junxian Liu, Peng Li, Yan Wang, Youqun Zhang, Congxuan Zhangarxiv.org/pdf/2401.10…null
2024-01-19CBVS: A Large-Scale Chinese Image-Text Benchmark for Real-World Short Video Search ScenariosCBVS:真实世界短视频搜索场景的大规模中文图文基准Xiangshuo Qiao, Xianxin Li, Xiaozhe Qu, Jie Zhang, Yang Liu, Yu Luo, Cihang Jin, Jin Maarxiv.org/pdf/2401.10…null
2024-01-19LDReg: Local Dimensionality Regularized Self-Supervised LearningLDReg:局部维度正则化自监督学习Hanxun Huang, Ricardo J. G. B. Campello, Sarah Monazam Erfani, Xingjun Ma, Michael E. Houle, James Baileyarxiv.org/pdf/2401.10…link
2024-01-19Learning to Robustly Reconstruct Low-light Dynamic Scenes from Spike Streams学习从尖峰流中稳健地重建低光动态场景Liwen Hu, Ziluo Ding, Mianzhi Liu, Lei Ma, Tiejun Huangarxiv.org/pdf/2401.10…null
2024-01-19Path Choice Matters for Clear Attribution in Path Methods路径选择对于路径方法中的清晰归因至关重要Borui Zhang, Wenzhao Zheng, Jie Zhou, Jiwen Luarxiv.org/pdf/2401.10…null