[分享][每日更新][2024.01.11][CV_arxiv_papers]

232 阅读9分钟

!UPDATED -- 2024-01-11

分类/检测/识别/分割

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-11Automatic UAV-based Airport Pavement Inspection Using Mixed Real and Virtual Scenarios使用混合真实和虚拟场景的基于无人机的自动机场路面检查Pablo Alonso, Jon Ander Iñiguez de Gordoa, Juan Diego Ortega, Sara García, Francisco Javier Iriarte, Marcos Nietoarxiv.org/pdf/2401.06…null
2024-01-11Attention to detail: inter-resolution knowledge distillation关注细节:分辨率间知识蒸馏Rocío del Amor, Julio Silva-Rodríguez, Adrián Colomer, Valery Naranjoarxiv.org/pdf/2401.06…null
2024-01-11Sea ice detection using concurrent multispectral and synthetic aperture radar imagery使用并发多光谱和合成孔径雷达图像进行海冰探测Martin S J Rogers, Maria Fox, Andrew Fleming, Louisa van Zeeland, Jeremy Wilkinson, J. Scott Hoskingarxiv.org/pdf/2401.06…null
2024-01-11Body-Area Capacitive or Electric Field Sensing for Human Activity Recognition and Human-Computer Interaction: A Comprehensive Survey用于人类活动识别和人机交互的身体区域电容或电场感应:综合调查Sizhen Bian, Mengxi Liu, Bo Zhou, Paul Lukowicz, Michele Magnoarxiv.org/pdf/2401.06…null
2024-01-11CoSSegGaussians: Compact and Swift Scene Segmenting 3D GaussiansCoSSegGaussians:紧凑且快速的场景分割 3D 高斯Bin Dou, Tianyu Zhang, Yongjia Ma, Zhaohui Wang, Zejian Yuanarxiv.org/pdf/2401.05…null
2024-01-11PartSTAD: 2D-to-3D Part Segmentation Task AdaptationPartSTAD:2D 到 3D 零件分割任务适配Hyunjin Kim, Minhyuk Sungarxiv.org/pdf/2401.05…null
2024-01-11Implications of Noise in Resistive Memory on Deep Neural Networks for Image Classification电阻存储器中的噪声对图像分类深度神经网络的影响Yannick Emonds, Kai Xi, Holger Fröningarxiv.org/pdf/2401.05…null
2024-01-11Learn From Zoom: Decoupled Supervised Contrastive Learning For WCE Image Classification向 Zoom 学习:WCE 图像分类的解耦监督对比学习Kunpeng Qiu, Zhiying Zhou, Yongxin Guoarxiv.org/pdf/2401.05…null
2024-01-11Evaluating Data Augmentation Techniques for Coffee Leaf Disease Classification评估咖啡叶病分类的数据增强技术Adrian Gheorghiu, Iulian-Marius Tăiatu, Dumitru-Clementin Cercel, Iuliana Marin, Florin Poparxiv.org/pdf/2401.05…null
2024-01-11LKCA: Large Kernel Convolutional AttentionLKCA:大核卷积注意力Chenghao Li, Boheng Zeng, Yi Lu, Pengbo Shi, Qingzi Chen, Jirui Liu, Lingyun Zhuarxiv.org/pdf/2401.05…null
2024-01-11Video Anomaly Detection and Explanation via Large Language Models通过大型语言模型进行视频异常检测和解释Hui Lv, Qianru Sunarxiv.org/pdf/2401.05…null
2024-01-11HiCMAE: Hierarchical Contrastive Masked Autoencoder for Self-Supervised Audio-Visual Emotion RecognitionHiCMAE:用于自监督视听情感识别的分层对比屏蔽自动编码器Licai Sun, Zheng Lian, Bin Liu, Jianhua Taoarxiv.org/pdf/2401.05…null
2024-01-11Exploring Self- and Cross-Triplet Correlations for Human-Object Interaction Detection探索人机交互检测的自三元组相关性和互三元组相关性Weibo Jiang, Weihong Ren, Jiandong Tian, Liangqiong Qu, Zhiyong Wang, Honghai Liuarxiv.org/pdf/2401.05…null
2024-01-11Masked Attribute Description Embedding for Cloth-Changing Person Re-identification用于换衣人员重新识别的屏蔽属性描述嵌入Chunlei Peng, Boyu Wang, Decheng Liu, Nannan Wang, Ruimin Hu, Xinbo Gaoarxiv.org/pdf/2401.05…null
2024-01-11MatSAM: Efficient Materials Microstructure Extraction via Visual Large ModelMatSAM:通过视觉大模型高效提取材料微观结构Changtai Li, Xu Han, Chao Yao, Xiaojuan Banarxiv.org/pdf/2401.05…null
2024-01-11REBUS: A Robust Evaluation Benchmark of Understanding SymbolsREBUS:理解符号的稳健评估基准Andrew Gritsevskiy, Arjun Panickssery, Aaron Kirtland, Derik Kauffman, Hans Gundlach, Irina Gritsevskaya, Joe Cavanagh, Jonathan Chiang, Lydia La Roux, Michelle Hungarxiv.org/pdf/2401.05…link
2024-01-11Nucleus subtype classification using inter-modality learning使用跨模态学习进行细胞核亚型分类Lucas W. Remedios, Shunxing Bao, Samuel W. Remedios, Ho Hin Lee, Leon Y. Cai, Thomas Li, Ruining Deng, Can Cui, Jia Li, Qi Liu, et.al.arxiv.org/pdf/2401.05…null

模型压缩/优化

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-11E![^{2}]()GAN: Efficient Training of Efficient GANs for Image-to-Image TranslationE![^{2}]()GAN:用于图像到图像翻译的高效 GAN 的高效训练Yifan Gong, Zheng Zhan, Qing Jin, Yanyu Li, Yerlan Idelbayev, Xian Liu, Andrey Zharkov, Kfir Aberman, Sergey Tulyakov, Yanzhi Wang, et.al.arxiv.org/pdf/2401.06…null
2024-01-11PALP: Prompt Aligned Personalization of Text-to-Image ModelsPALP:文本到图像模型的快速对齐个性化Moab Arar, Andrey Voynov, Amir Hertz, Omri Avrahami, Shlomi Fruchter, Yael Pritch, Daniel Cohen-Or, Ariel Shamirarxiv.org/pdf/2401.06…null
2024-01-11TRIPS: Trilinear Point Splatting for Real-Time Radiance Field RenderingTRIPS:用于实时辐射场渲染的三线性点溅射Linus Franke, Darius Rückert, Laura Fink, Marc Stammingerarxiv.org/pdf/2401.06…null
2024-01-11A Lightweight Feature Fusion Architecture For Resource-Constrained Crowd Counting用于资源受限人群计数的轻量级特征融合架构Yashwardhan Chaudhuri, Ankit Kumar, Orchid Chetia Phukan, Arun Balaji Buduruarxiv.org/pdf/2401.05…null

生成模型

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-11RAVEN: Rethinking Adversarial Video Generation with Efficient Tri-plane NetworksRAVEN:重新思考使用高效三平面网络的对抗性视频生成Partha Ghosh, Soubhik Sanyal, Cordelia Schmid, Bernhard Schölkopfarxiv.org/pdf/2401.06…null
2024-01-11GE-AdvGAN: Improving the transferability of adversarial samples by gradient editing-based adversarial generative modelGE-AdvGAN:通过基于梯度编辑的对抗生成模型提高对抗样本的可转移性Zhiyu Zhu, Huaming Chen, Xinyi Wang, Jiayu Zhang, Zhibo Jin, Kim-Kwang Raymond Chooarxiv.org/pdf/2401.06…null
2024-01-11How does the primate brain combine generative and discriminative computations in vision?灵长类动物的大脑如何将视觉中的生成计算和判别计算结合起来?Benjamin Peters, James J. DiCarlo, Todd Gureckis, Ralf Haefner, Leyla Isik, Joshua Tenenbaum, Talia Konkle, Thomas Naselaris, Kimberly Stachenfeld, Zenna Tavares, et.al.arxiv.org/pdf/2401.06…null
2024-01-11An attempt to generate new bridge types from latent space of PixelCNN尝试从 PixelCNN 的潜在空间生成新的桥类型Hongjun Zhangarxiv.org/pdf/2401.05…link
2024-01-11Efficient Image Deblurring Networks based on Diffusion Models基于扩散模型的高效图像去模糊网络Kang Chen, Yuanjie Liuarxiv.org/pdf/2401.05…null
2024-01-11HiCAST: Highly Customized Arbitrary Style Transfer with Adapter Enhanced Diffusion ModelsHiCAST:高度定制的任意风格转移,带有适配器增强扩散模型Hanzhang Wang, Haoran Wang, Jinze Yang, Zhongrui Yu, Zeke Xie, Lei Tian, Xinyan Xiao, Junjun Jiang, Xianming Liu, Mingming Sunarxiv.org/pdf/2401.05…null
2024-01-11EraseDiff: Erasing Data Influence in Diffusion ModelsEraseDiff:消除扩散模型中的数据影响Jing Wu, Trung Le, Munawar Hayat, Mehrtash Harandiarxiv.org/pdf/2401.05…null

多模态

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-11LEGO:Language Enhanced Multi-modal Grounding Model乐高:语言增强多模式接地模型Zhaowei Li, Qi Xu, Dong Zhang, Hang Song, Yiqing Cai, Qi Qi, Ran Zhou, Junting Pan, Zefeng Li, Van Tu Vu, et.al.arxiv.org/pdf/2401.06…null
2024-01-11Hallucination Benchmark in Medical Visual Question Answering医学视觉问答中的幻觉基准Jinge Wu, Yunsoo Kim, Honghan Wuarxiv.org/pdf/2401.05…null

Transformer

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-11Surface Normal Estimation with Transformers使用 Transformer 进行表面法线估计Barry Shichen Hu, Siyun Liang, Johannes Paetzold, Huy H. Nguyen, Isao Echizen, Jiapeng Tangarxiv.org/pdf/2401.05…null
2024-01-11Object-Centric Diffusion for Efficient Video Editing以对象为中心的扩散,实现高效视频编辑Kumara Kahatapitiya, Adil Karjauv, Davide Abati, Fatih Porikli, Yuki M. Asano, Amirhossein Habibianarxiv.org/pdf/2401.05…null
2024-01-11Transforming Image Super-Resolution: A ConvFormer-based Efficient Approach变换图像超分辨率:一种基于 ConvFormer 的高效方法Gang Wu, Junjun Jiang, Junpeng Jiang, Xianming Liuarxiv.org/pdf/2401.05…null

Nerf

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-11Fast High Dynamic Range Radiance Fields for Dynamic Scenes适用于动态场景的快速高动态范围辐射场Guanjun Wu, Taoran Yi, Jiemin Fang, Wenyu Liu, Xinggang Wangarxiv.org/pdf/2401.06…null
2024-01-11GO-NeRF: Generating Virtual Objects in Neural Radiance FieldsGO-NeRF:在神经辐射场中生成虚拟对象Peng Dai, Feitong Tan, Xin Yu, Yinda Zhang, Xiaojuan Qiarxiv.org/pdf/2401.05…null

3DGS

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-11Gaussian Shadow Casting for Neural Characters神经角色的高斯阴影投射Luis Bolanos, Shih-Yang Su, Helge Rhodinarxiv.org/pdf/2401.06…null

3D/CG

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-11Dubbing for Everyone: Data-Efficient Visual Dubbing using Neural Rendering Priors适合所有人的配音:使用神经渲染先验进行数据高效的视觉配音Jack Saunders, Vinay Namboodiriarxiv.org/pdf/2401.06…null
2024-01-11MatSynth: A Modern PBR Materials DatasetMatSynth:现代 PBR 材料数据集Giuseppe Vecchio, Valentin Deschaintrearxiv.org/pdf/2401.06…null
2024-01-11Surgical-DINO: Adapter Learning of Foundation Model for Depth Estimation in Endoscopic SurgerySurgical-DINO:内窥镜手术深度估计基础模型的适配器学习Cui Beilei, Islam Mobarakol, Bai Long, Ren Hongliangarxiv.org/pdf/2401.06…null
2024-01-11UAVD4L: A Large-Scale Dataset for UAV 6-DoF LocalizationUAVD4L:无人机 6 自由度定位的大规模数据集Rouwan Wu, Xiaoya Cheng, Juelin Zhu, Xuxiang Liu, Maojun Zhang, Shen Yanarxiv.org/pdf/2401.05…null
2024-01-11LiDAR data acquisition and processing for ecology applications用于生态应用的激光雷达数据采集和处理Ion Ciobotari, Adriana Príncipe, Maria Alexandra Oliveira, João Nuno Silvaarxiv.org/pdf/2401.05…null

各类学习方式

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-11Distilling Vision-Language Models on Millions of Videos从数百万个视频中提取视觉语言模型Yue Zhao, Long Zhao, Xingyi Zhou, Jialin Wu, Chun-Te Chu, Hui Miao, Florian Schroff, Hartwig Adam, Ting Liu, Boqing Gong, et.al.arxiv.org/pdf/2401.06…null
2024-01-11ConKeD: Multiview contrastive descriptor learning for keypoint-based retinal image registrationConKeD:基于关键点的视网膜图像配准的多视图对比描述符学习David Rivas-Villar, Álvaro S. Hervella, José Rouco, Jorge Novoarxiv.org/pdf/2401.05…null
2024-01-11Enhancing Contrastive Learning with Efficient Combinatorial Positive Pairing通过有效的组合正配对增强对比学习Jaeill Kim, Duhun Hwang, Eunjung Lee, Jangwon Suh, Jimyeong Kim, Wonjong Rheearxiv.org/pdf/2401.05…null

其他

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-11Manipulating Feature Visualizations with Gradient Slingshots使用渐变弹弓操作特征可视化Dilyara Bareeva, Marina M. -C. Höhne, Alexander Warnecke, Lukas Pirch, Klaus-Robert Müller, Konrad Rieck, Kirill Bykovarxiv.org/pdf/2401.06…null
2024-01-11MGARD: A multigrid framework for high-performance, error-controlled data compression and refactoringMGARD:用于高性能、错误控制数据压缩和重构的多重网格框架Qian Gong, Jieyang Chen, Ben Whitney, Xin Liang, Viktor Reshniak, Tania Banerjee, Jaemoon Lee, Anand Rangarajan, Lipeng Wan, Nicolas Vidal, et.al.arxiv.org/pdf/2401.05…null
2024-01-11YOIO: You Only Iterate Once by mining and fusing multiple necessary global information in the optical flow estimationYOIO:通过在光流估计中挖掘和融合多个必要的全局信息,您只需迭代一次Yu Jing, Tan Yujuan, Ren Ao, Liu Duoarxiv.org/pdf/2401.05…null
2024-01-11On the representation and methodology for wide and short range head pose estimation宽短程头部姿态估计的表示和方法Alejandro Cobo, Roberto Valle, José M. Buenaposada, Luis Baumelaarxiv.org/pdf/2401.05…null
2024-01-11CLIP-Driven Semantic Discovery Network for Visible-Infrared Person Re-IdentificationCLIP 驱动的语义发现网络,用于可见红外人员重新识别Xiaoyan Yu, Neng Dong, Liehuang Zhu, Hao Peng, Dapeng Taoarxiv.org/pdf/2401.05…null
2024-01-11Knowledge Translation: A New Pathway for Model Compression知识翻译:模型压缩的新途径Wujie Sun, Defang Chen, Jiawei Chen, Yan Feng, Chun Chen, Can Wangarxiv.org/pdf/2401.05…null
2024-01-11Learning Generalizable Models via Disentangling Spurious and Enhancing Potential Correlations通过消除虚假和增强潜在相关性来学习可推广模型Na Wang, Lei Qi, Jintao Guo, Yinghuan Shi, Yang Gaoarxiv.org/pdf/2401.05…null
2024-01-11Self Expanding Convolutional Neural Networks自扩展卷积神经网络Blaise Appolinary, Alex Deaconu, Sophia Yangarxiv.org/pdf/2401.05…null
2024-01-11Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image GenerationParrot:用于文本到图像生成的帕累托最优多奖励强化学习框架Seung Hyun Lee, Yinxiao Li, Junjie Ke, Innfarn Yoo, Han Zhang, Jiahui Yu, Qifei Wang, Fei Deng, Glenn Entis, Junfeng He, et.al.arxiv.org/pdf/2401.05…null
2024-01-11Face-GPS: A Comprehensive Technique for Quantifying Facial Muscle Dynamics in VideosFace-GPS:量化视频中面部肌肉动态的综合技术Juni Kim, Zhikang Dong, Pawel Polakarxiv.org/pdf/2401.05…null