[分享][每日更新][2024.01.31][CV_arxiv_papers]

218 阅读10分钟

[UPDATED!] 2024-01-31 (Publish Time)

分类/检测/识别/分割

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-31Improved Scene Landmark Detection for Camera Localization改进了相机定位的场景地标检测Tien Do, Sudipta N. Sinhaarxiv.org/pdf/2401.18…null
2024-01-31Benchmarking Sensitivity of Continual Graph Learning for Skeleton-Based Action Recognition基于骨架的动作识别的连续图学习的基准敏感性Wei Wei, Tom De Schepper, Kevin Metsarxiv.org/pdf/2401.18…null
2024-01-31Multilinear Operator Networks多线性算子网络Yixin Cheng, Grigorios G. Chrysos, Markos Georgopoulos, Volkan Cevherarxiv.org/pdf/2401.17…null
2024-01-31Shrub of a thousand faces: an individual segmentation from satellite images using deep learning千面灌木:使用深度学习对卫星图像进行单独分割Rohaifa Khaldi, Siham Tabik, Sergio Puertas-Ruiz, Julio Peñas de Giles, José Antonio Hódar Correa, Regino Zamora, Domingo Alcaraz Seguraarxiv.org/pdf/2401.17…null
2024-01-31Enhancing Multimodal Large Language Models with Vision Detection Models: An Empirical Study使用视觉检测模型增强多模态大语言模型:实证研究Qirui Jiao, Daoyuan Chen, Yilun Huang, Yaliang Li, Ying Shenarxiv.org/pdf/2401.17…null
2024-01-31MelNet: A Real-Time Deep Learning Algorithm for Object DetectionMelNet:一种用于目标检测的实时深度学习算法Yashar Azadvatan, Murat Kurtarxiv.org/pdf/2401.17…null
2024-01-31HyperZ\cdotZ\cdotW Operator Connects Slow-Fast Networks for Full Context InteractionHyperZ\cdotZ\cdotW 运算符连接慢速网络以实现全上下文交互Harvie Zhangarxiv.org/pdf/2401.17…null
2024-01-31Source-free Domain Adaptive Object Detection in Remote Sensing Images遥感图像中的无源域自适应目标检测Weixing Liu, Jun Liu, Xin Su, Han Nie, Bin Luoarxiv.org/pdf/2401.17…null
2024-01-31Hi-SAM: Marrying Segment Anything Model for Hierarchical Text SegmentationHi-SAM:结合 Segment Anything 模型进行分层文本分割Maoyuan Ye, Jing Zhang, Juhua Liu, Chenyu Liu, Baocai Yin, Cong Liu, Bo Du, Dacheng Taoarxiv.org/pdf/2401.17…null
2024-01-31PVLR: Prompt-driven Visual-Linguistic Representation Learning for Multi-Label Image RecognitionPVLR:用于多标签图像识别的提示驱动的视觉语言表示学习Hao Tan, Zichang Tan, Jun Li, Jun Wan, Zhen Leiarxiv.org/pdf/2401.17…null
2024-01-31AEROBLADE: Training-Free Detection of Latent Diffusion Images Using Autoencoder Reconstruction ErrorAEROBLADE:使用自动编码器重建误差对潜在扩散图像进行免训练检测Jonas Ricker, Denis Lukovnikov, Asja Fischerarxiv.org/pdf/2401.17…null
2024-01-31VR-based generation of photorealistic synthetic data for training hand-object tracking models基于 VR 生成逼真的合成数据,用于训练手部物体跟踪模型Chengyan Zhang, Rahul Chaudhariarxiv.org/pdf/2401.17…null
2024-01-31Convolution Meets LoRA: Parameter Efficient Finetuning for Segment Anything Model卷积遇见 LoRA:分段任意模型的参数高效微调Zihan Zhong, Zhiqiang Tang, Tong He, Haoyang Fang, Chun Yuanarxiv.org/pdf/2401.17…null
2024-01-31Semantic Anything in 3D Gaussians3D 高斯中的任何语义Xu Hu, Yuxi Wang, Lue Fan, Junsong Fan, Junran Peng, Zhen Lei, Qing Li, Zhaoxiang Zhangarxiv.org/pdf/2401.17…null
2024-01-31Instruction-Guided Scene Text Recognition指令引导的场景文本识别Yongkun Du, Zhineng Chen, Yuchen Su, Caiyan Jia, Yu-Gang Jiangarxiv.org/pdf/2401.17…null
2024-01-31Leveraging Swin Transformer for Local-to-Global Weakly Supervised Semantic Segmentation利用 Swin Transformer 进行本地到全局弱监督语义分割Rozhan Ahmadi, Shohreh Kasaeiarxiv.org/pdf/2401.17…null
2024-01-31Do Object Detection Localization Errors Affect Human Performance and Trust?对象检测定位错误会影响人类表现和信任吗?Sven de Witte, Ombretta Strafforello, Jan van Gemertarxiv.org/pdf/2401.17…null
2024-01-31SimAda: A Simple Unified Framework for Adapting Segment Anything Model in Underperformed ScenesSimAda:一个简单的统一框架,用于在表现不佳的场景中调整分段任意模型Yiran Song, Qianyu Zhou, Xuequan Lu, Zhiwen Shao, Lizhuang Maarxiv.org/pdf/2401.17…null
2024-01-31Tiered approach for rapid damage characterisation of infrastructure enabled by remote sensing and deep learning technologies利用遥感和深度学习技术快速表征基础设施损坏的分层方法Nadiia Kopiika, Andreas Karavias, Pavlos Krassakis, Zehao Ye, Jelena Ninic, Nataliya Shakhovska, Nikolaos Koukouzas, Sotirios Argyroudis, Stergios-Aristoteles Mitoulisarxiv.org/pdf/2401.17…null
2024-01-31Leveraging Human-Machine Interactions for Computer Vision Dataset Quality Enhancement利用人机交互提高计算机视觉数据集质量Esla Timothy Anzaku, Hyesoo Hong, Jin-Woo Park, Wonjun Yang, Kangmin Kim, JongBum Won, Deshika Vinoshani Kumari Herath, Arnout Van Messem, Wesley De Nevearxiv.org/pdf/2401.17…null
2024-01-31Unified Physical-Digital Face Attack Detection统一的物理-数字人脸攻击检测Hao Fang, Ajian Liu, Haocheng Yuan, Junze Zheng, Dingheng Zeng, Yanhong Liu, Jiankang Deng, Sergio Escalera, Xiaoming Liu, Jun Wan, et.al.arxiv.org/pdf/2401.17…null
2024-01-31Datacube segmentation via Deep Spectral Clustering通过深度谱聚类进行数据立方分割Alessandro Bombini, Fernando García-Avello Bofías, Caterina Bracci, Michele Ginolfi, Chiara Rubertoarxiv.org/pdf/2401.17…null
2024-01-31All Beings Are Equal in Open Set Recognition开集认识中众生平等Chaohua Li, Enhao Zhang, Chuanxing Geng, SongCan Chenarxiv.org/pdf/2401.17…null
2024-01-31Computation and Parameter Efficient Multi-Modal Fusion Transformer for Cued Speech Recognition用于提示语音识别的计算和参数高效的多模态融合变压器Lei Liu, Li Liu, Haizhou Liarxiv.org/pdf/2401.17…null
2024-01-31Good at captioning, bad at counting: Benchmarking GPT-4V on Earth observation data擅长字幕,不擅长计数:在地球观测数据上对 GPT-4V 进行基准测试Chenhui Zhang, Sherrie Wangarxiv.org/pdf/2401.17…link
2024-01-31Head and Neck Tumor Segmentation from [18F]F-FDG PET/CT Images Based on 3D Diffusion Model基于 3D 扩散模型的 [18F]F-FDG PET/CT 图像的头颈肿瘤分割Yafei Dong, Kuang Gongarxiv.org/pdf/2401.17…null
2024-01-31Local Feature Matching Using Deep Learning: A Survey使用深度学习进行局部特征匹配:调查Shibiao Xu, Shunpeng Chen, Rongtao Xu, Changwei Wang, Peng Lu, Li Guoarxiv.org/pdf/2401.17…null
2024-01-31Towards Image Semantics and Syntax Sequence Learning迈向图像语义和句法序列学习Chun Tao, Timur Ibrayev, Kaushik Royarxiv.org/pdf/2401.17…null

模型压缩/优化

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-31Trainable Fixed-Point Quantization for Deep Learning Acceleration on FPGAs用于 FPGA 上深度学习加速的可训练定点量化Dingyi Dai, Yichi Zhang, Jiahao Zhang, Zhanqiu Hu, Yaohui Cai, Qi Sun, Zhiru Zhangarxiv.org/pdf/2401.17…null

生成模型

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-31Motion Guidance: Diffusion-Based Image Editing with Differentiable Motion Estimators运动引导:使用可微运动估计器进行基于扩散的图像编辑Daniel Geng, Andrew Owensarxiv.org/pdf/2401.18…null
2024-01-31CARFF: Conditional Auto-encoded Radiance Field for 3D Scene ForecastingCARFF:用于 3D 场景预测的条件自动编码辐射场Jiezhi Yang, Khushi Desai, Charles Packer, Harshil Bhatia, Nicholas Rhinehart, Rowan McAllister, Joseph Gonzalezarxiv.org/pdf/2401.18…null
2024-01-31Advances in 3D Generation: A Survey3D 生成的进展:调查Xiaoyu Li, Qi Zhang, Di Kang, Weihao Cheng, Yiming Gao, Jingbo Zhang, Zhihao Liang, Jing Liao, Yan-Pei Cao, Ying Shanarxiv.org/pdf/2401.17…null
2024-01-31Double InfoGAN for Contrastive Analysis双InfoGAN进行对比分析Florence Carton, Robin Louiset, Pietro Goriarxiv.org/pdf/2401.17…null
2024-01-313D-Plotting Algorithm for Insects using YOLOv5使用 YOLOv5 的昆虫 3D 绘图算法Daisuke Mori, Hiroki Hayami, Yasufumi Fujimoto, Isao Gotoarxiv.org/pdf/2401.17…null
2024-01-31Image Anything: Towards Reasoning-coherent and Training-free Multi-modal Image GenerationImage Anything:迈向推理连贯且免训练的多模态图像生成Yuanhuiyi Lyu, Xu Zheng, Lin Wangarxiv.org/pdf/2401.17…null
2024-01-31Spatial-and-Frequency-aware Restoration method for Images based on Diffusion Models基于扩散模型的图像空间和频率感知恢复方法Kyungsung Lee, Donggyu Lee, Myungjoo Kangarxiv.org/pdf/2401.17…null
2024-01-31Topology-Aware Latent Diffusion for 3D Shape Generation用于生成 3D 形状的拓扑感知潜在扩散Jiangbei Hu, Ben Fei, Baixin Xu, Fei Hou, Weidong Yang, Shengfa Wang, Na Lei, Chen Qian, Ying Hearxiv.org/pdf/2401.17…null
2024-01-31Task-Oriented Diffusion Model Compression面向任务的扩散模型压缩Geonung Kim, Beomsu Kim, Eunhyeok Park, Sunghyun Choarxiv.org/pdf/2401.17…null

多模态

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-31Binding Touch to Everything: Learning Unified Multimodal Tactile Representations将触摸与一切结合起来:学习统一的多模态触觉表征Fengyu Yang, Chao Feng, Ziyang Chen, Hyoungseob Park, Daniel Wang, Yiming Dou, Ziyao Zeng, Xien Chen, Rit Gangopadhyay, Andrew Owens, et.al.arxiv.org/pdf/2401.18…null
2024-01-31Controllable Dense Captioner with Multimodal Embedding Bridging具有多模式嵌入桥接的可控密集字幕器Yuzhong Zhao, Yue Liu, Zonghao Guo, Weijia Wu, Chen Gong, Qixiang Ye, Fang Wanarxiv.org/pdf/2401.17…null
2024-01-31Proximity QA: Unleashing the Power of Multi-Modal Large Language Models for Spatial Proximity Analysis邻近 QA:释放多模态大型语言模型的力量进行空间邻近分析Jianing Li, Xi Nan, Ming Lu, Li Du, Shanghang Zhangarxiv.org/pdf/2401.17…null
2024-01-31M2-RAAP: A Multi-Modal Recipe for Advancing Adaptation-based Pre-training towards Effective and Efficient Zero-shot Video-text RetrievalM2-RAAP:一种多模式方法,用于推进基于适应的预训练,实现有效且高效的零样本视频文本检索Xingning Dong, Zipeng Feng, Chunluan Zhou, Xuzheng Yu, Ming Yang, Qingpei Guoarxiv.org/pdf/2401.17…null
2024-01-31SNP-S3: Shared Network Pre-training and Significant Semantic Strengthening for Various Video-Text TasksSNP-S3:各种视频文本任务的共享网络预训练和显着语义强化Xingning Dong, Qingpei Guo, Tian Gan, Qing Wang, Jianlong Wu, Xiangyuan Ren, Yuan Cheng, Wei Chuarxiv.org/pdf/2401.17…null

Transformer

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-31DROP: Decouple Re-Identification and Human Parsing with Task-specific Features for Occluded Person Re-identificationDROP:将重新识别和人体解析与特定于任务的特征分离以进行被遮挡人员重新识别Shuguang Dou, Xiangyang Jiang, Yuanpeng Tu, Junyao Gao, Zefan Qu, Qingsong Zhao, Cairong Zhaoarxiv.org/pdf/2401.18…null
2024-01-31LaneGraph2Seq: Lane Topology Extraction with Language Model via Vertex-Edge Encoding and Connectivity EnhancementLaneGraph2Seq:通过点边编码和连接增强使用语言模型提取车道拓扑Renyuan Peng, Xinyue Cai, Hang Xu, Jiachen Lu, Feng Wen, Wei Zhang, Li Zhangarxiv.org/pdf/2401.17…null

Nerf

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-31ReplaceAnything3D:Text-Guided 3D Scene Editing with Compositional Neural Radiance FieldsReplaceAnything3D:使用组合神经辐射场进行文本引导的 3D 场景编辑Edward Bartrum, Thu Nguyen-Phuoc, Chris Xie, Zhengqin Li, Numair Khan, Armen Avetisyan, Douglas Lanman, Lei Xiaoarxiv.org/pdf/2401.17…null

各类学习方式

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-31Fine-Grained Zero-Shot Learning: Advances, Challenges, and Prospects细粒度零样本学习:进展、挑战和前景Jingcai Guo, Zhijie Rao, Song Guo, Jingren Zhou, Dacheng Taoarxiv.org/pdf/2401.17…null

图像理解

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-31Exploring the Common Appearance-Boundary Adaptation for Nighttime Optical Flow探索夜间光流的常见外观边界适应Hanyu Zhou, Yi Chang, Haoyue Liu, Wending Yan, Yuxing Duan, Zhiwei Shi, Luxin Yanarxiv.org/pdf/2401.17…null

其他

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-31Reimagining Reality: A Comprehensive Survey of Video Inpainting Techniques重新想象现实:视频修复技术的综合调查Shreyank N Gowda, Yash Thakre, Shashank Narayana Gowda, Xiaobo Jinarxiv.org/pdf/2401.17…null
2024-01-31RADIN: Souping on a BudgetRADIN:预算中的汤Thibaut Menes, Olivier Risser-Maroixarxiv.org/pdf/2401.17…null
2024-01-31Robustly overfitting latents for flexible neural image compression鲁棒地过度拟合潜在的灵活神经图像压缩Yura Perugachi-Diaz, Arwin Gansekoele, Sandjai Bhulaiarxiv.org/pdf/2401.17…null
2024-01-31COMET: Contrastive Mean Teacher for Online Source-Free Universal Domain AdaptationCOMET:在线无源通用域适应的对比平均老师Pascal Schlachter, Bin Yangarxiv.org/pdf/2401.17…null
2024-01-31Unveiling the Power of Self-supervision for Multi-view Multi-human Association and Tracking揭示多视角多人关联和跟踪的自我监督力量Wei Feng, Feifan Wang, Ruize Han, Zekun Qian, Song Wangarxiv.org/pdf/2401.17…null
2024-01-31Agile But Safe: Learning Collision-Free High-Speed Legged Locomotion敏捷但安全:学习无碰撞高速腿式运动Tairan He, Chong Zhang, Wenli Xiao, Guanqi He, Changliu Liu, Guanya Shiarxiv.org/pdf/2401.17…null
2024-01-31Is Registering Raw Tagged-MR Enough for Strain Estimation in the Era of Deep Learning?注册Raw Tagged-MR足以用于深度学习时代的应变估计吗?Zhangxing Bian, Ahmed Alshareef, Shuwen Wei, Junyu Chen, Yuli Wang, Jonghye Woo, Dzung L. Pham, Jiachen Zhuo, Aaron Carass, Jerry L. Princearxiv.org/pdf/2401.17…null
2024-01-31Data-Effective Learning: A Comprehensive Medical Benchmark数据有效学习:综合医学基准Wenxuan Yang, Weimin Tan, Yuqi Sun, Bo Yanarxiv.org/pdf/2401.17…null