[分享][每日更新][2024.01.15][CV_arxiv_papers]

219 阅读10分钟

[UPDATED!] 2024-01-15 (Publish Time)

分类/检测/识别/分割

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-15Convolutional Neural Network Compression via Dynamic Parameter Rank Pruning通过动态参数秩剪枝的卷积神经网络压缩Manish Sharma, Jamison Heard, Eli Saber, Panos P. Markopoulosarxiv.org/pdf/2401.08…null
2024-01-15Jewelry Recognition via Encoder-Decoder Models通过编码器-解码器模型进行珠宝识别José M. Alcalde-Llergo, Enrique Yeguas-Bolívar, Andrea Zingoni, Alejandro Fuerte-Juradoarxiv.org/pdf/2401.08…null
2024-01-15How does self-supervised pretraining improve robustness against noisy labels across various medical image classification datasets?自监督预训练如何提高各种医学图像分类数据集中针对噪声标签的鲁棒性?Bidur Khanal, Binod Bhattarai, Bishesh Khanal, Cristian Lintearxiv.org/pdf/2401.07…null
2024-01-15Machine Perceptual Quality: Evaluating the Impact of Severe Lossy Compression on Audio and Image Models机器感知质量:评估严重有损压缩对音频和图像模型的影响Dan Jacobellis, Daniel Cummings, Neeraja J. Yadwadkararxiv.org/pdf/2401.07…null
2024-01-15Vertical Federated Image Segmentation垂直联合图像分割Paul K. Mandal, Cole Leoarxiv.org/pdf/2401.07…null
2024-01-15Machine Learning Based Object Tracking基于机器学习的对象跟踪Md Rakibul Karim Akanda, Joshua Reynolds, Treylin Jackson, Milijah Grayarxiv.org/pdf/2401.07…null
2024-01-15VeCAF: VLM-empowered Collaborative Active Finetuning with Training Objective AwarenessVeCAF:VLM 赋能的具有训练目标意识的协作主动微调Rongyu Zhang, Zefan Cai, Huanrui Yang, Zidong Liu, Denis Gudovskiy, Tomoyuki Okuno, Yohei Nakata, Kurt Keutzer, Baobao Chang, Yuan Du, et.al.arxiv.org/pdf/2401.07…null
2024-01-15Phenotyping calcification in vascular tissues using artificial intelligence使用人工智能对血管组织中的钙化进行表型分析Mehdi Ramezanpour, Anne M. Robertson, Yasutaka Tobe, Xiaowei Jia, Juan R. Cebralarxiv.org/pdf/2401.07…null
2024-01-15Pedestrian Detection in Low-Light Conditions: A Comprehensive Survey弱光条件下的行人检测:综合调查Bahareh Ghari, Ali Tourani, Asadollah Shahbahrami, Georgi Gaydadjievarxiv.org/pdf/2401.07…null
2024-01-15Fusing Echocardiography Images and Medical Records for Continuous Patient Stratification融合超声心动图图像和医疗记录以进行连续患者分层Nathan Painchaud, Pierre-Yves Courand, Pierre-Marc Jodoin, Nicolas Duchateau, Olivier Bernardarxiv.org/pdf/2401.07…null
2024-01-15Improving OCR Quality in 19th Century Historical Documents Using a Combined Machine Learning Based Approach使用基于机器学习的组合方法提高 19 世纪历史文档的 OCR 质量David Fleischhacker, Wolfgang Goederle, Roman Kernarxiv.org/pdf/2401.07…null
2024-01-15Seeing the Unseen: Visual Common Sense for Semantic Placement看到看不见的东西:语义放置的视觉常识Ram Ramrakhya, Aniruddha Kembhavi, Dhruv Batra, Zsolt Kira, Kuo-Hao Zeng, Luca Weihsarxiv.org/pdf/2401.07…null
2024-01-15DeepThalamus: A novel deep learning method for automatic segmentation of brain thalamic nuclei from multimodal ultra-high resolution MRIDeepThalamus:一种新颖的深度学习方法,用于从多模态超高分辨率 MRI 中自动分割大脑丘脑核Marina Ruiz-Perez, Sergio Morell-Ortega, Marien Gadea, Roberto Vivo-Hernando, Gregorio Rubio, Fernando Aparici, Mariam de la Iglesia-Vaya, Thomas Tourdias, Pierrick Coupé, José V. Manjónarxiv.org/pdf/2401.07…null
2024-01-15MaskClustering: View Consensus based Mask Graph Clustering for Open-Vocabulary 3D Instance SegmentationMaskClustering:用于开放词汇 3D 实例分割的基于视图共识的掩模图聚类Mi Yan, Jiazhao Zhang, Yan Zhu, He Wangarxiv.org/pdf/2401.07…null
2024-01-15Graph Transformer GANs with Graph Masked Modeling for Architectural Layout Generation用于生成架构布局的具有图形屏蔽建模的图形转换器 GANHao Tang, Ling Shao, Nicu Sebe, Luc Van Goolarxiv.org/pdf/2401.07…null
2024-01-15FiGCLIP: Fine-Grained CLIP Adaptation via Densely Annotated VideosFiGCLIP:通过密集注释视频进行细粒度 CLIP 适应Darshan Singh S, Zeeshan Khan, Makarand Tapaswiarxiv.org/pdf/2401.07…null
2024-01-15Foundation Models for Biomedical Image Segmentation: A Survey生物医学图像分割的基础模型:调查Ho Hin Lee, Yu Gu, Theodore Zhao, Yanbo Xu, Jianwei Yang, Naoto Usuyama, Cliff Wong, Mu Wei, Bennett A. Landman, Yuankai Huo, et.al.arxiv.org/pdf/2401.07…null
2024-01-15SwinTextSpotter v2: Towards Better Synergy for Scene Text SpottingSwinTextSpotter v2:实现更好的场景文本识别协同作用Mingxin Huang, Dezhi Peng, Hongliang Li, Zhenghao Peng, Chongyu Liu, Dahua Lin, Yuliang Liu, Xiang Bai, Lianwen Jinarxiv.org/pdf/2401.07…null
2024-01-15Fine-Grained Prototypes Distillation for Few-Shot Object Detection用于少样本目标检测的细粒度原型蒸馏Zichen Wang, Bo Yang, Haonan Yue, Zhenghao Maarxiv.org/pdf/2401.07…null
2024-01-15Collaboratively Self-supervised Video Representation Learning for Action Recognition用于动作识别的协作自监督视频表示学习Jie Zhang, Zhifan Wan, Lanqing Hu, Stephen Lin, Shuzhe Wu, Shiguang Shanarxiv.org/pdf/2401.07…null
2024-01-15Geo-locating Road Objects using Inverse Haversine Formula with NVIDIA Driveworks使用 NVIDIA Driveworks 的反半正弦公式对道路对象进行地理定位Mamoona Birkhez Shami, Gabriel Kiss, Trond Arve Haakonsen, Frank Lindsetharxiv.org/pdf/2401.07…null
2024-01-15PMFSNet: Polarized Multi-scale Feature Self-attention Network For Lightweight Medical Image SegmentationPMFSNet:用于轻量级医学图像分割的偏振多尺度特征自注意力网络Jiahui Zhong, Wenhong Tian, Yuanlun Xie, Zhijia Liu, Jie Ou, Taoran Tian, Lei Zhangarxiv.org/pdf/2401.07…null
2024-01-15Exploiting GPT-4 Vision for Zero-shot Point Cloud Understanding利用 GPT-4 视觉实现零样本点云理解Qi Sun, Xiao Cui, Wengang Zhou, Houqiang Liarxiv.org/pdf/2401.07…null
2024-01-15Combining Image- and Geometric-based Deep Learning for Shape Regression: A Comparison to Pixel-level Methods for Segmentation in Chest X-Ray结合基于图像和几何的深度学习进行形状回归:胸部 X 射线分割的像素级方法的比较Ron Keuth, Mattias Heinricharxiv.org/pdf/2401.07…null
2024-01-15MM-SAP: A Comprehensive Benchmark for Assessing Self-Awareness of Multimodal Large Language Models in PerceptionMM-SAP:评估感知中多模态大语言模型自我意识的综合基准Yuhao Wang, Yusheng Liao, Heyang Liu, Hongcheng Liu, Yu Wang, Yanfeng Wangarxiv.org/pdf/2401.07…null
2024-01-15Compositional Oil Spill Detection Based on Object Detector and Adapted Segment Anything Model from SAR Images基于目标检测器和 SAR 图像自适应分段任意模型的合成溢油检测Wenhui Wu, Man Sing Wong, Xinyu Yu, Guoqiang Shi, Coco Yin Tung Kwok, Kang Zouarxiv.org/pdf/2401.07…null
2024-01-15Harnessing Deep Learning and Satellite Imagery for Post-Buyout Land Cover Mapping利用深度学习和卫星图像进行收购后土地覆盖测绘Hakan T. Otal, Elyse Zavar, Sherri B. Binder, Alex Greer, M. Abdullah Canbazarxiv.org/pdf/2401.07…null
2024-01-15Robo-ABC: Affordance Generalization Beyond Categories via Semantic Correspondence for Robot ManipulationRobo-ABC:通过机器人操作的语义对应进行超越类别的可供性概括Yuanchen Ju, Kaizhe Hu, Guowei Zhang, Gu Zhang, Mingrun Jiang, Huazhe Xuarxiv.org/pdf/2401.07…null
2024-01-15CascadeV-Det: Cascade Point Voting for 3D Object DetectionCascadeV-Det:用于 3D 对象检测的级联点投票Yingping Liang, Ying Fuarxiv.org/pdf/2401.07…null
2024-01-15Semantic Segmentation in Multiple Adverse Weather Conditions with Domain Knowledge Retention具有领域知识保留的多种恶劣天气条件下的语义分割Xin Yang, Wending Yan, Yuan Yuan, Michael Bi Mi, Robby T. Tanarxiv.org/pdf/2401.07…null
2024-01-15BoNuS: Boundary Mining for Nuclei Segmentation with Partial Point LabelsBoNuS:使用部分点标签进行核分割的边界挖掘Yi Lin, Zeyu Wang, Dong Zhang, Kwang-Ting Cheng, Hao Chenarxiv.org/pdf/2401.07…null

模型压缩/优化

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-15A Deep Hierarchical Feature Sparse Framework for Occluded Person Re-Identification用于被遮挡人员重新识别的深层层次特征稀疏框架Yihu Song, Shuaishi Liuarxiv.org/pdf/2401.07…null

生成模型

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-15Towards A Better Metric for Text-to-Video Generation寻求更好的文本到视频生成指标Jay Zhangjie Wu, Guian Fang, Haoning Wu, Xintao Wang, Yixiao Ge, Xiaodong Cun, David Junhao Zhang, Jia-Wei Liu, Yuchao Gu, Rui Zhao, et.al.arxiv.org/pdf/2401.07…null
2024-01-15HexaGen3D: StableDiffusion is just one step away from Fast and Diverse Text-to-3D GenerationHexaGen3D:StableDiffusion 距离快速、多样化的文本到 3D 生成仅一步之遥Antoine Mercier, Ramin Nakhli, Mahesh Reddy, Rajeev Yasarla, Hong Cai, Fatih Porikli, Guillaume Bergerarxiv.org/pdf/2401.07…null
2024-01-15Towards Efficient Diffusion-Based Image Editing with Instant Attention Masks使用即时注意力蒙版实现高效的基于扩散的图像编辑Siyu Zou, Jiji Tang, Yiyi Zhou, Jing He, Chaoyi Zhao, Rongsheng Zhang, Zhipeng Hu, Xiaoshuai Sunarxiv.org/pdf/2401.07…null
2024-01-15Multimodal Crowd Counting with Pix2Pix GANs使用 Pix2Pix GAN 进行多模式人群计数Muhammad Asif Khan, Hamid Menouar, Ridha Hamilaarxiv.org/pdf/2401.07…null
2024-01-15InstantID: Zero-shot Identity-Preserving Generation in SecondsInstantID:几秒钟内零次身份保存生成Qixun Wang, Xu Bai, Haofan Wang, Zekui Qin, Anthony Chenarxiv.org/pdf/2401.07…null
2024-01-15Hierarchical Fashion Design with Multi-stage Diffusion Models多级扩散模型的分层时装设计Zhifeng Xie, Hao li, Huiming Ding, Mengtian Li, Ying Caoarxiv.org/pdf/2401.07…null
2024-01-15Cross Domain Early Crop Mapping using CropGAN and CNN Classifier使用 CropGAN 和 CNN 分类器进行跨域早期作物绘图Yiqun Wang, Hui Huang, Radu Statearxiv.org/pdf/2401.07…null

多模态

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-15M^{2}Fusion: Bayesian-based Multimodal Multi-level Fusion on Colorectal Cancer Microsatellite Instability PredictionM^{2}Fusion:基于贝叶斯的结直肠癌微卫星不稳定性预测多模态多级融合Quan Liu, Jiawen Yao, Lisha Yao, Xin Chen, Jingren Zhou, Le Lu, Ling Zhang, Zaiyi Liu, Yuankai Huoarxiv.org/pdf/2401.07…null
2024-01-15Uncovering the Full Potential of Visual Grounding Methods in VQA发掘 VQA 中视觉接地方法的全部潜力Daniel Reich, Tanja Schultzarxiv.org/pdf/2401.07…null
2024-01-15A Bi-Pyramid Multimodal Fusion Method for the Diagnosis of Bipolar Disorders用于诊断双相情感障碍的双金字塔多模态融合方法Guoxin Wang, Sheng Shi, Shan An, Fengmei Fan, Wenshu Ge, Qi Wang, Feng Yu, Zhiren Wangarxiv.org/pdf/2401.07…null
2024-01-15Bias-Conflict Sample Synthesis and Adversarial Removal Debias Strategy for Temporal Sentence Grounding in Video视频中时间句子扎根的偏差冲突样本合成和对抗性消除去偏差策略Zhaobo Qi, Yibo Yuan, Xiaowen Ruan, Shuhui Wang, Weigang Zhang, Qingming Huangarxiv.org/pdf/2401.07…null
2024-01-15One for All: Toward Unified Foundation Models for Earth Vision为所有人服务:迈向地球愿景的统一基础模型Zhitong Xiong, Yi Wang, Fahong Zhang, Xiao Xiang Zhuarxiv.org/pdf/2401.07…null

Transformer

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-15GD-CAF: Graph Dual-stream Convolutional Attention Fusion for Precipitation NowcastingGD-CAF:用于降水临近预报的图双流卷积注意力融合Lorand Vatamany, Siamak Mehrkanoonarxiv.org/pdf/2401.07…null
2024-01-15Transformer-based Video Saliency Prediction with High Temporal Dimension Decoding基于变压器的视频显着性预测与高时间维度解码Morteza Moradi, Simone Palazzo, Concetto Spampinatoarxiv.org/pdf/2401.07…null
2024-01-15Information hiding cameras: optical concealment of object information into ordinary images信息隐藏相机:将物体信息光学隐藏到普通图像中Bijie Bai, Ryan Lee, Yuhang Li, Tianyi Gan, Yuntian Wang, Mona Jarrahi, Aydogan Ozcanarxiv.org/pdf/2401.07…null
2024-01-15Exploring Masked Autoencoders for Sensor-Agnostic Image Retrieval in Remote Sensing探索用于遥感中与传感器无关的图像检索的掩模自动编码器Jakob Hackstein, Gencer Sumbul, Kai Norman Clasen, Begüm Demirarxiv.org/pdf/2401.07…null

3D/CG

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-15SSL-Interactions: Pretext Tasks for Interactive Trajectory PredictionSSL-Interactions:交互式轨迹预测的借口任务Prarthana Bhattacharyya, Chengjie Huang, Krzysztof Czarneckiarxiv.org/pdf/2401.07…null

各类学习方式

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-15Sparsity-based background removal for STORM super-resolution images基于稀疏性的 STORM 超分辨率图像背景去除Patris Valera, Josué Page Vizcaíno, Tobias Lasserarxiv.org/pdf/2401.07…null

其他

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-15Cesium Tiles for High-realism Simulation and Comparing SLAM Results in Corresponding Virtual and Real-world Environments用于高真实度模拟并比较相应虚拟和现实环境中 SLAM 结果的 Cesium TilesChris Beam, Jincheng Zhang, Nicholas Kakavitsas, Collin Hague, Artur Wolek, Andrew Willisarxiv.org/pdf/2401.07…null
2024-01-15Image Similarity using An Ensemble of Context-Sensitive Models使用上下文敏感模型集合进行图像相似度Zukang Liao, Min Chenarxiv.org/pdf/2401.07…null
2024-01-15Low-light Stereo Image Enhancement and De-noising in the Low-frequency Information Enhanced Image Space低频信息增强图像空间中的微光立体图像增强与去噪Minghua Zhao, Xiangdong Qin, Shuangli Du, Xuefei Bai, Jiahao Lyu, Yiguang Liuarxiv.org/pdf/2401.07…null
2024-01-15Curriculum for Crowd Counting -- Is it Worthy?人群计数课程——值得吗?Muhammad Asif Khan, Hamid Menouar, Ridha Hamilaarxiv.org/pdf/2401.07…null
2024-01-15PolMERLIN: Self-Supervised Polarimetric Complex SAR Image Despeckling with Masked NetworksPolMERLIN:使用掩模网络进行自监督偏振复合 SAR 图像去斑Shunya Kato, Masaki Saito, Katsuhiko Ishiguro, Sol Cummingsarxiv.org/pdf/2401.07…null
2024-01-15Concept-Guided Prompt Learning for Generalization in Vision-Language Models用于视觉语言模型泛化的概念引导即时学习Yi Zhang, Ce Zhang, Ke Yu, Yushun Tang, Zhihai Hearxiv.org/pdf/2401.07…null
2024-01-15Mask-adaptive Gated Convolution and Bi-directional Progressive Fusion Network for Depth Completion用于深度完成的掩模自适应门控卷积和双向渐进融合网络Tingxuan Huang, Jiacheng Miao, Shizhuo Deng, Tong, Dongyue Chenarxiv.org/pdf/2401.07…null
2024-01-15Improved Implicity Neural Representation with Fourier Bases Reparameterized Training通过傅里叶基重新参数化训练改进隐式神经表示Kexuan Shi, Xingyu Zhou, Shuhang Guarxiv.org/pdf/2401.07…null