[分享][每日更新][2024.01.12][CV_arxiv_papers]

208 阅读7分钟

[UPDATED!] 2024-01-12 (Publish Time)

分类/检测/识别/分割

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-12Seeing the roads through the trees: A benchmark for modeling spatial dependencies with aerial imagery透过树林看路:利用航空图像对空间依赖性进行建模的基准Caleb Robinson, Isaac Corley, Anthony Ortiz, Rahul Dodhia, Juan M. Lavista Ferres, Peyman Najafiradarxiv.org/pdf/2401.06…null
2024-01-12Scalable 3D Panoptic Segmentation With Superpoint Graph Clustering具有超点图聚类的可扩展 3D 全景分割Damien Robert, Hugo Raguet, Loic Landrieuarxiv.org/pdf/2401.06…null
2024-01-12Embedded Planogram Compliance Control System嵌入式货架合规控制系统M. Erkin Yücel, Serkan Topaloğlu, Cem Ünsalanarxiv.org/pdf/2401.06…null
2024-01-12Adversarial Examples are Misaligned in Diffusion Model Manifolds扩散模型流形中的对抗性示例未对齐Peter Lorenz, Ricard Durall, Jansi Keuperarxiv.org/pdf/2401.06…null
2024-01-12Motion2VecSets: 4D Latent Vector Set Diffusion for Non-rigid Shape Reconstruction and TrackingMotion2VecSets:用于非刚性形状重建和跟踪的 4D 潜在向量集扩散Wei Cao, Chang Luo, Biao Zhang, Matthias Nießner, Jiapeng Tangarxiv.org/pdf/2401.06…null
2024-01-12Dynamic Behaviour of Connectionist Speech Recognition with Strong Latency Constraints具有强延迟约束的联结主义语音识别的动态行为Giampiero Salviarxiv.org/pdf/2401.06…null
2024-01-12Resource-Efficient Gesture Recognition using Low-Resolution Thermal Camera via Spiking Neural Networks and Sparse Segmentation通过尖峰神经网络和稀疏分割,使用低分辨率热像仪进行资源高效的手势识别Ali Safa, Wout Mommen, Lars Keuninckxarxiv.org/pdf/2401.06…null
2024-01-12Multimodal Learning for detecting urban functional zones using remote sensing image and multi-semantic information利用遥感图像和多语义信息检测城市功能区的多模态学习Chuanji Shi, Yingying Zhang, Jiaotuan Wang, Qiqi Zhuarxiv.org/pdf/2401.06…null
2024-01-12Enhancing Consistency and Mitigating Bias: A Data Replay Approach for Incremental Learning增强一致性并减少偏差:增量学习的数据重放方法Chenyang Wang, Junjun Jiang, Xingyu Hu, Xianming Liu, Xiangyang Jiarxiv.org/pdf/2401.06…null
2024-01-12Optimizing Feature Selection for Binary Classification with Noisy Labels: A Genetic Algorithm Approach优化带有噪声标签的二元分类的特征选择:遗传算法方法Vandad Imani, Elaheh Moradi, Carlos Sevilla-Salcedo, Vittorio Fortino, Jussi Tohkaarxiv.org/pdf/2401.06…null
2024-01-12Robustness-Aware 3D Object Detection in Autonomous Driving: A Review and Outlook自动驾驶中的鲁棒性感知 3D 物体检测:回顾与展望Ziying Song, Lin Liu, Feiyang Jia, Yadan Luo, Guoxin Zhang, Lei Yang, Li Wang, Caiyan Jiaarxiv.org/pdf/2401.06…null
2024-01-12Exploring Diverse Representations for Open Set Recognition探索开集识别的多样化表示Yu Wang, Junxian Mu, Pengfei Zhu, Qinghua Huarxiv.org/pdf/2401.06…null
2024-01-12Frequency Masking for Universal Deepfake Detection用于通用 Deepfake 检测的频率掩蔽Chandler Timm Doloriel, Ngai-Man Cheungarxiv.org/pdf/2401.06…null
2024-01-12Improving the Detection of Small Oriented Objects in Aerial Images改进航空图像中小定向物体的检测Chandler Timm C. Doloriel, Rhandley D. Cajotearxiv.org/pdf/2401.06…null
2024-01-12Fully Automated Tumor Segmentation for Brain MRI data using Multiplanner UNet使用 Multiplanner UNet 对脑 MRI 数据进行全自动肿瘤分割Sumit Pandey, Satyasaran Changdar, Mathias Perslev, Erik B Damarxiv.org/pdf/2401.06…null
2024-01-12Self-supervised Learning of Dense Hierarchical Representations for Medical Image Segmentation医学图像分割的密集层次表示的自监督学习Eytan Kats, Jochen G. Hirsch, Mattias P. Heinricharxiv.org/pdf/2401.06…null
2024-01-12Improving Low-Light Image Recognition Performance Based on Image-adaptive Learnable Module基于图像自适应学习模块提高弱光图像识别性能Seitaro Ono, Yuka Ogino, Takahiro Toizumi, Atsushi Ito, Masato Tsukadaarxiv.org/pdf/2401.06…null
2024-01-12UMG-CLIP: A Unified Multi-Granularity Vision Generalist for Open-World UnderstandingUMG-CLIP:用于理解开放世界的统一多粒度视觉通才Bowen Shi, Peisen Zhao, Zichen Wang, Yuhang Zhang, Yaoming Wang, Jin Li, Wenrui Dai, Junni Zou, Hongkai Xiong, Qi Tian, et.al.arxiv.org/pdf/2401.06…null
2024-01-12SD-MVS: Segmentation-Driven Deformation Multi-View Stereo with Spherical Refinement and EM optimizationSD-MVS:具有球面细化和 EM 优化的分段驱动变形多视图立体Zhenlong Yuan, Jiakai Cao, Zhaoxin Li, Hao Jiang, Zhaoqi Wangarxiv.org/pdf/2401.06…null
2024-01-12SamLP: A Customized Segment Anything Model for License Plate DetectionSamLP:用于车牌检测的定制分段任意模型Haoxuan Ding, Junyu Gao, Yuan Yuan, Qi Wangarxiv.org/pdf/2401.06…null
2024-01-12Graph Relation Distillation for Efficient Biomedical Instance Segmentation用于高效生物医学实例分割的图关系蒸馏Xiaoyu Liu, Yueyi Zhang, Zhiwei Xiong, Wei Huang, Bo Hu, Xiaoyan Sun, Feng Wuarxiv.org/pdf/2401.06…null
2024-01-12AffordanceLLM: Grounding Affordance from Vision Language ModelsAffordanceLLM:视觉语言模型的基础可供性Shengyi Qian, Weifeng Chen, Min Bai, Xiong Zhou, Zhuowen Tu, Li Erran Liarxiv.org/pdf/2401.06…null

模型压缩/优化

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-12Mutual Distillation Learning For Person Re-Identification用于人员重新识别的相互蒸馏学习Huiyuan Fu, Kuilong Cui, Chuanming Wang, Mengshi Qi, Huadong Maarxiv.org/pdf/2401.06…null
2024-01-12UPDP: A Unified Progressive Depth Pruner for CNN and Vision TransformerUPDP:CNN 和 Vision Transformer 的统一渐进深度剪枝器Ji Liu, Dehua Tang, Yuanxian Huang, Li Zhang, Xiaocheng Zeng, Dong Li, Mingjie Lu, Jinzhang Peng, Yu Wang, Fan Jiang, et.al.arxiv.org/pdf/2401.06…null

生成模型

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-12Decoupling Pixel Flipping and Occlusion Strategy for Consistent XAI Benchmarks解耦像素翻转和遮挡策略以实现一致的 XAI 基准Stefan Blücher, Johanna Vielhaben, Nils Strodthoffarxiv.org/pdf/2401.06…null
2024-01-12360DVD: Controllable Panorama Video Generation with 360-Degree Video Diffusion Model360DVD:具有 360 度视频扩散模型的可控全景视频生成Qian Wang, Weiqi Li, Chong Mou, Xinhua Cheng, Jian Zhangarxiv.org/pdf/2401.06…null
2024-01-12RotationDrag: Point-based Image Editing with Rotated Diffusion FeaturesRotationDrag:具有旋转扩散功能的基于点的图像编辑Minxing Luo, Wentao Cheng, Jian Yangarxiv.org/pdf/2401.06…null
2024-01-12ModaVerse: Efficiently Transforming Modalities with LLMsModaVerse:利用法学硕士有效转变模式Xinyu Wang, Bohan Zhuang, Qi Wuarxiv.org/pdf/2401.06…null
2024-01-12Seek for Incantations: Towards Accurate Text-to-Image Diffusion Synthesis through Prompt Engineering寻求咒语:通过快速工程实现准确的文本到图像扩散合成Chang Yu, Junran Peng, Xiangyu Zhu, Zhaoxiang Zhang, Qi Tian, Zhen Leiarxiv.org/pdf/2401.06…null

多模态

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-12Generalizing Visual Question Answering from Synthetic to Human-Written Questions via a Chain of QA with a Large Language Model通过具有大型语言模型的 QA 链将视觉问答从合成问题推广到人类编写的问题Taehee Kim, Yeongjae Cho, Heejun Shin, Yohan Jo, Dongmyung Shinarxiv.org/pdf/2401.06…null
2024-01-12Hyper-STTN: Social Group-aware Spatial-Temporal Transformer Network for Human Trajectory Prediction with Hypergraph ReasoningHyper-STTN:利用超图推理进行人体轨迹预测的社会群体感知时空变换器网络Weizheng Wang, Le Mao, Baijian Yang, Guohua Chen, Byung-Cheol Minarxiv.org/pdf/2401.06…null

Transformer

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-12PCB-Vision: A Multiscene RGB-Hyperspectral Benchmark Dataset of Printed Circuit BoardsPCB-Vision:印刷电路板的多场景 RGB 高光谱基准数据集Elias Arbash, Margret Fuchs, Behnood Rasti, Sandra Lorenz, Pedram Ghamisi, Richard Gloaguenarxiv.org/pdf/2401.06…null
2024-01-12MedTransformer: Accurate AD Diagnosis for 3D MRI Images through 2D Vision TransformersMedTransformer:通过 2D Vision Transformer 对 3D MRI 图像进行准确的 AD 诊断Yifeng Wang, Ke Chen, Yihan Zhang, Haohan Wangarxiv.org/pdf/2401.06…null
2024-01-12Video Super-Resolution Transformer with Masked Inter&Intra-Frame Attention具有屏蔽帧间和帧内注意力的视频超分辨率变压器Xingyu Zhou, Leheng Zhang, Xiaorui Zhao, Keze Wang, Leida Li, Shuhang Guarxiv.org/pdf/2401.06…null

3D/CG

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-123D Reconstruction of Interacting Multi-Person in Clothing from a Single Image从单个图像重建多人服装交互的 3D 重建Junuk Cha, Hansol Lee, Jaewon Kim, Nhat Nguyen Bao Truong, Jae Shin Yoon, Seungryul Baekarxiv.org/pdf/2401.06…null

GNN

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-12Synthetic Data Generation Framework, Dataset, and Efficient Deep Model for Pedestrian Intention Prediction用于行人意图预测的综合数据生成框架、数据集和高效深度模型Muhammad Naveed Riaz, Maciej Wielgosz, Abel Garcia Romera, Antonio M. Lopezarxiv.org/pdf/2401.06…null

其他

Publish DateTitleTitle_CNAuthorsPDFCode
2024-01-12AttributionScanner: A Visual Analytics System for Metadata-Free Data-Slicing Based Model ValidationAttributionScanner:用于基于无元数据数据切片的模型验证的可视化分析系统Xiwei Xuan, Jorge Piazentin Ono, Liang Gou, Kwan-Liu Ma, Liu Renarxiv.org/pdf/2401.06…null
2024-01-12UAV-borne Mapping Algorithms for Canopy-Level and High-Speed Drone Applications适用于冠层级和高速无人机应用的无人机载测绘算法Jincheng Zhang, Artur Wolek, Andrew R. Willisarxiv.org/pdf/2401.06…null
2024-01-12Application Of Vision-Language Models For Assessing Osteoarthritis Disease Severity视觉语言模型在评估骨关节炎疾病严重程度中的应用Banafshe Felfeliyan, Yuyue Zhou, Shrimanti Ghosh, Jessica Kupper, Shaobo Liu, Abhilash Hareendranathan, Jacob L. Jaremkoarxiv.org/pdf/2401.06…null
2024-01-12Beyond the Surface: A Global-Scale Analysis of Visual Stereotypes in Text-to-Image Generation超越表面:文本到图像生成中视觉刻板印象的全球范围分析Akshita Jha, Vinodkumar Prabhakaran, Remi Denton, Sarah Laszlo, Shachi Dave, Rida Qadri, Chandan K. Reddy, Sunipa Devarxiv.org/pdf/2401.06…null