[分享][每日更新][2024.02.07][CV_arxiv_papers]

354 阅读13分钟

[UPDATED!] 2024-02-07 (Publish Time)

生成模型

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-07SPAD : Spatially Aware Multiview DiffusersSPAD:空间感知多视图扩散器Yash Kant, Ziyi Wu, Michael Vasilkovsky, Guocheng Qian, Jian Ren, Riza Alp Guler, Bernard Ghanem, Sergey Tulyakov, Igor Gilitschenski, Aliaksandr Siarohinarxiv.org/pdf/2402.05…null
2024-02-07Anatomically-Controllable Medical Image Generation with Segmentation-Guided Diffusion Models利用分割引导扩散模型生成解剖学可控的医学图像Nicholas Konz, Yuwen Chen, Haoyu Dong, Maciej A. Mazurowskiarxiv.org/pdf/2402.05…link
2024-02-07λ-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Spaceλ-ECLIPSE:利用 CLIP 潜在空间的多概念个性化文本到图像扩散模型Maitreya Patel, Sangmin Jung, Chitta Baral, Yezhou Yangarxiv.org/pdf/2402.05…null
2024-02-07LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content CreationLGM:用于高分辨率 3D 内容创建的大型多视图高斯模型Jiaxiang Tang, Zhaoxi Chen, Xiaokang Chen, Tengfei Wang, Gang Zeng, Ziwei Liuarxiv.org/pdf/2402.05…null
2024-02-07Blue noise for diffusion models扩散模型的蓝噪声Xingchang Huang, Corentin Salaün, Cristina Vasconcelos, Christian Theobalt, Cengiz Öztireli, Gurprit Singharxiv.org/pdf/2402.04…null
2024-02-07Source-Free Domain Adaptation with Diffusion-Guided Source Data Generation通过扩散引导源数据生成进行无源域适应Shivang Chopra, Suraj Kothawade, Houda Aynaou, Aman Chadhaarxiv.org/pdf/2402.04…null
2024-02-07Towards Aligned Layout Generation via Diffusion Model with Aesthetic Constraints通过具有审美约束的扩散模型实现对齐布局生成Jian Chen, Ruiyi Zhang, Yufan Zhou, Changyou Chenarxiv.org/pdf/2402.04…null
2024-02-07Cortical Surface Diffusion Generative Models皮质表面扩散生成模型Zhenshan Xie, Simon Dahan, Logan Z. J. Williams, M. Jorge Cardoso, Emma C. Robinsonarxiv.org/pdf/2402.04…null
2024-02-07EvoSeed: Unveiling the Threat on Deep Neural Networks with Real-World IllusionsEvoSeed:用现实世界的幻觉揭示深度神经网络的威胁Shashank Kotyan, PoYuan Mao, Danilo Vasconcellos Vargasarxiv.org/pdf/2402.04…link
2024-02-07Noise Map Guidance: Inversion with Spatial Context for Real Image Editing噪声图指导:使用空间上下文进行反演以进行真实图像编辑Hansam Cho, Jonghyun Lee, Seoung Bum Kim, Tae-Hyun Oh, Yonghyun Jeongarxiv.org/pdf/2402.04…link
2024-02-07Triplet-constraint Transformer with Multi-scale Refinement for Dose Prediction in Radiotherapy用于放射治疗剂量预测的多尺度细化三重态约束变压器Lu Wen, Qihun Zhang, Zhenghao Feng, Yuanyuan Xu, Xiao Chen, Jiliu Zhou, Yan Wangarxiv.org/pdf/2402.04…null
2024-02-07BRI3L: A Brightness Illusion Image Dataset for Identification and Localization of Regions of Illusory PerceptionBRI3L:用于识别和定位幻觉感知区域的亮度幻觉图像数据集Aniket Roy, Anirban Roy, Soma Mitra, Kuntal Ghosharxiv.org/pdf/2402.04…link
2024-02-07Text2Street: Controllable Text-to-image Generation for Street ViewsText2Street:街景的可控文本到图像生成Jinming Su, Songen Gu, Yiting Duan, Xingyue Chen, Junfeng Luoarxiv.org/pdf/2402.04…null

多模态

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-07BIKED++: A Multimodal Dataset of 1.4 Million Bicycle Image and Parametric CAD DesignsBIKED++:包含 140 万张自行车图像和参数化 CAD 设计的多模式数据集Lyle Regenwetter, Yazan Abu Obaideh, Amin Heyrani Nobari, Faez Ahmedarxiv.org/pdf/2402.05…null
2024-02-07Examining Modality Incongruity in Multimodal Federated Learning for Medical Vision and Language-based Disease Detection检查医学视觉和基于语言的疾病检测的多模态联合学习中的模态不一致性Pramit Saha, Divyanshu Mishra, Felix Wagner, Konstantinos Kamnitsas, J. Alison Noblearxiv.org/pdf/2402.05…null
2024-02-07Language-Based Augmentation to Address Shortcut Learning in Object Goal Navigation基于语言的增强解决对象目标导航中的快捷学习问题Dennis Hoftijzer, Gertjan Burghouts, Luuk Spreeuwersarxiv.org/pdf/2402.05…null
2024-02-07Efficient Multi-Resolution Fusion for Remote Sensing Data with Label Uncertainty具有标签不确定性的遥感数据的高效多分辨率融合Hersh Vakharia, Xiaoxiao Duarxiv.org/pdf/2402.05…link
2024-02-07Text or Image? What is More Important in Cross-Domain Generalization Capabilities of Hate Meme Detection Models?文字还是图像?仇恨模因检测模型的跨域泛化能力更重要的是什么?Piush Aggarwal, Jawar Mehrabanian, Weigang Huang, Özge Alacam, Torsten Zescharxiv.org/pdf/2402.04…null
2024-02-07MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language BenchmarkMLLM 作为法官:使用视觉语言基准评估多模式 LLM 作为法官Dongping Chen, Ruoxi Chen, Shilin Zhang, Yinuo Liu, Yaochen Wang, Huichi Zhou, Qihui Zhang, Pan Zhou, Yao Wan, Lichao Sunarxiv.org/pdf/2402.04…null
2024-02-07InstructScene: Instruction-Driven 3D Indoor Scene Synthesis with Semantic Graph PriorInstructScene:具有语义图先验的指令驱动 3D 室内场景合成Chenguo Lin, Yadong Muarxiv.org/pdf/2402.04…null
2024-02-07ScreenAI: A Vision-Language Model for UI and Infographics UnderstandingScreenAI:用于 UI 和信息图表理解的视觉语言模型Gilles Baechler, Srinivas Sunkara, Maria Wang, Fedir Zubach, Hassan Mansoor, Vincent Etter, Victor Cărbune, Jason Lin, Jindong Chen, Abhanshu Sharmaarxiv.org/pdf/2402.04…null
2024-02-07ColorSwap: A Color and Word Order Dataset for Multimodal EvaluationColorSwap:用于多模式评估的颜色和词序数据集Jirayu Burapacheep, Ishan Gaur, Agam Bhatia, Tristan Thrusharxiv.org/pdf/2402.04…link

Nerf

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-07NeRF as Non-Distant Environment Emitter in Physics-based Inverse RenderingNeRF 作为基于物理的逆渲染中的非远程环境发射器Jingwang Ling, Ruihan Yu, Feng Xu, Chun Du, Shuang Zhaoarxiv.org/pdf/2402.04…null
2024-02-07Mesh-based Gaussian Splatting for Real-time Large-scale Deformation用于实时大范围变形的基于网格的高斯分布Lin Gao, Jie Yang, Bo-Tao Zhang, Jia-Mu Sun, Yu-Jie Yuan, Hongbo Fu, Yu-Kun Laiarxiv.org/pdf/2402.04…null
2024-02-07OV-NeRF: Open-vocabulary Neural Radiance Fields with Vision and Language Foundation Models for 3D Semantic UnderstandingOV-NeRF:具有视觉和语言基础模型的开放词汇神经辐射场,用于 3D 语义理解Guibiao Liao, Kaichen Zhou, Zhenyu Bao, Kanglin Liu, Qing Liarxiv.org/pdf/2402.04…null
2024-02-07BirdNeRF: Fast Neural Reconstruction of Large-Scale Scenes From Aerial ImageryBirdNeRF:从航空图像中快速神经重建大规模场景Huiqing Zhang, Yifei Xue, Ming Liao, Yizhen Laoarxiv.org/pdf/2402.04…null

模型压缩/优化

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-07Knowledge Distillation for Road Detection based on cross-model Semi-Supervised Learning基于跨模型半监督学习的道路检测知识蒸馏Wanli Ma, Oktay Karakus, Paul L. Rosinarxiv.org/pdf/2402.05…null
2024-02-07EfficientViT-SAM: Accelerated Segment Anything Model Without Performance LossEfficientViT-SAM:加速分段任何模型而不会造成性能损失Zhuoyang Zhang, Han Cai, Song Hanarxiv.org/pdf/2402.05…link
2024-02-07ConvLoRA and AdaBN based Domain Adaptation via Self-Training通过自训练进行基于 ConvLoRA 和 AdaBN 的域适应Sidra Aleem, Julia Dietlmeier, Eric Arazo, Suzanne Littlearxiv.org/pdf/2402.04…link
2024-02-07Group Distributionally Robust Dataset Distillation with Risk Minimization具有风险最小化的分组分布稳健数据集蒸馏Saeed Vahidian, Mingyu Wang, Jianyang Gu, Vyacheslav Kungurtsev, Wei Jiang, Yiran Chenarxiv.org/pdf/2402.04…null
2024-02-07G-NAS: Generalizable Neural Architecture Search for Single Domain Generalization Object DetectionG-NAS:用于单域泛化对象检测的泛化神经架构搜索Fan Wu, Jinling Gao, Lanqing Hong, Xinbing Wang, Chenghu Zhou, Nanyang Yearxiv.org/pdf/2402.04…link

分类/检测/识别/分割/...

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-07Combining shape and contour features to improve tool wear monitoring in milling processes结合形状和轮廓特征来改进铣削过程中的刀具磨损监控M. T. García-Ordás, E. Alegre-Gutiérrez, V. González-Castro, R. Alaiz-Rodríguezarxiv.org/pdf/2402.05…null
2024-02-07Tool wear monitoring using an online, automatic and low cost system based on local texture使用基于局部纹理的在线、自动和低成本系统进行刀具磨损监测M. T. García-Ordás, E. Alegre-Gutiérrez, R. Alaiz-Rodríguez, V. González-Castroarxiv.org/pdf/2402.05…null
2024-02-07Self-calibrated convolution towards glioma segmentation用于神经胶质瘤分割的自校准卷积Felipe C. R. Salvagnini, Gerson O. Barbosa, Alexandre X. Falcao, Cid A. N. Santosarxiv.org/pdf/2402.05…null
2024-02-07Enhancement of Bengali OCR by Specialized Models and Advanced Techniques for Diverse Document Types通过针对不同文档类型的专业模型和先进技术增强孟加拉语 OCRAKM Shahariar Azad Rabby, Hasmot Ali, Md. Majedul Islam, Sheikh Abujar, Fuad Rahmanarxiv.org/pdf/2402.05…null
2024-02-07Mamba-UNet: UNet-Like Pure Visual Mamba for Medical Image SegmentationMamba-UNet:用于医学图像分割的类似 UNet 的纯视觉 MambaZiyang Wang, Jian-Qing Zheng, Yichi Zhang, Ge Cui, Lei Liarxiv.org/pdf/2402.05…link
2024-02-07Detection and Pose Estimation of flat, Texture-less Industry Objects on HoloLens using synthetic Training使用合成训练在 HoloLens 上检测平面、无纹理的工业对象并进行姿态估计Thomas Pöllabauer, Fabian Rücker, Andreas Franek, Felix Gorschlüterarxiv.org/pdf/2402.04…null
2024-02-07Channel-Selective Normalization for Label-Shift Robust Test-Time Adaptation用于标签移位鲁棒测试时间适应的通道选择性归一化Pedro Vianna, Muawiz Chaudhary, Paria Mehrbod, An Tang, Guy Cloutier, Guy Wolf, Michael Eickenberg, Eugene Belilovskyarxiv.org/pdf/2402.04…null
2024-02-07Is Two-shot All You Need? A Label-efficient Approach for Video Segmentation in Breast Ultrasound您只需要两次拍摄吗?乳腺超声视频分割的标签高效方法Jiajun Zeng, Ruobing Huang, Dong Niarxiv.org/pdf/2402.04…null
2024-02-07Toward Accurate Camera-based 3D Object Detection via Cascade Depth Estimation and Calibration通过级联深度估计和校准实现基于相机的精确 3D 物体检测Chaoqun Wang, Yiran Qin, Zijian Kang, Ningning Ma, Ruimao Zhangarxiv.org/pdf/2402.04…null
2024-02-07STAR: Shape-focused Texture Agnostic Representations for Improved Object Detection and 6D Pose EstimationSTAR:用于改进对象检测和 6D 姿势估计的形状聚焦纹理不可知表示Peter Hönig, Stefan Thalhammer, Jean-Baptiste Weibel, Matthias Hirschmanner, Markus Vinczearxiv.org/pdf/2402.04…null
2024-02-07Advancing Anomaly Detection: An Adaptation Model and a New Dataset推进异常检测:适应模型和新数据集Liyun Zhu, Arjun Raj, Lei Wangarxiv.org/pdf/2402.04…null
2024-02-07SARI: Simplistic Average and Robust Identification based Noisy Partial Label LearningSARI:基于噪声部分标签学习的简单平均和鲁棒识别Darshana Saravanan, Naresh Manwani, Vineet Gandhiarxiv.org/pdf/2402.04…null
2024-02-07Color Recognition in Challenging Lighting Environments: CNN Approach具有挑战性的照明环境中的颜色识别:CNN 方法Nizamuddin Maitlo, Nooruddin Noonari, Sajid Ahmed Ghanghro, Sathishkumar Duraisamy, Fayaz Ahmedarxiv.org/pdf/2402.04…null
2024-02-07Boundary-aware Contrastive Learning for Semi-supervised Nuclei Instance Segmentation半监督核实例分割的边界感知对比学习Ye Zhang, Ziyue Wang, Yifeng Wang, Hao Bian, Linghan Cai, Hengrui Li, Lingbo Zhang, Yongbing Zhangarxiv.org/pdf/2402.04…null
2024-02-07Adversarial Robustness Through Artifact Design通过工件设计实现对抗鲁棒性Tsufit Shua, Mahmood Sharifarxiv.org/pdf/2402.04…null
2024-02-07GSN: Generalisable Segmentation in Neural Radiance FieldGSN:神经辐射领域的通用分割Vinayak Gupta, Rahul Goel, Sirikonda Dhawal, P. J. Narayananarxiv.org/pdf/2402.04…null
2024-02-07LLMs Meet VLMs: Boost Open Vocabulary Object Detection with Fine-grained DescriptorsLLM 与 VLM 相遇:利用细粒度描述符增强开放词汇对象检测Sheng Jin, Xueying Jiang, Jiaxing Huang, Lewei Lu, Shijian Luarxiv.org/pdf/2402.04…null
2024-02-07Multi-Scale Semantic Segmentation with Modified MBConv Blocks使用修改的 MBConv 块进行多尺度语义分割Xi Chen, Yang Cai, Yuan Wu, Bo Xiong, Taesung Parkarxiv.org/pdf/2402.04…null
2024-02-07Meet JEANIE: a Similarity Measure for 3D Skeleton Sequences via Temporal-Viewpoint Alignment认识 JEANIE:通过时间视点对齐进行 3D 骨架序列的相似性测量Lei Wang, Jun Liu, Liang Zheng, Tom Gedeon, Piotr Koniuszarxiv.org/pdf/2402.04…null
2024-02-07Towards Improved Imbalance Robustness in Continual Multi-Label Learning with Dual Output Spiking Architecture (DOSA)利用双输出尖峰架构 (DOSA) 提高持续多标签学习中的不平衡鲁棒性Sourav Mishra, Shirin Dora, Suresh Sundaramarxiv.org/pdf/2402.04…null
2024-02-07Sparse Anatomical Prompt Semi-Supervised Learning with Masked Image Modeling for CBCT Tooth Segmentation用于 CBCT 牙齿分割的带有掩模图像建模的稀疏解剖提示半监督学习Pengyu Dai, Yafei Ou, Yang Liu, Yue Zhaoarxiv.org/pdf/2402.04…null
2024-02-07Attention Guided CAM: Visual Explanations of Vision Transformer Guided by Self-AttentionAttention Guided CAM:自注意力引导的 Vision Transformer 的视觉解释Saebom Leem, Hyunseok Seoarxiv.org/pdf/2402.04…link
2024-02-07FM-Fusion: Instance-aware Semantic Mapping Boosted by Vision-Language Foundation ModelsFM-Fusion:视觉语言基础模型推动的实例感知语义映射Chuhao Liu, Ke Wang, Jieqi Shi, Zhijian Qiao, Shaojie Shenarxiv.org/pdf/2402.04…null
2024-02-07BioDrone: A Bionic Drone-based Single Object Tracking Benchmark for Robust VisionBioDrone:基于仿生无人机的单目标跟踪基准,用于鲁棒视觉Xin Zhao, Shiyu Hu, Yipei Wang, Jing Zhang, Yimin Hu, Rongshuai Liu, Haibin Ling, Yin Li, Renshu Li, Kun Liu, et.al.arxiv.org/pdf/2402.04…null

图像理解

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-07A Psychological Study: Importance of Contrast and Luminance in Color to Grayscale Mapping心理学研究:颜色对比度和亮度对灰度映射的重要性Prasoon Ambalathankandy, Yafei Ou, Sae Kaneko, Masayuki Ikebearxiv.org/pdf/2402.04…null

Transformer

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-07Dual-disentangled Deep Multiple Clustering双解纠缠深度多重聚类Jiawei Yao, Juhua Huarxiv.org/pdf/2402.05…link
2024-02-07Image captioning for Brazilian Portuguese using GRIT model使用 GRIT 模型为巴西葡萄牙语制作图像字幕Rafael Silva de Alencar, William Alberto Cruz Castañeda, Marcellus Amadeusarxiv.org/pdf/2402.05…null
2024-02-07Dual-Path Coupled Image Deraining Network via Spatial-Frequency Interaction通过空间频率交互的双路径耦合图像去雨网络Yuhong He, Aiwen Jiang, Lingfang Jiang, Zhifeng Wang, Lu Wangarxiv.org/pdf/2402.04…link
2024-02-09Spiking-PhysFormer: Camera-Based Remote Photoplethysmography with Parallel Spike-driven TransformerSpiking-PhysFormer:基于相机的远程光电体积描记法,具有并行尖峰驱动变压器Mingxuan Liu, Jiankai Tang, Haoxiang Li, Jiahao Qi, Siwei Li, Kegang Wang, Yuntao Wang, Hong Chenarxiv.org/pdf/2402.04…null
2024-02-07Robot Interaction Behavior Generation based on Social Motion Forecasting for Human-Robot Interaction基于人机交互社会运动预测的机器人交互行为生成Esteve Valls Mascaro, Yashuai Yan, Dongheui Leearxiv.org/pdf/2402.04…null
2024-02-07Troublemaker Learning for Low-Light Image Enhancement低光图像增强的麻烦制造者学习Yinghao Song, Zhiyuan Cao, Wanhong Xiang, Sifan Long, Bo Yang, Hongwei Ge, Yanchun Liang, Chunguo Wuarxiv.org/pdf/2402.04…link
2024-02-07Progressive Conservative Adaptation for Evolving Target Domains针对不断变化的目标领域的渐进保守适应Gangming Zhao, Chaoqi Chen, Wenhao He, Chengwei Pan, Chaowei Fang, Jinpeng Li, Xilin Chen, Yizhou Yuarxiv.org/pdf/2402.04…null
2024-02-07DMAT: A Dynamic Mask-Aware Transformer for Human De-occlusionDMAT:用于人体去遮挡的动态掩模感知变压器Guoqiang Liang, Jiahao Hu, Qingyue Wang, Shizhou Zhangarxiv.org/pdf/2402.04…null

3D/CG

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-07V2VSSC: A 3D Semantic Scene Completion Benchmark for Perception with Vehicle to Vehicle CommunicationV2VSSC:车对车通信感知的 3D 语义场景完成基准Yuanfang Zhang, Junxuan Li, Kaiqing Luo, Yiying Yang, Jiayi Han, Nian Liu, Denghui Qin, Peng Han, Chengpei Xuarxiv.org/pdf/2402.04…null
2024-02-07A Review on Digital Pixel Sensors数字像素传感器综述Md Rahatul Islam Udoy, Shamiul Alam, Md Mazharul Islam, Akhilesh Jaiswal, Ahmedullah Azizarxiv.org/pdf/2402.04…null
2024-02-07MIRT: a simultaneous reconstruction and affine motion compensation technique for four dimensional computed tomography (4DCT)MIRT:四维计算机断层扫描 (4DCT) 的同时重建和仿射运动补偿技术Anh-Tuan Nguyen, Jens Renders, Domenico Iuso, Yves Maris, Jeroen Soete, Martine Wevers, Jan Sijbers, Jan De Beenhouwerarxiv.org/pdf/2402.04…null

其他

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-07RAGE for the Machine: Image Compression with Low-Cost Random Access for Embedded ApplicationsRAGE for the Machine:针对嵌入式应用的低成本随机访问图像压缩Christian D. Rask, Daniel E. Lucaniarxiv.org/pdf/2402.05…null
2024-02-07Physics Informed and Data Driven Simulation of Underwater Images via Residual Learning通过残差学习对水下图像进行物理知情和数据驱动的模拟Tanmoy Mondal, Ricardo Mendoza, Lucas Drumetzarxiv.org/pdf/2402.05…link
2024-02-07A Survey on Domain Generalization for Medical Image Analysis医学图像分析领域泛化综述Ziwei Niu, Shuyi Ouyang, Shiao Xie, Yen-wei Chen, Lanfen Linarxiv.org/pdf/2402.05…null
2024-02-074-Dimensional deformation part model for pose estimation using Kalman filter constraints使用卡尔曼滤波器约束进行位姿估计的 4 维变形零件模型Enrique Martinez-Berti, Antonio-Jose Sanchez-Salmeron, Carlos Ricolfe-Vialaarxiv.org/pdf/2402.04…null
2024-02-07Data-efficient Large Vision Models through Sequential Autoregression通过顺序自回归实现数据高效的大视觉模型Jianyuan Guo, Zhiwei Hao, Chengcheng Wang, Yehui Tang, Han Wu, Han Hu, Kai Han, Chang Xuarxiv.org/pdf/2402.04…link
2024-02-07AINS: Affordable Indoor Navigation Solution via Line Color Identification Using Mono-Camera for Autonomous VehiclesAINS:通过使用单摄像头进行线条颜色识别的经济实惠的室内导航解决方案,适用于自动驾驶汽车Nizamuddin Maitlo, Nooruddin Noonari, Kaleem Arshid, Naveed Ahmed, Sathishkumar Duraisamyarxiv.org/pdf/2402.04…null
2024-02-07The Influence of Autofocus Lenses in the Camera Calibration Process自动对焦镜头对相机标定过程的影响Carlos Ricolfe-Viala, Alicia Esparzaarxiv.org/pdf/2402.04…null
2024-02-07An Over Complete Deep Learning Method for Inverse Problems一种逆问题的超完备深度学习方法Moshe Eliasof, Eldad Haber, Eran Treisterarxiv.org/pdf/2402.04…null
2024-02-07BEBLID: Boosted efficient binary local image descriptorBEBLID:提升高效的二进制局部图像描述符Iago Suárez, Ghesn Sfeir, José M. Buenaposada, Luis Baumelaarxiv.org/pdf/2402.04…link