[分享][每日更新][2024.02.19][CV_arxiv_papers]

223 阅读11分钟

[UPDATED!] 2024-02-19 (Publish Time)

生成模型

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-19FiT: Flexible Vision Transformer for Diffusion ModelFiT:用于扩散模型的灵活视觉变压器Zeyu Lu, Zidong Wang, Di Huang, Chengyue Wu, Xihui Liu, Wanli Ouyang, Lei Baiarxiv.org/pdf/2402.12…null
2024-02-19Mixed Gaussian Flow for Diverse Trajectory Prediction用于多种轨迹预测的混合高斯流Jiahe Chen, Jinkun Cao, Dahua Lin, Kris Kitani, Jiangmiao Pangarxiv.org/pdf/2402.12…null
2024-02-19AnyGPT: Unified Multimodal LLM with Discrete Sequence ModelingAnyGPT:具有离散序列建模的统一多模态法学硕士Jun Zhan, Junqi Dai, Jiasheng Ye, Yunhua Zhou, Dong Zhang, Zhigeng Liu, Xin Zhang, Ruibin Yuan, Ge Zhang, Linyang Li, et.al.arxiv.org/pdf/2402.12…null
2024-02-19Adversarial Feature Alignment: Balancing Robustness and Accuracy in Deep Learning via Adversarial Training对抗性特征对齐:通过对抗性训练平衡深度学习的鲁棒性和准确性Leo Hyun Park, Jaeuk Kim, Myung Gyo Oh, Jaewoo Park, Taekyoung Kwonarxiv.org/pdf/2402.12…null
2024-02-193D Vascular Segmentation Supervised by 2D Annotation of Maximum Intensity Projection由最大强度投影的 2D 注释监督的 3D 血管分割Zhanqiang Guo, Zimeng Tan, Jianjiang Feng, Jie Zhouarxiv.org/pdf/2402.12…null
2024-02-19Human Video Translation via Query Warping通过查询变形进行人类视频翻译Haiming Zhu, Yangyang Xu, Shengfeng Hearxiv.org/pdf/2402.12…null
2024-02-19Direct Consistency Optimization for Compositional Text-to-Image Personalization组合文本到图像个性化的直接一致性优化Kyungmin Lee, Sangkyung Kwak, Kihyuk Sohn, Jinwoo Shinarxiv.org/pdf/2402.12…null
2024-02-19Privacy-Preserving Low-Rank Adaptation for Latent Diffusion Models潜在扩散模型的隐私保护低阶适应Zihao Luo, Xilie Xu, Feng Liu, Yun Sing Koh, Di Wang, Jingfeng Zhangarxiv.org/pdf/2402.11…null
2024-02-19DiLightNet: Fine-grained Lighting Control for Diffusion-based Image GenerationDiLightNet:用于基于扩散的图像生成的细粒度照明控制Chong Zeng, Yue Dong, Pieter Peers, Youkang Kong, Hongzhi Wu, Xin Tongarxiv.org/pdf/2402.11…null
2024-02-19One2Avatar: Generative Implicit Head Avatar For Few-shot User AdaptationOne2Avatar:用于小样本用户适应的生成隐式头部头像Zhixuan Yu, Ziqian Bai, Abhimitra Meka, Feitong Tan, Qiangeng Xu, Rohit Pandey, Sean Fanello, Hyun Soo Park, Yinda Zhangarxiv.org/pdf/2402.11…null
2024-02-19NOTE: Notable generation Of patient Text summaries through Efficient approach based on direct preference optimization注:通过基于直接偏好优化的有效方法生成显着的患者文本摘要Imjin Ahn, Hansle Gwon, Young-Hak Kim, Tae Joon Jun, Sanghyun Parkarxiv.org/pdf/2402.11…null
2024-02-19ComFusion: Personalized Subject Generation in Multiple Specific Scenes From Single ImageComFusion:从单个图像在多个特定场景中生成个性化主题Yan Hong, Jianfu Zhangarxiv.org/pdf/2402.11…null
2024-02-19UnlearnCanvas: A Stylized Image Dataset to Benchmark Machine Unlearning for Diffusion ModelsUnlearnCanvas:用于对扩散模型的机器遗忘进行基准测试的程式化图像数据集Yihua Zhang, Yimeng Zhang, Yuguang Yao, Jinghan Jia, Jiancheng Liu, Xiaoming Liu, Sijia Liuarxiv.org/pdf/2402.11…null
2024-02-19WildFake: A Large-scale Challenging Dataset for AI-Generated Images DetectionWildFake:用于人工智能生成图像检测的大规模挑战性数据集Yan Hong, Jianfu Zhangarxiv.org/pdf/2402.11…null
2024-02-19Statistical Test for Generated Hypotheses by Diffusion Models通过扩散模型生成的假设的统计检验Teruyuki Katsuoka, Tomohiro Shiraishi, Daiki Miwa, Vo Nguyen Le Duy, Ichiro Takeuchiarxiv.org/pdf/2402.11…null

多模态

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-19Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models鲁棒 CLIP:鲁棒大型视觉语言模型的视觉嵌入的无监督对抗性微调Christian Schlarmann, Naman Deep Singh, Francesco Croce, Matthias Heinarxiv.org/pdf/2402.12…null
2024-02-19ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart ReasoningChartX 和 ChartVLM:复杂图表推理的多功能基准和基础模型Renqiu Xia, Bo Zhang, Hancheng Ye, Xiangchao Yan, Qi Liu, Hongbin Zhou, Zijun Chen, Min Dou, Botian Shi, Junchi Yan, et.al.arxiv.org/pdf/2402.12…null
2024-02-19LVCHAT: Facilitating Long Video ComprehensionLVCHAT:促进长视频理解Yu Wang, Zeyuan Zhang, Julian McAuley, Zexue Hearxiv.org/pdf/2402.12…null
2024-02-19Scaffolding Coordinates to Promote Vision-Language Coordination in Large Multi-Modal Models脚手架坐标促进大型多模态模型中的视觉语言协调Xuanyu Lei, Zonghan Yang, Xinrui Chen, Peng Li, Yang Liuarxiv.org/pdf/2402.12…null
2024-02-19Semantic Textual Similarity Assessment in Chest X-ray Reports Using a Domain-Specific Cosine-Based Metric使用特定领域的基于余弦的度量对胸部 X 射线报告进行语义文本相似性评估Sayeh Gholipour Picha, Dawood Al Chanti, Alice Caplierarxiv.org/pdf/2402.11…null
2024-02-19Unveiling the Depths: A Multi-Modal Fusion Framework for Challenging Scenarios揭开深度:应对挑战性场景的多模态融合框架Jialei Xu, Xianming Liu, Junjun Jiang, Kui Jiang, Rui Li, Kai Cheng, Xiangyang Jiarxiv.org/pdf/2402.11…null
2024-02-19MM-SurvNet: Deep Learning-Based Survival Risk Stratification in Breast Cancer Through Multimodal Data FusionMM-SurvNet:通过多模态数据融合进行基于深度学习的乳腺癌生存风险分层Raktim Kumar Mondol, Ewan K. A. Millar, Arcot Sowmya, Erik Meijeringarxiv.org/pdf/2402.11…null

Nerf

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-19Binary Opacity Grids: Capturing Fine Geometric Detail for Mesh-Based View Synthesis二元不透明度网格:捕获精细的几何细节以进行基于网格的视图合成Christian Reiser, Stephan Garbin, Pratul P. Srinivasan, Dor Verbin, Richard Szeliski, Ben Mildenhall, Jonathan T. Barron, Peter Hedman, Andreas Geigerarxiv.org/pdf/2402.12…null
2024-02-19Colorizing Monochromatic Radiance Fields对单色辐射场进行着色Yean Cheng, Renjie Wan, Shuchen Weng, Chengxuan Zhu, Yakun Chang, Boxin Shiarxiv.org/pdf/2402.12…null

模型压缩/优化

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-19Interpretable Embedding for Ad-hoc Video Search用于临时视频搜索的可解释嵌入Jiaxin Wu, Chong-Wah Ngoarxiv.org/pdf/2402.11…null

分类/检测/识别/分割/...

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-19Landmark Stereo Dataset for Landmark Recognition and Moving Node Localization in a Non-GPS Battlefield Environment用于非 GPS 战场环境中地标识别和移动节点定位的地标立体数据集Ganesh Sapkota, Sanjay Madriaarxiv.org/pdf/2402.12…null
2024-02-19UncertaintyTrack: Exploiting Detection and Localization Uncertainty in Multi-Object TrackingUncertaintyTrack:利用多目标跟踪中的检测和定位不确定性Chang Won Lee, Steven L. Waslanderarxiv.org/pdf/2402.12…null
2024-02-19Zero shot VLMs for hate meme detection: Are we there yet?用于仇恨模因检测的零样本 VLM:我们到了吗?Naquee Rizwan, Paramananda Bhaskar, Mithun Das, Swadhin Satyaprakash Majhi, Punyajoy Saha, Animesh Mukherjeearxiv.org/pdf/2402.12…null
2024-02-19Perceiving Longer Sequences With Bi-Directional Cross-Attention Transformers使用双向交叉注意力变压器感知更长的序列Markus Hiller, Krista A. Ehinger, Tom Drummondarxiv.org/pdf/2402.12…null
2024-02-19Towards Explainable LiDAR Point Cloud Semantic Segmentation via Gradient Based Target Localization通过基于梯度的目标定位实现可解释的激光雷达点云语义分割Abhishek Kuriyal, Vaibhav Kumararxiv.org/pdf/2402.12…null
2024-02-19ISCUTE: Instance Segmentation of Cables Using Text EmbeddingISCUTE:使用文本嵌入对电缆进行实例分割Shir Kozlovsky, Omkar Joglekar, Dotan Di Castroarxiv.org/pdf/2402.11…null
2024-02-19Weakly Supervised Object Detection in Chest X-Rays with Differentiable ROI Proposal Networks and Soft ROI Pooling具有可微分 ROI 建议网络和软 ROI 池化的胸部 X 光弱监督对象检测Philip Müller, Felix Meissen, Georgios Kaissis, Daniel Rueckertarxiv.org/pdf/2402.11…null
2024-02-19Event-Based Motion Magnification基于事件的运动放大Yutian Chen, Shi Guo, Fangzheng Yu, Feng Zhang, Jinwei Gu, Tianfan Xuearxiv.org/pdf/2402.11…null
2024-02-19Separating common from salient patterns with Contrastive Representation Learning通过对比表征学习区分常见模式和显着模式Robin Louiset, Edouard Duchesnay, Antoine Grigis, Pietro Goriarxiv.org/pdf/2402.11…null
2024-02-19Modularized Networks for Few-shot Hateful Meme Detection用于少量仇恨模因检测的模块化网络Rui Cao, Roy Ka-Wei Lee, Jing Jiangarxiv.org/pdf/2402.11…null
2024-02-19Rock Classification Based on Residual Networks基于残差网络的岩石分类Sining Zhoubian, Yuyang Wang, Zhihuan Jiangarxiv.org/pdf/2402.11…null
2024-02-19SDGE: Stereo Guided Depth Estimation for 360° Camera SetsSDGE:360° 相机组的立体引导深度估计Jialei Xu, Xianming Liu, Junjun Jiang, Xiangyang Jiarxiv.org/pdf/2402.11…null
2024-02-19FOD-Swin-Net: angular super resolution of fiber orientation distribution using a transformer-based deep modelFOD-Swin-Net:使用基于变压器的深度模型的纤维取向分布的角度超分辨率Mateus Oliveira da Silva, Caio Pinheiro Santana, Diedre Santos do Carmo, Letícia Rittnerarxiv.org/pdf/2402.11…null
2024-02-19Reinforcement Learning as a Parsimonious Alternative to Prediction Cascades: A Case Study on Image Segmentation强化学习作为预测级联的简约替代方案:图像分割的案例研究Bharat Srikishan, Anika Tabassum, Srikanth Allu, Ramakrishnan Kannan, Nikhil Muralidhararxiv.org/pdf/2402.11…null

图像理解

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-19Evaluating Image Review Ability of Vision Language Models评估视觉语言模型的图像审查能力Shigeki Saito, Kazuki Hayashi, Yusuke Ide, Yusuke Sakai, Kazuma Onishi, Toma Suzuki, Seiji Gobara, Hidetaka Kamigaito, Katsuhiko Hayashi, Taro Watanabearxiv.org/pdf/2402.12…null
2024-02-19An Endoscopic Chisel: Intraoperative Imaging Carves 3D Anatomical Models内窥镜凿子:术中成像雕刻 3D 解剖模型Jan Emily Mangulabnan, Roger D. Soberanis-Mukul, Timo Teufel, Manish Sahu, Jose L. Porras, S. Swaroop Vedula, Masaru Ishii, Gregory Hager, Russell H. Taylor, Mathias Unberatharxiv.org/pdf/2402.11…null

LLM

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-19Open3DSG: Open-Vocabulary 3D Scene Graphs from Point Clouds with Queryable Objects and Open-Set RelationshipsOpen3DSG:来自点云的开放词汇 3D 场景图,具有可查询对象和开放集关系Sebastian Koch, Narunas Vaskevicius, Mirco Colosi, Pedro Hermosilla, Timo Ropinskiarxiv.org/pdf/2402.12…null

Transformer

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-19A Lightweight Parallel Framework for Blind Image Quality Assessment一种轻量级并行盲图像质量评估框架Qunyue Huang, Bin Fangarxiv.org/pdf/2402.12…null
2024-02-19Surround-View Fisheye Optics in Computer Vision and Simulation: Survey and Challenge计算机视觉和仿真中的环视鱼眼光学器件:调查和挑战Daniel Jakab, Brian Michael Deegan, Sushil Sharma, Eoin Martino Grua, Jonathan Horgan, Enda Ward, Pepijn Van De Ven, Anthony Scanlan, Ciaran Eisingarxiv.org/pdf/2402.12…null
2024-02-19AICAttack: Adversarial Image Captioning Attack with Attention-Based OptimizationAICAtack:基于注意力优化的对抗性图像字幕攻击Jiyao Li, Mingze Ni, Yifei Dong, Tianqing Zhu, Wei Liuarxiv.org/pdf/2402.11…null
2024-02-19PhySU-Net: Long Temporal Context Transformer for rPPG with Self-Supervised Pre-trainingPhySU-Net:具有自监督预训练的 rPPG 长时态上下文转换器Marko Savic, Guoying Zhaoarxiv.org/pdf/2402.11…null
2024-02-19Language-guided Image Reflection Separation语言引导的图像反射分离Haofeng Zhong, Yuchen Hong, Shuchen Weng, Jinxiu Liang, Boxin Shiarxiv.org/pdf/2402.11…null

3D/CG

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-19Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability以容量和可扩展性推动 3D 形状生成的自回归模型Xuelin Qian, Yu Wang, Simian Luo, Yinda Zhang, Ying Tai, Zhenyu Zhang, Chengjie Wang, Xiangyang Xue, Bo Zhao, Tiejun Huang, et.al.arxiv.org/pdf/2402.12…null
2024-02-19Pan-Mamba: Effective pan-sharpening with State Space ModelPan-Mamba:使用状态空间模型进行有效的全色锐化Xuanhua He, Ke Cao, Keyu Yan, Rui Li, Chengjun Xie, Jie Zhang, Man Zhouarxiv.org/pdf/2402.12…null
2024-02-19A Spatiotemporal Illumination Model for 3D Image Fusion in Optical Coherence Tomography光学相干断层扫描中 3D 图像融合的时空照明模型Stefan Ploner, Jungeun Won, Julia Schottenhamml, Jessica Girgis, Kenneth Lam, Nadia Waheed, James Fujimoto, Andreas Maierarxiv.org/pdf/2402.12…null
2024-02-19Two Online Map Matching Algorithms Based on Analytic Hierarchy Process and Fuzzy Logic两种基于层次分析法和模糊逻辑的在线地图匹配算法Jeremy J. Lin, Tomoro Mochida, Riley C. W. O'Neill, Atsuro Yoshida, Masashi Yamazaki, Akinobu Sasadaarxiv.org/pdf/2402.11…null
2024-02-19DIO: Dataset of 3D Mesh Models of Indoor Objects for Robotics and Computer Vision ApplicationsDIO:用于机器人和计算机视觉应用的室内物体 3D 网格模型数据集Nillan Nimal, Wenbin Li, Ronald Clark, Sajad Saeediarxiv.org/pdf/2402.11…null

各类学习方式

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-19Avoiding Feature Suppression in Contrastive Learning: Learning What Has Not Been Learned Before避免对比学习中的特征抑制:学习以前没有学过的东西Jihai Zhang, Xiang Lan, Xiaoye Qu, Yu Cheng, Mengling Feng, Bryan Hooiarxiv.org/pdf/2402.11…null

其他

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-19Regularization by denoising: Bayesian model and Langevin-within-split Gibbs sampling通过去噪进行正则化:贝叶斯模型和 Langevin-within-split Gibbs 采样Elhadji C. Faye, Mame Diarra Fall, Nicolas Dobigeonarxiv.org/pdf/2402.12…null
2024-02-19DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language ModelsDriveVLM:自动驾驶和大型视觉语言模型的融合Xiaoyu Tian, Junru Gu, Bailin Li, Yicheng Liu, Chenxu Hu, Yang Wang, Kun Zhan, Peng Jia, Xianpeng Lang, Hang Zhaoarxiv.org/pdf/2402.12…null
2024-02-19Revisiting Data Augmentation in Deep Reinforcement Learning重新审视深度强化学习中的数据增强Jianshu Hu, Yunpeng Jiang, Paul Wengarxiv.org/pdf/2402.12…null
2024-02-19Examining Monitoring System: Detecting Abnormal Behavior In Online Examinations考试监控系统:检测在线考试中的异常行为Dinh An Ngo, Thanh Dat Nguyen, Thi Le Chi Dang, Huy Hoan Le, Ton Bao Ho, Vo Thanh Khang Nguyen, Truong Thanh Hung Nguyenarxiv.org/pdf/2402.12…null
2024-02-19Major TOM: Expandable Datasets for Earth Observation主要 TOM:可扩展的地球观测数据集Alistair Francis, Mikolaj Czerkawskiarxiv.org/pdf/2402.12…null
2024-02-19Robustness and Exploration of Variational and Machine Learning Approaches to Inverse Problems: An Overview反问题变分和机器学习方法的鲁棒性和探索:概述Alexander Auras, Kanchana Vaishnavi Gandikota, Hannah Droege, Michael Moellerarxiv.org/pdf/2402.12…null
2024-02-19InMD-X: Large Language Models for Internal Medicine DoctorsInMD-X:内科医生的大型语言模型Hansle Gwon, Imjin Ahn, Hyoje Jung, Byeolhee Kim, Young-Hak Kim, Tae Joon Junarxiv.org/pdf/2402.11…null