[分享][每日更新][2024.02.27][CV_arxiv_papers]

346 阅读18分钟

[UPDATED!] 2024-02-27 (Publish Time)

模型压缩/优化

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-27ShapeLLM: Universal 3D Object Understanding for Embodied InteractionShapeLLM:用于实体交互的通用 3D 对象理解Zekun Qi, Runpei Dong, Shaochen Zhang, Haoran Geng, Chunrui Han, Zheng Ge, Li Yi, Kaisheng Maarxiv.org/pdf/2402.17…null
2024-02-27Adaptive quantization with mixed-precision based on low-cost proxy基于低成本代理的混合精度自适应量化Junzhe Chen, Qiao Yang, Senmao Tian, Shunli Zhangarxiv.org/pdf/2402.17…null
2024-02-27MCF-VC: Mitigate Catastrophic Forgetting in Class-Incremental Learning for Multimodal Video CaptioningMCF-VC:减轻多模态视频字幕的类增量学习中的灾难性遗忘Huiyu Xiong, Lanxiao Wang, Heqian Qiu, Taijin Zhao, Benliu Qiu, Hongliang Liarxiv.org/pdf/2402.17…null
2024-02-27Neural Video Compression with Feature Modulation具有特征调制的神经视频压缩Jiahao Li, Bin Li, Yan Luarxiv.org/pdf/2402.17…null
2024-02-27Towards Robust and Efficient Cloud-Edge Elastic Model Adaptation via Selective Entropy Distillation通过选择性熵蒸馏实现稳健高效的云边缘弹性模型适应Yaofo Chen, Shuaicheng Niu, Shoukai Xu, Hengjie Song, Yaowei Wang, Mingkui Tanarxiv.org/pdf/2402.17…null
2024-02-27LiveHPS: LiDAR-based Scene-level Human Pose and Shape Estimation in Free EnvironmentLiveHPS:自由环境下基于激光雷达的场景级人体姿态和形状估计Yiming Ren, Xiao Han, Chengfeng Zhao, Jingya Wang, Lan Xu, Jingyi Yu, Yuexin Maarxiv.org/pdf/2402.17…null
2024-02-27Structural Teacher-Student Normality Learning for Multi-Class Anomaly Detection and Localization用于多类异常检测和定位的结构性师生正态学习Hanqiu Deng, Xingyu Liarxiv.org/pdf/2402.17…null

分类/检测/识别/分割/...

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-27ADL4D: Towards A Contextually Rich Dataset for 4D Activities of Daily LivingADL4D:为日常生活的 4D 活动建立上下文丰富的数据集Marsil Zakour, Partha Pratim Nath, Ludwig Lohmer, Emre Faik Gökçe, Martin Piccolrovazzi, Constantin Patsch, Yuankai Wu, Rahul Chaudhari, Eckehard Steinbacharxiv.org/pdf/2402.17…null
2024-02-27VRP-SAM: SAM with Visual Reference PromptVRP-SAM:带有视觉参考提示的 SAMYanpeng Sun, Jiahui Chen, Shan Zhang, Xinyu Zhang, Qiang Chen, Gang Zhang, Errui Ding, Jingdong Wang, Zechao Liarxiv.org/pdf/2402.17…null
2024-02-27MedContext: Learning Contextual Cues for Efficient Volumetric Medical SegmentationMedContext:学习上下文线索以实现高效的体积医学分割Hanan Gani, Muzammal Naseer, Fahad Khan, Salman Khanarxiv.org/pdf/2402.17…null
2024-02-27SDF2Net: Shallow to Deep Feature Fusion Network for PolSAR Image ClassificationSDF2Net:用于 PolSAR 图像分类的浅层到深层特征融合网络Mohammed Q. Alkhatib, M. Sami Zitouni, Mina Al-Saad, Nour Aburaed, Hussain Al-Ahmadarxiv.org/pdf/2402.17…null
2024-02-27Mitigating Distributional Shift in Semantic Segmentation via Uncertainty Estimation from Unlabelled Data通过未标记数据的不确定性估计来减轻语义分割中的分布变化David S. W. Williams, Daniele De Martini, Matthew Gadd, Paul Newmanarxiv.org/pdf/2402.17…null
2024-02-27Masked Gamma-SSL: Learning Uncertainty Estimation via Masked Image ModelingMasked Gamma-SSL:通过掩模图像建模学习不确定性估计David S. W. Williams, Matthew Gadd, Paul Newman, Daniele De Martiniarxiv.org/pdf/2402.17…null
2024-02-27Adapt Before Comparison: A New Perspective on Cross-Domain Few-Shot Segmentation先适应后比较:跨域少样本分割的新视角Jonas Herzogarxiv.org/pdf/2402.17…null
2024-02-27A Large-scale Evaluation of Pretraining Paradigms for the Detection of Defects in Electroluminescence Solar Cell Images用于检测电致发光太阳能电池图像缺陷的预训练范式的大规模评估David Torpey, Lawrence Pratt, Richard Kleinarxiv.org/pdf/2402.17…null
2024-02-27Scribble Hides Class: Promoting Scribble-Based Weakly-Supervised Semantic Segmentation with Its Class LabelScribble 隐藏类:利用其类标签促进基于 Scribble 的弱监督语义分割Xinliang Zhang, Lei Zhu, Hangzhou He, Lujia Jin, Yanye Luarxiv.org/pdf/2402.17…null
2024-02-27Robust Unsupervised Crowd Counting and Localization with Adaptive Resolution SAM具有自适应分辨率 SAM 的稳健无监督人群计数和定位Jia Wan, Qiangqiang Wu, Wei Lin, Antoni B. Chanarxiv.org/pdf/2402.17…null
2024-02-27FedLPPA: Learning Personalized Prompt and Aggregation for Federated Weakly-supervised Medical Image SegmentationFedLPPA:学习联合弱监督医学图像分割的个性化提示和聚合Li Lin, Yixiang Liu, Jiewei Wu, Pujin Cheng, Zhiyuan Cai, Kenneth K. Y. Wong, Xiaoying Tangarxiv.org/pdf/2402.17…null
2024-02-27Automated Classification of Phonetic Segments in Child Speech Using Raw Ultrasound Imaging使用原始超声成像对儿童语音中的语音片段进行自动分类Saja Al Ani, Joanne Cleland, Ahmed Zohaarxiv.org/pdf/2402.17…null
2024-02-27Model X-ray:Detect Backdoored Models via Decision Boundary模型 X 射线:通过决策边界检测后门模型Yanghao Su, Jie Zhang, Ting Xu, Tianwei Zhang, Weiming Zhang, Nenghai Yuarxiv.org/pdf/2402.17…null
2024-02-27Segment anything model for head and neck tumor segmentation with CT, PET and MRI multi-modality images使用 CT、PET 和 MRI 多模态图像分割任何头颈部肿瘤模型Jintao Ren, Mathis Rasmussen, Jasper Nijkamp, Jesper Grau Eriksen, Stine Korremanarxiv.org/pdf/2402.17…null
2024-02-27ViTaL: An Advanced Framework for Automated Plant Disease Identification in Leaf Images Using Vision Transformers and Linear Projection For Feature ReductionViTaL:使用视觉变换器和线性投影进行特征缩减的叶子图像中自动植物病害识别的高级框架Abhishek Sebastian, Annis Fathima A, Pragna R, Madhan Kumar S, Yaswanth Kannan G, Vinay Muraliarxiv.org/pdf/2402.17…null
2024-02-27PANDAS: Prototype-based Novel Class Discovery and DetectionPANDAS:基于原型的新类发现和检测Tyler L. Hayes, César R. de Souza, Namil Kim, Jiwon Kim, Riccardo Volpi, Diane Larlusarxiv.org/pdf/2402.17…null
2024-02-27CARZero: Cross-Attention Alignment for Radiology Zero-Shot ClassificationCARZero:放射学零样本分类的交叉注意力对齐Haoran Lai, Qingsong Yao, Zihang Jiang, Rongsheng Wang, Zhiyang He, Xiaodong Tao, S. Kevin Zhouarxiv.org/pdf/2402.17…null
2024-02-27An Efficient MLP-based Point-guided Segmentation Network for Ore Images with Ambiguous Boundary一种基于MLP的高效的边界模糊矿石图像点引导分割网络Guodong Sun, Yuting Peng, Le Cheng, Mengya Xu, An Wang, Bo Wu, Hongliang Ren, Yang Zhangarxiv.org/pdf/2402.17…null
2024-02-27SDDGR: Stable Diffusion-based Deep Generative Replay for Class Incremental Object DetectionSDDGR:用于类增量对象检测的稳定的基于扩散的深度生成重放Junsu Kim, Hoseong Cho, Jihyeon Kim, Yihalem Yimolal Tiruneh, Seungryul Baekarxiv.org/pdf/2402.17…null
2024-02-27A Vanilla Multi-Task Framework for Dense Visual Prediction Solution to 1st VCL Challenge -- Multi-Task Robustness Track用于第一届 VCL 挑战赛密集视觉预测解决方案的普通多任务框架——多任务鲁棒性赛道Zehui Chen, Qiuchen Wang, Zhenyu Li, Jiaming Liu, Shanghang Zhang, Feng Zhaoarxiv.org/pdf/2402.17…null
2024-02-27Scaling Supervised Local Learning with Augmented Auxiliary Networks使用增强辅助网络扩展监督本地学习Chenxiang Ma, Jibin Wu, Chenyang Si, Kay Chen Tanarxiv.org/pdf/2402.17…null
2024-02-27How we won BraTS 2023 Adult Glioma challenge? Just faking it! Enhanced Synthetic Data Augmentation and Model Ensemble for brain tumour segmentation我们如何赢得 BraTS 2023 成人胶质瘤挑战赛?只是假装而已!用于脑肿瘤分割的增强型合成数据增强和模型集成André Ferreira, Naida Solak, Jianning Li, Philipp Dammann, Jens Kleesiek, Victor Alves, Jan Eggerarxiv.org/pdf/2402.17…null
2024-02-27Explicit Interaction for Fusion-Based Place Recognition基于融合的地点识别的显式交互Jingyi Xu, Junyi Ma, Qi Wu, Zijie Zhou, Yue Wang, Xieyuanli Chen, Ling Peiarxiv.org/pdf/2402.17…null
2024-02-27Deep Learning-Based Speech and Vision Synthesis to Improve Phishing Attack Detection through a Multi-layer Adaptive Framework基于深度学习的语音和视觉合成通过多层自适应框架改进网络钓鱼攻击检测Tosin Ige, Christopher Kiekintveld, Aritran Piplaiarxiv.org/pdf/2402.17…null
2024-02-27SDR-Former: A Siamese Dual-Resolution Transformer for Liver Lesion Classification Using 3D Multi-Phase ImagingSDR-Former:使用 3D 多相成像进行肝脏病变分类的连体双分辨率变压器Meng Lou, Hanning Ying, Xiaoqing Liu, Hong-Yu Zhou, Yuqing Zhang, Yizhou Yuarxiv.org/pdf/2402.17…null
2024-02-27Preserving Fairness Generalization in Deepfake Detection在 Deepfake 检测中保持公平泛化Li Lin, Xinan He, Yan Ju, Xin Wang, Feng Ding, Shu Huarxiv.org/pdf/2402.17…null
2024-02-27Deployment Prior Injection for Run-time Calibratable Object Detection用于运行时可校准对象检测的部署预注入Mo Zhou, Yiding Yang, Haoxiang Li, Vishal M. Patel, Gang Huaarxiv.org/pdf/2402.17…null
2024-02-27PE-MVCNet: Multi-view and Cross-modal Fusion Network for Pulmonary Embolism PredictionPE-MVCNet:用于肺栓塞预测的多视图和跨模态融合网络Zhaoxin Guo, Zhipeng Wang, Ruiquan Ge, Jianxun Yu, Feiwei Qin, Yuan Tian, Yuqing Peng, Yonghong Li, Changmiao Wangarxiv.org/pdf/2402.17…null
2024-02-27Lane2Seq: Towards Unified Lane Detection via Sequence GenerationLane2Seq:通过序列生成实现统一车道检测Kunyang Zhouarxiv.org/pdf/2402.17…null
2024-02-27Few-shot adaptation for morphology-independent cell instance segmentation与形态无关的细胞实例分割的少样本适应Ram J. Zaveri, Voke Brume, Gianfranco Dorettoarxiv.org/pdf/2402.17…null
2024-02-27NocPlace: Nocturnal Visual Place Recognition Using Generative and Inherited Knowledge TransferNocPlace:使用生成和遗传知识转移进行夜间视觉地点识别Bingxi Liu, Yiqun Wang, Huaqi Tao, Tingjun Huang, Fulin Tang, Yihong Wu, Jinqiang Cui, Hong Zhangarxiv.org/pdf/2402.17…null
2024-02-27Efficiently Leveraging Linguistic Priors for Scene Text Spotting有效利用语言先验进行场景文本识别Nguyen Nguyen, Yapeng Tian, Chenliang Xuarxiv.org/pdf/2402.17…null
2024-02-27SAM-DiffSR: Structure-Modulated Diffusion Model for Image Super-ResolutionSAM-DiffSR:用于图像超分辨率的结构调制扩散模型Chengcheng Wang, Zhiwei Hao, Yehui Tang, Jianyuan Guo, Yujie Yang, Kai Han, Yunhe Wangarxiv.org/pdf/2402.17…link
2024-02-27OSCaR: Object State Captioning and State Change RepresentationOSCaR:对象状态描述和状态变化表示Nguyen Nguyen, Jing Bi, Ali Vosoughi, Yapeng Tian, Pooyan Fazli, Chenliang Xuarxiv.org/pdf/2402.17…null
2024-02-27In Defense and Revival of Bayesian Filtering for Thermal Infrared Object Tracking热红外物体跟踪贝叶斯过滤的防御和复兴Peng Gao, Shi-Min Li, Feng Gao, Fei Wang, Ru-Yue Yuan, Hamido Fujitaarxiv.org/pdf/2402.17…null

OCR

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-27Advancing Generative Model Evaluation: A Novel Algorithm for Realistic Image Synthesis and Comparison in OCR System推进生成模型评估:一种用于 OCR 系统中真实图像合成和比较的新算法Majid Memari, Khaled R. Ahmed, Shahram Rahimi, Noorbakhsh Amiri Golilarzarxiv.org/pdf/2402.17…null

Transformer

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-27CAD-SIGNet: CAD Language Inference from Point Clouds using Layer-wise Sketch Instance Guided AttentionCAD-SIGNet:使用分层草图实例引导注意力从点云进行 CAD 语言推理Mohammad Sadil Khan, Elona Dupont, Sk Aziz Ali, Kseniya Cherenkova, Anis Kacem, Djamila Aouadaarxiv.org/pdf/2402.17…null
2024-02-27Structure-Guided Adversarial Training of Diffusion Models结构引导的扩散模型对抗训练Ling Yang, Haotian Qian, Zhilong Zhang, Jingwei Liu, Bin Cuiarxiv.org/pdf/2402.17…null
2024-02-27An Empirical Study of the Generalization Ability of Lidar 3D Object Detectors to Unseen Domains激光雷达 3D 物体探测器对不可见领域的泛化能力的实证研究George Eskandar, Chongzhe Zhang, Abhishek Kaushik, Karim Guirguis, Mohamed Sayed, Bin Yangarxiv.org/pdf/2402.17…null
2024-02-27Diffusion Model-Based Image Editing: A Survey基于扩散模型的图像编辑:调查Yi Huang, Jiancheng Huang, Yifan Liu, Mingfu Yan, Jiaxi Lv, Jianzhuang Liu, Wei Xiong, He Zhang, Shifeng Chen, Liangliang Caoarxiv.org/pdf/2402.17…null
2024-02-27Interactive Multi-Head Self-Attention with Linear Complexity具有线性复杂度的交互式多头自注意力Hankyul Kang, Ming-Hsuan Yang, Jongbin Ryuarxiv.org/pdf/2402.17…null
2024-02-27Bit Rate Matching Algorithm Optimization in JPEG-AI Verification ModelJPEG-AI验证模型中的比特率匹配算法优化Panqi Jia, A. Burakhan Koyuncu, Jue Mao, Ze Cui, Yi Ma, Tiansheng Guo, Timofey Solovyev, Alexander Karabutov, Yin Zhao, Jing Wang, et.al.arxiv.org/pdf/2402.17…null
2024-02-27Leveraging Enhanced Queries of Point Sets for Vectorized Map Construction利用增强的点集查询进行矢量化地图构建Zihao Liu, Xiaoyu Zhang, Guangwei Liu, Ji Zhao, Ningyi Xuarxiv.org/pdf/2402.17…null
2024-02-27LSPT: Long-term Spatial Prompt Tuning for Visual Representation LearningLSPT:视觉表示学习的长期空间提示调整Shentong Mo, Yansen Wang, Xufang Luo, Dongsheng Liarxiv.org/pdf/2402.17…null
2024-02-27CAPT: Category-level Articulation Estimation from a Single Point Cloud Using TransformerCAPT:使用 Transformer 从单点云进行类别级清晰度估计Lian Fu, Ryoichi Ishikawa, Yoshihiro Sato, Takeshi Oishiarxiv.org/pdf/2402.17…null
2024-02-27Image-Text Matching with Multi-View Attention具有多视图注意力的图像文本匹配Rui Cheng, Wanqing Cuiarxiv.org/pdf/2402.17…null
2024-02-27Feature Re-Embedding: Towards Foundation Model-Level Performance in Computational Pathology特征重新嵌入:迈向计算病理学的基础模型级性能Wenhao Tang, Fengtao Zhou, Sheng Huang, Xiang Zhu, Yi Zhang, Bo Liuarxiv.org/pdf/2402.17…null
2024-02-27CharacterGen: Efficient 3D Character Generation from Single Images with Multi-View Pose CanonicalizationCharacterGen:通过多视图姿势规范化从单张图像高效生成 3D 角色Hao-Yang Peng, Jia-Peng Zhang, Meng-Hao Guo, Yan-Pei Cao, Shi-Min Huarxiv.org/pdf/2402.17…null
2024-02-27CharNeRF: 3D Character Generation from Concept ArtCharNeRF:从概念艺术生成 3D 角色Eddy Chu, Yiyang Chen, Chedy Raissi, Anand Bhojanarxiv.org/pdf/2402.17…null

3D/CG

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-27LoDIP: Low light phase retrieval with deep image priorLoDIP:具有深度图像先验的低光相位检索Raunak Manekar, Elisa Negrini, Minh Pham, Daniel Jacobs, Jaideep Srivastavaarxiv.org/pdf/2402.17…null
2024-02-27Analyzing Regional Organization of the Human Hippocampus in 3D-PLI Using Contrastive Learning and Geometric Unfolding使用对比学习和几何展开在 3D-PLI 中分析人类海马的区域组织Alexander Oberstrass, Jordan DeKraker, Nicola Palomero-Gallagher, Sascha E. A. Muenzing, Alan C. Evans, Markus Axer, Katrin Amunts, Timo Dickscheidarxiv.org/pdf/2402.17…null
2024-02-27PHNet: Patch-based Normalization for Portrait HarmonizationPHNet:基于补丁的标准化肖像协调Karen Efremyan, Elizaveta Petrova, Evgeny Kaskov, Alexander Kapitanovarxiv.org/pdf/2402.17…null
2024-02-27AVS-Net: Point Sampling with Adaptive Voxel Size for 3D Point Cloud AnalysisAVS-Net:采用自适应体素大小的点采样进行 3D 点云分析Hongcheng Yang, Dingkang Liang, Dingyuan Zhang, Xingyu Jiang, Zhe Liu, Zhikang Zou, Yingying Zhuarxiv.org/pdf/2402.17…null
2024-02-27EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak ConditionsEMO:Emote Portrait Alive - 在弱条件下使用音视频扩散模型生成富有表现力的肖像视频Linrui Tian, Qi Wang, Bang Zhang, Liefeng Boarxiv.org/pdf/2402.17…null
2024-02-27Generative 3D Part Assembly via Part-Whole-Hierarchy Message Passing通过部分-整体-层次结构消息传递生成 3D 零件组装Bi'an Du, Xiang Gao, Wei Hu, Renjie Liaoarxiv.org/pdf/2402.17…null
2024-02-27VastGaussian: Vast 3D Gaussians for Large Scene ReconstructionVastGaussian:用于大型场景重建的 Vast 3D 高斯Jiaqi Lin, Zhihao Li, Xiao Tang, Jianzhuang Liu, Shiyong Liu, Jiayue Liu, Yangdi Lu, Xiaofei Wu, Songcen Xu, Youliang Yan, et.al.arxiv.org/pdf/2402.17…null
2024-02-27DiffuseKronA: A Parameter Efficient Fine-tuning Method for Personalized Diffusion ModelDiffuseKronA:个性化扩散模型的参数高效微调方法Shyam Marjit, Harshit Singh, Nityanand Mathur, Sayak Paul, Chia-Mu Yu, Pin-Yu Chenarxiv.org/pdf/2402.17…null
2024-02-27Sora Generates Videos with Stunning Geometrical ConsistencySora 生成具有令人惊叹的几何一致性的视频Xuanyi Li, Daquan Zhou, Chenxu Zhang, Shaodong Wei, Qibin Hou, Ming-Ming Chengarxiv.org/pdf/2402.17…null
2024-02-27Coupled Laplacian Eigenmaps for Locally-Aware 3D Rigid Point Cloud Matching用于局部感知 3D 刚性点云匹配的耦合拉普拉斯特征图Matteo Bastico, Etienne Decencière, Laurent Corté, Yannick Tillier, David Ryckelynckarxiv.org/pdf/2402.17…null
2024-02-27Learning Dynamic Tetrahedra for High-Quality Talking Head Synthesis学习动态四面体以实现高质量的头部说话合成Zicheng Zhang, Ruobing Zheng, Ziwen Liu, Congying Han, Tianqi Li, Meng Wang, Tiande Guo, Jingdong Chen, Bonan Li, Ming Yangarxiv.org/pdf/2402.17…null
2024-02-27ICP-Flow: LiDAR Scene Flow Estimation with ICPICP-Flow:利用 ICP 进行 LiDAR 场景流量估计Yancong Lin, Holger Caesararxiv.org/pdf/2402.17…null
2024-02-27Denoising Diffusion Models for Inpainting of Healthy Brain Tissue用于修复健康脑组织的去噪扩散模型Alicia Durrer, Philippe C. Cattin, Julia Wollebarxiv.org/pdf/2402.17…null
2024-02-27DivAvatar: Diverse 3D Avatar Generation with a Single PromptDivAvatar:通过单一提示生成多样化的 3D 头像Weijing Tao, Biwen Lei, Kunhao Liu, Shijian Lu, Miaomiao Cui, Xuansong Xie, Chunyan Miaoarxiv.org/pdf/2402.17…null
2024-02-27Enhancing Hyperspectral Images via Diffusion Model and Group-Autoencoder Super-resolution Network通过扩散模型和组自动编码器超分辨率网络增强高光谱图像Zhaoyang Wang, Dongyang Li, Mingyang Zhang, Hao Luo, Maoguo Gongarxiv.org/pdf/2402.17…null
2024-02-27Differentiable Biomechanics Unlocks Opportunities for Markerless Motion Capture可微分生物力学为无标记运动捕捉带来机遇R. James Cottonarxiv.org/pdf/2402.17…null

各类学习方式

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-27Opening Cabinets and Drawers in the Real World using a Commodity Mobile Manipulator使用商品移动机械手打开现实世界中的橱柜和抽屉Arjun Gupta, Michelle Zhang, Rishik Sathua, Saurabh Guptaarxiv.org/pdf/2402.17…null
2024-02-27PLReMix: Combating Noisy Labels with Pseudo-Label Relaxed Contrastive Representation LearningPLReMix:用伪标签松弛对比表示学习对抗噪声标签Xiaoyu Liu, Beitong Zhou, Cheng Chengarxiv.org/pdf/2402.17…null
2024-02-27Demonstrating and Reducing Shortcuts in Vision-Language Representation Learning展示和减少视觉语言表征学习的捷径Maurits Bleeker, Mariya Hendriksen, Andrew Yates, Maarten de Rijkearxiv.org/pdf/2402.17…null
2024-02-27MGE: A Training-Free and Efficient Model Generation and Enhancement SchemeMGE:免训练的高效模型生成和增强方案Xuan Wang, Zeshan Pang, Yuliang Lu, Xuehu Yanarxiv.org/pdf/2402.17…null
2024-02-27ArcSin: Adaptive ranged cosine Similarity injected noise for Language-Driven Visual TasksArcSin:用于语言驱动视觉任务的自适应范围余弦相似度注入噪声Yang Liu, Xiaomin Yu, Gongyu Zhang, Christos Bergeles, Prokar Dasgupta, Alejandro Granados, Sebastien Ourselinarxiv.org/pdf/2402.17…null
2024-02-27Context-based and Diversity-driven Specificity in Compositional Zero-Shot Learning组合零样本学习中基于上下文和多样性驱动的特异性Yun Li, Zhe Liu, Hang Chen, Lina Yaoarxiv.org/pdf/2402.17…null

其他

Publish DateTitleTitle_CNAuthorsPDFCode
2024-02-27Diffusion Meets DAgger: Supercharging Eye-in-hand Imitation LearningDiffusion 遇上 DAgger:增强手眼模仿学习Xiaoyu Zhang, Matthew Chang, Pranav Kumar, Saurabh Guptaarxiv.org/pdf/2402.17…null
2024-02-27Towards Fairness-Aware Adversarial Learning迈向公平意识的对抗性学习Yanghao Zhang, Tianle Zhang, Ronghui Mu, Xiaowei Huang, Wenjie Ruanarxiv.org/pdf/2402.17…link
2024-02-27Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners视觉和听觉:具有扩散潜在对准器的开放域视音频生成Yazhou Xing, Yingqing He, Zeyue Tian, Xintao Wang, Qifeng Chenarxiv.org/pdf/2402.17…null
2024-02-27Bayesian Differentiable Physics for Cloth Digitalization用于布料数字化的贝叶斯微分物理Deshan Gong, Ningtao Mao, He Wangarxiv.org/pdf/2402.17…null
2024-02-27CustomSketching: Sketch Concept Extraction for Sketch-based Image Synthesis and EditingCustomSketching:基于草图的图像合成和编辑的草图概念提取Chufeng Xiao, Hongbo Fuarxiv.org/pdf/2402.17…null
2024-02-27OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist Autonomous Agents for Desktop and WebOmniACT:为桌面和 Web 启用多模式通才自治代理的数据集和基准Raghav Kapoor, Yash Parag Butala, Melisa Russak, Jing Yu Koh, Kiran Kamble, Waseem Alshikh, Ruslan Salakhutdinovarxiv.org/pdf/2402.17…null
2024-02-27Adapting Learned Image Codecs to Screen Content via Adjustable Transformations通过可调整的转换使学习的图像编解码器适应屏幕内容H. Burak Dogaroglu, A. Burakhan Koyuncu, Atanas Boev, Elena Alshina, Eckehard Steinbacharxiv.org/pdf/2402.17…null
2024-02-27Multimodal Learned Sparse Retrieval with Probabilistic Expansion Control具有概率扩展控制的多模态学习稀疏检索Thong Nguyen, Mariya Hendriksen, Andrew Yates, Maarten de Rijkearxiv.org/pdf/2402.17…null
2024-02-27Black-box Adversarial Attacks Against Image Quality Assessment Models针对图像质量评估模型的黑盒对抗攻击Yu Ran, Ao-Xiang Zhang, Mingjie Li, Weixuan Tang, Yuan-Gen Wangarxiv.org/pdf/2402.17…null
2024-02-27AlignMiF: Geometry-Aligned Multimodal Implicit Field for LiDAR-Camera Joint SynthesisAlignMiF:用于 LiDAR-相机联合合成的几何对齐多模态隐式场Tao Tang, Guangrun Wang, Yixing Lao, Peng Chen, Jie Liu, Liang Lin, Kaicheng Yu, Xiaodan Liangarxiv.org/pdf/2402.17…null
2024-02-27Bit Distribution Study and Implementation of Spatial Quality Map in the JPEG-AI StandardizationJPEG-AI标准化中空间质量图的比特分布研究与实现Panqi Jia, Jue Mao, Esin Koyuncu, A. Burakhan Koyuncu, Timofey Solovyev, Alexander Karabutov, Yin Zhao, Elena Alshina, Andre Kauparxiv.org/pdf/2402.17…null
2024-02-27V2C-Long: Longitudinal Cortex Reconstruction with Spatiotemporal CorrespondenceV2C-Long:具有时空对应性的纵向皮层重建Fabian Bongratz, Jan Fecht, Anne-Marie Rickmann, Christian Wachingerarxiv.org/pdf/2402.17…null
2024-02-27A novel image space formalism of Fourier domain interpolation neural networks for noise propagation analysis用于噪声传播分析的傅里叶域插值神经网络的新颖图像空间形式Peter Dawood, Felix Breuer, Istvan Homolya, Jannik Stebani, Maximilian Gram, Peter M. Jakob, Moritz Zaiss, Martin Blaimerarxiv.org/pdf/2402.17…null
2024-02-27Accelerating Diffusion Sampling with Optimized Time Steps通过优化时间步长加速扩散采样Shuchen Xue, Zhaoqiang Liu, Fei Chen, Shifeng Zhang, Tianyang Hu, Enze Xie, Zhenguo Liarxiv.org/pdf/2402.17…null
2024-02-27SocialCVAE: Predicting Pedestrian Trajectory via Interaction Conditioned LatentsSocialCVAE:通过交互条件潜伏预测行人轨迹Wei Xiang, Haoteng Yin, He Wang, Xiaogang Jinarxiv.org/pdf/2402.17…null
2024-02-27Method of Tracking and Analysis of Fluorescent-Labeled Cells Using Automatic Thresholding and Labeling使用自动阈值和标记跟踪和分析荧光标记细胞的方法Mizuki Fukasawa, Tomokazu Fukuda, Takuya Akashiarxiv.org/pdf/2402.17…null
2024-02-27Learning Exposure Correction in Dynamic Scenes学习动态场景中的曝光校正Jin Liu, Bo Wang, Chuanming Wang, Huiyuan Fu, Huadong Maarxiv.org/pdf/2402.17…null
2024-02-27An Interpretable Evaluation of Entropy-based Novelty of Generative Models基于熵的生成模型新颖性的可解释评估Jingwei Zhang, Cheuk Ting Li, Farzan Farniaarxiv.org/pdf/2402.17…null
2024-02-27One-Shot Structure-Aware Stylized Image Synthesis一次性结构感知风格化图像合成Hansam Cho, Jonghyun Lee, Seunggyu Chang, Yonghyun Jeongarxiv.org/pdf/2402.17…null
2024-02-27Playground v2.5: Three Insights towards Enhancing Aesthetic Quality in Text-to-Image GenerationPlayground v2.5:增强文本到图像生成美学质量的三个见解Daiqing Li, Aleks Kamko, Ehsan Akhgari, Ali Sabet, Linmiao Xu, Suhail Doshiarxiv.org/pdf/2402.17…null
2024-02-27VCD: Knowledge Base Guided Visual Commonsense Discovery in ImagesVCD:知识库引导图像中的视觉常识发现Xiangqing Shen, Yurun Song, Siwei Wu, Rui Xiaarxiv.org/pdf/2402.17…null
2024-02-27Purified and Unified Steganographic Network纯净统一的隐写网络Guobiao Li, Sheng Li, Zicong Luo, Zhenxing Qian, Xinpeng Zhangarxiv.org/pdf/2402.17…null
2024-02-27Enhancing Quality of Compressed Images by Mitigating Enhancement Bias Towards Compression Domain通过减轻压缩域的增强偏差来增强压缩图像的质量Qunliang Xing, Mai Xu, Shengxi Li, Xin Deng, Meisong Zheng, Huaida Liu, Ying Chenarxiv.org/pdf/2402.17…null
2024-02-27Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision ModelsSora:大视觉模型的背景、技术、局限性和机遇回顾Yixin Liu, Kai Zhang, Yuan Li, Zhiling Yan, Chujie Gao, Ruoxi Chen, Zhengqing Yuan, Yue Huang, Hanchi Sun, Jianfeng Gao, et.al.arxiv.org/pdf/2402.17…null
2024-02-27Deep Umbra: A Generative Approach for Sunlight Access Computation in Urban SpacesDeep Umbra:城市空间中阳光照射计算的生成方法Kazi Shahrukh Omar, Gustavo Moreira, Daniel Hodczak, Maryam Hosseini, Nicola Colaninno, Marcos Lage, Fabio Mirandaarxiv.org/pdf/2402.17…null
2024-02-27Video as the New Language for Real-World Decision Making视频作为现实世界决策的新语言Sherry Yang, Jacob Walker, Jack Parker-Holder, Yilun Du, Jake Bruce, Andre Barreto, Pieter Abbeel, Dale Schuurmansarxiv.org/pdf/2402.17…null
2024-02-27Transparent Image Layer Diffusion using Latent Transparency使用潜在透明度的透明图像层扩散Lvmin Zhang, Maneesh Agrawalaarxiv.org/pdf/2402.17…null
2024-02-27T-HITL Effectively Addresses Problematic Associations in Image Generation and Maintains Overall Visual QualityT-HITL 有效解决图像生成中的问题关联并保持整体视觉质量Susan Epstein, Li Chen, Alessandro Vecchiato, Ankit Jainarxiv.org/pdf/2402.17…null