[分享][每日更新][2024.03.27][CV_arxiv_papers]

305 阅读18分钟

[UPDATED!] 2024-03-27 (Publish Time)

生成模型

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-27ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and InsertionObjectDrop:引导反事实以实现逼真的对象删除和插入Daniel Winter, Matan Cohen, Shlomi Fruchter, Yael Pritch, Alex Rav-Acha, Yedid Hoshenarxiv.org/pdf/2403.18…null
2024-03-27ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth EstimationECoDepth:有效调节单眼深度估计的扩散模型Suraj Patni, Aradhye Agarwal, Chetan Aroraarxiv.org/pdf/2403.18…link
2024-03-27Object Pose Estimation via the Aggregation of Diffusion Features通过扩散特征的聚合进行物体姿态估计Tianfu Wang, Guosheng Hu, Hongguang Wangarxiv.org/pdf/2403.18…link
2024-03-27ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic ObjectImageNet-D:扩散合成对象上神经网络鲁棒性的基准测试Chenshuang Zhang, Fei Pan, Junmo Kim, In So Kweon, Chengzhi Maoarxiv.org/pdf/2403.18…link
2024-03-27Semi-Supervised Learning for Deep Causal Generative Models深度因果生成模型的半监督学习Yasin Ibrahim, Hermione Warr, Konstantinos Kamnitsasarxiv.org/pdf/2403.18…null
2024-03-27HandBooster: Boosting 3D Hand-Mesh Reconstruction by Conditional Synthesis and Sampling of Hand-Object InteractionsHandBooster:通过手-物体交互的条件合成和采样来促进 3D 手网格重建Hao Xu, Haipeng Li, Yinqiao Wang, Shuaicheng Liu, Chi-Wing Fuarxiv.org/pdf/2403.18…link
2024-03-27Artifact Reduction in 3D and 4D Cone-beam Computed Tomography Images with Deep Learning -- A Review利用深度学习减少 3D 和 4D 锥束计算机断层扫描图像中的伪影 - 综述Mohammadreza Amirian, Daniel Barco, Ivo Herzig, Frank-Peter Schillingarxiv.org/pdf/2403.18…null
2024-03-27CosalPure: Learning Concept from Group Images for Robust Co-Saliency DetectionCosalPure:从组图像中学习概念以实现稳健的协同显着性检测Jiayi Zhu, Qing Guo, Felix Juefei-Xu, Yihao Huang, Yang Liu, Geguang Puarxiv.org/pdf/2403.18…null
2024-03-27CT-3DFlow : Leveraging 3D Normalizing Flows for Unsupervised Detection of Pathological Pulmonary CT scansCT-3DFlow:利用 3D 标准化流程进行病理性肺部 CT 扫描的无监督检测Aissam Djahnine, Alexandre Popoff, Emilien Jupin-Delevaux, Vincent Cottin, Olivier Nempont, Loic Bousselarxiv.org/pdf/2403.18…null
2024-03-27DiffusionFace: Towards a Comprehensive Dataset for Diffusion-Based Face Forgery AnalysisDiffusionFace:面向基于扩散的人脸伪造分析的综合数据集Zhongxi Chen, Ke Sun, Ziyin Zhou, Xianming Lin, Xiaoshuai Sun, Liujuan Cao, Rongrong Jiarxiv.org/pdf/2403.18…link
2024-03-27DiffStyler: Diffusion-based Localized Image Style TransferDiffStyler:基于扩散的本地化图像风格迁移Shaoxu Liarxiv.org/pdf/2403.18…null
2024-03-27SingularTrajectory: Universal Trajectory Predictor Using Diffusion ModelSingularTrajectory:使用扩散模型的通用轨迹预测器Inhwan Bae, Young-Jae Park, Hae-Gon Jeonarxiv.org/pdf/2403.18…link
2024-03-27U-Sketch: An Efficient Approach for Sketch to Image Diffusion ModelsU-Sketch:草图到图像扩散模型的有效方法Ilias Mitsouras, Eleftherios Tsonis, Paraskevi Tzouveli, Athanasios Voulodimosarxiv.org/pdf/2403.18…null
2024-03-27ECNet: Effective Controllable Text-to-Image Diffusion ModelsECNet:有效的可控文本到图像扩散模型Sicheng Li, Keqiang Sun, Zhixin Lai, Xiaoshi Wu, Feng Qiu, Haoran Xie, Kazunori Miyata, Hongsheng Liarxiv.org/pdf/2403.18…null
2024-03-27Colour and Brush Stroke Pattern Recognition in Abstract Art using Modified Deep Convolutional Generative Adversarial Networks使用改进的深度卷积生成对抗网络进行抽象艺术中的颜色和画笔描边图案识别Srinitish Srinivasan, Varenya Pathakarxiv.org/pdf/2403.18…null
2024-03-27Generative Multi-modal Models are Good Class-Incremental Learners生成式多模态模型是良好的课堂增量学习器Xusheng Cao, Haori Lu, Linlan Huang, Xialei Liu, Ming-Ming Chengarxiv.org/pdf/2403.18…link
2024-03-27Ship in Sight: Diffusion Models for Ship-Image Super Resolution船舶视线:船舶图像超分辨率的扩散模型Luigi Sigillo, Riccardo Fosco Gramaccioni, Alessandro Nicolosi, Danilo Comminielloarxiv.org/pdf/2403.18…link
2024-03-27DODA: Diffusion for Object-detection Domain Adaptation in AgricultureDODA:农业中目标检测领域适应的扩散Shuai Xiang, Pieter M. Blok, James Burridge, Haozhou Wang, Wei Guoarxiv.org/pdf/2403.18…null
2024-03-27DVLO: Deep Visual-LiDAR Odometry with Local-to-Global Feature Fusion and Bi-Directional Structure AlignmentDVLO:具有局部到全局特征融合和双向结构对准的深度视觉激光雷达里程计Jiuming Liu, Dong Zhuo, Zhiheng Feng, Siting Zhu, Chensheng Peng, Zhe Liu, Hesheng Wangarxiv.org/pdf/2403.18…null
2024-03-27Unleashing the Potential of SAM for Medical Adaptation via Hierarchical Decoding通过分层解码释放 SAM 在医学适应方面的潜力Zhiheng Cheng, Qingyue Wei, Hongru Zhu, Yan Wang, Liangqiong Qu, Wei Shao, Yuyin Zhouarxiv.org/pdf/2403.18…link
2024-03-27Enhancing Generative Class Incremental Learning Performance with Model Forgetting Approach通过模型遗忘方法提高生成类增量学习性能Taro Togo, Ren Togo, Keisuke Maeda, Takahiro Ogawa, Miki Haseyamaarxiv.org/pdf/2403.18…null
2024-03-27NeuSDFusion: A Spatial-Aware Generative Model for 3D Shape Completion, Reconstruction, and GenerationNeuSDFusion:用于 3D 形状补全、重建和生成的空间感知生成模型Ruikai Cui, Weizhe Liu, Weixuan Sun, Senbo Wang, Taizhang Shang, Yang Li, Xibin Song, Han Yan, Zhennan Wu, Shenzhou Chen, et.al.arxiv.org/pdf/2403.18…null
2024-03-27TAFormer: A Unified Target-Aware Transformer for Video and Motion Joint Prediction in Aerial ScenesTAFormer:用于空中场景中视频和运动联合预测的统一目标感知变压器Liangyu Xu, Wanxuan Lu, Hongfeng Yu, Yongqiang Mao, Hanbo Bi, Chenglong Liu, Xian Sun, Kun Fuarxiv.org/pdf/2403.18…null
2024-03-27NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level ModulationNeuroPictor:通过多个体预训练和多级调制改进 fMRI 到图像重建Jingyang Huo, Yikai Wang, Xuelin Qian, Yun Wang, Chong Li, Jianfeng Feng, Yanwei Fuarxiv.org/pdf/2403.18…null
2024-03-27Generative Medical Segmentation生成医疗分割Jiayu Huo, Xi Ouyang, Sébastien Ourselin, Rachel Sparksarxiv.org/pdf/2403.18…link

多模态

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-27Mini-Gemini: Mining the Potential of Multi-modality Vision Language ModelsMini-Gemini:挖掘多模态视觉语言模型的潜力Yanwei Li, Yuechen Zhang, Chengyao Wang, Zhisheng Zhong, Yixin Chen, Ruihang Chu, Shaoteng Liu, Jiaya Jiaarxiv.org/pdf/2403.18…link
2024-03-27Mitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive Decoding通过指令对比解码减轻大视觉语言模型中的幻觉Xintong Wang, Jingheng Pan, Liang Ding, Chris Biemannarxiv.org/pdf/2403.18…null
2024-03-27Bringing Textual Prompt to AI-Generated Image Quality Assessment将文本提示引入人工智能生成的图像质量评估Bowen Qu, Haohui Li, Wei Gaoarxiv.org/pdf/2403.18…null
2024-03-27Can Language Beat Numerical Regression? Language-Based Multimodal Trajectory Prediction语言能打败数值回归吗?基于语言的多模态轨迹预测Inhwan Bae, Junoh Lee, Hae-Gon Jeonarxiv.org/pdf/2403.18…link
2024-03-27Quantifying and Mitigating Unimodal Biases in Multimodal Large Language Models: A Causal Perspective量化和减轻多模态大语言模型中的单模态偏差:因果视角Meiqi Chen, Yixin Cao, Yan Zhang, Chaochao Luarxiv.org/pdf/2403.18…null
2024-03-27H2ASeg: Hierarchical Adaptive Interaction and Weighting Network for Tumor Segmentation in PET/CT ImagesH2ASeg:用于 PET/CT 图像中肿瘤分割的分层自适应交互和加权网络Jinpeng Lu, Jingyun Chen, Linghan Cai, Songhan Jiang, Yongbing Zhangarxiv.org/pdf/2403.18…null
2024-03-27Beyond Embeddings: The Promise of Visual Table in Multi-Modal Models超越嵌入:多模态模型中视觉表的前景Yiwu Zhong, Zi-Yuan Hu, Michael R. Lyu, Liwei Wangarxiv.org/pdf/2403.18…link
2024-03-27An Evolutionary Network Architecture Search Framework with Adaptive Multimodal Fusion for Hand Gesture Recognition用于手势识别的自适应多模态融合的进化网络架构搜索框架Yizhang Xia, Shihao Song, Zhanglu Hou, Junwen Xu, Juan Zou, Yuan Liu, Shengxiang Yangarxiv.org/pdf/2403.18…null
2024-03-27Middle Fusion and Multi-Stage, Multi-Form Prompts for Robust RGB-T Tracking中间融合和多阶段、多形式提示,实现强大的 RGB-T 跟踪Qiming Wang, Yongqiang Bai, Hongxing Songarxiv.org/pdf/2403.18…null
2024-03-27Online Embedding Multi-Scale CLIP Features into 3D Maps将多尺度 CLIP 特征在线嵌入到 3D 地图中Shun Taguchi, Hideki Deguchiarxiv.org/pdf/2403.18…null

Nerf

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-27Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstructionGamba:将高斯泼溅法与 Mamba 结合起来进行单视图 3D 重建Qiuhong Shen, Xuanyu Yi, Zike Wu, Pan Zhou, Hanwang Zhang, Shuicheng Yan, Xinchao Wangarxiv.org/pdf/2403.18…null
2024-03-27SAT-NGP : Unleashing Neural Graphics Primitives for Fast Relightable Transient-Free 3D reconstruction from Satellite ImagerySAT-NGP:释放神经图形基元,从卫星图像进行快速可重新照明的无瞬态 3D 重建Camille Billouard, Dawa Derksen, Emmanuelle Sarrazin, Bruno Valletarxiv.org/pdf/2403.18…null
2024-03-27Modeling uncertainty for Gaussian Splatting高斯泼溅的不确定性建模Luca Savant, Diego Valsesia, Enrico Magliarxiv.org/pdf/2403.18…null

3DGS

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-27SplatFace: Gaussian Splat Face Reconstruction Leveraging an Optimizable SurfaceSplatFace:利用可优化表面的高斯 Splat 面重建Jiahao Luo, Jing Liu, James Davisarxiv.org/pdf/2403.18…null

模型压缩/优化

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-27Dense Vision Transformer Compression with Few Samples少量样本的密集视觉变压器压缩Hanxiao Zhang, Yifan Zhou, Guo-Hua Wang, Jianxin Wuarxiv.org/pdf/2403.18…null
2024-03-27I2CKD : Intra- and Inter-Class Knowledge Distillation for Semantic SegmentationI2CKD:用于语义分割的类内和类间知识蒸馏Ayoub Karine, Thibault Napoléon, Maher Jridiarxiv.org/pdf/2403.18…null

分类/检测/识别/分割/...

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-27Benchmarking Object Detectors with COCO: A New Path Forward使用 COCO 对目标检测器进行基准测试:一条新的前进道路Shweta Singh, Aayan Yadav, Jitesh Jain, Humphrey Shi, Justin Johnson, Karan Desaiarxiv.org/pdf/2403.18…null
2024-03-27ModaLink: Unifying Modalities for Efficient Image-to-PointCloud Place RecognitionModaLink:统一模式以实现高效的图像到点云位置识别Weidong Xie, Lun Luo, Nanfei Ye, Yi Ren, Shaoyi Du, Minhang Wang, Jintao Xu, Rui Ai, Weihao Gu, Xieyuanli Chenarxiv.org/pdf/2403.18…link
2024-03-27Detection of subclinical atherosclerosis by image-based deep learning on chest x-ray基于图像的深度学习胸部 X 射线检测亚临床动脉粥样硬化Guglielmo Gallone, Francesco Iodice, Alberto Presta, Davide Tore, Ovidio de Filippo, Michele Visciano, Carlo Alberto Barbano, Alessandro Serafini, Paola Gorrini, Alessandro Bruno, et.al.arxiv.org/pdf/2403.18…null
2024-03-27A vascular synthetic model for improved aneurysm segmentation and detection via Deep Neural Networks通过深度神经网络改进动脉瘤分割和检测的血管合成模型Rafic Nader, Florent Autrusseau, Vincent L'Allinec, Romain Bourcierarxiv.org/pdf/2403.18…null
2024-03-27Annolid: Annotate, Segment, and Track Anything You NeedAnnolid:注释、分段和跟踪您需要的任何内容Chen Yang, Thomas A. Clelandarxiv.org/pdf/2403.18…null
2024-03-27Addressing Data Annotation Challenges in Multiple Sensors: A Solution for Scania Collected Datasets解决多个传感器中的数据注释挑战:斯堪尼亚收集数据集的解决方案Ajinkya Khoche, Aron Asefaw, Alejandro Gonzalez, Bogdan Timus, Sina Sharif Mansouri, Patric Jensfeltarxiv.org/pdf/2403.18…null
2024-03-27Transformers-based architectures for stroke segmentation: A review基于 Transformers 的笔画分割架构:综述Yalda Zafari-Ghadim, Essam A. Rashed, Mohamed Mabrokarxiv.org/pdf/2403.18…null
2024-03-27Homogeneous Tokenizer Matters: Homogeneous Visual Tokenizer for Remote Sensing Image Understanding同质分词器很重要:用于遥感图像理解的同质视觉分词器Run Shao, Zhaoyang Zhang, Chao Tao, Yunsheng Zhang, Chengli Peng, Haifeng Liarxiv.org/pdf/2403.18…link
2024-03-27The Impact of Uniform Inputs on Activation Sparsity and Energy-Latency Attacks in Computer Vision统一输入对计算机视觉中激活稀疏性和能量延迟攻击的影响Andreas Müller, Erwin Quiringarxiv.org/pdf/2403.18…link
2024-03-27Efficient Heatmap-Guided 6-Dof Grasp Detection in Cluttered Scenes杂乱场景中高效的热图引导 6 自由度抓取检测Siang Chen, Wei Tang, Pengwei Xie, Wenming Yang, Guijin Wangarxiv.org/pdf/2403.18…link
2024-03-27Direct mineral content prediction from drill core images via transfer learning通过迁移学习从钻芯图像直接预测矿物含量Romana Boiger, Sergey V. Churakov, Ignacio Ballester Llagaria, Georg Kosakowski, Raphael Wüst, Nikolaos I. Prasianakisarxiv.org/pdf/2403.18…null
2024-03-27Density-guided Translator Boosts Synthetic-to-Real Unsupervised Domain Adaptive Segmentation of 3D Point Clouds密度引导转换器促进 3D 点云的合成到真实无监督域自适应分割Zhimin Yuan, Wankang Zeng, Yanfei Su, Weiquan Liu, Ming Cheng, Yulan Guo, Cheng Wangarxiv.org/pdf/2403.18…null
2024-03-27Deep Learning Segmentation and Classification of Red Blood Cells Using a Large Multi-Scanner Dataset使用大型多扫描仪数据集对红细胞进行深度学习分割和分类Mohamed Elmanna, Ahmed Elsafty, Yomna Ahmed, Muhammad Rushdi, Ahmed Morsyarxiv.org/pdf/2403.18…null
2024-03-27A Channel-ensemble Approach: Unbiased and Low-variance Pseudo-labels is Critical for Semi-supervised Classification通道集成方法:无偏且低方差的伪标签对于半监督分类至关重要Jiaqi Wu, Junbiao Pang, Baochang Zhang, Qingming Huangarxiv.org/pdf/2403.18…null
2024-03-27BAM: Box Abstraction Monitors for Real-time OoD Detection in Object DetectionBAM:用于对象检测中实时 OoD 检测的框抽象监视器Changshun Wu, Weicheng He, Chih-Hong Cheng, Xiaowei Huang, Saddek Bensalemarxiv.org/pdf/2403.18…null
2024-03-27ViTAR: Vision Transformer with Any ResolutionViTAR:任何分辨率的视觉转换器Qihang Fan, Quanzeng You, Xiaotian Han, Yongfei Liu, Yunzhe Tao, Huaibo Huang, Ran He, Hongxia Yangarxiv.org/pdf/2403.18…null
2024-03-27Generating Diverse Agricultural Data for Vision-Based Farming Applications为基于视觉的农业应用生成多样化的农业数据Mikolaj Cieslak, Umabharathi Govindarajan, Alejandro Garcia, Anuradha Chandrashekar, Torsten Hädrich, Aleksander Mendoza-Drosik, Dominik L. Michels, Sören Pirk, Chia-Chun Fu, Wojciech Pałubickiarxiv.org/pdf/2403.18…null
2024-03-27A Quantum Fuzzy-based Approach for Real-Time Detection of Solar Coronal Holes基于量子模糊的太阳冕洞实时检测方法Sanmoy Bandyopadhyay, Suman Kunduarxiv.org/pdf/2403.18…null
2024-03-27Learning Inclusion Matching for Animation Paint Bucket Colorization学习动画油漆桶着色的包含匹配Yuekun Dai, Shangchen Zhou, Qinyue Li, Chongyi Li, Chen Change Loyarxiv.org/pdf/2403.18…null
2024-03-27Tracking-Assisted Object Detection with Event Cameras使用事件摄像机进行跟踪辅助物体检测Ting-Kang Yen, Igor Morawski, Shusil Dangi, Kai He, Chung-Yi Lin, Jia-Fong Yeh, Hung-Ting Su, Winston Hsuarxiv.org/pdf/2403.18…null
2024-03-27PIPNet3D: Interpretable Detection of Alzheimer in MRI ScansPIPNet3D:MRI 扫描中阿尔茨海默病的可解释检测Lisa Anita De Santi, Jörg Schlötterer, Michael Scheschenja, Joel Wessendorf, Meike Nauta, Vincenzo Positano, Christin Seifertarxiv.org/pdf/2403.18…null
2024-03-27Uncertainty-Aware SAR ATR: Defending Against Adversarial Attacks via Bayesian Neural Networks不确定性感知 SAR ATR:通过贝叶斯神经网络防御对抗性攻击Tian Ye, Rajgopal Kannan, Viktor Prasanna, Carl Busartarxiv.org/pdf/2403.18…null
2024-03-27Selective Mixup Fine-Tuning for Optimizing Non-Decomposable Objectives用于优化不可分解目标的选择性混合微调Shrinivas Ramasubramanian, Harsh Rangwani, Sho Takemori, Kunal Samanta, Yuhei Umeda, Venkatesh Babu Radhakrishnanarxiv.org/pdf/2403.18…null
2024-03-27Multi-scale Unified Network for Image Classification用于图像分类的多尺度统一网络Wenzhuo Liu, Fei Zhu, Cheng-Lin Liuarxiv.org/pdf/2403.18…null
2024-03-27SGDM: Static-Guided Dynamic Module Make Stronger Visual ModelsSGDM:静态引导动态模块打造更强大的视觉模型Wenjie Xing, Zhenchao Cui, Jing Qiarxiv.org/pdf/2403.18…null
2024-03-27Benchmarking Image Transformers for Prostate Cancer Detection from Ultrasound Data用于根据超声数据检测前列腺癌的图像转换器的基准测试Mohamed Harmanani, Paul F. R. Wilson, Fahimeh Fooladgar, Amoon Jamzad, Mahdi Gilany, Minh Nguyen Nhat To, Brian Wodlinger, Purang Abolmaesumi, Parvin Mousaviarxiv.org/pdf/2403.18…null
2024-03-27Fourier or Wavelet bases as counterpart self-attention in spikformer for efficient visual classification傅里叶或小波基作为 spikformer 中对应的自注意力,以实现高效的视觉分类Qingyu Wang, Duzhen Zhang, Tilelin Zhang, Bo Xuarxiv.org/pdf/2403.18…null
2024-03-27Road Obstacle Detection based on Unknown Objectness Scores基于未知物体分数的道路障碍物检测Chihiro Noguchi, Toshiaki Ohgushi, Masao Yamanakaarxiv.org/pdf/2403.18…null
2024-03-27Few-shot Online Anomaly Detection and Segmentation少样本在线异常检测和分割Shenxing Wei, Xing Wei, Zhiheng Ma, Songlin Dong, Shaochen Zhang, Yihong Gongarxiv.org/pdf/2403.18…null
2024-03-27Looking Beyond What You See: An Empirical Analysis on Subgroup Intersectional Fairness for Multi-label Chest X-ray Classification Using Social Determinants of Racial Health Inequities超越您所看到的:利用种族健康不平等的社会决定因素对多标签胸部 X 射线分类的子组交叉公平性进行实证分析Dana Moukheiber, Saurabh Mahindre, Lama Moukheiber, Mira Moukheiber, Mingchen Gaoarxiv.org/pdf/2403.18…null
2024-03-27Multi-Layer Dense Attention Decoder for Polyp Segmentation用于息肉分割的多层密集注意力解码器Krushi Patel, Fengjun Li, Guanghui Wangarxiv.org/pdf/2403.18…null

图像理解

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-27Towards Image Ambient Lighting Normalization迈向图像环境照明标准化Florin-Alexandru Vasluianu, Tim Seizinger, Zongwei Wu, Rakesh Ranjan, Radu Timoftearxiv.org/pdf/2403.18…null
2024-03-27\mathrm{F^2Depth}: Self-supervised Indoor Monocular Depth Estimation via Optical Flow Consistency and Feature Map Synthesis\mathrm{F^2Depth}:通过光流一致性和特征图合成进行自监督室内单目深度估计Xiaotong Guo, Huijie Zhao, Shuwei Shao, Xudong Li, Baochang Zhangarxiv.org/pdf/2403.18…null

LLM

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-27An Image Grid Can Be Worth a Video: Zero-shot Video Question Answering Using a VLM图像网格值得制作视频:使用 VLM 进行零样本视频问答Wonkyun Kim, Changin Choi, Wonseok Lee, Wonjong Rheearxiv.org/pdf/2403.18…null

Transformer

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-27InstructBrush: Learning Attention-based Instruction Optimization for Image EditingInstructBrush:学习基于注意力的图像编辑指令优化Ruoyu Zhao, Qingnan Fan, Fei Kou, Shuai Qin, Hong Gu, Wei Wu, Pengcheng Xu, Mingrui Zhu, Nannan Wang, Xinbo Gaoarxiv.org/pdf/2403.18…null
2024-03-27Attention Calibration for Disentangled Text-to-Image Personalization用于解开文本到图像个性化的注意力校准Yanbing Zhang, Mengping Yang, Qin Zhou, Zhe Wangarxiv.org/pdf/2403.18…link
2024-03-27A Semi-supervised Nighttime Dehazing Baseline with Spatial-Frequency Aware and Realistic Brightness Constraint具有空间频率感知和现实亮度约束的半监督夜间除雾基线Xiaofeng Cong, Jie Gui, Jing Zhang, Junming Hou, Hao Shenarxiv.org/pdf/2403.18…link
2024-03-27HEMIT: H&E to Multiplex-immunohistochemistry Image Translation with Dual-Branch Pix2pix GeneratorHEMIT:使用双分支 Pix2pix 生成器将 H&E 转换为多重免疫组织化学图像Chang Bian, Beth Philips, Tim Cootes, Martin Fergiearxiv.org/pdf/2403.18…null
2024-03-27Learning CNN on ViT: A Hybrid Model to Explicitly Class-specific Boundaries for Domain Adaptation在 ViT 上学习 CNN:用于域适应的显式特定类边界的混合模型Ba Hung Ngo, Nhat-Tuong Do-Tran, Tuan-Ngoc Nguyen, Hae-Gon Jeon, Tae Jong Choiarxiv.org/pdf/2403.18…null
2024-03-27Efficient Test-Time Adaptation of Vision-Language Models视觉语言模型的高效测试时适应Adilbek Karmanov, Dayan Guan, Shijian Lu, Abdulmotaleb El Saddik, Eric Xingarxiv.org/pdf/2403.18…null
2024-03-27Don't Look into the Dark: Latent Codes for Pluralistic Image Inpainting不要看黑暗:多元图像修复的潜在代码Haiwei Chen, Yajie Zhaoarxiv.org/pdf/2403.18…null

3D/CG

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-27Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark真实声场:视听室声学数据集和基准Ziyang Chen, Israel D. Gebru, Christian Richardt, Anurag Kumar, William Laney, Andrew Owens, Alexander Richardarxiv.org/pdf/2403.18…null
2024-03-27MetaCap: Meta-learning Priors from Multi-View Imagery for Sparse-view Human Performance Capture and RenderingMetaCap:来自多视图图像的元学习先验,用于稀疏视图人类表现捕获和渲染Guoxing Sun, Rishabh Dabral, Pascal Fua, Christian Theobalt, Marc Habermannarxiv.org/pdf/2403.18…null
2024-03-27Garment3DGen: 3D Garment Stylization and Texture GenerationGarment3DGen:3D 服装风格化和纹理生成Nikolaos Sarafianos, Tuur Stuyck, Xiaoyu Xiang, Yilei Li, Jovan Popovic, Rakesh Ranjanarxiv.org/pdf/2403.18…null
2024-03-27Duolando: Follower GPT with Off-Policy Reinforcement Learning for Dance AccompanimentDuolando:带有非策略强化学习的舞蹈伴奏跟随者 GPTLi Siyao, Tianpei Gu, Zhitao Yang, Zhengyu Lin, Ziwei Liu, Henghui Ding, Lei Yang, Chen Change Loyarxiv.org/pdf/2403.18…null
2024-03-27ParCo: Part-Coordinating Text-to-Motion SynthesisParCo:部分协调文本到动作合成Qiran Zou, Shangyuan Yuan, Shian Du, Yu Wang, Chang Liu, Yi Xu, Jie Chen, Xiangyang Jiarxiv.org/pdf/2403.18…link
2024-03-27Backpropagation-free Network for 3D Test-time Adaptation用于 3D 测试时间适应的无反向传播网络Yanshuo Wang, Ali Cheraghian, Zeeshan Hayder, Jie Hong, Sameera Ramasinghe, Shafin Rahman, David Ahmedt-Aristizabal, Xuesong Li, Lars Petersson, Mehrtash Harandiarxiv.org/pdf/2403.18…null
2024-03-27MonoHair: High-Fidelity Hair Modeling from a Monocular VideoMonoHair:通过单目视频进行高保真头发建模Keyu Wu, Lingchen Yang, Zhiyi Kuang, Yao Feng, Xutao Han, Yuefan Shen, Hongbo Fu, Kun Zhou, Youyi Zhengarxiv.org/pdf/2403.18…null
2024-03-27AIR-HLoc: Adaptive Image Retrieval for Efficient Visual LocalisationAIR-HLoc:用于高效视觉定位的自适应图像检索Changkun Liu, Huajian Huang, Zhengyang Ma, Tristan Braudarxiv.org/pdf/2403.18…null
2024-03-27Toward Interactive Regional Understanding in Vision-Large Language Models迈向大视觉语言模型中的交互式区域理解Jungbeom Lee, Sanghyuk Chun, Sangdoo Yunarxiv.org/pdf/2403.18…null

各类学习方式

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-27OrCo: Towards Better Generalization via Orthogonality and Contrast for Few-Shot Class-Incremental LearningOrCo:通过少样本类增量学习的正交性和对比实现更好的泛化Noor Ahmed, Anna Kukleva, Bernt Schielearxiv.org/pdf/2403.18…null
2024-03-27Towards Non-Exemplar Semi-Supervised Class-Incremental Learning迈向非范例半监督课堂增量学习Wenzhuo Liu, Fei Zhu, Cheng-Lin Liuarxiv.org/pdf/2403.18…null
2024-03-27Image Deraining via Self-supervised Reinforcement Learning通过自监督强化学习进行图像去雨He-Hao Liao, Yan-Tsung Peng, Wen-Tao Chu, Ping-Chun Hsieh, Chung-Chi Tsaiarxiv.org/pdf/2403.18…null
2024-03-27Branch-Tuning: Balancing Stability and Plasticity for Continual Self-Supervised Learning分支调优:平衡持续自我监督学习的稳定性和可塑性Wenzhuo Liu, Fei Zhu, Cheng-Lin Liuarxiv.org/pdf/2403.18…null

其他

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-27Enhancing Manufacturing Quality Prediction Models through the Integration of Explainability Methods通过集成可解释性方法增强制造质量预测模型Dennis Gross, Helge Spieker, Arnaud Gotlieb, Ricardo Knoblaucharxiv.org/pdf/2403.18…null
2024-03-27Deep Learning for Robust and Explainable Models in Computer Vision计算机视觉中稳健且可解释模型的深度学习Mohammadreza Amirianarxiv.org/pdf/2403.18…null
2024-03-27FlexEdit: Flexible and Controllable Diffusion-based Object-centric Image EditingFlexEdit:灵活可控的基于扩散的以对象为中心的图像编辑Trong-Tung Nguyen, Duc-Anh Nguyen, Anh Tran, Cuong Phamarxiv.org/pdf/2403.18…null
2024-03-27RAP: Retrieval-Augmented Planner for Adaptive Procedure Planning in Instructional VideosRAP:用于教学视频中自适应程序规划的检索增强规划器Ali Zare, Yulei Niu, Hammad Ayyubi, Shih-fu Changarxiv.org/pdf/2403.18…null
2024-03-27Users prefer Jpegli over same-sized libjpeg-turbo or MozJPEG与相同大小的 libjpeg-turbo 或 MozJPEG 相比,用户更喜欢 JpegliMartin Bruse, Luca Versari, Zoltan Szabadka, Jyrki Alakuijalaarxiv.org/pdf/2403.18…null
2024-03-27Language Plays a Pivotal Role in the Object-Attribute Compositional Generalization of CLIP语言在 CLIP 的对象属性组合概括中发挥着关键作用Reza Abbasi, Mohammad Samiei, Mohammad Hossein Rohban, Mahdieh Soleymani Baghshaharxiv.org/pdf/2403.18…null
2024-03-27VersaT2I: Improving Text-to-Image Models with Versatile RewardVersaT2I:通过多功能奖励改进文本到图像模型Jianshu Guo, Wenhao Chai, Jie Deng, Hsiang-Wei Huang, Tian Ye, Yichen Xu, Jiawei Zhang, Jenq-Neng Hwang, Gaoang Wangarxiv.org/pdf/2403.18…null
2024-03-27Scaling Vision-and-Language Navigation With Offline RL通过离线强化学习扩展视觉和语言导航Valay Bundele, Mahesh Bhupati, Biplab Banerjee, Aditya Groverarxiv.org/pdf/2403.18…null
2024-03-27FTBC: Forward Temporal Bias Correction for Optimizing ANN-SNN ConversionFTBC:用于优化 ANN-SNN 转换的前向时间偏差校正Xiaofeng Wu, Velibor Bojkovic, Bin Gu, Kun Suo, Kai Zouarxiv.org/pdf/2403.18…null
2024-03-27Implementation of the Principal Component Analysis onto High-Performance Computer Facilities for Hyperspectral Dimensionality Reduction: Results and Comparisons在高性能计算机设备上实施主成分分析以实现高光谱降维:结果和比较E. Martel, R. Lazcano, J. Lopez, D. Madroñal, R. Salvador, S. Lopez, E. Juarez, R. Guerra, C. Sanz, R. Sarmientoarxiv.org/pdf/2403.18…null
2024-03-27LayoutFlow: Flow Matching for Layout GenerationLayoutFlow:布局生成的流匹配Julian Jorge Andrade Guerreiro, Naoto Inoue, Kento Masui, Mayu Otani, Hideki Nakayamaarxiv.org/pdf/2403.18…null