[UPDATED!] 2024-03-17 (Publish Time)
生成模型
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-17 | StainDiffuser: MultiTask Dual Diffusion Model for Virtual Staining | StainDiffuser:用于虚拟染色的多任务双扩散模型 | Tushar Kataria, Beatrice Knudsen, Shireen Y. Elhabian | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Enhancing Bandwidth Efficiency for Video Motion Transfer Applications using Deep Learning Based Keypoint Prediction | 使用基于深度学习的关键点预测提高视频运动传输应用的带宽效率 | Xue Bai, Tasmiah Haque, Sumit Mohan, Yuliang Cai, Byungheon Jeong, Adam Halasz, Srinjoy Das | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Advanced Knowledge Extraction of Physical Design Drawings, Translation and conversion to CAD formats using Deep Learning | 使用深度学习提取物理设计图纸的高级知识、翻译并转换为 CAD 格式 | Jesher Joshua M, Ragav V, Syed Ibrahim S P | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Fast Personalized Text-to-Image Syntheses With Attention Injection | 通过注意力注入进行快速个性化文本到图像合成 | Yuxuan Zhang, Yiren Song, Jinpeng Yu, Han Pan, Zhongliang Jing | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Stylized Face Sketch Extraction via Generative Prior with Limited Data | 通过有限数据的生成先验提取程式化脸部草图 | Kwan Yun, Kwanggyoon Seo, Chang Wook Seo, Soyeon Yoon, Seongcheol Kim, Soohyun Ji, Amirsaman Ashtari, Junyong Noh | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | THOR: Text to Human-Object Interaction Diffusion via Relation Intervention | THOR:通过关系干预将文本传播到人机交互 | Qianyang Wu, Ye Shi, Xiaoshui Huang, Jingyi Yu, Lan Xu, Jingya Wang | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | MaskDiffusion: Exploiting Pre-trained Diffusion Models for Semantic Segmentation | MaskDiffusion:利用预训练的扩散模型进行语义分割 | Yasufumi Kawano, Yoshimitsu Aoki | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Artifact Feature Purification for Cross-domain Detection of AI-generated Images | 用于 AI 生成图像跨域检测的伪影特征纯化 | Zheling Meng, Bo Peng, Jing Dong, Tieniu Tan | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | CGI-DM: Digital Copyright Authentication for Diffusion Models via Contrasting Gradient Inversion | CGI-DM:通过对比梯度反转进行扩散模型的数字版权认证 | Xiaoyu Wu, Yang Hua, Chumeng Liang, Jiaru Zhang, Hao Wang, Tao Song, Haibing Guan | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Selective Hourglass Mapping for Universal Image Restoration Based on Diffusion Model | 基于扩散模型的选择性沙漏映射通用图像恢复 | Dian Zheng, Xiao-Ming Wu, Shuzhou Yang, Jian Zhang, Jian-Fang Hu, Wei-Shi Zheng | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Omni-Recon: Towards General-Purpose Neural Radiance Fields for Versatile 3D Applications | Omni-Recon:面向多功能 3D 应用的通用神经辐射场 | Yonggan Fu, Huaizhi Qu, Zhifan Ye, Chaojian Li, Kevin Zhao, Yingyan Lin | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | 3D Human Reconstruction in the Wild with Synthetic Data Using Generative Models | 使用生成模型利用合成数据在野外进行 3D 人体重建 | Yongtao Ge, Wenjia Wang, Yongfan Chen, Hao Chen, Chunhua Shen | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Source Prompt Disentangled Inversion for Boosting Image Editability with Diffusion Models | 源提示解缠结反演,通过扩散模型提高图像可编辑性 | Ruibin Li, Ruihuang Li, Song Guo, Lei Zhang | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Adaptive Semantic-Enhanced Denoising Diffusion Probabilistic Model for Remote Sensing Image Super-Resolution | 遥感图像超分辨率自适应语义增强去噪扩散概率模型 | Jialu Sui, Xianping Ma, Xiaokang Zhang, Man-On Pun | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Zippo: Zipping Color and Transparency Distributions into a Single Diffusion Model | Zippo:将颜色和透明度分布压缩到单个扩散模型中 | Kangyang Xie, Binbin Yang, Hao Chen, Meng Wang, Cheng Zou, Hui Xue, Ming Yang, Chunhua Shen | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Unveiling and Mitigating Memorization in Text-to-image Diffusion Models through Cross Attention | 通过交叉注意力揭示和减轻文本到图像扩散模型中的记忆 | Jie Ren, Yaxin Li, Shenglai Zen, Han Xu, Lingjuan Lyu, Yue Xing, Jiliang Tang | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Endora: Video Generation Models as Endoscopy Simulators | Endora:作为内窥镜模拟器的视频生成模型 | Chenxin Li, Hengyu Liu, Yifan Liu, Brandon Y. Feng, Wuyang Li, Xinyu Liu, Zhen Chen, Jing Shao, Yixuan Yuan | arxiv.org/pdf/2403.11… | null |
多模态
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-17 | Reconstruct before Query: Continual Missing Modality Learning with Decomposed Prompt Collaboration | 查询前重建:通过分解的即时协作进行持续缺失的模态学习 | Shu Zhao, Xiaohan Zou, Tan Yu, Huijuan Xu | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Few-Shot VQA with Frozen LLMs: A Tale of Two Approaches | 冻结法学硕士的少样本 VQA:两种方法的故事 | Igor Sterner, Weizhe Lin, Jinghong Chen, Bill Byrne | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Bilateral Propagation Network for Depth Completion | 用于深度补全的双边传播网络 | Jie Tang, Fei-Peng Tian, Boshi An, Jian Li, Ping Tan | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | FORCE: Dataset and Method for Intuitive Physics Guided Human-object Interaction | FORCE:直观物理引导人机交互的数据集和方法 | Xiaohan Zhang, Bharat Lal Bhatnagar, Sebastian Starke, Ilya Petrov, Vladimir Guzov, Helisa Dhamo, Eduardo Pérez-Pellitero, Gerard Pons-Moll | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | PhD: A Prompted Visual Hallucination Evaluation Dataset | 博士:提示视幻觉评估数据集 | Jiazhen Liu, Yuhan Fu, Ruobing Xie, Runquan Xie, Xingwu Sun, Fengzong Lian, Zhanhui Kang, Xirong Li | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | m&m's: A Benchmark to Evaluate Tool-Use for multi-step multi-modal Tasks | m&m's:评估多步骤多模式任务工具使用的基准 | Zixian Ma, Weikai Huang, Jieyu Zhang, Tanmay Gupta, Ranjay Krishna | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Customizing Visual-Language Foundation Models for Multi-modal Anomaly Detection and Reasoning | 定制用于多模态异常检测和推理的视觉语言基础模型 | Xiaohao Xu, Yunkang Cao, Yongqi Chen, Weiming Shen, Xiaonan Huang | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | From Pixels to Predictions: Spectrogram and Vision Transformer for Better Time Series Forecasting | 从像素到预测:频谱图和视觉转换器可实现更好的时间序列预测 | Zhen Zeng, Rachneet Kaur, Suchetha Siddagangappa, Tucker Balch, Manuela Veloso | arxiv.org/pdf/2403.11… | null |
Nerf
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-17 | Creating Seamless 3D Maps Using Radiance Fields | 使用辐射场创建无缝 3D 地图 | Sai Tarun Sathyan, Thomas B. Kinsman | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | SpikeNeRF: Learning Neural Radiance Fields from Continuous Spike Stream | SpikeNeRF:从连续尖峰流中学习神经辐射场 | Lin Zhu, Kangmin Jia, Yifan Zhao, Yunshan Qi, Lizhi Wang, Hua Huang | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Recent Advances in 3D Gaussian Splatting | 3D 高斯分布的最新进展 | Tong Wu, Yu-Jie Yuan, Ling-Xiao Zhang, Jie Yang, Yan-Pei Cao, Ling-Qi Yan, Lin Gao | arxiv.org/pdf/2403.11… | null |
3DGS
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-17 | 3DGS-ReLoc: 3D Gaussian Splatting for Map Representation and Visual ReLocalization | 3DGS-ReLoc:用于地图表示和视觉重定位的 3D 高斯泼溅 | Peng Jiang, Gaurav Pandey, Srikanth Saripalli | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | GeoGaussian: Geometry-aware Gaussian Splatting for Scene Rendering | GeoGaussian:用于场景渲染的几何感知高斯泼溅 | Yanyan Li, Chenyu Lyu, Yan Di, Guangyao Zhai, Gim Hee Lee, Federico Tombari | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | BrightDreamer: Generic 3D Gaussian Generative Framework for Fast Text-to-3D Synthesis | BrightDreamer:用于快速文本到 3D 合成的通用 3D 高斯生成框架 | Lutao Jiang, Lin Wang | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Compact 3D Gaussian Splatting For Dense Visual SLAM | 用于密集视觉 SLAM 的紧凑 3D 高斯泼溅 | Tianchen Deng, Yaohui Chen, Leyan Zhang, Jianfei Yang, Shenghai Yuan, Danwei Wang, Weidong Chen | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Analytic-Splatting: Anti-Aliased 3D Gaussian Splatting via Analytic Integration | 解析扩散:通过解析积分进行抗锯齿 3D 高斯扩散 | Zhihao Liang, Qi Zhang, Wenbo Hu, Ying Feng, Lei Zhu, Kui Jia | arxiv.org/pdf/2403.11… | null |
模型压缩/优化
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-17 | Self-Supervised Quantization-Aware Knowledge Distillation | 自监督量化感知知识蒸馏 | Kaiqi Zhao, Ming Zhao | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Graph Expansion in Pruned Recurrent Neural Network Layers Preserve Performance | 修剪后的循环神经网络层中的图扩展可保持性能 | Suryam Arnav Kalra, Arindam Biswas, Pabitra Mitra, Biswajit Basu | arxiv.org/pdf/2403.11… | null |
分类/检测/识别/分割/...
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-17 | V2X-DGW: Domain Generalization for Multi-agent Perception under Adverse Weather Conditions | V2X-DGW:恶劣天气条件下多智能体感知的领域泛化 | Baolu Li, Jinlong Li, Xinyu Liu, Runsheng Xu, Zhengzhong Tu, Jiacheng Guo, Xiaopeng Li, Hongkai Yu | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Ensembling and Test Augmentation for Covid-19 Detection and Covid-19 Domain Adaptation from 3D CT-Scans | 通过 3D CT 扫描进行 Covid-19 检测和 Covid-19 域适应的集成和测试增强 | Fares Bougourzi, Feryal Windal Moula, Halim Benhabiles, Fadi Dornaika, Abdelmalik Taleb-Ahmed | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Domain-Guided Masked Autoencoders for Unique Player Identification | 用于唯一玩家识别的域引导屏蔽自动编码器 | Bavesh Balaji, Jerrin Bright, Sirisha Rambhatla, Yuhao Chen, Alexander Wong, John Zelek, David A Clausi | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | NeoNeXt: Novel neural network operator and architecture based on the patch-wise matrix multiplications | NeoNeXt:基于分片矩阵乘法的新型神经网络算子和架构 | Vladimir Korviakov, Denis Koposov | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | YOLOv9 for Fracture Detection in Pediatric Wrist Trauma X-ray Images | YOLOv9 用于儿童手腕创伤 X 射线图像中的骨折检测 | Chun-Tse Chien, Rui-Yang Ju, Kuang-Yi Chou, Jen-Shiun Chiang | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Simple 2D Convolutional Neural Network-based Approach for COVID-19 Detection | 基于简单 2D 卷积神经网络的 COVID-19 检测方法 | Chih-Chung Hsu, Chia-Ming Lee, Yang Fan Chiang, Yi-Shiuan Chou, Chih-Yu Jiang, Shen-Chieh Tai, Chi-Han Tsai | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Concatenate, Fine-tuning, Re-training: A SAM-enabled Framework for Semi-supervised 3D Medical Image Segmentation | 连接、微调、再训练:支持 SAM 的半监督 3D 医学图像分割框架 | Shumeng Li, Lei Qi, Qian Yu, Jing Huo, Yinghuan Shi, Yang Gao | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | CPA-Enhancer: Chain-of-Thought Prompted Adaptive Enhancer for Object Detection under Unknown Degradations | CPA-Enhancer:用于未知退化下目标检测的思想链提示自适应增强器 | Yuwei Zhang, Yan Wu, Yanming Liu, Xinyue Peng | arxiv.org/pdf/2403.11… | link |
| 2024-03-17 | RCdpia: A Renal Carcinoma Digital Pathology Image Annotation dataset based on pathologists | RCdpia:基于病理学家的肾癌数字病理学图像注释数据集 | Qingrong Sun, Weixiang Zhong, Jie Zhou, Chong Lai, Xiaodong Teng, Maode Lai | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | TAG: Guidance-free Open-Vocabulary Semantic Segmentation | 标签:无指导开放词汇语义分割 | Yasufumi Kawano, Yoshimitsu Aoki | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | NetTrack: Tracking Highly Dynamic Objects with a Net | NetTrack:使用网络跟踪高度动态的物体 | Guangze Zheng, Shijie Lin, Haobo Zuo, Changhong Fu, Jia Pan | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | DuPL: Dual Student with Trustworthy Progressive Learning for Robust Weakly Supervised Semantic Segmentation | DuPL:具有值得信赖的渐进式学习的双重学生,用于鲁棒的弱监督语义分割 | Yuanchen Wu, Xichen Ye, Kequan Yang, Jide Li, Xiaoqiang Li | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | A lightweight deep learning pipeline with DRDA-Net and MobileNet for breast cancer classification | 使用 DRDA-Net 和 MobileNet 进行乳腺癌分类的轻量级深度学习管道 | Mahdie Ahmadi, Nader Karimi, Shadrokh Samavi | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | GRA: Detecting Oriented Objects through Group-wise Rotating and Attention | GRA:通过分组旋转和注意力检测定向物体 | Jiangshan Wang, Yifan Pu, Yizeng Han, Jiayi Guo, Yiru Wang, Xiu Li, Gao Huang | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | LERENet: Eliminating Intra-class Differences for Metal Surface Defect Few-shot Semantic Segmentation | LERENet:消除金属表面缺陷的类内差异少样本语义分割 | Hanze Ding, Zhangkai Wu, Jiyan Zhang, Ming Ping, Yanfang Liu | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Local-consistent Transformation Learning for Rotation-invariant Point Cloud Analysis | 用于旋转不变点云分析的局部一致变换学习 | Yiyang Chen, Lunhao Duan, Shanshan Zhao, Changxing Ding, Dacheng Tao | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Self-supervised co-salient object detection via feature correspondence at multiple scales | 通过多尺度特征对应进行自监督共显着目标检测 | Souradeep Chakraborty, Dimitris Samaras | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Hierarchical Generative Network for Face Morphing Attacks | 用于面部变形攻击的分层生成网络 | Zuyuan He, Zongyong Deng, Qiaoyun He, Qijun Zhao | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Multitask frame-level learning for few-shot sound event detection | 用于少样本声音事件检测的多任务帧级学习 | Liang Zou, Genwei Yan, Ruoyu Wang, Jun Du, Meng Lei, Tian Gao, Xin Fang | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Audio-Visual Segmentation via Unlabeled Frame Exploitation | 通过未标记帧利用进行视听分割 | Jinxiang Liu, Yikun Liu, Fei Zhang, Chen Ju, Ya Zhang, Yanfeng Wang | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Tokensome: Towards a Genetic Vision-Language GPT for Explainable and Cognitive Karyotyping | Tokensome:迈向可解释和认知核型分析的遗传视觉语言 GPT | Haoxi Zhang, Xinxu Zhang, Yuanxin Lin, Maiqi Wang, Yi Lai, Yu Wang, Linfeng Yu, Yufeng Xu, Ran Cheng, Edward Szczerbicki | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Intelligent Railroad Grade Crossing: Leveraging Semantic Segmentation and Object Detection for Enhanced Safety | 智能铁路平交道口:利用语义分割和对象检测来增强安全性 | Al Amin, Deo Chimba, Kamrul Hasan, Emmanuel Samson | arxiv.org/pdf/2403.11… | null |
GNN
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-17 | DynamicGlue: Epipolar and Time-Informed Data Association in Dynamic Environments using Graph Neural Networks | DynamicGlue:使用图神经网络在动态环境中进行极线和时间通知数据关联 | Theresa Huber, Simon Schaefer, Stefan Leutenegger | arxiv.org/pdf/2403.11… | null |
LLM
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-17 | Large Language Models Powered Context-aware Motion Prediction | 大型语言模型支持上下文感知运动预测 | Xiaoji Zheng, Lixiu Wu, Zhijie Yan, Yuanrong Tang, Hao Zhao, Chen Zhong, Bokui Chen, Jiangtao Gong | arxiv.org/pdf/2403.11… | null |
Transformer
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-17 | Unifying Feature and Cost Aggregation with Transformers for Semantic and Visual Correspondence | 使用 Transformers 统一功能和成本聚合,以实现语义和视觉对应 | Sunghwan Hong, Seokju Cho, Seungryong Kim, Stephen Lin | arxiv.org/pdf/2403.11… | null |
3D/CG
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-17 | A Dual-Augmentor Framework for Domain Generalization in 3D Human Pose Estimation | 3D 人体姿态估计中域泛化的双增强器框架 | Qucheng Peng, Ce Zheng, Chen Chen | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | STAIR: Semantic-Targeted Active Implicit Reconstruction | STAIR:以语义为目标的主动隐式重建 | Liren Jin, Haofei Kuang, Yue Pan, Cyrill Stachniss, Marija Popović | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Neural Markov Random Field for Stereo Matching | 用于立体匹配的神经马尔可夫随机场 | Tongfan Guan, Chen Wang, Yun-Hui Liu | arxiv.org/pdf/2403.11… | link |
| 2024-03-17 | Boosting Semi-Supervised Temporal Action Localization by Learning from Non-Target Classes | 通过向非目标类学习来促进半监督时间动作本地化 | Kun Xia, Le Wang, Sanping Zhou, Gang Hua, Wei Tang | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Training A Small Emotional Vision Language Model for Visual Art Comprehension | 训练用于视觉艺术理解的小型情感视觉语言模型 | Jing Zhang, Liang Zheng, Dan Guo, Meng Wang | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | A Versatile Framework for Multi-scene Person Re-identification | 多场景行人重识别的多功能框架 | Wei-Shi Zheng, Junkai Yan, Yi-Xing Peng | arxiv.org/pdf/2403.11… | null |
各类学习方式
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-17 | Uncertainty-Aware Pseudo-Label Filtering for Source-Free Unsupervised Domain Adaptation | 用于无源无监督域适应的不确定性感知伪标签过滤 | Xi Chen, Haosen Yang, Huicong Zhang, Hongxun Yao, Xiatian Zhu | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Universal Semi-Supervised Domain Adaptation by Mitigating Common-Class Bias | 通过减轻共同类偏差进行通用半监督域适应 | Wenyu Zhang, Qingmu Liu, Felix Ong Wei Cong, Mohamed Ragab, Chuan-Sheng Foo | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Controllable Relation Disentanglement for Few-Shot Class-Incremental Learning | 小样本类增量学习的可控关系解开 | Yuan Zhou, Richang Hong, Yanrong Guo, Lin Liu, Shijie Hao, Hanwang Zhang | arxiv.org/pdf/2403.11… | null |
其他
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-17 | SQ-LLaVA: Self-Questioning for Large Vision-Language Assistant | SQ-LLaVA:大视野语言助手的自我提问 | Guohao Sun, Can Qin, Jiamian Wang, Zeyuan Chen, Ran Xu, Zhiqiang Tao | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Order-One Rolling Shutter Cameras | 一阶卷帘快门相机 | Marvin Anas Hahn, Kathlén Kohn, Orlando Marigliano, Tomas Pajdla | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | MindEye2: Shared-Subject Models Enable fMRI-To-Image With 1 Hour of Data | MindEye2:共享受试者模型可通过 1 小时的数据实现 fMRI 转图像 | Paul S. Scotti, Mihir Tripathy, Cesar Kadir Torrico Villanueva, Reese Kneeland, Tong Chen, Ashutosh Narang, Charan Santhirasegaran, Jonathan Xu, Thomas Naselaris, Kenneth A. Norman, et.al. | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Self-Supervised Video Desmoking for Laparoscopic Surgery | 腹腔镜手术的自我监督视频除烟 | Renlong Wu, Zhilu Zhang, Shuohao Zhang, Longfei Gou, Haobin Chen, Lei Zhang, Hao Chen, Wangmeng Zuo | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Quality-Aware Image-Text Alignment for Real-World Image Quality Assessment | 用于真实世界图像质量评估的质量感知图像文本对齐 | Lorenzo Agnolucci, Leonardo Galteri, Marco Bertini | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | Lost in Translation? Translation Errors and Challenges for Fair Assessment of Text-to-Image Models on Multilingual Concepts | 迷失在翻译中?多语言概念的文本到图像模型公平评估的翻译错误和挑战 | Michael Saxon, Yiran Luo, Sharon Levy, Chitta Baral, Yezhou Yang, William Yang Wang | arxiv.org/pdf/2403.11… | null |
| 2024-03-17 | OSTAF: A One-Shot Tuning Method for Improved Attribute-Focused T2I Personalization | OSTAF:一种用于改进以属性为中心的 T2I 个性化的一次性调整方法 | Ye Wang, Zili Yi, Rui Ma | arxiv.org/pdf/2403.11… | null |