[UPDATED!] 2024-03-08 (Publish Time)
生成模型
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-08 | VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models | VideoElevator:通过多功能文本到图像扩散模型提高视频生成质量 | Yabo Zhang, Yuxiang Wei, Xianhui Lin, Zheng Hui, Peiran Ren, Xuansong Xie, Xiangyang Ji, Wangmeng Zuo | arxiv.org/pdf/2403.05… | link |
| 2024-03-08 | A Data Augmentation Pipeline to Generate Synthetic Labeled Datasets of 3D Echocardiography Images using a GAN | 使用 GAN 生成 3D 超声心动图图像的合成标记数据集的数据增强管道 | Cristiana Tiago, Andrew Gilbert, Ahmed S. Beela, Svein Arne Aase, Sten Roar Snare, Jurica Sprem | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Enhancing Plausibility Evaluation for Generated Designs with Denoising Autoencoder | 使用去噪自动编码器增强生成设计的合理性评估 | Jiajie Fan, Amal Trigui, Thomas Bäck, Hao Wang | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Federated Learning Method for Preserving Privacy in Face Recognition System | 人脸识别系统中保护隐私的联邦学习方法 | Enoch Solomon, Abraham Woubie | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | DiffSF: Diffusion Models for Scene Flow Estimation | DiffSF:用于场景流估计的扩散模型 | Yushan Zhang, Bastian Wandt, Maria Magnusson, Michael Felsberg | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Noise Level Adaptive Diffusion Model for Robust Reconstruction of Accelerated MRI | 用于加速 MRI 鲁棒重建的噪声水平自适应扩散模型 | Shoujin Huang, Guanxiong Luo, Xi Wang, Ziran Chen, Yuwan Wang, Huaishui Yang, Pheng-Ann Heng, Lingyan Zhang, Mengye Lyu | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation | 在基于文本的人类图像生成的扩散模型中有效利用以人为中心的先验 | Junyan Wang, Zhenhong Sun, Zhiyu Tan, Xuanbai Chen, Weihua Chen, Hao Li, Cheng Zhang, Yang Song | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Denoising Autoregressive Representation Learning | 去噪自回归表示学习 | Yazhe Li, Jorg Bornschein, Ting Chen | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | DiffuLT: How to Make Diffusion Model Useful for Long-tail Recognition | DiffuLT:如何使扩散模型对长尾识别有用 | Jie Shao, Ke Zhu, Hanxiao Zhang, Jianxin Wu | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | GSEdit: Efficient Text-Guided Editing of 3D Objects via Gaussian Splatting | GSEdit:通过高斯泼溅对 3D 对象进行高效的文本引导编辑 | Francesco Palandra, Andrea Sanchietti, Daniele Baieri, Emanuele Rodolà | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Improving Diffusion Models for Virtual Try-on | 改进虚拟试穿的扩散模型 | Yisol Choi, Sangkyung Kwak, Kyungmin Lee, Hyungwon Choi, Jinwoo Shin | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment | ELLA:为扩散模型配备法学硕士以增强语义对齐 | Xiwei Hu, Rui Wang, Yixiao Fang, Bin Fu, Pei Cheng, Gang Yu | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Sora as an AGI World Model? A Complete Survey on Text-to-Video Generation | Sora 作为 AGI 世界模型?关于文本到视频生成的完整调查 | Joseph Cho, Fachrina Dewi Puspitasari, Sheng Zheng, Jingyao Zheng, Lik-Hang Lee, Tae-Ho Kim, Choong Seon Hong, Chaoning Zhang | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Evaluating Text-to-Image Generative Models: An Empirical Study on Human Image Synthesis | 评估文本到图像生成模型:人类图像合成的实证研究 | Muxi Chen, Yi Liu, Jian Yi, Changran Xu, Qiuxia Lai, Hongliang Wang, Tsung-Yi Ho, Qiang Xu | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | CogView3: Finer and Faster Text-to-Image Generation via Relay Diffusion | CogView3:通过中继扩散生成更精细、更快的文本到图像 | Wendi Zheng, Jiayan Teng, Zhuoyi Yang, Weihan Wang, Jidong Chen, Xiaotao Gu, Yuxiao Dong, Ming Ding, Jie Tang | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Face2Diffusion for Fast and Editable Face Personalization | Face2Diffusion 用于快速且可编辑的面部个性化 | Kaede Shiohara, Toshihiko Yamasaki | arxiv.org/pdf/2403.05… | link |
| 2024-03-08 | Spectrum Translation for Refinement of Image Generation (STIG) Based on Contrastive Learning and Spectral Filter Profile | 基于对比学习和频谱滤波器配置文件的用于细化图像生成 (STIG) 的频谱转换 | Seokjun Lee, Seung-Won Jung, Hyunseok Seo | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Improving Diffusion-Based Generative Models via Approximated Optimal Transport | 通过近似最优传输改进基于扩散的生成模型 | Daegyu Kim, Jooyoung Choi, Chaehun Shin, Uiwon Hwang, Sungroh Yoon | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | XPSR: Cross-modal Priors for Diffusion-based Image Super-Resolution | XPSR:基于扩散的图像超分辨率的跨模态先验 | Yunpeng Qu, Kun Yuan, Kai Zhao, Qizhi Xie, Jinhua Hao, Ming Sun, Chao Zhou | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction Model | CRM:使用卷积重建模型将单图像转换为 3D 纹理网格 | Zhengyi Wang, Yikai Wang, Yifei Chen, Chendong Xiang, Shuo Chen, Dajiang Yu, Chongxuan Li, Hang Su, Jun Zhu | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | A Probabilistic Hadamard U-Net for MRI Bias Field Correction | 用于 MRI 偏差场校正的概率 Hadamard U-Net | Xin Zhu, Hongyi Pan, Yury Velichko, Adam B. Murphy, Ashley Ross, Baris Turkbey, Ahmet Enis Cetin, Ulas Bagci | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | InstructGIE: Towards Generalizable Image Editing | InstructGIE:走向通用图像编辑 | Zichong Meng, Changdi Yang, Jun Liu, Hao Tang, Pu Zhao, Yanzhi Wang | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | DiffClass: Diffusion-Based Class Incremental Learning | DiffClass:基于扩散的类增量学习 | Zichong Meng, Jie Zhang, Changdi Yang, Zheng Zhan, Pu Zhao, Yanzhi WAng | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | StereoDiffusion: Training-Free Stereo Image Generation Using Latent Diffusion Models | StereoDiffusion:使用潜在扩散模型生成免训练立体图像 | Lezhong Wang, Jeppe Revall Frisvad, Mark Bo Jensen, Siavash Arjomand Bigdeli | arxiv.org/pdf/2403.04… | null |
| 2024-03-08 | C2P-GCN: Cell-to-Patch Graph Convolutional Network for Colorectal Cancer Grading | C2P-GCN:用于结直肠癌分级的细胞到贴片图卷积网络 | Sudipta Paul, Bulent Yener, Amanda W. Lund | arxiv.org/pdf/2403.04… | null |
多模态
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-08 | Probabilistic Image-Driven Traffic Modeling via Remote Sensing | 通过遥感进行概率图像驱动的交通建模 | Scott Workman, Armin Hadzic | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | OmniCount: Multi-label Object Counting with Semantic-Geometric Priors | OmniCount:使用语义几何先验进行多标签对象计数 | Anindya Mondal, Sauradip Nag, Xiatian Zhu, Anjan Dutta | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | OccFusion: Depth Estimation Free Multi-sensor Fusion for 3D Occupancy Prediction | OccFusion:用于 3D 占用预测的无深度估计多传感器融合 | Ji Zhang, Yiran Ding | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Synthetic Privileged Information Enhances Medical Image Representation Learning | 综合特权信息增强医学图像表示学习 | Lucas Farndale, Chris Walsh, Robert Insall, Ke Yuan | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Unlocking the Potential of Multimodal Unified Discrete Representation through Training-Free Codebook Optimization and Hierarchical Alignment | 通过免训练码本优化和分层对齐释放多模态统一离散表示的潜力 | Hai Huang, Yan Xia, Shengpeng Ji, Shulei Wang, Hanting Wang, Jieming Zhu, Zhenhua Dong, Zhou Zhao | arxiv.org/pdf/2403.05… | link |
| 2024-03-08 | LVIC: Multi-modality segmentation by Lifting Visual Info as Cue | LVIC:通过提升视觉信息作为提示进行多模态分割 | Zichao Dong, Bowen Pang, Xufeng Huang, Hang Ji, Xin Zhan, Junbo Chen | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Med3DInsight: Enhancing 3D Medical Image Understanding with 2D Multi-Modal Large Language Models | Med3DInsight:利用 2D 多模态大型语言模型增强 3D 医学图像理解 | Qiuhui Chen, Huping Ye, Yi Hong | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Learning to Rematch Mismatched Pairs for Robust Cross-Modal Retrieval | 学习重新匹配不匹配的对以实现稳健的跨模态检索 | Haochen Han, Qinghua Zheng, Guang Dai, Minnan Luo, Jingdong Wang | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Towards Multimodal Sentiment Analysis Debiasing via Bias Purification | 通过偏差净化实现多模态情感分析去偏差 | Dingkang Yang, Mingcheng Li, Dongling Xiao, Yang Liu, Kun Yang, Zhaoyu Chen, Yuzheng Wang, Peng Zhai, Ke Li, Lihua Zhang | arxiv.org/pdf/2403.05… | null |
3DGS
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-08 | SplattingAvatar: Realistic Real-Time Human Avatars with Mesh-Embedded Gaussian Splatting | SplattingAvatar:具有网格嵌入式高斯泼溅的逼真实时人体化身 | Zhijing Shao, Zhaolong Wang, Zhuang Li, Duotun Wang, Xiangru Lin, Yu Zhang, Mingming Fan, Zeyu Wang | arxiv.org/pdf/2403.05… | link |
模型压缩/优化
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-08 | Attention-guided Feature Distillation for Semantic Segmentation | 用于语义分割的注意力引导特征蒸馏 | Amir M. Mansourian, Arya Jalali, Rozhan Ahmadi, Shohreh Kasaei | arxiv.org/pdf/2403.05… | link |
| 2024-03-08 | Generalized Correspondence Matching via Flexible Hierarchical Refinement and Patch Descriptor Distillation | 通过灵活的分层细化和补丁描述符蒸馏进行广义对应匹配 | Yu Han, Ziwei Long, Yanting Zhang, Jin Wu, Zhijun Fang, Rui Fan | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Fine-tuning a Multiple Instance Learning Feature Extractor with Masked Context Modelling and Knowledge Distillation | 使用屏蔽上下文建模和知识蒸馏来微调多实例学习特征提取器 | Juan I. Pisula, Katarzyna Bozek | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Adversarial Sparse Teacher: Defense Against Distillation-Based Model Stealing Attacks Using Adversarial Examples | 对抗性稀疏教师:使用对抗性示例防御基于蒸馏的模型窃取攻击 | Eda Yilmaz, Hacer Yalim Keles | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | ECToNAS: Evolutionary Cross-Topology Neural Architecture Search | ECToNAS:进化跨拓扑神经架构搜索 | Elisabeth J. Schiessler, Roland C. Aydin, Christian J. Cyron | arxiv.org/pdf/2403.05… | link |
| 2024-03-08 | RadarDistill: Boosting Radar-based Object Detection Performance via Knowledge Distillation from LiDAR Features | RadarDistill:通过 LiDAR 特征的知识蒸馏提高基于雷达的物体检测性能 | Geonho Bang, Kwangjin Choi, Jisong Kim, Dongsuk Kum, Jun Won Choi | arxiv.org/pdf/2403.05… | null |
分类/检测/识别/分割/...
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-08 | Tune without Validation: Searching for Learning Rate and Weight Decay on Training Sets | 无需验证的调整:搜索训练集上的学习率和权重衰减 | Lorenzo Brigato, Stavroula Mougiakakou | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Part-aware Personalized Segment Anything Model for Patient-Specific Segmentation | 用于特定患者分割的部件感知个性化分割任何模型 | Chenhui Zhao, Liyue Shen | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | EVD4UAV: An Altitude-Sensitive Benchmark to Evade Vehicle Detection in UAV | EVD4UAV:无人机中逃避车辆检测的高度敏感基准 | Huiming Sun, Jiacheng Guo, Zibo Meng, Tianyun Zhang, Jianwu Fang, Yuewei Lin, Hongkai Yu | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Rethinking Transformers Pre-training for Multi-Spectral Satellite Imagery | 重新思考多光谱卫星图像的 Transformer 预训练 | Mubashir Noman, Muzammal Naseer, Hisham Cholakkal, Rao Muhammad Anwar, Salman Khan, Fahad Shahbaz Khan | arxiv.org/pdf/2403.05… | link |
| 2024-03-08 | SIRST-5K: Exploring Massive Negatives Synthesis with Self-supervised Learning for Robust Infrared Small Target Detection | SIRST-5K:通过自监督学习探索大量负片合成,实现鲁棒红外小目标检测 | Yahao Lu, Yupei Lin, Han Wu, Xiaoyu Xian, Yukai Shi, Liang Lin | arxiv.org/pdf/2403.05… | link |
| 2024-03-08 | FedFMS: Exploring Federated Foundation Models for Medical Image Segmentation | FedFMS:探索医学图像分割的联邦基础模型 | Yuxi Liu, Guibo Luo, Yuesheng Zhu | arxiv.org/pdf/2403.05… | link |
| 2024-03-08 | A Deep Learning Method for Classification of Biophilic Artworks | 生物亲和艺术作品分类的深度学习方法 | Purna Kar, Jordan J. Bird, Yangang Xing, Alexander Sumich, Andrew Knight, Ahmad Lotfi, Benedict Carpenter van Barthold | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Exploring Robust Features for Few-Shot Object Detection in Satellite Imagery | 探索卫星图像中少镜头物体检测的鲁棒特征 | Xavier Bou, Gabriele Facciolo, Rafael Grompone von Gioi, Jean-Michel Morel, Thibaud Ehret | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Spectrogram-Based Detection of Auto-Tuned Vocals in Music Recordings | 基于频谱图的音乐录音中自动调谐人声检测 | Mahyar Gohari, Paolo Bestagini, Sergio Benini, Nicola Adami | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Self-Supervised Multiple Instance Learning for Acute Myeloid Leukemia Classification | 急性髓系白血病分类的自监督多实例学习 | Salome Kazeminia, Max Joosten, Dragan Bosnacki, Carsten Marr | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Frequency-Adaptive Dilated Convolution for Semantic Segmentation | 用于语义分割的频率自适应扩张卷积 | Linwei Chen, Lin Gu, Ying Fu | arxiv.org/pdf/2403.05… | link |
| 2024-03-08 | Hybridized Convolutional Neural Networks and Long Short-Term Memory for Improved Alzheimer's Disease Diagnosis from MRI Scans | 混合卷积神经网络和长短期记忆可改善 MRI 扫描对阿尔茨海默病的诊断 | Maleka Khatun, Md Manowarul Islam, Habibur Rahman Rifat, Md. Shamim Bin Shahid, Md. Alamin Talukder, Md Ashraf Uddin | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Multiple Instance Learning with random sampling for Whole Slide Image Classification | 用于整个幻灯片图像分类的随机采样的多实例学习 | H. Keshvarikhojasteh, J. P. W. Pluim, M. Veta | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | VLM-PL: Advanced Pseudo Labeling approach Class Incremental Object Detection with Vision-Language Model | VLM-PL:使用视觉语言模型的高级伪标记方法类增量对象检测 | Junsu Kim, Yunhoe Ku, Jihyeon Kim, Junuk Cha, Seungryul Baek | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Embedded Deployment of Semantic Segmentation in Medicine through Low-Resolution Inputs | 通过低分辨率输入在医学中进行语义分割的嵌入式部署 | Erik Ostrowski, Muhammad Shafique | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | PEEB: Part-based Image Classifiers with an Explainable and Editable Language Bottleneck | PEEB:具有可解释和可编辑语言瓶颈的基于部分的图像分类器 | Thang M. Pham, Peijie Chen, Tin Nguyen, Seunghyun Yoon, Trung Bui, Anh Nguyen | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Debiasing Large Visual Language Models | 消除大型视觉语言模型的偏差 | Yi-Fan Zhang, Weichen Yu, Qingsong Wen, Xue Wang, Zhang Zhang, Liang Wang, Rong Jin, Tieniu Tan | arxiv.org/pdf/2403.05… | link |
| 2024-03-08 | Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text Retrieval | 用于图像文本检索的跨模态和单模态软标签对齐 | Hailang Huang, Zhijie Nie, Ziqiao Wang, Ziyu Shang | arxiv.org/pdf/2403.05… | link |
| 2024-03-08 | Hide in Thicket: Generating Imperceptible and Rational Adversarial Perturbations on 3D Point Clouds | 隐藏在灌木丛中:在 3D 点云上生成难以察觉且合理的对抗性扰动 | Tianrui Lou, Xiaojun Jia, Jindong Gu, Li Liu, Siyuan Liang, Bangyan He, Xiaochun Cao | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | LightM-UNet: Mamba Assists in Lightweight UNet for Medical Image Segmentation | LightM-UNet:Mamba 助力轻量级 UNet 进行医学图像分割 | Weibin Liao, Yinghao Zhu, Xinyuan Wang, Cehngwei Pan, Yasha Wang, Liantao Ma | arxiv.org/pdf/2403.05… | link |
| 2024-03-08 | Benchmarking Micro-action Recognition: Dataset, Methods, and Applications | 微动作识别基准测试:数据集、方法和应用 | Dan Guo, Kun Li, Bin Hu, Yan Zhang, Meng Wang | arxiv.org/pdf/2403.05… | link |
| 2024-03-08 | Improving the Successful Robotic Grasp Detection Using Convolutional Neural Networks | 使用卷积神经网络改进成功的机器人抓取检测 | Hamed Hosseini, Mehdi Tale Masouleh, Ahmad Kalhor | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Learning Expressive And Generalizable Motion Features For Face Forgery Detection | 学习用于人脸伪造检测的富有表现力和可推广的运动特征 | Jingyi Zhang, Peng Zhang, Jingjing Wang, Di Xie, Shiliang Pu | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | MamMIL: Multiple Instance Learning for Whole Slide Images with State Space Models | MamMIL:使用状态空间模型对整个幻灯片图像进行多实例学习 | Zijie Fang, Yifeng Wang, Zhi Wang, Jian Zhang, Xiangyang Ji, Yongbing Zhang | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | LanePtrNet: Revisiting Lane Detection as Point Voting and Grouping on Curves | LanePtrNet:重新审视车道检测作为点投票和曲线分组 | Jiayan Cao, Xueyu Zhu, Cheng Qian | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | APPLE: Adversarial Privacy-aware Perturbations on Latent Embedding for Unfairness Mitigation | 苹果:潜在嵌入的对抗性隐私感知扰动以减轻不公平现象 | Zikang Xu, Fenghe Tang, Quan Quan, Qingsong Yao, S. Kevin Zhou | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | From Registration Uncertainty to Segmentation Uncertainty | 从注册不确定性到分割不确定性 | Junyu Chen, Yihao Liu, Shuwen Wei, Zhangxing Bian, Aaron Carass, Yong Du | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Beyond MOT: Semantic Multi-Object Tracking | 超越 MOT:语义多目标跟踪 | Yunhao Li, Hao Wang, Qin Li, Xue Ma, Jiali Yao, Shaohua Dong, Heng Fan, Libo Zhang | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Robust Surgical Tool Tracking with Pixel-based Probabilities for Projected Geometric Primitives | 具有基于像素的投影几何基元概率的鲁棒手术工具跟踪 | Christopher D'Ambrosia, Florian Richter, Zih-Yun Chiu, Nikhil Shinde, Fei Liu, Henrik I. Christensen, Michael C. Yip | arxiv.org/pdf/2403.04… | null |
| 2024-03-08 | ActFormer: Scalable Collaborative Perception via Active Queries | ActFormer:通过主动查询的可扩展协作感知 | Suozhi Huang, Juexiao Zhang, Yiming Li, Chen Feng | arxiv.org/pdf/2403.04… | null |
图像理解
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-08 | ContrastDiagnosis: Enhancing Interpretability in Lung Nodule Diagnosis Using Contrastive Learning | 对比诊断:使用对比学习增强肺结节诊断的可解释性 | Chenglong Wang, Yinqiao Yi, Yida Wang, Chengxiu Zhang, Yun Liu, Kensaku Mori, Mei Yuan, Guang Yang | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Stealing Stable Diffusion Prior for Robust Monocular Depth Estimation | 窃取稳定扩散先验以实现稳健的单目深度估计 | Yifan Mao, Jian Liu, Xianming Liu | arxiv.org/pdf/2403.05… | link |
| 2024-03-08 | PrimeComposer: Faster Progressively Combined Diffusion for Image Composition with Attention Steering | PrimeComposer:通过注意力引导进行图像合成的更快的渐进组合扩散 | Yibin Wang, Weizhong Zhang, Jianwei Zheng, Cheng Jin | arxiv.org/pdf/2403.05… | link |
LLM
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-08 | Beyond Finite Data: Towards Data-free Out-of-distribution Generalization via Extrapola | 超越有限数据:通过 Extrapola 实现无数据分布外泛化 | Yijiang Li, Sucheng Ren, Weipeng Deng, Yuzhi Xu, Ying Gao, Edith Ngai, Haohan Wang | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Will GPT-4 Run DOOM? | GPT-4 会运行《DOOM》吗? | Adrian de Wynter | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | DiffChat: Learning to Chat with Text-to-Image Synthesis Models for Interactive Image Creation | DiffChat:学习使用文本到图像合成模型进行聊天以创建交互式图像 | Jiapeng Wang, Chengyu Wang, Tingfeng Cao, Jun Huang, Lianwen Jin | arxiv.org/pdf/2403.04… | null |
Transformer
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-08 | JointMotion: Joint Self-supervision for Joint Motion Prediction | JointMotion:联合自监督联合运动预测 | Royden Wagner, Ömer Şahin Taş, Marvin Klemp, Carlos Fernandez | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | DualBEV: CNN is All You Need in View Transformation | DualBEV:CNN 是您在视图转换中所需要的一切 | Peidong Li, Wancheng Shen, Qihao Huang, Dixiao Cui | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Tracking Meets LoRA: Faster Training, Larger Model, Stronger Performance | 跟踪遇见 LoRA:更快的训练、更大的模型、更强的性能 | Liting Lin, Heng Fan, Zhipeng Zhang, Yaowei Wang, Yong Xu, Haibin Ling | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | UFORecon: Generalizable Sparse-View Surface Reconstruction from Arbitrary and UnFavOrable Data Sets | UFORecon:从任意和不利数据集进行可推广的稀疏视图表面重建 | Youngju Na, Woo Jae Kim, Kyu Beom Han, Suhyeon Ha, Sung-eui Yoon | arxiv.org/pdf/2403.05… | link |
| 2024-03-08 | Agile Multi-Source-Free Domain Adaptation | 敏捷的多源自由域适应 | Xinyao Li, Jingjing Li, Fengling Li, Lei Zhu, Ke Lu | arxiv.org/pdf/2403.05… | link |
| 2024-03-08 | REPS: Reconstruction-based Point Cloud Sampling | REPS:基于重建的点云采样 | Guoqing Zhang, Wenbo Zhao, Jian Liu, Xianming Liu | arxiv.org/pdf/2403.05… | link |
| 2024-03-08 | DITTO: Dual and Integrated Latent Topologies for Implicit 3D Reconstruction | DITTO:用于隐式 3D 重建的双重和集成潜在拓扑 | Jaehyeok Shim, Kyungdon Joo | arxiv.org/pdf/2403.05… | null |
3D/CG
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-08 | Grasping Trajectory Optimization with Point Clouds | 利用点云抓取轨迹优化 | Yu Xiang, Sai Haneesh Allu, Rohith Peddi, Tyler Summers, Vibhav Gogate | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | 3D Face Reconstruction Using A Spectral-Based Graph Convolution Encoder | 使用基于频谱的图卷积编码器进行 3D 人脸重建 | Haoxin Xu, Zezheng Zhao, Yuxin Cao, Chunyu Chen, Hao Ge, Ziyao Liu | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Overcoming Data Inequality across Domains with Semi-Supervised Domain Generalization | 通过半监督域泛化克服跨域的数据不平等 | Jinha Park, Wonguk Cho, Taesup Kim | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Motion-Guided Dual-Camera Tracker for Low-Cost Skill Evaluation of Gastric Endoscopy | 用于胃内窥镜检查低成本技能评估的运动引导双摄像头跟踪器 | Yuelin Zhang, Wanquan Yan, Kim Yan, Chun Ping Lam, Yufu Qiu, Pengyu Zheng, Raymond Shing-Yan Tang, Shing Shin Cheng | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Arbitrary-Scale Point Cloud Upsampling by Voxel-Based Network with Latent Geometric-Consistent Learning | 通过基于体素的网络和潜在几何一致学习进行任意尺度点云上采样 | Hang Du, Xuejun Yan, Jingjing Wang, Di Xie, Shiliang Pu | arxiv.org/pdf/2403.05… | link |
| 2024-03-08 | Enhancing Texture Generation with High-Fidelity Using Advanced Texture Priors | 使用高级纹理先验增强高保真度纹理生成 | Kuo Xu, Maoyu Wang, Muyu Wang, Lincong Feng, Tianhui Zhang, Xiaoli Liu | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | MUC: Mixture of Uncalibrated Cameras for Robust 3D Human Body Reconstruction | MUC:用于稳健 3D 人体重建的未校准相机混合 | Yitao Zhu, Sheng Wang, Mengjie Xu, Zixu Zhuang, Zhixin Wang, Kaidong Wang, Han Zhang, Qian Wang | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | ERASOR++: Height Coding Plus Egocentric Ratio Based Dynamic Object Removal for Static Point Cloud Mapping | ERASOR++:静态点云映射的基于高度编码和自心比的动态对象去除 | Jiabao Zhang, Yu Zhang | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Robust automated calcification meshing for biomechanical cardiac digital twins | 用于生物力学心脏数字孪生的鲁棒自动化钙化网格划分 | Daniel H. Pak, Minliang Liu, Theodore Kim, Caglar Ozturk, Raymond McKay, Ellen T. Roche, Rudolph Gleason, James S. Duncan | arxiv.org/pdf/2403.04… | null |
各类学习方式
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-08 | Tell, Don't Show!: Language Guidance Eases Transfer Across Domains in Images and Videos | 告诉,不要展示!:语言指导简化了图像和视频中跨域的传输 | Tarun Kalluri, Bodhisattwa Prasad Majumder, Manmohan Chandraker | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Poly-View Contrastive Learning | 多视角对比学习 | Amitis Shidani, Devon Hjelm, Jason Ramapuram, Russ Webb, Eeshan Gunesh Dhekane, Dan Busbridge | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Continual Learning and Catastrophic Forgetting | 持续学习和灾难性遗忘 | Gido M. van de Ven, Nicholas Soures, Dhireesha Kudithipudi | arxiv.org/pdf/2403.05… | null |
其他
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-08 | The R2D2 deep neural network series paradigm for fast precision imaging in radio astronomy | 用于射电天文学快速精确成像的 R2D2 深度神经网络系列范式 | Amir Aghabiglou, Chung San Chu, Arwa Dabbech, Yves Wiaux | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | HistGen: Histopathology Report Generation via Local-Global Feature Encoding and Cross-modal Context Interaction | HistGen:通过局部-全局特征编码和跨模式上下文交互生成组织病理学报告 | Zhengrui Guo, Jiabo Ma, Yingxue Xu, Yihui Wang, Liansheng Wang, Hao Chen | arxiv.org/pdf/2403.05… | link |
| 2024-03-08 | DuDoUniNeXt: Dual-domain unified hybrid model for single and multi-contrast undersampled MRI reconstruction | DuDoUniNeXt:用于单对比和多对比欠采样 MRI 重建的双域统一混合模型 | Ziqi Gao, Yue Zhang, Xinwen Liu, Kaiyan Li, S. Kevin Zhou | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | CLIP-Gaze: Towards General Gaze Estimation via Visual-Linguistic Model | CLIP-Gaze:通过视觉语言模型进行一般注视估计 | Pengwei Yin, Guanzhong Zeng, Jingjing Wang, Di Xie | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | Exploring the Adversarial Frontier: Quantifying Robustness via Adversarial Hypervolume | 探索对抗性前沿:通过对抗性超容量量化鲁棒性 | Ping Guo, Cheng Gong, Xi Lin, Zhiyuan Yang, Qingfu Zhang | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | DyRoNet: A Low-Rank Adapter Enhanced Dynamic Routing Network for Streaming Perception | DyRoNet:用于流感知的低阶适配器增强型动态路由网络 | Xiang Huang, Zhi-Qi Cheng, Jun-Yan He, Chenyang Li, Wangmeng Xiang, Baigui Sun, Xiao Wu | arxiv.org/pdf/2403.05… | null |
| 2024-03-08 | PromptIQA: Boosting the Performance and Generalization for No-Reference Image Quality Assessment via Prompts | PromptIQA:通过提示提高无参考图像质量评估的性能和泛化 | Zewen Chen, Haina Qin, Juan Wang, Chunfeng Yuan, Bing Li, Weiming Hu, Liang Wang | arxiv.org/pdf/2403.04… | null |
| 2024-03-08 | PIPsUS: Self-Supervised Dense Point Tracking in Ultrasound | PIPsUS:超声波中的自监督密集点跟踪 | Wanwen Chen, Adam Schmidt, Eitan Prisman, Septimiu E Salcudean | arxiv.org/pdf/2403.04… | null |