[UPDATED!] 2024-03-05 (Publish Time)
生成模型
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-05 | Scaling Rectified Flow Transformers for High-Resolution Image Synthesis | 缩放整流流量变压器以实现高分辨率图像合成 | Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim Entezari, Jonas Müller, Harry Saini, Yam Levi, Dominik Lorenz, Axel Sauer, Frederic Boesel, et.al. | arxiv.org/pdf/2403.03… | null |
| 2024-03-05 | Triple-CFN: Restructuring Conceptual Spaces for Enhancing Abstract Reasoning process | Triple-CFN:重构概念空间以增强抽象推理过程 | Ruizhuo Song, Beiming Yuan | arxiv.org/pdf/2403.03… | null |
| 2024-03-05 | NRDF: Neural Riemannian Distance Fields for Learning Articulated Pose Priors | NRDF:用于学习铰接姿势先验的神经黎曼距离场 | Yannan He, Garvita Tiwari, Tolga Birdal, Jan Eric Lenssen, Gerard Pons-Moll | arxiv.org/pdf/2403.03… | null |
| 2024-03-05 | Doubly Abductive Counterfactual Inference for Text-based Image Editing | 基于文本的图像编辑的双重溯因反事实推理 | Xue Song, Jiequan Cui, Hanwang Zhang, Jingjing Chen, Richang Hong, Yu-Gang Jiang | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Neural Image Compression with Text-guided Encoding for both Pixel-level and Perceptual Fidelity | 具有文本引导编码的神经图像压缩,可实现像素级和感知保真度 | Hagyeong Lee, Minkyu Kim, Jun-Hyuk Kim, Seungeon Kim, Dokwan Oh, Jaeho Lee | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Cross-Domain Image Conversion by CycleDM | CycleDM 的跨域图像转换 | Sho Shimotsumagari, Shumpei Takezaki, Daichi Haraguchi, Seiichi Uchida | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Enhancing the Rate-Distortion-Perception Flexibility of Learned Image Codecs with Conditional Diffusion Decoders | 使用条件扩散解码器增强学习图像编解码器的率失真感知灵活性 | Daniele Mari, Simone Milani | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Zero-LED: Zero-Reference Lighting Estimation Diffusion Model for Low-Light Image Enhancement | Zero-LED:用于低光图像增强的零参考照明估计扩散模型 | Jinhong He, Minglong Xue, Zhipu Liu, Chengyun Song, Senming Zhong | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Tuning-Free Noise Rectification for High Fidelity Image-to-Video Generation | 用于高保真图像到视频生成的免调谐噪声校正 | Weijie Li, Litong Gong, Yiran Zhu, Fanda Fan, Biao Wang, Tiezheng Ge, Bo Zheng | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Fast, Scale-Adaptive, and Uncertainty-Aware Downscaling of Earth System Model Fields with Generative Foundation Models | 使用生成基础模型对地球系统模型场进行快速、尺度自适应和不确定性感知缩减 | Philipp Hess, Michael Aich, Baoxiang Pan, Niklas Boers | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Few-shot Learner Parameterization by Diffusion Time-steps | 通过扩散时间步长进行少样本学习器参数化 | Zhongqi Yue, Pan Zhou, Richang Hong, Hanwang Zhang, Qianru Sun | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Enhancing Weakly Supervised 3D Medical Image Segmentation through Probabilistic-aware Learning | 通过概率感知学习增强弱监督 3D 医学图像分割 | Zhaoxin Fan, Runmin Jiang, Junhao Wu, Xin Huang, Tianyang Wang, Heng Huang, Min Xu | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Semantic Human Mesh Reconstruction with Textures | 使用纹理进行语义人体网格重建 | Xiaoyu Zhan, Jianxin Yang, Yuanqi Li, Jie Guo, Yanwen Guo, Wenping Wang | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Updating the Minimum Information about CLinical Artificial Intelligence (MI-CLAIM) checklist for generative modeling research | 更新生成建模研究的临床人工智能 (MI-CLAIM) 最低信息清单 | Brenda Y. Miao, Irene Y. Chen, Christopher YK Williams, Jaysón Davidson, Augusto Garcia-Agundez, Harry Sun, Travis Zack, Atul J. Butte, Madhumita Sushil | arxiv.org/pdf/2403.02… | null |
多模态
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-05 | Self-supervised 3D Patient Modeling with Multi-modal Attentive Fusion | 具有多模态注意力融合的自监督 3D 患者建模 | Meng Zheng, Benjamin Planche, Xuan Gong, Fan Yang, Terrence Chen, Ziyan Wu | arxiv.org/pdf/2403.03… | null |
| 2024-03-05 | Design2Code: How Far Are We From Automating Front-End Engineering? | Design2Code:我们距离自动化前端工程还有多远? | Chenglei Si, Yanzhe Zhang, Zhengyuan Yang, Ruibo Liu, Diyi Yang | arxiv.org/pdf/2403.03… | null |
| 2024-03-05 | Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models | 大饱眼福:多模态大语言模型的混合分辨率自适应 | Gen Luo, Yiyi Zhou, Yuxin Zhang, Xiawu Zheng, Xiaoshuai Sun, Rongrong Ji | arxiv.org/pdf/2403.03… | null |
| 2024-03-05 | MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer | MADTP:多模态对齐引导的动态令牌修剪,用于加速视觉语言变压器 | Jianjian Cao, Peng Ye, Shengze Li, Chong Yu, Yansong Tang, Jiwen Lu, Tao Chen | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Multi-modal Instruction Tuned LLMs with Fine-grained Visual Perception | 具有细粒度视觉感知的多模式指令调整法学硕士 | Junwen He, Yifan Wang, Lijun Wang, Huchuan Lu, Jun-Yan He, Jin-Peng Lan, Bin Luo, Xuansong Xie | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Enhancing Conceptual Understanding in Multimodal Contrastive Learning through Hard Negative Samples | 通过硬负样本增强多模态对比学习中的概念理解 | Philipp J. Rösch, Norbert Oswald, Michaela Geierhos, Jindřich Libovický | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Enhancing Generalization in Medical Visual Question Answering Tasks via Gradient-Guided Model Perturbation | 通过梯度引导模型扰动增强医学视觉问答任务的泛化 | Gang Liu, Hongyang Li, Zerui He, Shenjun Zhong | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Finetuned Multimodal Language Models Are High-Quality Image-Text Data Filters | 微调的多模态语言模型是高质量的图像文本数据过滤器 | Weizhi Wang, Khalil Mrini, Linjie Yang, Sateesh Kumar, Yu Tian, Xifeng Yan, Heng Wang | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Interactive Continual Learning: Fast and Slow Thinking | 交互式持续学习:快思考和慢思考 | Biqing Qi, Xingquan Chen, Junqi Gao, Jianxing Liu, Ligang Wu, Bowen Zhou | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | VEglue: Testing Visual Entailment Systems via Object-Aligned Joint Erasing | VEglue:通过对象对齐联合擦除测试视觉蕴涵系统 | Zhiyuan Chang, Mingyang Li, Junjie Wang, Cheng Li, Qing Wang | arxiv.org/pdf/2403.02… | null |
模型压缩/优化
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-05 | PromptKD: Unsupervised Prompt Distillation for Vision-Language Models | PromptKD:视觉语言模型的无监督快速蒸馏 | Zheng Li, Xiang Li, Xinyi Fu, Xing Zhang, Weiqiang Wang, Jian Yang | arxiv.org/pdf/2403.02… | null |
分类/检测/识别/分割/...
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-05 | Solving the bongard-logo problem by modeling a probabilistic model | 通过建立概率模型来解决 boongard-logo 问题 | Ruizhuo Song, Beiming Yuan | arxiv.org/pdf/2403.03… | null |
| 2024-03-05 | PalmProbNet: A Probabilistic Approach to Understanding Palm Distributions in Ecuadorian Tropical Forest via Transfer Learning | PalmProbNet:通过迁移学习了解厄瓜多尔热带森林棕榈分布的概率方法 | Kangning Cui, Zishan Shao, Gregory Larsen, Victor Pauca, Sarra Alqahtani, David Segurado, João Pinheiro, Manqi Wang, David Lutz, Robert Plemmons, et.al. | arxiv.org/pdf/2403.03… | null |
| 2024-03-05 | Simplicity in Complexity | 复杂中的简单 | Kevin Shen, Surabhi S Nath, Aenne Brielmann, Peter Dayan | arxiv.org/pdf/2403.03… | null |
| 2024-03-05 | Motion-Corrected Moving Average: Including Post-Hoc Temporal Information for Improved Video Segmentation | 运动校正移动平均:包括事后时间信息以改进视频分割 | Robert Mendel, Tobias Rueckert, Dirk Wilhelm, Daniel Rueckert, Christoph Palm | arxiv.org/pdf/2403.03… | null |
| 2024-03-05 | Improved LiDAR Odometry and Mapping using Deep Semantic Segmentation and Novel Outliers Detection | 使用深度语义分割和新颖的异常值检测改进 LiDAR 里程计和地图绘制 | Mohamed Afifi, Mohamed ElHelw | arxiv.org/pdf/2403.03… | null |
| 2024-03-05 | MiKASA: Multi-Key-Anchor & Scene-Aware Transformer for 3D Visual Grounding | MiKASA:用于 3D 视觉基础的多键锚点和场景感知变压器 | Chun-Peng Chang, Shaoxiang Wang, Alain Pagani, Didier Stricker | arxiv.org/pdf/2403.03… | null |
| 2024-03-05 | CrackNex: a Few-shot Low-light Crack Segmentation Model Based on Retinex Theory for UAV Inspections | CrackNex:基于 Retinex 理论的无人机检测少镜头微光裂纹分割模型 | Zhen Yao, Jiawei Xu, Shuhang Hou, Mooi Choo Chuah | arxiv.org/pdf/2403.03… | null |
| 2024-03-05 | ChatGPT and biometrics: an assessment of face recognition, gender detection, and age estimation capabilities | ChatGPT 和生物识别:面部识别、性别检测和年龄估计能力的评估 | Ahmad Hassanpour, Yasamin Kowsari, Hatef Otroshi Shahreza, Bian Yang, Sebastien Marcel | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | XAI-Based Detection of Adversarial Attacks on Deepfake Detectors | 基于 XAI 的 Deepfake 探测器对抗性攻击检测 | Ben Pinhasov, Raz Lapid, Rony Ohayon, Moshe Sipper, Yehudit Aperstein | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Citizen Science and Machine Learning for Research and Nature Conservation: The Case of Eurasian Lynx, Free-ranging Rodents and Insects | 用于研究和自然保护的公民科学和机器学习:欧亚山猫、自由放养的啮齿动物和昆虫的案例 | Kinga Skorupska, Rafał Stryjek, Izabela Wierzbowska, Piotr Bebas, Maciej Grzeszczuk, Piotr Gago, Jarosław Kowalski, Maciej Krzywicki, Jagoda Lazarek, Wiesław Kopeć | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Enhancing Long-Term Person Re-Identification Using Global, Local Body Part, and Head Streams | 使用全局、局部身体部位和头部流增强长期人员重新识别 | Duy Tran Thanh, Yeejin Lee, Byeongkeun Kang | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Revisiting Confidence Estimation: Towards Reliable Failure Prediction | 重新审视置信度估计:实现可靠的故障预测 | Fei Zhu, Xu-Yao Zhang, Zhen Cheng, Cheng-Lin Liu | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | ActiveAD: Planning-Oriented Active Learning for End-to-End Autonomous Driving | ActiveAD:面向规划的端到端自动驾驶主动学习 | Han Lu, Xiaosong Jia, Yichen Xie, Wenlong Liao, Xiaokang Yang, Junchi Yan | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Are Dense Labels Always Necessary for 3D Object Detection from Point Cloud? | 从点云进行 3D 物体检测是否始终需要密集标签? | Chenqiang Gao, Chuandong Liu, Jun Shu, Fangcen Liu, Jiang Liu, Luyu Yang, Xinbo Gao, Deyu Meng | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | DDF: A Novel Dual-Domain Image Fusion Strategy for Remote Sensing Image Semantic Segmentation with Unsupervised Domain Adaptation | DDF:一种新型双域图像融合策略,用于具有无监督域适应的遥感图像语义分割 | Lingyan Ran, Lushuang Wang, Tao Zhuo, Yinghui Xing | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | HUNTER: Unsupervised Human-centric 3D Detection via Transferring Knowledge from Synthetic Instances to Real Scenes | HUNTER:通过将合成实例的知识转移到真实场景进行无监督的以人为中心的 3D 检测 | Yichen Yao, Zimo Jiang, Yujing Sun, Zhencai Zhu, Xinge Zhu, Runnan Chen, Yuexin Ma | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | DeconfuseTrack:Dealing with Confusion for Multi-Object Tracking | DeconfuseTrack:处理多目标跟踪的混乱 | Cheng Huang, Shoudong Han, Mengyu He, Wenbo Zheng, Yuhao Wei | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Learning without Exact Guidance: Updating Large-scale High-resolution Land Cover Maps from Low-resolution Historical Labels | 在没有确切指导的情况下学习:根据低分辨率历史标签更新大规模高分辨率土地覆盖图 | Zhuohong Li, Wei He, Jiepan Li, Fangxiao Lu, Hongyan Zhang | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Bootstrapping Rare Object Detection in High-Resolution Satellite Imagery | 在高分辨率卫星图像中引导稀有物体检测 | Akram Zaytar, Caleb Robinson, Gilles Q. Hacheme, Girmaw A. Tadesse, Rahul Dodhia, Juan M. Lavista Ferres, Lacey F. Hughey, Jared A. Stabach, Irene Amoke | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | FastOcc: Accelerating 3D Occupancy Prediction by Fusing the 2D Bird's-Eye View and Perspective View | FastOcc:通过融合 2D 鸟瞰图和透视图加速 3D 占用预测 | Jiawei Hou, Xiaoyan Li, Wenhao Guan, Gang Zhang, Di Feng, Yuheng Du, Xiangyang Xue, Jian Pu | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Deep Common Feature Mining for Efficient Video Semantic Segmentation | 深度共同特征挖掘实现高效视频语义分割 | Yaoyan Zheng, Hongyu Yang, Di Huang | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | UFO: Uncertainty-aware LiDAR-image Fusion for Off-road Semantic Terrain Map Estimation | UFO:用于越野语义地形图估计的不确定性激光雷达图像融合 | Ohn Kim, Junwon Seo, Seongyong Ahn, Chong Hui Kim | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | False Positive Sampling-based Data Augmentation for Enhanced 3D Object Detection Accuracy | 基于误报采样的数据增强可增强 3D 对象检测的准确性 | Jiyong Oh, Junhaeng Lee, Woongchan Byun, Minsang Kong, Sang Hun Lee | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | BSDP: Brain-inspired Streaming Dual-level Perturbations for Online Open World Object Detection | BSDP:用于在线开放世界对象检测的受大脑启发的流式双级扰动 | Yu Chen, Liyan Ma, Liping Jing, Jian Yu | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Modeling Collaborator: Enabling Subjective Vision Classification With Minimal Human Effort via LLM Tool-Use | 建模协作者:通过 LLM 工具使用以最少的人力实现主观视觉分类 | Imad Eddine Toubal, Aditya Avinash, Neil Gordon Alldrin, Jan Dlabal, Wenlei Zhou, Enming Luo, Otilia Stretcu, Hao Xiong, Chun-Ta Lu, Howard Zhou, et.al. | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Systemic Biases in Sign Language AI Research: A Deaf-Led Call to Reevaluate Research Agendas | 手语人工智能研究中的系统性偏见:聋人主导的重新评估研究议程的呼吁 | Aashaka Desai, Maartje De Meulder, Julie A. Hochgesang, Annemarie Kocab, Alex X. Lu | arxiv.org/pdf/2403.02… | null |
LLM
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-05 | ImgTrojan: Jailbreaking Vision-Language Models with ONE Image | ImgTrojan:使用一张图像越狱视觉语言模型 | Xijia Tao, Shuai Zhong, Lei Li, Qi Liu, Lingpeng Kong | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Android in the Zoo: Chain-of-Action-Thought for GUI Agents | 动物园里的 Android:GUI 代理的行动链思想 | Jiwen Zhang, Jihao Wu, Yihua Teng, Minghui Liao, Nuo Xu, Xiao Xiao, Zhongyu Wei, Duyu Tang | arxiv.org/pdf/2403.02… | null |
Transformer
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-05 | FAR: Flexible, Accurate and Robust 6DoF Relative Camera Pose Estimation | FAR:灵活、准确且鲁棒的 6DoF 相对相机姿态估计 | Chris Rockwell, Nilesh Kulkarni, Linyi Jin, Jeong Joon Park, Justin Johnson, David F. Fouhey | arxiv.org/pdf/2403.03… | null |
| 2024-03-05 | A Unified Framework for Microscopy Defocus Deblur with Multi-Pyramid Transformer and Contrastive Learning | 具有多金字塔变压器和对比学习的显微镜散焦去模糊统一框架 | Yuelin Zhang, Pengyu Zheng, Wanquan Yan, Chengyu Fang, Shing Shin Cheng | arxiv.org/pdf/2403.02… | null |
3D/CG
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-05 | HoloVIC: Large-scale Dataset and Benchmark for Multi-Sensor Holographic Intersection and Vehicle-Infrastructure Cooperative | HoloVIC:多传感器全息交叉口和车路协同的大规模数据集和基准 | Cong Ma, Lei Qiao, Chengkai Zhu, Kai Liu, Zelong Kong, Qing Li, Xueqi Zhou, Yuheng Kan, Wei Wu | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Towards Geometric-Photometric Joint Alignment for Facial Mesh Registration | 面向面部网格配准的几何光度联合对准 | Xizhi Wang, Yaxiong Wang, Mengjian Li | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Low-Res Leads the Way: Improving Generalization for Super-Resolution by Self-Supervised Learning | 低分辨率引领潮流:通过自监督学习提高超分辨率的泛化能力 | Haoyu Chen, Wenbo Li, Jinjin Gu, Jingjing Ren, Haoze Sun, Xueyi Zou, Zhensong Zhang, Youliang Yan, Lei Zhu | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Pooling Image Datasets With Multiple Covariate Shift and Imbalance | 具有多个协变量偏移和不平衡的池化图像数据集 | Sotirios Panagiotis Chytas, Vishnu Suresh Lokhande, Peiran Li, Vikas Singh | arxiv.org/pdf/2403.02… | null |
各类学习方式
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-05 | Dual Mean-Teacher: An Unbiased Semi-Supervised Framework for Audio-Visual Source Localization | 对偶平均教师:用于视听源定位的无偏半监督框架 | Yuxin Guo, Shijie Ma, Hu Su, Zhiqing Wang, Yuhao Zhao, Wei Zou, Siyang Sun, Yun Zheng | arxiv.org/pdf/2403.03… | null |
| 2024-03-05 | Cross Pseudo-Labeling for Semi-Supervised Audio-Visual Source Localization | 用于半监督视听源定位的交叉伪标签 | Yuxin Guo, Shijie Ma, Yuhao Zhao, Hu Su, Wei Zou | arxiv.org/pdf/2403.03… | null |
| 2024-03-05 | Recall-Oriented Continual Learning with Generative Adversarial Meta-Model | 具有生成对抗性元模型的面向回忆的持续学习 | Haneol Kang, Dong-Wan Choi | arxiv.org/pdf/2403.03… | link |
| 2024-03-05 | Rehabilitation Exercise Quality Assessment through Supervised Contrastive Learning with Hard and Soft Negatives | 通过硬阴性和软阴性的监督对比学习进行康复运动质量评估 | Mark Karlov, Ali Abedi, Shehroz S. Khan | arxiv.org/pdf/2403.02… | null |
其他
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-03-05 | A Backpack Full of Skills: Egocentric Video Understanding with Diverse Task Perspectives | 充满技能的背包:以自我为中心的视频理解与多样化的任务视角 | Simone Alberto Peirone, Francesca Pistilli, Antonio Alliegro, Giuseppe Averta | arxiv.org/pdf/2403.03… | null |
| 2024-03-05 | Gaze-Vector Estimation in the Dark with Temporally Encoded Event-driven Neural Networks | 使用时间编码事件驱动神经网络在黑暗中进行注视矢量估计 | Abeer Banerjee, Naval K. Mehta, Shyam S. Prasad, Himanshu, Sumeet Saurav, Sanjay Singh | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Towards Robust Federated Learning via Logits Calibration on Non-IID Data | 通过非 IID 数据的 Logits 校准实现稳健的联邦学习 | Yu Qiao, Apurba Adhikary, Chaoning Zhang, Choong Seon Hong | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Why Not Use Your Textbook? Knowledge-Enhanced Procedure Planning of Instructional Videos | 为什么不使用你的教科书?教学视频的知识增强程序规划 | Kumaranage Ravindu Yasas Nagasinghe, Honglu Zhou, Malitha Gunawardhana, Martin Renqiang Min, Daniel Harari, Muhammad Haris Khan | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Learning Group Activity Features Through Person Attribute Prediction | 通过人员属性预测学习群体活动特征 | Chihiro Nakatani, Hiroaki Kawashima, Norimichi Ukita | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | DomainVerse: A Benchmark Towards Real-World Distribution Shifts For Tuning-Free Adaptive Domain Generalization | DomainVerse:免调整自适应域泛化的现实世界分布转变的基准 | Feng Hou, Jin Yuan, Ying Yang, Yang Liu, Yang Zhang, Cheng Zhong, Zhongchao Shi, Jianping Fan, Yong Rui, Zhiqiang He | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | Dirichlet-based Per-Sample Weighting by Transition Matrix for Noisy Label Learning | 通过转移矩阵进行基于狄利克雷的每样本加权,用于噪声标签学习 | HeeSun Bae, Seungjae Shin, Byeonghu Na, Il-Chul Moon | arxiv.org/pdf/2403.02… | link |
| 2024-03-05 | What do we learn from inverting CLIP models? | 我们从反演 CLIP 模型中学到了什么? | Hamid Kazemi, Atoosa Chegini, Jonas Geiping, Soheil Feizi, Tom Goldstein | arxiv.org/pdf/2403.02… | null |
| 2024-03-05 | DPAdapter: Improving Differentially Private Deep Learning through Noise Tolerance Pre-training | DPAdapter:通过噪声容忍预训练改进差分隐私深度学习 | Zihao Wang, Rui Zhu, Dongruo Zhou, Zhikun Zhang, John Mitchell, Haixu Tang, XiaoFeng Wang | arxiv.org/pdf/2403.02… | null |