[UPDATED!] 2024-03-02 (Publish Time)
生成模型
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-02 | SceneCraft: An LLM Agent for Synthesizing 3D Scene as Blender Code | SceneCraft:用于将 3D 场景合成为 Blender 代码的 LLM 代理 | Ziniu Hu, Ahmet Iscen, Aashi Jain, Thomas Kipf, Yisong Yue, David A. Ross, Cordelia Schmid, Alireza Fathi | arxiv.org/pdf/2403.01… | null |
2024-03-02 | DiffSal: Joint Audio and Video Learning for Diffusion Saliency Prediction | DiffSal:用于扩散显着性预测的联合音频和视频学习 | Junwen Xiong, Peng Zhang, Tao You, Chuanyue Li, Wei Huang, Yufei Zha | arxiv.org/pdf/2403.01… | null |
2024-03-02 | TCIG: Two-Stage Controlled Image Generation with Quality Enhancement through Diffusion | TCIG:两阶段控制图像生成,通过扩散增强质量 | Salaheldin Mohamed | arxiv.org/pdf/2403.01… | null |
2024-03-02 | Training Unbiased Diffusion Models From Biased Dataset | 从有偏数据集训练无偏扩散模型 | Yeongmin Kim, Byeonghu Na, Minsang Park, JoonHo Jang, Dongjun Kim, Wanmo Kang, Il-Chul Moon | arxiv.org/pdf/2403.01… | null |
2024-03-02 | Dynamic 3D Point Cloud Sequences as 2D Videos | 作为 2D 视频的动态 3D 点云序列 | Yiming Zeng, Junhui Hou, Qijian Zhang, Siyu Ren, Wenping Wang | arxiv.org/pdf/2403.01… | null |
2024-03-02 | Text-guided Explorable Image Super-resolution | 文本引导的可探索图像超分辨率 | Kanchana Vaishnavi Gandikota, Paramanand Chandramouli | arxiv.org/pdf/2403.01… | null |
2024-03-02 | Face Swap via Diffusion Model | 通过扩散模型进行面部交换 | Feifei Wang | arxiv.org/pdf/2403.01… | null |
多模态
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-02 | DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions | DNA 系列:通过块级监督增强权重共享 NAS | Guangrun Wang, Changlin Li, Liuchun Yuan, Jiefeng Peng, Xiaoyu Xian, Xiaodan Liang, Xiaojun Chang, Liang Lin | arxiv.org/pdf/2403.01… | null |
2024-03-02 | TUMTraf V2X Cooperative Perception Dataset | TUMTraf V2X 协作感知数据集 | Walter Zimmer, Gerhard Arya Wardana, Suren Sritharan, Xingcheng Zhou, Rui Song, Alois C. Knoll | arxiv.org/pdf/2403.01… | null |
2024-03-02 | ICC: Quantifying Image Caption Concreteness for Multimodal Dataset Curation | ICC:量化多模态数据集管理的图像描述的具体性 | Moran Yanuka, Morris Alper, Hadar Averbuch-Elor, Raja Giryes | arxiv.org/pdf/2403.01… | null |
2024-03-02 | REWIND Dataset: Privacy-preserving Speaking Status Segmentation from Multimodal Body Movement Signals in the Wild | REWIND 数据集:根据野外多模态身体运动信号进行隐私保护的说话状态分割 | Jose Vargas Quiros, Chirag Raman, Stephanie Tan, Ekin Gedik, Laura Cabrera-Quiros, Hayley Hung | arxiv.org/pdf/2403.01… | null |
2024-03-02 | Adversarial Testing for Visual Grounding via Image-Aware Property Reduction | 通过图像感知属性减少进行视觉接地的对抗性测试 | Zhiyuan Chang, Mingyang Li, Junjie Wang, Cheng Li, Boyu Wu, Fanjiang Xu, Qing Wang | arxiv.org/pdf/2403.01… | null |
Nerf
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-02 | NeRF-VPT: Learning Novel View Representations with Neural Radiance Fields via View Prompt Tuning | NeRF-VPT:通过视图提示调整学习具有神经辐射场的新颖视图表示 | Linsheng Chen, Guangrun Wang, Liuchun Yuan, Keze Wang, Ken Deng, Philip H. S. Torr | arxiv.org/pdf/2403.01… | null |
2024-03-02 | Neural radiance fields-based holography [Invited] | 基于神经辐射场的全息术 [邀请] | Minsung Kang, Fan Wang, Kai Kumano, Tomoyoshi Ito, Tomoyoshi Shimobaba | arxiv.org/pdf/2403.01… | null |
2024-03-02 | Neural Field Classifiers via Target Encoding and Classification Loss | 通过目标编码和分类损失的神经场分类器 | Xindi Yang, Zeke Xie, Xiong Zhou, Boyu Liu, Buhua Liu, Yi Liu, Haoran Wang, Yunfeng Cai, Mingming Sun | arxiv.org/pdf/2403.01… | null |
模型压缩/优化
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-02 | Bespoke Non-Stationary Solvers for Fast Sampling of Diffusion and Flow Models | 用于扩散和流动模型快速采样的定制非平稳求解器 | Neta Shaul, Uriel Singer, Ricky T. Q. Chen, Matthew Le, Ali Thabet, Albert Pumarola, Yaron Lipman | arxiv.org/pdf/2403.01… | null |
2024-03-02 | On the Road to Portability: Compressing End-to-End Motion Planner for Autonomous Driving | 走向便携性:压缩自动驾驶端到端运动规划器 | Kaituo Feng, Changsheng Li, Dongchun Ren, Ye Yuan, Guoren Wang | arxiv.org/pdf/2403.01… | null |
2024-03-02 | Extracting Usable Predictions from Quantized Networks through Uncertainty Quantification for OOD Detection | 通过不确定性量化从量化网络中提取可用预测以进行 OOD 检测 | Rishi Singhal, Srinath Srinivasan | arxiv.org/pdf/2403.01… | null |
分类/检测/识别/分割/...
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-02 | Image-Based Dietary Assessment: A Healthy Eating Plate Estimation System | 基于图像的饮食评估:健康饮食餐盘估计系统 | Assylzhan Izbassar, Pakizar Shamoi | arxiv.org/pdf/2403.01… | null |
2024-03-02 | Causal Mode Multiplexer: A Novel Framework for Unbiased Multispectral Pedestrian Detection | 因果模式多路复用器:无偏多光谱行人检测的新颖框架 | Taeheon Kim, Sebin Shin, Youngjoon Yu, Hak Gu Kim, Yong Man Ro | arxiv.org/pdf/2403.01… | null |
2024-03-02 | Fast Low-parameter Video Activity Localization in Collaborative Learning Environments | 协作学习环境中的快速低参数视频活动定位 | Venkatesh Jatla, Sravani Teeparthi, Ugesh Egala, Sylvia Celedon Pattichis, Marios S. Patticis | arxiv.org/pdf/2403.01… | null |
2024-03-02 | Benchmarking Segmentation Models with Mask-Preserved Attribute Editing | 使用掩模保留属性编辑对分割模型进行基准测试 | Zijin Yin, Kongming Liang, Bing Li, Zhanyu Ma, Jun Guo | arxiv.org/pdf/2403.01… | null |
2024-03-02 | Boosting Box-supervised Instance Segmentation with Pseudo Depth | 使用伪深度增强盒监督实例分割 | Xinyi Yu, Ling Yan, Pengtao Jiang, Hao Chen, Bo Li, Lin Yuanbo Wu, Linlin Ou | arxiv.org/pdf/2403.01… | null |
2024-03-02 | SAR-AE-SFP: SAR Imagery Adversarial Example in Real Physics domain with Target Scattering Feature Parameters | SAR-AE-SFP:具有目标散射特征参数的真实物理域中的 SAR 图像对抗示例 | Jiahao Cui, Jiale Duan, Binyan Luo, Hang Cao, Wang Guo, Haifeng Li | arxiv.org/pdf/2403.01… | null |
2024-03-02 | Data-free Multi-label Image Recognition via LLM-powered Prompt Tuning | 通过 LLM 支持的即时调整进行无数据多标签图像识别 | Shuo Yang, Zirui Shang, Yongqi Wang, Derong Deng, Hongwei Chen, Qiyuan Cheng, Xinxiao Wu | arxiv.org/pdf/2403.01… | null |
2024-03-02 | Leveraging Self-Supervised Learning for Scene Recognition in Child Sexual Abuse Imagery | 利用自我监督学习进行儿童性虐待图像的场景识别 | Pedro H. V. Valois, João Macedo, Leo S. F. Ribeiro, Jefersson A. dos Santos, Sandra Avila | arxiv.org/pdf/2403.01… | null |
2024-03-02 | Run-time Introspection of 2D Object Detection in Automated Driving Systems Using Learning Representations | 使用学习表示对自动驾驶系统中的 2D 对象检测进行运行时自省 | Hakan Yekta Yatbaz, Mehrdad Dianati, Konstantinos Koufos, Roger Woodman | arxiv.org/pdf/2403.01… | null |
2024-03-02 | Learn Suspected Anomalies from Event Prompts for Video Anomaly Detection | 从事件提示中了解可疑异常情况以进行视频异常检测 | Chenchen Tao, Chong Wang, Yuexian Zou, Xiaohao Peng, Jiafei Wu, Jiangbo Qian | arxiv.org/pdf/2403.01… | null |
2024-03-02 | Auxiliary Tasks Enhanced Dual-affinity Learning for Weakly Supervised Semantic Segmentation | 辅助任务增强弱监督语义分割的双亲和力学习 | Lian Xu, Mohammed Bennamoun, Farid Boussaid, Wanli Ouyang, Ferdous Sohel, Dan Xu | arxiv.org/pdf/2403.01… | null |
2024-03-02 | ELA: Efficient Local Attention for Deep Convolutional Neural Networks | ELA:深度卷积神经网络的高效局部注意力 | Wei Xu, Yi Wan | arxiv.org/pdf/2403.01… | null |
2024-03-02 | Beyond Night Visibility: Adaptive Multi-Scale Fusion of Infrared and Visible Images | 超越夜间能见度:红外和可见光图像的自适应多尺度融合 | Shufan Pei, Junhong Lin, Wenxi Liu, Tiesong Zhao, Chia-Wen Lin | arxiv.org/pdf/2403.01… | null |
图像理解
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-02 | Depth Information Assisted Collaborative Mutual Promotion Network for Single Image Dehazing | 深度信息辅助单幅图像去雾协作互促网络 | Yafei Zhang, Shen Zhou, Huafeng Li | arxiv.org/pdf/2403.01… | null |
Transformer
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-02 | Dual Graph Attention based Disentanglement Multiple Instance Learning for Brain Age Estimation | 基于双图注意力的解缠多实例学习用于脑年龄估计 | Fanzhe Yan, Gang Yang, Yu Li, Aiping Liu, Xun Chen | arxiv.org/pdf/2403.01… | null |
3D/CG
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-02 | Seeing Unseen: Discover Novel Biomedical Concepts via GeometryConstrained Probabilistic Modeling | 看到看不见的东西:通过几何约束概率建模发现新的生物医学概念 | Jianan Fan, Dongnan Liu, Hang Chang, Heng Huang, Mei Chen, Weidong Cai | arxiv.org/pdf/2403.01… | null |
其他
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-02 | ShapeBoost: Boosting Human Shape Estimation with Part-Based Parameterization and Clothing-Preserving Augmentation | ShapeBoost:通过基于部位的参数化和服装保留增强来促进人体形状估计 | Siyuan Bian, Jiefeng Li, Jiasheng Tang, Cewu Lu | arxiv.org/pdf/2403.01… | null |
2024-03-02 | Mitigating the Bias in the Model for Continual Test-Time Adaptation | 减轻持续测试时间适应模型中的偏差 | Inseop Chung, Kyomin Hwang, Jayeon Yoo, Nojun Kwak | arxiv.org/pdf/2403.01… | null |
2024-03-02 | Single-image camera calibration with model-free distortion correction | 具有无模型畸变校正的单图像相机校准 | Katia Genovese | arxiv.org/pdf/2403.01… | null |
2024-03-02 | Consistent and Asymptotically Statistically-Efficient Solution to Camera Motion Estimation | 相机运动估计的一致且渐近统计有效的解决方案 | Guangyang Zeng, Qingcheng Zeng, Xinghan Li, Biqiang Mu, Jiming Chen, Ling Shi, Junfeng Wu | arxiv.org/pdf/2403.01… | null |
2024-03-02 | Edge-guided Low-light Image Enhancement with Inertial Bregman Alternating Linearized Minimization | 采用惯性 Bregman 交替线性化最小化的边缘引导低光图像增强 | Chaoyan Huang, Zhongming Wu, Tieyong Zeng | arxiv.org/pdf/2403.01… | null |
2024-03-02 | Towards Accurate Lip-to-Speech Synthesis in-the-Wild | 实现野外准确的唇语合成 | Sindhu Hegde, Rudrabha Mukhopadhyay, C. V. Jawahar, Vinay Namboodiri | arxiv.org/pdf/2403.01… | null |