!UPDATED -- 2024-01-09
各类学习方式
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-09 | Advancing Ante-Hoc Explainable Models through Generative Adversarial Networks | 通过生成对抗网络推进事前可解释模型 | Tanmay Garg, Deepika Vemuri, Vineeth N Balasubramanian | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | Effective pruning of web-scale datasets based on complexity of concept clusters | 基于概念簇复杂度的网络规模数据集的有效剪枝 | Amro Abbas, Evgenia Rusak, Kushal Tirumala, Wieland Brendel, Kamalika Chaudhuri, Ari S. Morcos | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | Iterative Feedback Network for Unsupervised Point Cloud Registration | 用于无监督点云配准的迭代反馈网络 | Yifan Xie, Boyu Wang, Shiqi Li, Jihua Zhu | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | Pre-trained Model Guided Fine-Tuning for Zero-Shot Adversarial Robustness | 预训练模型引导的零样本对抗鲁棒性微调 | Sibo Wang, Jie Zhang, Zheng Yuan, Shiguang Shan | arxiv.org/pdf/2401.04… | null |
分类/检测/识别/分割
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-09 | U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation | U-Mamba:增强生物医学图像分割的远程依赖性 | Jun Ma, Feifei Li, Bo Wang | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | Low-resource finetuning of foundation models beats state-of-the-art in histopathology | 基础模型的低资源微调击败了组织病理学领域的最先进技术 | Benedikt Roth, Valentin Koch, Sophia J. Wagner, Julia A. Schnabel, Carsten Marr, Tingying Peng | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | Benchmark Analysis of Various Pre-trained Deep Learning Models on ASSIRA Cats and Dogs Dataset | ASSIRA猫狗数据集上各种预训练深度学习模型的基准分析 | Galib Muhammad Shahriar Himel, Md. Masudul Islam | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | Learning to Prompt Segment Anything Models | 学习提示分割任何模型 | Jiaxing Huang, Kai Jiang, Jingyi Zhang, Han Qiu, Lewei Lu, Shijian Lu, Eric Xing | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | Generic Knowledge Boosted Pre-training For Remote Sensing Images | 通用知识促进遥感图像的预训练 | Ziyue Huang, Mingming Zhang, Yuan Gong, Qingjie Liu, Yunhong Wang | arxiv.org/pdf/2401.04… | link |
| 2024-01-09 | Let's Go Shopping (LGS) -- Web-Scale Image-Text Dataset for Visual Concept Understanding | Let's Go Shopping (LGS)——用于视觉概念理解的网络规模图像文本数据集 | Yatong Bai, Utsav Garg, Apaar Shanker, Haoming Zhang, Samyak Parajuli, Erhan Bas, Isidora Filipovic, Amelia N. Chu, Eugenia D Fomitcheva, Elliot Branson, et.al. | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | An Automatic Cascaded Model for Hemorrhagic Stroke Segmentation and Hemorrhagic Volume Estimation | 用于出血性卒中分割和出血量估计的自动级联模型 | Weijin Xu, Zhuang Sha, Huihua Yang, Rongcai Jiang, Zhanying Li, Wentao Liu, Ruisheng Su | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | PhilEO Bench: Evaluating Geo-Spatial Foundation Models | PhilEO Bench:评估地理空间基础模型 | Casper Fibaek, Luke Camilleri, Andreas Luyts, Nikolaos Dionelis, Bertrand Le Saux | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | D3AD: Dynamic Denoising Diffusion Probabilistic Model for Anomaly Detection | D3AD:用于异常检测的动态去噪扩散概率模型 | Justin Tebbe, Jawad Tayyub | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | A Novel Dataset for Non-Destructive Inspection of Handwritten Documents | 用于手写文档无损检测的新型数据集 | Eleonora Breci, Luca Guarnera, Sebastiano Battiato | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | Image classification network enhancement methods based on knowledge injection | 基于知识注入的图像分类网络增强方法 | Yishuang Tian, Ning Wang, Liang Zhang | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | Empirical Analysis of Anomaly Detection on Hyperspectral Imaging Using Dimension Reduction Methods | 使用降维方法进行高光谱成像异常检测的实证分析 | Dongeon Kim, YeongHyeon Park | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | Meta-forests: Domain generalization on random forests with meta-learning | 元森林:通过元学习对随机森林进行领域泛化 | Yuyang Sun, Panagiotis Kosmas | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | MapAI: Precision in Building Segmentation | MapAI:精确的建筑分割 | Sander Riisøen Jyhne, Morten Goodwin, Per Arne Andersen, Ivar Oveland, Alexander Salveson Nossum, Karianne Ormseth, Mathilde Ørstavik, Andrew C. Flatman | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | Optimal Transcoding Resolution Prediction for Efficient Per-Title Bitrate Ladder Estimation | 用于高效每标题比特率阶梯估计的最佳转码分辨率预测 | Jinhai Yang, Mengxi Guo, Shijie Zhao, Junlin Li, Li Zhang | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | MST: Adaptive Multi-Scale Tokens Guided Interactive Segmentation | MST:自适应多尺度令牌引导交互式分割 | Long Xu, Shanghong Li, Yongquan Chen, Jun Luo | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | SoK: Facial Deepfake Detectors | SoK:面部 Deepfake 探测器 | Binh M. Le, Jiwon Kim, Shahroz Tariq, Kristen Moore, Alsharif Abuadbba, Simon S. Woo | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | Knowledge-enhanced Multi-perspective Video Representation Learning for Scene Recognition | 用于场景识别的知识增强多视角视频表示学习 | Xuzheng Yu, Chen Jiang, Wei Zhang, Tian Gan, Linlin Chao, Jianan Zhao, Yuan Cheng, Qingpei Guo, Wei Chu | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | BD-MSA: Body decouple VHR Remote Sensing Image Change Detection method guided by multi-scale feature information aggregation | BD-MSA:多尺度特征信息聚合引导的体解耦VHR遥感图像变化检测方法 | Yonghui Tan, Xiaolong Li, Yishu Chen, Jinquan Ai | arxiv.org/pdf/2401.04… | null |
模型压缩/优化
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-09 | Enhanced Distribution Alignment for Post-Training Quantization of Diffusion Models | 扩散模型训练后量化的增强分布对齐 | Xuewen Liu, Zhikai Li, Junrui Xiao, Qingyi Gu | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | Memory-Efficient Personalization using Quantized Diffusion Model | 使用量化扩散模型进行内存高效的个性化 | Hyogon Ryu, Seohyun Lim, Hyunjung Shim | arxiv.org/pdf/2401.04… | null |
生成模型
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-09 | Morphable Diffusion: 3D-Consistent Diffusion for Single-image Avatar Creation | 可变形扩散:用于单图像头像创建的 3D 一致扩散 | Xiyi Chen, Marko Mihajlovic, Shaofei Wang, Sergey Prokudin, Siyu Tang | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | Low-Resource Vision Challenges for Foundation Models | 基础模型的低资源视觉挑战 | Yunhua Zhang, Hazel Doughty, Cees G. M. Snoek | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | EmoGen: Emotional Image Content Generation with Text-to-Image Diffusion Models | EmoGen:使用文本到图像扩散模型生成情感图像内容 | Jingyuan Yang, Jiawei Feng, Hui Huang | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | MagicVideo-V2: Multi-Stage High-Aesthetic Video Generation | MagicVideo-V2:多阶段高美视频生成 | Weimin Wang, Jiawei Liu, Zhijie Lin, Jiangqiao Yan, Shuo Chen, Chetwin Low, Tuyen Hoang, Jie Wu, Jun Hao Liew, Hanshu Yan, et.al. | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | Representative Feature Extraction During Diffusion Process for Sketch Extraction with One Example | 草图提取扩散过程中的代表性特征提取(以一例为例) | Kwan Yun, Youngseo Kim, Kwanggyoon Seo, Chang Wook Seo, Junyong Noh | arxiv.org/pdf/2401.04… | null |
多模态
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-09 | Vision Reimagined: AI-Powered Breakthroughs in WiFi Indoor Imaging | 视觉重新构想:人工智能驱动的 WiFi 室内成像突破 | Jianyang Shi, Bowen Zhang, Amartansh Dubey, Ross Murch, Liwen Jing | arxiv.org/pdf/2401.04… | null |
Transformer
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-09 | Jump Cut Smoothing for Talking Heads | 说话头像的跳切平滑 | Xiaojuan Wang, Taesung Park, Yang Zhou, Eli Shechtman, Richard Zhang | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | WaveletFormerNet: A Transformer-based Wavelet Network for Real-world Non-homogeneous and Dense Fog Removal | WaveletFormerNet:基于变压器的小波网络,用于现实世界的非均匀和密集除雾 | Shengli Zhang, Zhiyong Tao, Sen Lin | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | Take A Shortcut Back: Mitigating the Gradient Vanishing for Training Spiking Neural Networks | 走捷径回来:减轻训练尖峰神经网络的梯度消失 | Yufei Guo, Yuanpei Chen | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | Learning with Noisy Labels: Interconnection of Two Expectation-Maximizations | 使用噪声标签学习:两个期望最大化的互连 | Heewon Kim, Hyun Sung Chang, Kiho Cho, Jaeyun Lee, Bohyung Han | arxiv.org/pdf/2401.04… | null |
3D/CG
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-09 | A Simple Baseline for Spoken Language to Sign Language Translation with 3D Avatars | 使用 3D 头像进行口语到手语翻译的简单基线 | Ronglai Zuo, Fangyun Wei, Zenggui Chen, Brian Mak, Jiaolong Yang, Xin Tong | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | Uncertainty-aware Sampling for Long-tailed Semi-supervised Learning | 长尾半监督学习的不确定性采样 | Kuo Yang, Duo Li, Menghan Hu, Guangtao Zhai, Xiaokang Yang, Xiao-Ping Zhang | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | RomniStereo: Recurrent Omnidirectional Stereo Matching | RomniStereo:循环全向立体匹配 | Hualie Jiang, Rui Xu, Minglang Tan, Wenjie Jiang | arxiv.org/pdf/2401.04… | null |
图像理解
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-09 | RadarCam-Depth: Radar-Camera Fusion for Depth Estimation with Learned Metric Scale | RadarCam-Depth:雷达相机融合,通过学习的公制尺度进行深度估计 | Han Li, Yukai Ma, Yaqing Gu, Kewei Hu, Yong Liu, Xingxing Zuo | arxiv.org/pdf/2401.04… | null |
其他
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-09 | Revisiting Adversarial Training at Scale | 重新审视大规模对抗性训练 | Zeyu Wang, Xianhang Li, Hongru Zhu, Cihang Xie | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | CoordGate: Efficiently Computing Spatially-Varying Convolutions in Convolutional Neural Networks | CoordGate:在卷积神经网络中高效计算空间变化的卷积 | Sunny Howard, Peter Norreys, Andreas Döpp | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | Phase-shifted remote photoplethysmography for estimating heart rate and blood pressure from facial video | 相移远程光电体积描记法,用于根据面部视频估算心率和血压 | Gyutae Hwang, Sang Jun Lee | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | Towards Real-World Aerial Vision Guidance with Categorical 6D Pose Tracker | 通过分类 6D 姿势跟踪器实现真实世界的空中视觉引导 | Jingtao Sun, Yaonan Wang, Danwei Wang | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | Mix-GENEO: A flexible filtration for multiparameter persistent homology detects digital images | Mix-GENEO:用于多参数持久同源性检测数字图像的灵活过滤 | Jiaxing He, Bingzhe Hou, Tieru Wu, Yue Xin | arxiv.org/pdf/2401.04… | null |
| 2024-01-09 | StarCraftImage: A Dataset For Prototyping Spatial Reasoning Methods For Multi-Agent Environments | StarCraftImage:用于多代理环境空间推理方法原型设计的数据集 | Sean Kulinski, Nicholas R. Waytowich, James Z. Hare, David I. Inouye | arxiv.org/pdf/2401.04… | null |