!UPDATED -- 2024-01-11
分类/检测/识别/分割
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-11 | Automatic UAV-based Airport Pavement Inspection Using Mixed Real and Virtual Scenarios | 使用混合真实和虚拟场景的基于无人机的自动机场路面检查 | Pablo Alonso, Jon Ander Iñiguez de Gordoa, Juan Diego Ortega, Sara García, Francisco Javier Iriarte, Marcos Nieto | arxiv.org/pdf/2401.06… | null |
| 2024-01-11 | Attention to detail: inter-resolution knowledge distillation | 关注细节:分辨率间知识蒸馏 | Rocío del Amor, Julio Silva-Rodríguez, Adrián Colomer, Valery Naranjo | arxiv.org/pdf/2401.06… | null |
| 2024-01-11 | Sea ice detection using concurrent multispectral and synthetic aperture radar imagery | 使用并发多光谱和合成孔径雷达图像进行海冰探测 | Martin S J Rogers, Maria Fox, Andrew Fleming, Louisa van Zeeland, Jeremy Wilkinson, J. Scott Hosking | arxiv.org/pdf/2401.06… | null |
| 2024-01-11 | Body-Area Capacitive or Electric Field Sensing for Human Activity Recognition and Human-Computer Interaction: A Comprehensive Survey | 用于人类活动识别和人机交互的身体区域电容或电场感应:综合调查 | Sizhen Bian, Mengxi Liu, Bo Zhou, Paul Lukowicz, Michele Magno | arxiv.org/pdf/2401.06… | null |
| 2024-01-11 | CoSSegGaussians: Compact and Swift Scene Segmenting 3D Gaussians | CoSSegGaussians:紧凑且快速的场景分割 3D 高斯 | Bin Dou, Tianyu Zhang, Yongjia Ma, Zhaohui Wang, Zejian Yuan | arxiv.org/pdf/2401.05… | null |
| 2024-01-11 | PartSTAD: 2D-to-3D Part Segmentation Task Adaptation | PartSTAD:2D 到 3D 零件分割任务适配 | Hyunjin Kim, Minhyuk Sung | arxiv.org/pdf/2401.05… | null |
| 2024-01-11 | Implications of Noise in Resistive Memory on Deep Neural Networks for Image Classification | 电阻存储器中的噪声对图像分类深度神经网络的影响 | Yannick Emonds, Kai Xi, Holger Fröning | arxiv.org/pdf/2401.05… | null |
| 2024-01-11 | Learn From Zoom: Decoupled Supervised Contrastive Learning For WCE Image Classification | 向 Zoom 学习:WCE 图像分类的解耦监督对比学习 | Kunpeng Qiu, Zhiying Zhou, Yongxin Guo | arxiv.org/pdf/2401.05… | null |
| 2024-01-11 | Evaluating Data Augmentation Techniques for Coffee Leaf Disease Classification | 评估咖啡叶病分类的数据增强技术 | Adrian Gheorghiu, Iulian-Marius Tăiatu, Dumitru-Clementin Cercel, Iuliana Marin, Florin Pop | arxiv.org/pdf/2401.05… | null |
| 2024-01-11 | LKCA: Large Kernel Convolutional Attention | LKCA:大核卷积注意力 | Chenghao Li, Boheng Zeng, Yi Lu, Pengbo Shi, Qingzi Chen, Jirui Liu, Lingyun Zhu | arxiv.org/pdf/2401.05… | null |
| 2024-01-11 | Video Anomaly Detection and Explanation via Large Language Models | 通过大型语言模型进行视频异常检测和解释 | Hui Lv, Qianru Sun | arxiv.org/pdf/2401.05… | null |
| 2024-01-11 | HiCMAE: Hierarchical Contrastive Masked Autoencoder for Self-Supervised Audio-Visual Emotion Recognition | HiCMAE:用于自监督视听情感识别的分层对比屏蔽自动编码器 | Licai Sun, Zheng Lian, Bin Liu, Jianhua Tao | arxiv.org/pdf/2401.05… | null |
| 2024-01-11 | Exploring Self- and Cross-Triplet Correlations for Human-Object Interaction Detection | 探索人机交互检测的自三元组相关性和互三元组相关性 | Weibo Jiang, Weihong Ren, Jiandong Tian, Liangqiong Qu, Zhiyong Wang, Honghai Liu | arxiv.org/pdf/2401.05… | null |
| 2024-01-11 | Masked Attribute Description Embedding for Cloth-Changing Person Re-identification | 用于换衣人员重新识别的屏蔽属性描述嵌入 | Chunlei Peng, Boyu Wang, Decheng Liu, Nannan Wang, Ruimin Hu, Xinbo Gao | arxiv.org/pdf/2401.05… | null |
| 2024-01-11 | MatSAM: Efficient Materials Microstructure Extraction via Visual Large Model | MatSAM:通过视觉大模型高效提取材料微观结构 | Changtai Li, Xu Han, Chao Yao, Xiaojuan Ban | arxiv.org/pdf/2401.05… | null |
| 2024-01-11 | REBUS: A Robust Evaluation Benchmark of Understanding Symbols | REBUS:理解符号的稳健评估基准 | Andrew Gritsevskiy, Arjun Panickssery, Aaron Kirtland, Derik Kauffman, Hans Gundlach, Irina Gritsevskaya, Joe Cavanagh, Jonathan Chiang, Lydia La Roux, Michelle Hung | arxiv.org/pdf/2401.05… | link |
| 2024-01-11 | Nucleus subtype classification using inter-modality learning | 使用跨模态学习进行细胞核亚型分类 | Lucas W. Remedios, Shunxing Bao, Samuel W. Remedios, Ho Hin Lee, Leon Y. Cai, Thomas Li, Ruining Deng, Can Cui, Jia Li, Qi Liu, et.al. | arxiv.org/pdf/2401.05… | null |
模型压缩/优化
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-11 | E![^{2}]()GAN: Efficient Training of Efficient GANs for Image-to-Image Translation | E![^{2}]()GAN:用于图像到图像翻译的高效 GAN 的高效训练 | Yifan Gong, Zheng Zhan, Qing Jin, Yanyu Li, Yerlan Idelbayev, Xian Liu, Andrey Zharkov, Kfir Aberman, Sergey Tulyakov, Yanzhi Wang, et.al. | arxiv.org/pdf/2401.06… | null |
| 2024-01-11 | PALP: Prompt Aligned Personalization of Text-to-Image Models | PALP:文本到图像模型的快速对齐个性化 | Moab Arar, Andrey Voynov, Amir Hertz, Omri Avrahami, Shlomi Fruchter, Yael Pritch, Daniel Cohen-Or, Ariel Shamir | arxiv.org/pdf/2401.06… | null |
| 2024-01-11 | TRIPS: Trilinear Point Splatting for Real-Time Radiance Field Rendering | TRIPS:用于实时辐射场渲染的三线性点溅射 | Linus Franke, Darius Rückert, Laura Fink, Marc Stamminger | arxiv.org/pdf/2401.06… | null |
| 2024-01-11 | A Lightweight Feature Fusion Architecture For Resource-Constrained Crowd Counting | 用于资源受限人群计数的轻量级特征融合架构 | Yashwardhan Chaudhuri, Ankit Kumar, Orchid Chetia Phukan, Arun Balaji Buduru | arxiv.org/pdf/2401.05… | null |
生成模型
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-11 | RAVEN: Rethinking Adversarial Video Generation with Efficient Tri-plane Networks | RAVEN:重新思考使用高效三平面网络的对抗性视频生成 | Partha Ghosh, Soubhik Sanyal, Cordelia Schmid, Bernhard Schölkopf | arxiv.org/pdf/2401.06… | null |
| 2024-01-11 | GE-AdvGAN: Improving the transferability of adversarial samples by gradient editing-based adversarial generative model | GE-AdvGAN:通过基于梯度编辑的对抗生成模型提高对抗样本的可转移性 | Zhiyu Zhu, Huaming Chen, Xinyi Wang, Jiayu Zhang, Zhibo Jin, Kim-Kwang Raymond Choo | arxiv.org/pdf/2401.06… | null |
| 2024-01-11 | How does the primate brain combine generative and discriminative computations in vision? | 灵长类动物的大脑如何将视觉中的生成计算和判别计算结合起来? | Benjamin Peters, James J. DiCarlo, Todd Gureckis, Ralf Haefner, Leyla Isik, Joshua Tenenbaum, Talia Konkle, Thomas Naselaris, Kimberly Stachenfeld, Zenna Tavares, et.al. | arxiv.org/pdf/2401.06… | null |
| 2024-01-11 | An attempt to generate new bridge types from latent space of PixelCNN | 尝试从 PixelCNN 的潜在空间生成新的桥类型 | Hongjun Zhang | arxiv.org/pdf/2401.05… | link |
| 2024-01-11 | Efficient Image Deblurring Networks based on Diffusion Models | 基于扩散模型的高效图像去模糊网络 | Kang Chen, Yuanjie Liu | arxiv.org/pdf/2401.05… | null |
| 2024-01-11 | HiCAST: Highly Customized Arbitrary Style Transfer with Adapter Enhanced Diffusion Models | HiCAST:高度定制的任意风格转移,带有适配器增强扩散模型 | Hanzhang Wang, Haoran Wang, Jinze Yang, Zhongrui Yu, Zeke Xie, Lei Tian, Xinyan Xiao, Junjun Jiang, Xianming Liu, Mingming Sun | arxiv.org/pdf/2401.05… | null |
| 2024-01-11 | EraseDiff: Erasing Data Influence in Diffusion Models | EraseDiff:消除扩散模型中的数据影响 | Jing Wu, Trung Le, Munawar Hayat, Mehrtash Harandi | arxiv.org/pdf/2401.05… | null |
多模态
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-11 | LEGO:Language Enhanced Multi-modal Grounding Model | 乐高:语言增强多模式接地模型 | Zhaowei Li, Qi Xu, Dong Zhang, Hang Song, Yiqing Cai, Qi Qi, Ran Zhou, Junting Pan, Zefeng Li, Van Tu Vu, et.al. | arxiv.org/pdf/2401.06… | null |
| 2024-01-11 | Hallucination Benchmark in Medical Visual Question Answering | 医学视觉问答中的幻觉基准 | Jinge Wu, Yunsoo Kim, Honghan Wu | arxiv.org/pdf/2401.05… | null |
Transformer
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-11 | Surface Normal Estimation with Transformers | 使用 Transformer 进行表面法线估计 | Barry Shichen Hu, Siyun Liang, Johannes Paetzold, Huy H. Nguyen, Isao Echizen, Jiapeng Tang | arxiv.org/pdf/2401.05… | null |
| 2024-01-11 | Object-Centric Diffusion for Efficient Video Editing | 以对象为中心的扩散,实现高效视频编辑 | Kumara Kahatapitiya, Adil Karjauv, Davide Abati, Fatih Porikli, Yuki M. Asano, Amirhossein Habibian | arxiv.org/pdf/2401.05… | null |
| 2024-01-11 | Transforming Image Super-Resolution: A ConvFormer-based Efficient Approach | 变换图像超分辨率:一种基于 ConvFormer 的高效方法 | Gang Wu, Junjun Jiang, Junpeng Jiang, Xianming Liu | arxiv.org/pdf/2401.05… | null |
Nerf
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-11 | Fast High Dynamic Range Radiance Fields for Dynamic Scenes | 适用于动态场景的快速高动态范围辐射场 | Guanjun Wu, Taoran Yi, Jiemin Fang, Wenyu Liu, Xinggang Wang | arxiv.org/pdf/2401.06… | null |
| 2024-01-11 | GO-NeRF: Generating Virtual Objects in Neural Radiance Fields | GO-NeRF:在神经辐射场中生成虚拟对象 | Peng Dai, Feitong Tan, Xin Yu, Yinda Zhang, Xiaojuan Qi | arxiv.org/pdf/2401.05… | null |
3DGS
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-11 | Gaussian Shadow Casting for Neural Characters | 神经角色的高斯阴影投射 | Luis Bolanos, Shih-Yang Su, Helge Rhodin | arxiv.org/pdf/2401.06… | null |
3D/CG
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-11 | Dubbing for Everyone: Data-Efficient Visual Dubbing using Neural Rendering Priors | 适合所有人的配音:使用神经渲染先验进行数据高效的视觉配音 | Jack Saunders, Vinay Namboodiri | arxiv.org/pdf/2401.06… | null |
| 2024-01-11 | MatSynth: A Modern PBR Materials Dataset | MatSynth:现代 PBR 材料数据集 | Giuseppe Vecchio, Valentin Deschaintre | arxiv.org/pdf/2401.06… | null |
| 2024-01-11 | Surgical-DINO: Adapter Learning of Foundation Model for Depth Estimation in Endoscopic Surgery | Surgical-DINO:内窥镜手术深度估计基础模型的适配器学习 | Cui Beilei, Islam Mobarakol, Bai Long, Ren Hongliang | arxiv.org/pdf/2401.06… | null |
| 2024-01-11 | UAVD4L: A Large-Scale Dataset for UAV 6-DoF Localization | UAVD4L:无人机 6 自由度定位的大规模数据集 | Rouwan Wu, Xiaoya Cheng, Juelin Zhu, Xuxiang Liu, Maojun Zhang, Shen Yan | arxiv.org/pdf/2401.05… | null |
| 2024-01-11 | LiDAR data acquisition and processing for ecology applications | 用于生态应用的激光雷达数据采集和处理 | Ion Ciobotari, Adriana Príncipe, Maria Alexandra Oliveira, João Nuno Silva | arxiv.org/pdf/2401.05… | null |
各类学习方式
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-11 | Distilling Vision-Language Models on Millions of Videos | 从数百万个视频中提取视觉语言模型 | Yue Zhao, Long Zhao, Xingyi Zhou, Jialin Wu, Chun-Te Chu, Hui Miao, Florian Schroff, Hartwig Adam, Ting Liu, Boqing Gong, et.al. | arxiv.org/pdf/2401.06… | null |
| 2024-01-11 | ConKeD: Multiview contrastive descriptor learning for keypoint-based retinal image registration | ConKeD:基于关键点的视网膜图像配准的多视图对比描述符学习 | David Rivas-Villar, Álvaro S. Hervella, José Rouco, Jorge Novo | arxiv.org/pdf/2401.05… | null |
| 2024-01-11 | Enhancing Contrastive Learning with Efficient Combinatorial Positive Pairing | 通过有效的组合正配对增强对比学习 | Jaeill Kim, Duhun Hwang, Eunjung Lee, Jangwon Suh, Jimyeong Kim, Wonjong Rhee | arxiv.org/pdf/2401.05… | null |
其他
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-11 | Manipulating Feature Visualizations with Gradient Slingshots | 使用渐变弹弓操作特征可视化 | Dilyara Bareeva, Marina M. -C. Höhne, Alexander Warnecke, Lukas Pirch, Klaus-Robert Müller, Konrad Rieck, Kirill Bykov | arxiv.org/pdf/2401.06… | null |
| 2024-01-11 | MGARD: A multigrid framework for high-performance, error-controlled data compression and refactoring | MGARD:用于高性能、错误控制数据压缩和重构的多重网格框架 | Qian Gong, Jieyang Chen, Ben Whitney, Xin Liang, Viktor Reshniak, Tania Banerjee, Jaemoon Lee, Anand Rangarajan, Lipeng Wan, Nicolas Vidal, et.al. | arxiv.org/pdf/2401.05… | null |
| 2024-01-11 | YOIO: You Only Iterate Once by mining and fusing multiple necessary global information in the optical flow estimation | YOIO:通过在光流估计中挖掘和融合多个必要的全局信息,您只需迭代一次 | Yu Jing, Tan Yujuan, Ren Ao, Liu Duo | arxiv.org/pdf/2401.05… | null |
| 2024-01-11 | On the representation and methodology for wide and short range head pose estimation | 宽短程头部姿态估计的表示和方法 | Alejandro Cobo, Roberto Valle, José M. Buenaposada, Luis Baumela | arxiv.org/pdf/2401.05… | null |
| 2024-01-11 | CLIP-Driven Semantic Discovery Network for Visible-Infrared Person Re-Identification | CLIP 驱动的语义发现网络,用于可见红外人员重新识别 | Xiaoyan Yu, Neng Dong, Liehuang Zhu, Hao Peng, Dapeng Tao | arxiv.org/pdf/2401.05… | null |
| 2024-01-11 | Knowledge Translation: A New Pathway for Model Compression | 知识翻译:模型压缩的新途径 | Wujie Sun, Defang Chen, Jiawei Chen, Yan Feng, Chun Chen, Can Wang | arxiv.org/pdf/2401.05… | null |
| 2024-01-11 | Learning Generalizable Models via Disentangling Spurious and Enhancing Potential Correlations | 通过消除虚假和增强潜在相关性来学习可推广模型 | Na Wang, Lei Qi, Jintao Guo, Yinghuan Shi, Yang Gao | arxiv.org/pdf/2401.05… | null |
| 2024-01-11 | Self Expanding Convolutional Neural Networks | 自扩展卷积神经网络 | Blaise Appolinary, Alex Deaconu, Sophia Yang | arxiv.org/pdf/2401.05… | null |
| 2024-01-11 | Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation | Parrot:用于文本到图像生成的帕累托最优多奖励强化学习框架 | Seung Hyun Lee, Yinxiao Li, Junjie Ke, Innfarn Yoo, Han Zhang, Jiahui Yu, Qifei Wang, Fei Deng, Glenn Entis, Junfeng He, et.al. | arxiv.org/pdf/2401.05… | null |
| 2024-01-11 | Face-GPS: A Comprehensive Technique for Quantifying Facial Muscle Dynamics in Videos | Face-GPS:量化视频中面部肌肉动态的综合技术 | Juni Kim, Zhikang Dong, Pawel Polak | arxiv.org/pdf/2401.05… | null |