[UPDATED!] 2024-01-12 (Publish Time)
分类/检测/识别/分割
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-12 | Seeing the roads through the trees: A benchmark for modeling spatial dependencies with aerial imagery | 透过树林看路:利用航空图像对空间依赖性进行建模的基准 | Caleb Robinson, Isaac Corley, Anthony Ortiz, Rahul Dodhia, Juan M. Lavista Ferres, Peyman Najafirad | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | Scalable 3D Panoptic Segmentation With Superpoint Graph Clustering | 具有超点图聚类的可扩展 3D 全景分割 | Damien Robert, Hugo Raguet, Loic Landrieu | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | Embedded Planogram Compliance Control System | 嵌入式货架合规控制系统 | M. Erkin Yücel, Serkan Topaloğlu, Cem Ünsalan | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | Adversarial Examples are Misaligned in Diffusion Model Manifolds | 扩散模型流形中的对抗性示例未对齐 | Peter Lorenz, Ricard Durall, Jansi Keuper | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | Motion2VecSets: 4D Latent Vector Set Diffusion for Non-rigid Shape Reconstruction and Tracking | Motion2VecSets:用于非刚性形状重建和跟踪的 4D 潜在向量集扩散 | Wei Cao, Chang Luo, Biao Zhang, Matthias Nießner, Jiapeng Tang | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | Dynamic Behaviour of Connectionist Speech Recognition with Strong Latency Constraints | 具有强延迟约束的联结主义语音识别的动态行为 | Giampiero Salvi | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | Resource-Efficient Gesture Recognition using Low-Resolution Thermal Camera via Spiking Neural Networks and Sparse Segmentation | 通过尖峰神经网络和稀疏分割,使用低分辨率热像仪进行资源高效的手势识别 | Ali Safa, Wout Mommen, Lars Keuninckx | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | Multimodal Learning for detecting urban functional zones using remote sensing image and multi-semantic information | 利用遥感图像和多语义信息检测城市功能区的多模态学习 | Chuanji Shi, Yingying Zhang, Jiaotuan Wang, Qiqi Zhu | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | Enhancing Consistency and Mitigating Bias: A Data Replay Approach for Incremental Learning | 增强一致性并减少偏差:增量学习的数据重放方法 | Chenyang Wang, Junjun Jiang, Xingyu Hu, Xianming Liu, Xiangyang Ji | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | Optimizing Feature Selection for Binary Classification with Noisy Labels: A Genetic Algorithm Approach | 优化带有噪声标签的二元分类的特征选择:遗传算法方法 | Vandad Imani, Elaheh Moradi, Carlos Sevilla-Salcedo, Vittorio Fortino, Jussi Tohka | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | Robustness-Aware 3D Object Detection in Autonomous Driving: A Review and Outlook | 自动驾驶中的鲁棒性感知 3D 物体检测:回顾与展望 | Ziying Song, Lin Liu, Feiyang Jia, Yadan Luo, Guoxin Zhang, Lei Yang, Li Wang, Caiyan Jia | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | Exploring Diverse Representations for Open Set Recognition | 探索开集识别的多样化表示 | Yu Wang, Junxian Mu, Pengfei Zhu, Qinghua Hu | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | Frequency Masking for Universal Deepfake Detection | 用于通用 Deepfake 检测的频率掩蔽 | Chandler Timm Doloriel, Ngai-Man Cheung | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | Improving the Detection of Small Oriented Objects in Aerial Images | 改进航空图像中小定向物体的检测 | Chandler Timm C. Doloriel, Rhandley D. Cajote | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | Fully Automated Tumor Segmentation for Brain MRI data using Multiplanner UNet | 使用 Multiplanner UNet 对脑 MRI 数据进行全自动肿瘤分割 | Sumit Pandey, Satyasaran Changdar, Mathias Perslev, Erik B Dam | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | Self-supervised Learning of Dense Hierarchical Representations for Medical Image Segmentation | 医学图像分割的密集层次表示的自监督学习 | Eytan Kats, Jochen G. Hirsch, Mattias P. Heinrich | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | Improving Low-Light Image Recognition Performance Based on Image-adaptive Learnable Module | 基于图像自适应学习模块提高弱光图像识别性能 | Seitaro Ono, Yuka Ogino, Takahiro Toizumi, Atsushi Ito, Masato Tsukada | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | UMG-CLIP: A Unified Multi-Granularity Vision Generalist for Open-World Understanding | UMG-CLIP:用于理解开放世界的统一多粒度视觉通才 | Bowen Shi, Peisen Zhao, Zichen Wang, Yuhang Zhang, Yaoming Wang, Jin Li, Wenrui Dai, Junni Zou, Hongkai Xiong, Qi Tian, et.al. | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | SD-MVS: Segmentation-Driven Deformation Multi-View Stereo with Spherical Refinement and EM optimization | SD-MVS:具有球面细化和 EM 优化的分段驱动变形多视图立体 | Zhenlong Yuan, Jiakai Cao, Zhaoxin Li, Hao Jiang, Zhaoqi Wang | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | SamLP: A Customized Segment Anything Model for License Plate Detection | SamLP:用于车牌检测的定制分段任意模型 | Haoxuan Ding, Junyu Gao, Yuan Yuan, Qi Wang | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | Graph Relation Distillation for Efficient Biomedical Instance Segmentation | 用于高效生物医学实例分割的图关系蒸馏 | Xiaoyu Liu, Yueyi Zhang, Zhiwei Xiong, Wei Huang, Bo Hu, Xiaoyan Sun, Feng Wu | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | AffordanceLLM: Grounding Affordance from Vision Language Models | AffordanceLLM:视觉语言模型的基础可供性 | Shengyi Qian, Weifeng Chen, Min Bai, Xiong Zhou, Zhuowen Tu, Li Erran Li | arxiv.org/pdf/2401.06… | null |
模型压缩/优化
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-12 | Mutual Distillation Learning For Person Re-Identification | 用于人员重新识别的相互蒸馏学习 | Huiyuan Fu, Kuilong Cui, Chuanming Wang, Mengshi Qi, Huadong Ma | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | UPDP: A Unified Progressive Depth Pruner for CNN and Vision Transformer | UPDP:CNN 和 Vision Transformer 的统一渐进深度剪枝器 | Ji Liu, Dehua Tang, Yuanxian Huang, Li Zhang, Xiaocheng Zeng, Dong Li, Mingjie Lu, Jinzhang Peng, Yu Wang, Fan Jiang, et.al. | arxiv.org/pdf/2401.06… | null |
生成模型
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-12 | Decoupling Pixel Flipping and Occlusion Strategy for Consistent XAI Benchmarks | 解耦像素翻转和遮挡策略以实现一致的 XAI 基准 | Stefan Blücher, Johanna Vielhaben, Nils Strodthoff | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | 360DVD: Controllable Panorama Video Generation with 360-Degree Video Diffusion Model | 360DVD:具有 360 度视频扩散模型的可控全景视频生成 | Qian Wang, Weiqi Li, Chong Mou, Xinhua Cheng, Jian Zhang | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | RotationDrag: Point-based Image Editing with Rotated Diffusion Features | RotationDrag:具有旋转扩散功能的基于点的图像编辑 | Minxing Luo, Wentao Cheng, Jian Yang | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | ModaVerse: Efficiently Transforming Modalities with LLMs | ModaVerse:利用法学硕士有效转变模式 | Xinyu Wang, Bohan Zhuang, Qi Wu | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | Seek for Incantations: Towards Accurate Text-to-Image Diffusion Synthesis through Prompt Engineering | 寻求咒语:通过快速工程实现准确的文本到图像扩散合成 | Chang Yu, Junran Peng, Xiangyu Zhu, Zhaoxiang Zhang, Qi Tian, Zhen Lei | arxiv.org/pdf/2401.06… | null |
多模态
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-12 | Generalizing Visual Question Answering from Synthetic to Human-Written Questions via a Chain of QA with a Large Language Model | 通过具有大型语言模型的 QA 链将视觉问答从合成问题推广到人类编写的问题 | Taehee Kim, Yeongjae Cho, Heejun Shin, Yohan Jo, Dongmyung Shin | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | Hyper-STTN: Social Group-aware Spatial-Temporal Transformer Network for Human Trajectory Prediction with Hypergraph Reasoning | Hyper-STTN:利用超图推理进行人体轨迹预测的社会群体感知时空变换器网络 | Weizheng Wang, Le Mao, Baijian Yang, Guohua Chen, Byung-Cheol Min | arxiv.org/pdf/2401.06… | null |
Transformer
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-12 | PCB-Vision: A Multiscene RGB-Hyperspectral Benchmark Dataset of Printed Circuit Boards | PCB-Vision:印刷电路板的多场景 RGB 高光谱基准数据集 | Elias Arbash, Margret Fuchs, Behnood Rasti, Sandra Lorenz, Pedram Ghamisi, Richard Gloaguen | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | MedTransformer: Accurate AD Diagnosis for 3D MRI Images through 2D Vision Transformers | MedTransformer:通过 2D Vision Transformer 对 3D MRI 图像进行准确的 AD 诊断 | Yifeng Wang, Ke Chen, Yihan Zhang, Haohan Wang | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | Video Super-Resolution Transformer with Masked Inter&Intra-Frame Attention | 具有屏蔽帧间和帧内注意力的视频超分辨率变压器 | Xingyu Zhou, Leheng Zhang, Xiaorui Zhao, Keze Wang, Leida Li, Shuhang Gu | arxiv.org/pdf/2401.06… | null |
3D/CG
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-12 | 3D Reconstruction of Interacting Multi-Person in Clothing from a Single Image | 从单个图像重建多人服装交互的 3D 重建 | Junuk Cha, Hansol Lee, Jaewon Kim, Nhat Nguyen Bao Truong, Jae Shin Yoon, Seungryul Baek | arxiv.org/pdf/2401.06… | null |
GNN
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-12 | Synthetic Data Generation Framework, Dataset, and Efficient Deep Model for Pedestrian Intention Prediction | 用于行人意图预测的综合数据生成框架、数据集和高效深度模型 | Muhammad Naveed Riaz, Maciej Wielgosz, Abel Garcia Romera, Antonio M. Lopez | arxiv.org/pdf/2401.06… | null |
其他
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-01-12 | AttributionScanner: A Visual Analytics System for Metadata-Free Data-Slicing Based Model Validation | AttributionScanner:用于基于无元数据数据切片的模型验证的可视化分析系统 | Xiwei Xuan, Jorge Piazentin Ono, Liang Gou, Kwan-Liu Ma, Liu Ren | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | UAV-borne Mapping Algorithms for Canopy-Level and High-Speed Drone Applications | 适用于冠层级和高速无人机应用的无人机载测绘算法 | Jincheng Zhang, Artur Wolek, Andrew R. Willis | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | Application Of Vision-Language Models For Assessing Osteoarthritis Disease Severity | 视觉语言模型在评估骨关节炎疾病严重程度中的应用 | Banafshe Felfeliyan, Yuyue Zhou, Shrimanti Ghosh, Jessica Kupper, Shaobo Liu, Abhilash Hareendranathan, Jacob L. Jaremko | arxiv.org/pdf/2401.06… | null |
| 2024-01-12 | Beyond the Surface: A Global-Scale Analysis of Visual Stereotypes in Text-to-Image Generation | 超越表面:文本到图像生成中视觉刻板印象的全球范围分析 | Akshita Jha, Vinodkumar Prabhakaran, Remi Denton, Sarah Laszlo, Shachi Dave, Rida Qadri, Chandan K. Reddy, Sunipa Dev | arxiv.org/pdf/2401.06… | null |