[UPDATED!] 2024-02-01 (Publish Time)
分类/检测/识别/分割
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-01 | We're Not Using Videos Effectively: An Updated Domain Adaptive Video Segmentation Baseline | 我们没有有效地使用视频:更新的域自适应视频分割基准 | Simar Kareer, Vivek Vijaykumar, Harsh Maheshwari, Prithvijit Chattopadhyay, Judy Hoffman, Viraj Prabhu | arxiv.org/pdf/2402.00… | link |
| 2024-02-01 | Towards Optimal Feature-Shaping Methods for Out-of-Distribution Detection | 面向分布外检测的最佳特征整形方法 | Qinyu Zhao, Ming Xu, Kartik Gupta, Akshay Asthana, Liang Zheng, Stephen Gould | arxiv.org/pdf/2402.00… | link |
| 2024-02-01 | Automatic Segmentation of the Spinal Cord Nerve Rootlets | 脊髓神经根的自动分割 | Jan Valosek, Theo Mathieu, Raphaelle Schlienger, Olivia S. Kowalczyk, Julien Cohen-Adad | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | Vehicle Perception from Satellite | 卫星车辆感知 | Bin Zhao, Pengfei Han, Xuelong Li | arxiv.org/pdf/2402.00… | link |
| 2024-02-01 | Approximating Optimal Morphing Attacks using Template Inversion | 使用模板反转近似最佳变形攻击 | Laurent Colbois, Hatef Otroshi Shahreza, Sébastien Marcel | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | A Framework for Building Point Cloud Cleaning, Plane Detection and Semantic Segmentation | 构建点云清理、平面检测和语义分割的框架 | Ilyass Abouelaziz, Youssef Mourchid | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | Coronary Artery Disease Classification with Different Lesion Degree Ranges based on Deep Learning | 基于深度学习的不同病变程度范围的冠状动脉疾病分类 | Ariadna Jiménez-Partinen, Karl Thurnhofer-Hemsi, Esteban J. Palomo, Jorge Rodríguez-Capitán, Ana I. Molina-Ramos | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | CADICA: a new dataset for coronary artery disease detection by using invasive coronary angiography | CADICA:使用侵入性冠状动脉造影检测冠状动脉疾病的新数据集 | Ariadna Jiménez-Partinen, Miguel A. Molina-Cabello, Karl Thurnhofer-Hemsi, Esteban J. Palomo, Jorge Rodríguez-Capitán, Ana I. Molina-Ramos, Manuel Jiménez-Navarro | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | A Single Graph Convolution Is All You Need: Efficient Grayscale Image Classification | 您只需要一个图卷积即可:高效的灰度图像分类 | Jacob Fein-Ashley, Tian Ye, Sachini Wickramasinghe, Bingyi Zhang, Rajgopal Kannan, Viktor Prasanna | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | Masked Conditional Diffusion Model for Enhancing Deepfake Detection | 用于增强 Deepfake 检测的屏蔽条件扩散模型 | Tiewen Chen, Shanmin Yang, Shu Hu, Zhenghan Fang, Ying Fu, Xi Wu, Xin Wang | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | A Manifold Representation of the Key in Vision Transformers | 视觉变形金刚关键的多种表示 | Li Meng, Morten Goodwin, Anis Yazidi, Paal Engelstad | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | Bias Mitigating Few-Shot Class-Incremental Learning | 减少少样本类增量学习的偏差 | Li-Jun Zhao, Zhen-Duo Chen, Zi-Chao Zhang, Xin Luo, Xin-Shun Xu | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | Can you see me now? Blind spot estimation for autonomous vehicles using scenario-based simulation with random reference sensors | 你现在能看见我吗?使用基于场景的模拟和随机参考传感器来估计自动驾驶汽车的盲点 | Marc Uecker, J. Marius Zöllner | arxiv.org/pdf/2402.00… | link |
| 2024-02-01 | Dual-Student Knowledge Distillation Networks for Unsupervised Anomaly Detection | 用于无监督异常检测的双学生知识蒸馏网络 | Liyi Yao, Shaobing Gao | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | Lightweight Pixel Difference Networks for Efficient Visual Representation Learning | 用于高效视觉表示学习的轻量级像素差分网络 | Zhuo Su, Jiehua Zhang, Longguang Wang, Hua Zhang, Zhen Liu, Matti Pietikäinen, Li Liu | arxiv.org/pdf/2402.00… | link |
| 2024-02-01 | Disentangled Multimodal Brain MR Image Translation via Transformer-based Modality Infuser | 通过基于 Transformer 的模态注入器解开多模态大脑 MR 图像翻译 | Jihoon Cho, Xiaofeng Liu, Fangxu Xing, Jinsong Ouyang, Georges El Fakhri, Jinah Park, Jonghye Woo | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | High-Quality Medical Image Generation from Free-hand Sketch | 从手绘草图生成高质量的医学图像 | Quan Huu Cap, Atsushi Fukuda | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | Machine Unlearning for Image-to-Image Generative Models | 图像到图像生成模型的机器遗忘 | Guihong Li, Hsiang Hsu, Chun-Fu, Chen, Radu Marculescu | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | Self-supervised learning of video representations from a child's perspective | 从儿童的角度进行视频表示的自我监督学习 | A. Emin Orhan, Wentao Wang, Alex N. Wang, Mengye Ren, Brenden M. Lake | arxiv.org/pdf/2402.00… | link |
| 2024-02-01 | Comparative Evaluation of Traditional and Deep Learning-Based Segmentation Methods for Spoil Pile Delineation Using UAV Images | 使用无人机图像进行弃土堆描绘的传统分割方法和基于深度学习的分割方法的比较评估 | Sureka Thiruchittampalam, Bikram P. Banerjee, Nancy F. Glenn, Simit Raval | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | FineBio: A Fine-Grained Video Dataset of Biological Experiments with Hierarchical Annotation | FineBio:具有分层注释的生物实验细粒度视频数据集 | Takuma Yagi, Misaki Ohashi, Yifei Huang, Ryosuke Furuta, Shungo Adachi, Toutai Mitsuyama, Yoichi Sato | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | Guided Interpretable Facial Expression Recognition via Spatial Action Unit Cues | 通过空间动作单元线索引导可解释的面部表情识别 | Soufiane Belharbi, Marco Pedersoli, Alessandro Lameiras Koerich, Simon Bacon, Eric Granger | arxiv.org/pdf/2402.00… | link |
| 2024-02-01 | LRDif: Diffusion Models for Under-Display Camera Emotion Recognition | LRDif:用于屏下摄像头情绪识别的扩散模型 | Zhifeng Wang, Kaihao Zhang, Ramesh Sankaranarayana | arxiv.org/pdf/2402.00… | null |
模型压缩/优化
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-01 | AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning | AnimateLCM:通过解耦一致性学习加速个性化扩散模型和适配器的动画 | Fu-Yun Wang, Zhaoyang Huang, Xiaoyu Shi, Weikang Bian, Guanglu Song, Yu Liu, Hongsheng Li | arxiv.org/pdf/2402.00… | link |
生成模型
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-01 | ViCA-NeRF: View-Consistency-Aware 3D Editing of Neural Radiance Fields | ViCA-NeRF:神经辐射场的视图一致性感知 3D 编辑 | Jiahua Dong, Yu-Xiong Wang | arxiv.org/pdf/2402.00… | link |
| 2024-02-01 | Emo-Avatar: Efficient Monocular Video Style Avatar through Texture Rendering | Emo-Avatar:通过纹理渲染高效的单目视频风格头像 | Pinxin Liu, Luchuan Song, Daoan Zhang, Hang Hua, Yunlong Tang, Huaijin Tu, Jiebo Luo, Chenliang Xu | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | CapHuman: Capture Your Moments in Parallel Universes | CapHuman:在平行宇宙中捕捉你的瞬间 | Chao Liang, Fan Ma, Linchao Zhu, Yingying Deng, Yi Yang | arxiv.org/pdf/2402.00… | link |
| 2024-02-01 | Dynamic Texture Transfer using PatchMatch and Transformers | 使用 PatchMatch 和 Transformer 进行动态纹理传输 | Guo Pu, Shiyao Xu, Xixin Cao, Zhouhui Lian | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | Image2Points:A 3D Point-based Context Clusters GAN for High-Quality PET Image Reconstruction | Image2Points:用于高质量 PET 图像重建的基于 3D 点的上下文聚类 GAN | Jiaqi Cui, Yan Wang, Lu Wen, Pinxian Zeng, Xi Wu, Jiliu Zhou, Dinggang Shen | arxiv.org/pdf/2402.00… | link |
多模态
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-01 | In-Bed Pose Estimation: A Review | 床上姿势估计:回顾 | Ziya Ata Yazıcı, Sara Colantonio, Hazım Kemal Ekenel | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | Fisheye Camera and Ultrasonic Sensor Fusion For Near-Field Obstacle Perception in Bird's-Eye-View | 鱼眼相机和超声波传感器融合,实现鸟瞰近场障碍物感知 | Arindam Das, Sudarshan Paul, Niko Scholz, Akhilesh Kumar Malviya, Ganesh Sistu, Ujjwal Bhattacharya, Ciarán Eising | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | Instruction Makes a Difference | 指导有所作为 | Tosin Adewumi, Nudrat Habib, Lama Alkhaled, Elisa Barney | arxiv.org/pdf/2402.00… | link |
| 2024-02-01 | Safety of Multimodal Large Language Models on Images and Text | 图像和文本多模态大语言模型的安全性 | Xin Liu, Yichen Zhu, Yunshi Lan, Chao Yang, Yu Qiao | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | Multimodal Embodied Interactive Agent for Cafe Scene | 咖啡馆场景的多模态实体交互代理 | Yang Liu, Xinshuai Song, Kaixuan Jiang, Weixing Chen, Jingzhou Luo, Guanbin Li, Liang Lin | arxiv.org/pdf/2402.00… | null |
LLM
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-01 | Vision-LLMs Can Fool Themselves with Self-Generated Typographic Attacks | 视觉法学硕士可以用自己生成的印刷攻击来欺骗自己 | Maan Qraitem, Nazia Tasnim, Kate Saenko, Bryan A. Plummer | arxiv.org/pdf/2402.00… | null |
Transformer
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-01 | 360-GS: Layout-guided Panoramic Gaussian Splatting For Indoor Roaming | 360-GS:用于室内漫游的布局引导的全景高斯分布 | Jiayang Bai, Letian Huang, Jie Guo, Wen Gong, Yuanqi Li, Yanwen Guo | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | GS++: Error Analyzing and Optimal Gaussian Splatting | GS++:误差分析和最佳高斯分布 | Letian Huang, Jiayang Bai, Jie Guo, Yanwen Guo | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | LVC-LGMC: Joint Local and Global Motion Compensation for Learned Video Compression | LVC-LGMC:用于学习视频压缩的联合局部和全局运动补偿 | Wei Jiang, Junru Li, Kai Zhang, Li Zhang | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | Merging Multi-Task Models via Weight-Ensembling Mixture of Experts | 通过专家权重组合合并多任务模型 | Anke Tang, Li Shen, Yong Luo, Nan Yin, Lefei Zhang, Dacheng Tao | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | SmartCooper: Vehicular Collaborative Perception with Adaptive Fusion and Judger Mechanism | SmartCooper:具有自适应融合和判断机制的车辆协同感知 | Yuang Zhang, Haonan An, Zhengru Fang, Guowen Xu, Yuan Zhou, Xianhao Chen, Yuguang Fang | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | A Survey on Hallucination in Large Vision-Language Models | 大视觉语言模型中幻觉的调查 | Hanchao Liu, Wenyuan Xue, Yifei Chen, Dapeng Chen, Xiutian Zhao, Ke Wang, Liping Hou, Rongjun Li, Wei Peng | arxiv.org/pdf/2402.00… | null |
3DGS
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-01 | StopThePop: Sorted Gaussian Splatting for View-Consistent Real-time Rendering | StopThePop:用于视图一致实时渲染的排序高斯泼溅 | Lukas Radl, Michael Steiner, Mathias Parger, Alexander Weinrauch, Bernhard Kerbl, Markus Steinberger | arxiv.org/pdf/2402.00… | null |
3D/CG
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-01 | AToM: Amortized Text-to-Mesh using 2D Diffusion | AToM:使用 2D 扩散的摊销文本到网格 | Guocheng Qian, Junli Cao, Aliaksandr Siarohin, Yash Kant, Chaoyang Wang, Michael Vasilkovsky, Hsin-Ying Lee, Yuwei Fang, Ivan Skorokhodov, Peiye Zhuang, et.al. | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | Geometry Transfer for Stylizing Radiance Fields | 用于风格化辐射场的几何传递 | Hyunyoung Jung, Seonghyeon Nam, Nikolaos SarafianosSungjoo Yoo, Alexander Sorkine-Hornung, Rakesh Ranjan | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | DRSM: efficient neural 4d decomposition for dynamic reconstruction in stationary monocular cameras | DRSM:用于固定单目相机动态重建的高效神经 4d 分解 | Weixing Xie, Xiao Dong, Yong Yang, Qiqin Lin, Jingze Chen, Junfeng Yao, Xiaohu Guo | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | Diffusion-based Light Field Synthesis | 基于扩散的光场合成 | Ruisheng Gao, Yutong Liu, Zeyu Xiao, Zhiwei Xiong | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | Recasting Regional Lighting for Shadow Removal | 重铸区域照明以消除阴影 | Yuhao Liu, Zhanghan Ke, Ke Xu, Fang Liu, Zhenwei Wang, Rynson W. H. Lau | arxiv.org/pdf/2402.00… | null |
各类学习方式
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-01 | Exploring Homogeneous and Heterogeneous Consistent Label Associations for Unsupervised Visible-Infrared Person ReID | 探索无监督可见红外行人再识别的同质和异质一致标签关联 | Lingfeng He, De Cheng, Nannan Wang, Xinbo Gao | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | Deep Clustering Using the Soft Silhouette Score: Towards Compact and Well-Separated Clusters | 使用 Soft Silhouette 分数进行深度聚类:实现紧凑且分离良好的聚类 | Georgios Vardakas, Ioannis Papakostas, Aristidis Likas | arxiv.org/pdf/2402.00… | null |
其他
| Publish Date | Title | Title_CN | Authors | Code | |
|---|---|---|---|---|---|
| 2024-02-01 | BootsTAP: Bootstrapped Training for Tracking-Any-Point | BootsTAP:用于跟踪任意点的引导训练 | Carl Doersch, Yi Yang, Dilara Gokay, Pauline Luc, Skanda Koppula, Ankush Gupta, Joseph Heyward, Ross Goroshin, João Carreira, Andrew Zisserman | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | ChaosBench: A Multi-Channel, Physics-Based Benchmark for Subseasonal-to-Seasonal Climate Prediction | ChaosBench:基于物理的多通道次季节到季节气候预测基准 | Juan Nathaniel, Yongquan Qu, Tung Nguyen, Sungduk Yu, Julius Busecke, Aditya Grover, Pierre Gentine | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | Deep Robot Sketching: An application of Deep Q-Learning Networks for human-like sketching | 深度机器人素描:深度 Q 学习网络在类人素描中的应用 | Raul Fernandez-Fernandez, Juan G. Victores, Carlos Balaguer | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | Tropical Decision Boundaries for Neural Networks Are Robust Against Adversarial Attacks | 神经网络的热带决策边界对于对抗性攻击具有鲁棒性 | Kurt Pasque, Christopher Teska, Ruriko Yoshida, Keiji Miura, Jefferson Huang | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | Short: Benchmarking transferable adversarial attacks | 简短:可转移对抗性攻击的基准测试 | Zhibo Jin, Jiayu Zhang, Zhiyu Zhu, Huaming Chen | arxiv.org/pdf/2402.00… | link |
| 2024-02-01 | LM-HT SNN: Enhancing the Performance of SNN to ANN Counterpart through Learnable Multi-hierarchical Threshold Model | LM-HT SNN:通过可学习的多层次阈值模型增强 SNN 与 ANN 对应物的性能 | Zecheng Hao, Xinyu Shi, Zhiyu Pan, Yujia Liu, Zhaofei Yu, Tiejun Huang | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | InfMAE: A Foundation Model in Infrared Modality | InfMAE:红外模态的基础模型 | Fangcen Liu, Chenqiang Gao, Yaming Zhang, Junjie Guo, Jinhao Wang, Deyu Meng | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | SCO-VIST: Social Interaction Commonsense Knowledge-based Visual Storytelling | SCO-VIST:基于社交互动常识知识的视觉叙事 | Eileen Wang, Soyeon Caren Han, Josiah Poon | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | Invariance-powered Trustworthy Defense via Remove Then Restore | 通过删除然后恢复实现不变性的可信防御 | Xiaowei Fu, Yuhang Zhou, Lina Ma, Lei Zhang | arxiv.org/pdf/2402.00… | null |
| 2024-02-01 | Understanding Neural Network Systems for Image Analysis using Vector Spaces and Inverse Maps | 了解使用向量空间和逆映射进行图像分析的神经网络系统 | Rebecca Pattichis, Marios S. Pattichis | arxiv.org/pdf/2402.00… | null |