[UPDATED!] 2024-03-13 (Publish Time)
生成模型
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-13 | VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis | VLOGGER:用于具体化身合成的多模态扩散 | Enric Corona, Andrei Zanfir, Eduard Gabriel Bazavan, Nikos Kolotouros, Thiemo Alldieck, Cristian Sminchisescu | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Spatiotemporal Diffusion Model with Paired Sampling for Accelerated Cardiac Cine MRI | 加速心脏电影 MRI 的成对采样时空扩散模型 | Shihan Qiu, Shaoyan Pan, Yikang Liu, Lin Zhao, Jian Xu, Qi Liu, Terrence Chen, Eric Z. Chen, Xiao Chen, Shanhui Sun | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Clinically Feasible Diffusion Reconstruction for Highly-Accelerated Cardiac Cine MRI | 临床上可行的高加速心脏电影 MRI 扩散重建 | Shihan Qiu, Shaoyan Pan, Yikang Liu, Lin Zhao, Jian Xu, Qi Liu, Terrence Chen, Eric Z. Chen, Xiao Chen, Shanhui Sun | arxiv.org/pdf/2403.08… | null |
2024-03-13 | GaussCtrl: Multi-View Consistent Text-Driven 3D Gaussian Splatting Editing | GaussCtrl:多视图一致文本驱动的 3D 高斯泼溅编辑 | Jing Wu, Jia-Wang Bian, Xinghui Li, Guangrun Wang, Ian Reid, Philip Torr, Victor Adrian Prisacariu | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Ambient Diffusion Posterior Sampling: Solving Inverse Problems with Diffusion Models trained on Corrupted Data | 环境扩散后采样:使用受损坏数据训练的扩散模型解决逆问题 | Asad Aali, Giannis Daras, Brett Levac, Sidharth Kumar, Alexandros G. Dimakis, Jonathan I. Tamir | arxiv.org/pdf/2403.08… | link |
2024-03-13 | Data Augmentation in Human-Centric Vision | 以人为本的视觉中的数据增强 | Wentao Jiang, Yige Zhang, Shaozhong Zheng, Si Liu, Shuicheng Yan | arxiv.org/pdf/2403.08… | null |
2024-03-13 | ActionDiffusion: An Action-aware Diffusion Model for Procedure Planning in Instructional Videos | ActionDiffusion:用于教学视频中的程序规划的动作感知扩散模型 | Lei Shi, Paul Bürkner, Andreas Bulling | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Model Will Tell: Training Membership Inference for Diffusion Models | 模型会告诉我们:训练扩散模型的成员推理 | Xiaomeng Fu, Xi Wang, Qiao Li, Jin Liu, Jiao Dai, Jizhong Han | arxiv.org/pdf/2403.08… | null |
2024-03-13 | MD-Dose: A Diffusion Model based on the Mamba for Radiotherapy Dose Prediction | MD-Dose:基于 Mamba 的放疗剂量预测扩散模型 | Linjie Fu, Xia Li, Xiuding Cai, Yingkai Wang, Xueyao Wang, Yali Shen, Yu Yao | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Diffusion Models with Implicit Guidance for Medical Anomaly Detection | 具有隐式指导的医疗异常检测的扩散模型 | Cosmin I. Bercea, Benedikt Wiestler, Daniel Rueckert, Julia A. Schnabel | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Towards Dense and Accurate Radar Perception Via Efficient Cross-Modal Diffusion Model | 通过高效的跨模态扩散模型实现密集且准确的雷达感知 | Ruibin Zhang, Donglai Xue, Yuhan Wang, Ruixu Geng, Fei Gao | arxiv.org/pdf/2403.08… | null |
2024-03-13 | PFStorer: Personalized Face Restoration and Super-Resolution | PFStorer:个性化面部恢复和超分辨率 | Tuomas Varanka, Tapani Toivonen, Soumya Tripathy, Guoying Zhao, Erman Acar | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Iterative Online Image Synthesis via Diffusion Model for Imbalanced Classification | 通过扩散模型进行不平衡分类的迭代在线图像合成 | Shuhan Li, Yi Lin, Hao Chen, Kwang-Ting Cheng | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Tackling the Singularities at the Endpoints of Time Intervals in Diffusion Models | 解决扩散模型中时间间隔端点的奇异性 | Pengze Zhang, Hubery Yin, Chen Li, Xiaohua Xie | arxiv.org/pdf/2403.08… | link |
2024-03-13 | Mitigate Target-level Insensitivity of Infrared Small Target Detection via Posterior Distribution Modeling | 通过后验分布建模减轻红外小目标检测的目标级不敏感性 | Haoqing Li, Jinfu Yang, Yifei Xu, Runshi Wang | arxiv.org/pdf/2403.08… | link |
2024-03-13 | Attack Deterministic Conditional Image Generative Models for Diverse and Controllable Generation | 攻击确定性条件图像生成模型,生成多样化、可控 | Tianyi Chu, Wei Xing, Jiafu Chen, Zhizhong Wang, Jiakai Sun, Lei Zhao, Haibo Chen, Huaizhong Lin | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Hierarchical Auto-Organizing System for Open-Ended Multi-Agent Navigation | 开放式多智能体导航的分层自动组织系统 | Zhonghan Zhao, Kewei Chen, Dongxu Guo, Wenhao Chai, Tian Ye, Yanting Zhang, Gaoang Wang | arxiv.org/pdf/2403.08… | null |
2024-03-13 | VIGFace: Virtual Identity Generation Model for Face Image Synthesis | VIGFace:人脸图像合成的虚拟身份生成模型 | Minsoo Kim, Min-Cheol Sagong, Gi Pyo Nam, Junghyun Cho, Ig-Jae Kim | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Sketch2Manga: Shaded Manga Screening from Sketch with Diffusion Models | Sketch2Manga:使用扩散模型从 Sketch 进行阴影漫画筛选 | Jian Lin, Xueting Liu, Chengze Li, Minshan Xie, Tien-Tsin Wong | arxiv.org/pdf/2403.08… | null |
2024-03-13 | CoroNetGAN: Controlled Pruning of GANs via Hypernetworks | CoroNetGAN:通过超网络控制 GAN 修剪 | Aman Kumar, Khushboo Anand, Shubham Mandloi, Ashutosh Mishra, Avinash Thakur, Neeraj Kasera, Prathosh A P | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Make Me Happier: Evoking Emotions Through Image Diffusion Models | 让我更快乐:通过图像扩散模型唤起情绪 | Qing Lin, Jingfeng Zhang, Yew Soon Ong, Mengmi Zhang | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Point Cloud Compression via Constrained Optimal Transport | 通过约束最优传输进行点云压缩 | Zezeng Li, Weimin Wang, Ziliang Wang, Na Lei | arxiv.org/pdf/2403.08… | link |
2024-03-13 | PaddingFlow: Improving Normalizing Flows with Padding-Dimensional Noise | PaddingFlow:利用填充维噪声改进标准化流 | Qinglong Meng, Chongkun Xia, Xueqian Wang | arxiv.org/pdf/2403.08… | link |
2024-03-13 | ShadowRemovalNet: Efficient Real-Time Shadow Removal | ShadowRemovalNet:高效实时阴影去除 | Alzayat Saleh, Alex Olsen, Jake Wood, Bronson Philippa, Mostafa Rahimi Azghadi | arxiv.org/pdf/2403.08… | null |
多模态
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-13 | Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization | 通过引导偏好优化强化多模态大语言模型 | Renjie Pi, Tianyang Han, Wei Xiong, Jipeng Zhang, Runtao Liu, Rui Pan, Tong Zhang | arxiv.org/pdf/2403.08… | null |
2024-03-13 | A Multimodal Fusion Network For Student Emotion Recognition Based on Transformer and Tensor Product | 基于变压器和张量积的学生情绪识别多模态融合网络 | Ao Xiang, Zongqing Qi, Han Wang, Qin Yang, Danqing Ma | arxiv.org/pdf/2403.08… | null |
2024-03-13 | CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large Language Model | CoIN:多模型大语言模型持续指令调优的基准 | Cheng Chen, Junchen Zhu, Xu Luo, Hengtao Shen, Lianli Gao, Jingkuan Song | arxiv.org/pdf/2403.08… | link |
2024-03-13 | REPAIR: Rank Correlation and Noisy Pair Half-replacing with Memory for Noisy Correspondence | 修复:等级相关性和噪声对用内存替换一半以实现噪声对应 | Ruochen Zheng, Jiahao Hong, Changxin Gao, Nong Sang | arxiv.org/pdf/2403.08… | null |
Nerf
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-13 | Gaussian Splatting in Style | 高斯泼溅风格 | Abhishek Saroha, Mariia Gladkova, Cecilia Curreli, Tarun Yenamandra, Daniel Cremers | arxiv.org/pdf/2403.08… | null |
2024-03-13 | StyleDyRF: Zero-shot 4D Style Transfer for Dynamic Neural Radiance Fields | StyleDyRF:动态神经辐射场的零样本 4D 风格迁移 | Hongbin Xu, Weitao Chen, Feng Xiao, Baigui Sun, Wenxiong Kang | arxiv.org/pdf/2403.08… | null |
2024-03-13 | NeRF-Supervised Feature Point Detection and Description | NeRF 监督的特征点检测和描述 | Ali Youssef, Francisco Vasconcelos | arxiv.org/pdf/2403.08… | null |
3DGS
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-13 | GaussianImage: 1000 FPS Image Representation and Compression by 2D Gaussian Splatting | GaussianImage:通过 2D 高斯分布进行 1000 FPS 图像表示和压缩 | Xinjie Zhang, Xingtong Ge, Tongda Xu, Dailan He, Yan Wang, Hongwei Qin, Guo Lu, Jing Geng, Jun Zhang | arxiv.org/pdf/2403.08… | null |
2024-03-13 | ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation | ManiGaussian:用于多任务机器人操作的动态高斯泼溅 | Guanxing Lu, Shiyi Zhang, Ziwei Wang, Changliu Liu, Jiwen Lu, Yansong Tang | arxiv.org/pdf/2403.08… | null |
模型压缩/优化
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-13 | MonoOcc: Digging into Monocular Semantic Occupancy Prediction | MonoOcc:深入研究单目语义占用预测 | Yupeng Zheng, Xiang Li, Pengfei Li, Yuhang Zheng, Bu Jin, Chengliang Zhong, Xiaoxiao Long, Hao Zhao, Qichao Zhang | arxiv.org/pdf/2403.08… | link |
2024-03-13 | Deep Learning for In-Orbit Cloud Segmentation and Classification in Hyperspectral Satellite Data | 高光谱卫星数据在轨云分割和分类的深度学习 | Daniel Kovac, Jan Mucha, Jon Alvarez Justo, Jiri Mekyska, Zoltan Galaz, Krystof Novotny, Radoslav Pitonak, Jan Knezik, Jonas Herec, Tor Arne Johansen | arxiv.org/pdf/2403.08… | null |
2024-03-13 | LIX: Implicitly Infusing Spatial Geometric Prior Knowledge into Visual Semantic Segmentation for Autonomous Driving | LIX:将空间几何先验知识隐式融入自动驾驶的视觉语义分割中 | Sicen Guo, Zhiyuan Wu, Qijun Chen, Ioannis Pitas, Rui Fan | arxiv.org/pdf/2403.08… | null |
2024-03-13 | AutoDFP: Automatic Data-Free Pruning via Channel Similarity Reconstruction | AutoDFP:通过渠道相似性重建进行自动无数据修剪 | Siqi Li, Jun Chen, Jingyang Xiang, Chengrui Zhu, Yong Liu | arxiv.org/pdf/2403.08… | null |
分类/检测/识别/分割/...
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-13 | Segmentation of Knee Bones for Osteoarthritis Assessment: A Comparative Analysis of Supervised, Few-Shot, and Zero-Shot Learning Approaches | 用于骨关节炎评估的膝骨分割:监督式、少样本和零样本学习方法的比较分析 | Yun Xin Teoh, Alice Othmani, Siew Li Goh, Juliana Usman, Khin Wee Lai | arxiv.org/pdf/2403.08… | null |
2024-03-13 | MIM4D: Masked Modeling with Multi-View Video for Autonomous Driving Representation Learning | MIM4D:用于自动驾驶表示学习的多视图视频蒙版建模 | Jialv Zou, Bencheng Liao, Qian Zhang, Wenyu Liu, Xinggang Wang | arxiv.org/pdf/2403.08… | link |
2024-03-13 | DAM: Dynamic Adapter Merging for Continual Video QA Learning | DAM:用于持续视频 QA 学习的动态适配器合并 | Feng Cheng, Ziyang Wang, Yi-Lin Sung, Yan-Bo Lin, Mohit Bansal, Gedas Bertasius | arxiv.org/pdf/2403.08… | link |
2024-03-13 | Real-time 3D semantic occupancy prediction for autonomous vehicles using memory-efficient sparse convolution | 使用内存高效的稀疏卷积对自动驾驶车辆进行实时 3D 语义占用预测 | Samuel Sze, Lars Kunze | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Historical Astronomical Diagrams Decomposition in Geometric Primitives | 历史天文图的几何基元分解 | Syrine Kalleli, Scott Trigg, Ségolène Albouy, Mathieu Husson, Mathieu Aubry | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Exploiting Structural Consistency of Chest Anatomy for Unsupervised Anomaly Detection in Radiography Images | 利用胸部解剖结构的一致性进行放射摄影图像中的无监督异常检测 | Tiange Xiang, Yixiao Zhang, Yongyi Lu, Alan Yuille, Chaoyi Zhang, Weidong Cai, Zongwei Zhou | arxiv.org/pdf/2403.08… | null |
2024-03-13 | OneVOS: Unifying Video Object Segmentation with All-in-One Transformer Framework | OneVOS:通过一体化 Transformer 框架统一视频对象分割 | Wanyun Li, Pinxue Guo, Xinyu Zhou, Lingyi Hong, Yangji He, Xiangyu Zheng, Wei Zhang, Wenqiang Zhang | arxiv.org/pdf/2403.08… | null |
2024-03-13 | A Decade's Battle on Dataset Bias: Are We There Yet? | 十年来对抗数据集偏差的斗争:我们到了吗? | Zhuang Liu, Kaiming He | arxiv.org/pdf/2403.08… | link |
2024-03-13 | PRAGO: Differentiable Multi-View Pose Optimization From Objectness Detections | PRAGO:通过物体检测进行可微分多视图姿势优化 | Matteo Taiana, Matteo Toso, Stuart James, Alessio Del Bue | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Leveraging Compressed Frame Sizes For Ultra-Fast Video Classification | 利用压缩帧大小进行超快速视频分类 | Yuxing Han, Yunan Ding, Chen Ye Gan, Jiangtao Wen | arxiv.org/pdf/2403.08… | null |
2024-03-13 | CINA: Conditional Implicit Neural Atlas for Spatio-Temporal Representation of Fetal Brains | CINA:胎儿大脑时空表征的条件隐式神经图谱 | Maik Dannecker, Vanessa Kyriakopoulou, Lucilio Cordero-Grande, Anthony N. Price, Joseph V. Hajnal, Daniel Rueckert | arxiv.org/pdf/2403.08… | null |
2024-03-13 | AIGCs Confuse AI Too: Investigating and Explaining Synthetic Image-induced Hallucinations in Large Vision-Language Models | AIGC 也让人工智能感到困惑:调查和解释大型视觉语言模型中合成图像引起的幻觉 | Yifei Gao, Jiaqi Wang, Zhiyu Lin, Jitao Sang | arxiv.org/pdf/2403.08… | null |
2024-03-13 | HOLMES: HOLonym-MEronym based Semantic inspection for Convolutional Image Classifiers | HOLMES:基于 HOLonym-MEronym 的卷积图像分类器语义检查 | Francesco Dibitonto, Fabio Garcea, André Panisson, Alan Perotti, Lia Morra | arxiv.org/pdf/2403.08… | link |
2024-03-13 | Pig aggression classification using CNN, Transformers and Recurrent Networks | 使用 CNN、Transformers 和循环网络对猪的攻击行为进行分类 | Junior Silva Souza, Eduardo Bedin, Gabriel Toshio Hirokawa Higa, Newton Loebens, Hemerson Pistori | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Improved YOLOv5 Based on Attention Mechanism and FasterNet for Foreign Object Detection on Railway and Airway tracks | 基于注意力机制和FasterNet的改进YOLOv5用于铁路和航空轨道上的异物检测 | Zongqing Qi, Danqing Ma, Jingyu Xu, Ao Xiang, Hedi Qu | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Language-Driven Visual Consensus for Zero-Shot Semantic Segmentation | 语言驱动的零样本语义分割视觉共识 | Zicheng Zhang, Tong Zhang, Yi Zhu, Jianzhuang Liu, Xiaodan Liang, QiXiang Ye, Wei Ke | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Low-Cost and Real-Time Industrial Human Action Recognitions Based on Large-Scale Foundation Models | 基于大规模基础模型的低成本实时工业人体动作识别 | Wensheng Liang, Ruiyan Zhuang, Xianwei Shi, Shuai Li, Zhicheng Wang, Xiaoguang Ma | arxiv.org/pdf/2403.08… | null |
2024-03-13 | The Development and Performance of a Machine Learning Based Mobile Platform for Visually Determining the Etiology of Penile Pathology | 基于机器学习的移动平台的开发和性能,用于直观地确定阴茎病理学的病因 | Lao-Tzu Allan-Blitz, Sithira Ambepitiya, Raghavendra Tirupathi, Jeffrey D. Klausner, Yudara Kularathne | arxiv.org/pdf/2403.08… | null |
2024-03-13 | RAF-GI: Towards Robust, Accurate and Fast-Convergent Gradient Inversion Attack in Federated Learning | RAF-GI:联邦学习中稳健、准确和快速收敛的梯度反转攻击 | Can Liu, Jin Wang, Dongyang Yu | arxiv.org/pdf/2403.08… | null |
2024-03-13 | A Generalized Framework with Adaptive Weighted Soft-Margin for Imbalanced SVM Classification | 具有自适应加权软间隔的不平衡SVM分类的通用框架 | Lu Jiang, Qi Wang, Yuhang Chang, Jianing Song, Haoyue Fu | arxiv.org/pdf/2403.08… | null |
2024-03-13 | DrFER: Learning Disentangled Representations for 3D Facial Expression Recognition | DrFER:学习 3D 面部表情识别的解缠结表示 | Hebeizi Li, Hongyu Yang, Di Huang | arxiv.org/pdf/2403.08… | null |
2024-03-13 | MGIC: A Multi-Label Gradient Inversion Attack based on Canny Edge Detection on Federated Learning | MGIC:联邦学习上基于 Canny 边缘检测的多标签梯度反转攻击 | Can Liu, Jin Wang | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Optimized Detection and Classification on GTRSB: Advancing Traffic Sign Recognition with Convolutional Neural Networks | GTRSB 的优化检测和分类:利用卷积神经网络推进交通标志识别 | Dhruv Toshniwal, Saurabh Loya, Anuj Khot, Yash Marda | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Pre-examinations Improve Automated Metastases Detection on Cranial MRI | 预检查可改善颅脑 MRI 转移瘤的自动检测 | Katerina Deike-Hofmann, Dorottya Dancs, Daniel Paech, Heinz-Peter Schlemmer, Klaus Maier-Hein, Philipp Bäumer, Alexander Radbruch, Michael Götz | arxiv.org/pdf/2403.08… | null |
2024-03-13 | LiqD: A Dynamic Liquid Level Detection Model under Tricky Small Containers | LiqD:棘手小容器下的动态液位检测模型 | Yukun Ma, Zikun Mao | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Efficient Prompt Tuning of Large Vision-Language Model for Fine-Grained Ship Classification | 高效快速调整大视觉语言模型以实现细粒度船舶分类 | Long Lan, Fengxiang Wang, Shuyan Li, Xiangtao Zheng, Zengmao Wang, Xinwang Liu | arxiv.org/pdf/2403.08… | null |
2024-03-13 | IG-FIQA: Improving Face Image Quality Assessment through Intra-class Variance Guidance robust to Inaccurate Pseudo-Labels | IG-FIQA:通过对不准确的伪标签稳健的类内方差指导来改进人脸图像质量评估 | Minsoo Kim, Gi Pyo Nam, Haksub Kim, Haesol Park, Ig-Jae Kim | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Continuous Object State Recognition for Cooking Robots Using Pre-Trained Vision-Language Models and Black-box Optimization | 使用预先训练的视觉语言模型和黑盒优化对烹饪机器人进行连续物体状态识别 | Kento Kawaharazuka, Naoaki Kanazawa, Yoshiki Obinata, Kei Okada, Masayuki Inaba | arxiv.org/pdf/2403.08… | null |
2024-03-13 | P2LHAP:Wearable sensor-based human activity recognition, segmentation and forecast through Patch-to-Label Seq2Seq Transformer | P2LHAP:通过 Patch-to-Label Seq2Seq Transformer 基于可穿戴传感器的人体活动识别、分割和预测 | Shuangjian Li, Tao Zhu, Mingxing Nie, Huansheng Ning, Zhenyu Liu, Liming Chen | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Advancing Security in AI Systems: A Novel Approach to Detecting Backdoors in Deep Neural Networks | 提高人工智能系统的安全性:一种检测深度神经网络后门的新方法 | Khondoker Murad Hossain, Tim Oates | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Versatile Defense Against Adversarial Attacks on Image Recognition | 针对图像识别的对抗性攻击的多功能防御 | Haibo Zhang, Zhihua Yao, Kouichi Sakurai | arxiv.org/pdf/2403.08… | null |
2024-03-13 | LAFS: Landmark-based Facial Self-supervised Learning for Face Recognition | LAFS:基于地标的面部自监督学习用于人脸识别 | Zhonglin Sun, Chen Feng, Ioannis Patras, Georgios Tzimiropoulos | arxiv.org/pdf/2403.08… | link |
2024-03-13 | Multiscale Low-Frequency Memory Network for Improved Feature Extraction in Convolutional Neural Networks | 用于改进卷积神经网络中特征提取的多尺度低频存储网络 | Fuzhi Wu, Jiasong Wu, Youyong Kong, Chunfeng Yang, Guanyu Yang, Huazhong Shu, Guy Carrault, Lotfi Senhadji | arxiv.org/pdf/2403.08… | link |
图像理解
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-13 | SM4Depth: Seamless Monocular Metric Depth Estimation across Multiple Cameras and Scenes by One Model | SM4Depth:通过一个模型跨多个摄像机和场景进行无缝单目度量深度估计 | Yihao Liu, Feng Xue, Anlong Ming | arxiv.org/pdf/2403.08… | link |
2024-03-13 | METER: a mobile vision transformer architecture for monocular depth estimation | METER:用于单目深度估计的移动视觉变压器架构 | L. Papa, P. Russo, I. Amerini | arxiv.org/pdf/2403.08… | link |
LLM
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-13 | Masked Generative Story Transformer with Character Guidance and Caption Augmentation | 具有角色指导和字幕增强功能的蒙面生成故事变压器 | Christos Papadimitriou, Giorgos Filandrianos, Maria Lymperaiou, Giorgos Stamou | arxiv.org/pdf/2403.08… | link |
Transformer
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-13 | Content-aware Masked Image Modeling Transformer for Stereo Image Compression | 用于立体图像压缩的内容感知蒙版图像建模转换器 | Xinjie Zhang, Shenyuan Gao, Zhening Liu, Xingtong Ge, Dailan He, Tongda Xu, Yan Wang, Jun Zhang | arxiv.org/pdf/2403.08… | null |
2024-03-13 | AADNet: Attention aware Demoiréing Network | AADNet:注意感知 Demoiréing 网络 | M Rakesh Reddy, Shubham Mandloi, Aman Kumar | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Activating Wider Areas in Image Super-Resolution | 激活更广泛的图像超分辨率区域 | Cheng Cheng, Hang Wang, Hongbin Sun | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Identity-aware Dual-constraint Network for Cloth-Changing Person Re-identification | 用于换衣服人员重新识别的身份感知双约束网络 | Peini Guo, Mengyuan Liu, Hong Liu, Ruijia Fan, Guoquan Wang, Bin He | arxiv.org/pdf/2403.08… | null |
2024-03-13 | SeCG: Semantic-Enhanced 3D Visual Grounding via Cross-modal Graph Attention | SeCG:通过跨模态图注意力进行语义增强的 3D 视觉基础 | Feng Xiao, Hongbin Xu, Qiuxia Wu, Wenxiong Kang | arxiv.org/pdf/2403.08… | null |
3D/CG
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-13 | FastMAC: Stochastic Spectral Sampling of Correspondence Graph | FastMAC:对应图的随机谱采样 | Yifei Zhang, Hao Zhao, Hongyang Li, Siheng Chen | arxiv.org/pdf/2403.08… | null |
2024-03-13 | 3DFIRES: Few Image 3D REconstruction for Scenes with Hidden Surface | 3DFIRES:具有隐藏表面的场景的少量图像 3D 重建 | Linyi Jin, Nilesh Kulkarni, David Fouhey | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Refractive COLMAP: Refractive Structure-from-Motion Revisited | 折射 COLMAP:重新审视运动中的折射结构 | Mengkun She, Felix Seegräber, David Nakath, Kevin Köser | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Scaling Up Dynamic Human-Scene Interaction Modeling | 扩大动态人景交互建模 | Nan Jiang, Zhiyuan Zhang, Hongjie Li, Xiaoxuan Ma, Zan Wang, Yixin Chen, Tengyu Liu, Yixin Zhu, Siyuan Huang | arxiv.org/pdf/2403.08… | null |
2024-03-13 | A Novel Implicit Neural Representation for Volume Data | 一种新颖的体数据隐式神经表示 | Armin Sheibanifard, Hongchuan Yu | arxiv.org/pdf/2403.08… | null |
2024-03-13 | UniLiDAR: Bridge the domain gap among different LiDARs for continual learning | UniLiDAR:弥合不同 LiDAR 之间的领域差距以实现持续学习 | Zikun Xu, Jianqiang Wang, Shaobing Xu | arxiv.org/pdf/2403.08… | null |
2024-03-13 | OccFiner: Offboard Occupancy Refinement with Hybrid Propagation | OccFiner:通过混合传播优化船外占用率 | Hao Shi, Song Wang, Jiaming Zhang, Xiaoting Yin, Zhongdao Wang, Zhijian Zhao, Guangming Wang, Jianke Zhu, Kailun Yang, Kaiwei Wang | arxiv.org/pdf/2403.08… | null |
2024-03-13 | NaturalVLM: Leveraging Fine-grained Natural Language for Affordance-Guided Visual Manipulation | NaturalVLM:利用细粒度自然语言进行可供性引导的视觉操作 | Ran Xu, Yan Shen, Xiaoqi Li, Ruihai Wu, Hao Dong | arxiv.org/pdf/2403.08… | null |
2024-03-13 | STMPL: Human Soft-Tissue Simulation | STMPL:人体软组织模拟 | Anton Agafonov, Lihi Zelnik-Manor | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts | Follow-Your-Click:通过简短提示进行开放域区域图像动画 | Yue Ma, Yingqing He, Hongfa Wang, Andong Wang, Chenyang Qi, Chengfei Cai, Xiu Li, Zhifeng Li, Heung-Yeung Shum, Wei Liu, et.al. | arxiv.org/pdf/2403.08… | link |
2024-03-13 | BiTT: Bi-directional Texture Reconstruction of Interacting Two Hands from a Single Image | BiTT:从单个图像中交互两只手的双向纹理重建 | Minje Kim, Tae-Kyun Kim | arxiv.org/pdf/2403.08… | null |
2024-03-13 | PNeSM: Arbitrary 3D Scene Stylization via Prompt-Based Neural Style Mapping | PNeSM:通过基于提示的神经风格映射进行任意 3D 场景风格化 | Jiafu Chen, Wei Xing, Jiakai Sun, Tianyi Chu, Yiling Huang, Boyan Ji, Lei Zhao, Huaizhong Lin, Haibo Chen, Zhizhong Wang | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Iterative Learning for Joint Image Denoising and Motion Artifact Correction of 3D Brain MRI | 3D 脑 MRI 联合图像去噪和运动伪影校正的迭代学习 | Lintao Zhang, Mengqi Wu, Lihong Wang, David C. Steffens, Guy G. Potter, Mingxia Liu | arxiv.org/pdf/2403.08… | link |
各类学习方式
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-13 | iCONTRA: Toward Thematic Collection Design Via Interactive Concept Transfer | iCONTRA:通过交互式概念转移实现主题系列设计 | Dinh-Khoi Vo, Duy-Nam Ly, Khanh-Duy Le, Tam V. Nguyen, Minh-Triet Tran, Trung-Nghia Le | arxiv.org/pdf/2403.08… | link |
2024-03-13 | Consistent Prompting for Rehearsal-Free Continual Learning | 持续提示,无需排练的持续学习 | Zhanxin Gao, Jun Cen, Xiaobin Chang | arxiv.org/pdf/2403.08… | link |
2024-03-13 | Unleashing the Power of Meta-tuning for Few-shot Generalization Through Sparse Interpolated Experts | 通过稀疏插值专家释放元调整的力量以实现少样本泛化 | Shengzhuang Chen, Jihoon Tack, Yunqiao Yang, Yee Whye Teh, Jonathan Richard Schwarz, Ying Wei | arxiv.org/pdf/2403.08… | link |
其他
Publish Date | Title | Title_CN | Authors | Code | |
---|---|---|---|---|---|
2024-03-13 | Diffusion-based Iterative Counterfactual Explanations for Fetal Ultrasound Image Quality Assessment | 基于扩散的胎儿超声图像质量评估的迭代反事实解释 | Paraskevas Pegios, Manxi Lin, Nina Weng, Morten Bo Søndergaard Svendsen, Zahra Bashir, Siavash Bigdeli, Anders Nymark Christensen, Martin Tolsgaard, Aasa Feragen | arxiv.org/pdf/2403.08… | null |
2024-03-13 | HAIFIT: Human-Centered AI for Fashion Image Translation | HAIFIT:以人为本的时尚图像翻译人工智能 | Jianan Jiang, Xinglin Li, Weiren Yu, Di Wu | arxiv.org/pdf/2403.08… | link |
2024-03-13 | A Causal Inspired Early-Branching Structure for Domain Generalization | 用于领域泛化的因果启发的早期分支结构 | Liang Chen, Yong Zhang, Yibing Song, Zhen Zhang, Lingqiao Liu | arxiv.org/pdf/2403.08… | link |
2024-03-13 | HIMap: HybrId Representation Learning for End-to-end Vectorized HD Map Construction | HIMap:用于端到端矢量化高精地图构建的混合表示学习 | Yi Zhou, Hui Zhang, Jiaqian Yu, Yifan Yang, Sangil Jung, Seung-In Park, ByungIn Yoo | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Occluded Cloth-Changing Person Re-Identification | 遮挡换布人员重新识别 | Zhihao Chen, Yiyuan Ge | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Better Fit: Accommodate Variations in Clothing Types for Virtual Try-on | 更合身:适应虚拟试穿服装类型的变化 | Xuanpu Zhang, Dan Song, Pengxin Zhan, Qingguo Chen, Kuilong Liu, Anan Liu | arxiv.org/pdf/2403.08… | null |
2024-03-13 | An Empirical Study of Parameter Efficient Fine-tuning on Vision-Language Pre-train Model | 视觉语言预训练模型参数高效微调的实证研究 | Yuxin Tian, Mouxing Yang, Yunfan Li, Dayiheng Liu, Xingzhang Ren, Xi Peng, Jiancheng Lv | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Improved Image-based Pose Regressor Models for Underwater Environments | 改进的水下环境中基于图像的姿态回归模型 | Luyuan Peng, Hari Vishnu, Mandar Chitre, Yuen Min Too, Bharath Kalyan, Rajat Mishra | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Data augmentation with automated machine learning: approaches and performance comparison with classical data augmentation methods | 通过自动化机器学习进行数据增强:与经典数据增强方法的方法和性能比较 | Alhassan Mumuni, Fuseini Mumuni | arxiv.org/pdf/2403.08… | null |
2024-03-13 | A Dual-domain Regularization Method for Ring Artifact Removal of X-ray CT | X射线CT环形伪影去除的双域正则化方法 | Hongyang Zhu, Xin Lu, Yanwei Qin, Xinran Yu, Tianjiao Sun, Yunsong Zhao | arxiv.org/pdf/2403.08… | null |
2024-03-13 | Matching Non-Identical Objects | 匹配不同的对象 | Yusuke Marumo, Kazuhiko Kawamoto, Hiroshi Kera | arxiv.org/pdf/2403.08… | null |