[分享][每日更新][2024.03.04][CV_arxiv_papers]

447 阅读17分钟

[UPDATED!] 2024-03-04 (Publish Time)

生成模型

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-04Gradient Correlation Subspace Learning against Catastrophic Forgetting针对灾难性遗忘的梯度相关子空间学习Tammuz Dubnov, Vishal Thenganearxiv.org/pdf/2403.02…null
2024-03-05UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention ControlUniCtrl:通过免训练统一注意力控制提高文本到视频扩散模型的时空一致性Xuweiyi Chen, Tian Xia, Sihan Xuarxiv.org/pdf/2403.02…link
2024-03-043DTopia: Large Text-to-3D Generation Model with Hybrid Diffusion Priors3DTopia:具有混合扩散先验的大型文本到 3D 生成模型Fangzhou Hong, Jiaxiang Tang, Ziang Cao, Min Shi, Tong Wu, Zhaoxi Chen, Tengfei Wang, Liang Pan, Dahua Lin, Ziwei Liuarxiv.org/pdf/2403.02…link
2024-03-04DragTex: Generative Point-Based Texture Editing on 3D MeshDragTex:3D 网格上基于点的生成纹理编辑Yudi Zhang, Qi Xu, Lei Zhangarxiv.org/pdf/2403.02…null
2024-03-04Domain adaptation, Explainability & Fairness in AI for Medical Image Analysis: Diagnosis of COVID-19 based on 3-D Chest CT-scans医学图像分析 AI 的领域适应、可解释性和公平性:基于 3D 胸部 CT 扫描的 COVID-19 诊断Dimitrios Kollias, Anastasios Arsenos, Stefanos Kolliasarxiv.org/pdf/2403.02…null
2024-03-04Point2Building: Reconstructing Buildings from Airborne LiDAR Point CloudsPoint2Building:利用机载 LiDAR 点云重建建筑物Yujia Liu, Anton Obukhov, Jan Dirk Wegner, Konrad Schindlerarxiv.org/pdf/2403.02…null
2024-03-04ResAdapter: Domain Consistent Resolution Adapter for Diffusion ModelsResAdapter:扩散模型的域一致分辨率适配器Jiaxiang Cheng, Pan Xie, Xin Xia, Jiashi Li, Jie Wu, Yuxi Ren, Huixia Li, Xuefeng Xiao, Min Zheng, Lean Fuarxiv.org/pdf/2403.02…null
2024-03-04Semi-Supervised Semantic Segmentation Based on Pseudo-Labels: A Survey基于伪标签的半监督语义分割:一项调查Lingyan Ran, Yali Li, Guoqiang Liang, Yanning Zhangarxiv.org/pdf/2403.01…null
2024-03-04FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking Faces from Disentangled AudioFaceChain-ImagineID:从解开的音频中自由制作高保真多样化的说话面孔Chao Xu, Yang Liu, Jiazheng Xing, Weida Wang, Mingze Sun, Jun Dan, Tianxin Huang, Siyuan Li, Zhi-Qi Cheng, Ying Tai, et.al.arxiv.org/pdf/2403.01…link
2024-03-04ViewDiff: 3D-Consistent Image Generation with Text-to-Image ModelsViewDiff:使用文本到图像模型生成 3D 一致图像Lukas Höllein, Aljaž Božič, Norman Müller, David Novotny, Hung-Yu Tseng, Christian Richardt, Michael Zollhöfer, Matthias Nießnerarxiv.org/pdf/2403.01…link
2024-03-04OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-onOOTDiffusion:基于舾装融合的潜在扩散,用于可控虚拟试穿Yuhao Xu, Tao Gu, Weifeng Chen, Chengcai Chenarxiv.org/pdf/2403.01…link
2024-03-04AFBT GAN: enhanced explainability and diagnostic performance for cognitive decline by counterfactual generative adversarial networkAFBT GAN:通过反事实生成对抗网络增强认知能力下降的可解释性和诊断性能Xiongri Shen, Zhenxi Song, Zhiguo Zhangarxiv.org/pdf/2403.01…link
2024-03-04HanDiffuser: Text-to-Image Generation With Realistic Hand AppearancesHanDiffuser:具有逼真手部外观的文本到图像生成Supreeth Narasimhaswamy, Uttaran Bhattacharya, Xiang Chen, Ishita Dasgupta, Saayan Mitra, Minh Hoaiarxiv.org/pdf/2403.01…null
2024-03-04Improving Adversarial Energy-Based Model via Diffusion Process通过扩散过程改进基于能量的对抗模型Cong Geng, Tian Han, Peng-Tao Jiang, Hao Zhang, Jinwei Chen, Søren Hauberg, Bo Liarxiv.org/pdf/2403.01…null

多模态

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-04Density-based Isometric Mapping基于密度的等轴测图Bardia Yousefi, Mélina Khansari, Ryan Trask, Patrick Tallon, Carina Carino, Arman Afrasiyabi, Vikas Kundra, Lan Ma, Lei Ren, Keyvan Farahani, et.al.arxiv.org/pdf/2403.02…null
2024-03-04Differentially Private Representation Learning via Image Captioning通过图像字幕进行差分私人表示学习Tom Sander, Yaodong Yu, Maziar Sanjabi, Alain Durmus, Yi Ma, Kamalika Chaudhuri, Chuan Guoarxiv.org/pdf/2403.02…null
2024-03-04Vision-Language Models for Medical Report Generation and Visual Question Answering: A Review用于医疗报告生成和视觉问答的视觉语言模型:综述Iryna Hartsock, Ghulam Rasoolarxiv.org/pdf/2403.02…null
2024-03-04COMMIT: Certifying Robustness of Multi-Sensor Fusion Systems against Semantic AttacksCOMMIT:证明多传感器融合系统针对语义攻击的鲁棒性Zijian Huang, Wenda Chu, Linyi Li, Chejian Xu, Bo Liarxiv.org/pdf/2403.02…null
2024-03-04Beyond Specialization: Assessing the Capabilities of MLLMs in Age and Gender Estimation超越专业化:评估 MLLM 在年龄和性别估计方面的能力Maksim Kuprashevich, Grigorii Alekseenko, Irina Tolstykharxiv.org/pdf/2403.02…link
2024-03-04A New Perspective on Smiling and Laughter Detection: Intensity Levels Matter微笑和笑声检测的新视角:强度水平很重要Hugo Bohy, Kevin El Haddad, Thierry Dutoitarxiv.org/pdf/2403.02…null
2024-03-04Modeling Multimodal Social Interactions: New Challenges and Baselines with Densely Aligned Representations多模式社交互动建模:具有密集对齐表示的新挑战和基线Sangmin Lee, Bolin Lai, Fiona Ryan, Bikram Boote, James M. Rehgarxiv.org/pdf/2403.02…null
2024-03-04Modality-Aware and Shift Mixer for Multi-modal Brain Tumor Segmentation用于多模态脑肿瘤分割的模态感知和移位混合器Zhongzhen Huang, Linda Wei, Shaoting Zhang, Xiaofan Zhangarxiv.org/pdf/2403.02…null
2024-03-05TNF: Tri-branch Neural Fusion for Multimodal Medical Data ClassificationTNF:用于多模式医疗数据分类的三分支神经融合Tong Zheng, Shusaku Sone, Yoshitaka Ushiku, Yuki Oba, Jiaxin Maarxiv.org/pdf/2403.01…null
2024-03-05NPHardEval4V: A Dynamic Reasoning Benchmark of Multimodal Large Language ModelsNPHardEval4V:多模态大语言模型的动态推理基准Lizhou Fan, Wenyue Hua, Xiang Li, Kaijie Zhu, Mingyu Jin, Lingyao Li, Haoyang Ling, Jinkui Chi, Jindong Wang, Xin Ma, et.al.arxiv.org/pdf/2403.01…null

Nerf

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-04DaReNeRF: Direction-aware Representation for Dynamic ScenesDaReNeRF:动态场景的方向感知表示Ange Lou, Benjamin Planche, Zhongpai Gao, Yamin Li, Tianyu Luan, Hao Ding, Terrence Chen, Jack Noble, Ziyan Wuarxiv.org/pdf/2403.02…null
2024-03-04Depth-Guided Robust and Fast Point Cloud Fusion NeRF for Sparse Input Views用于稀疏输入视图的深度引导鲁棒快速点云融合 NeRFShuai Guo, Qiuwen Wang, Yijie Gao, Rong Xie, Li Songarxiv.org/pdf/2403.02…null

模型压缩/优化

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-04Encodings for Prediction-based Neural Architecture Search基于预测的神经架构搜索的编码Yash Akhauri, Mohamed S. Abdelfattaharxiv.org/pdf/2403.02…link
2024-03-04On Latency Predictors for Neural Architecture Search神经架构搜索的延迟预测器Yash Akhauri, Mohamed S. Abdelfattaharxiv.org/pdf/2403.02…link
2024-03-04UB-FineNet: Urban Building Fine-grained Classification Network for Open-access Satellite ImagesUB-FineNet:开放获取卫星图像的城市建筑细粒度分类网络Zhiyi He, Wei Yao, Jie Shao, Puzuo Wangarxiv.org/pdf/2403.02…null
2024-03-04CSE: Surface Anomaly Detection with Contrastively Selected EmbeddingCSE:通过对比选择嵌入进行表面异常检测Simon Thomine, Hichem Snoussiarxiv.org/pdf/2403.01…null
2024-03-04NASH: Neural Architecture Search for Hardware-Optimized Machine Learning ModelsNASH:硬件优化机器学习模型的神经架构搜索Mengfei Ji, Zaid Al-Arsarxiv.org/pdf/2403.01…link
2024-03-04Neural Network Assisted Lifting Steps For Improved Fully Scalable Lossy Image Compression in JPEG 2000用于改进 JPEG 2000 中完全可扩展有损图像压缩的神经网络辅助提升步骤Xinyue Li, Aous Naman, David Taubmanarxiv.org/pdf/2403.01…link

分类/检测/识别/分割/...

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-04Coronary artery segmentation in non-contrast calcium scoring CT images using deep learning使用深度学习进行非造影钙评分 CT 图像中的冠状动脉分割Mariusz Bujny, Katarzyna Jesionek, Jakub Nalepa, Karol Miszalski-Jamka, Katarzyna Widawka-Żak, Sabina Wolny, Marcin Kosturarxiv.org/pdf/2403.02…null
2024-03-04When do Convolutional Neural Networks Stop Learning?卷积神经网络什么时候停止学习?Sahan Ahmad, Gabriel Trahan, Aminul Islamarxiv.org/pdf/2403.02…link
2024-03-04Anatomically Constrained Tractography of the Fetal Brain胎儿大脑的解剖学约束纤维束成像Camilo Calixto, Camilo Jaimes, Matheus D. Soldatelli, Simon K. Warfield, Ali Gholipour, Davood Karimiarxiv.org/pdf/2403.02…null
2024-03-04NiNformer: A Network in Network Transformer with Token Mixing Generated Gating FunctionNiNformer:具有令牌混合生成门控功能的网络变压器中的网络Abdullah Nazhat Abdullah, Tarkan Aydinarxiv.org/pdf/2403.02…link
2024-03-04Brand Visibility in Packaging: A Deep Learning Approach for Logo Detection, Saliency-Map Prediction, and Logo Placement Analysis包装中的品牌可见度:用于徽标检测、显着图预测和徽标放置分析的深度学习方法Alireza Hosseini, Kiana Hooshanfar, Pouria Omrani, Reza Toosi, Ramin Toosi, Zahra Ebrahimian, Mohammad Ali Akhaeearxiv.org/pdf/2403.02…link
2024-03-04RegionGPT: Towards Region Understanding Vision Language ModelRegionGPT:迈向区域理解视觉语言模型Qiushan Guo, Shalini De Mello, Hongxu Yin, Wonmin Byeon, Ka Chun Cheung, Yizhou Yu, Ping Luo, Sifei Liuarxiv.org/pdf/2403.02…null
2024-03-04Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training对比区域指导:无需训练即可改善视觉语言模型的基础David Wan, Jaemin Cho, Elias Stengel-Eskin, Mohit Bansalarxiv.org/pdf/2403.02…null
2024-03-04Bayesian Uncertainty Estimation by Hamiltonian Monte Carlo: Applications to Cardiac MRI Segmentation哈密​​顿蒙特卡罗贝叶斯不确定性估计:在心脏 MRI 分割中的应用Yidong Zhao, Joao Tourais, Iain Pierce, Christian Nitsche, Thomas A. Treibel, Sebastian Weingärtner, Artur M. Schweidtmann, Qian Taoarxiv.org/pdf/2403.02…null
2024-03-04Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like ArchitecturesVision-RWKV:使用类似 RWKV 的架构实现高效且可扩展的视觉感知Yuchen Duan, Weiyun Wang, Zhe Chen, Xizhou Zhu, Lewei Lu, Tong Lu, Yu Qiao, Hongsheng Li, Jifeng Dai, Wenhai Wangarxiv.org/pdf/2403.02…link
2024-03-04Harnessing Intra-group Variations Via a Population-Level Context for Pathology Detection通过群体水平背景利用组内变异进行病理检测P. Bilha Githinji, Xi Yuan, Zhenglin Chen, Ijaz Gul, Dingqi Shang, Wen Liang, Jianming Deng, Dan Zeng, Dongmei yu, Chenggang Yan, et.al.arxiv.org/pdf/2403.02…null
2024-03-04REAL-Colon: A dataset for developing real-world AI applications in colonoscopyREAL-Colon:用于开发结肠镜检查中真实人工智能应用的数据集Carlo Biffi, Giulio Antonelli, Sebastian Bernhofer, Cesare Hassan, Daizen Hirata, Mineo Iwatate, Andreas Maieron, Pietro Salvagnini, Andrea Cherubiniarxiv.org/pdf/2403.02…link
2024-03-04MiM-ISTD: Mamba-in-Mamba for Efficient Infrared Small Target DetectionMiM-ISTD:用于高效红外小目标检测的 Mamba-in-MambaTianxiang Chen, Zhentao Tan, Tao Gong, Qi Chu, Yue Wu, Bin Liu, Jieping Ye, Nenghai Yuarxiv.org/pdf/2403.02…null
2024-03-04Self-Supervised Facial Representation Learning with Facial Region Awareness具有面部区域意识的自监督面部表征学习Zheng Gao, Ioannis Patrasarxiv.org/pdf/2403.02…null
2024-03-04LOCR: Location-Guided Transformer for Optical Character RecognitionLOCR:用于光学字符识别的位置引导变压器Yu Sun, Dongzhan Zhou, Chen Lin, Conghui He, Wanli Ouyang, Han-Sen Zhongarxiv.org/pdf/2403.02…null
2024-03-04VTG-GPT: Tuning-Free Zero-Shot Video Temporal Grounding with GPTVTG-GPT:使用 GPT 的免调整零镜头视频临时接地Yifang Xu, Yunzhuo Sun, Zien Xie, Benxiang Zhai, Sidan Duarxiv.org/pdf/2403.02…link
2024-03-04DiffMOT: A Real-time Diffusion-based Multiple Object Tracker with Non-linear PredictionDiffMOT:具有非线性预测的基于扩散的实时多目标跟踪器Weiyi Lv, Yuhang Huang, Ning Zhang, Ruei-Sung Lin, Mei Han, Dan Zengarxiv.org/pdf/2403.02…null
2024-03-04HyperPredict: Estimating Hyperparameter Effects for Instance-Specific Regularization in Deformable Image RegistrationHyperPredict:估计可变形图像配准中实例特定正则化的超参数效应Aisha L. Shuaibu, Ivor J. A. Simpsonarxiv.org/pdf/2403.02…link
2024-03-04A Generative Approach for Wikipedia-Scale Visual Entity Recognition维基百科规模视觉实体识别的生成方法Mathilde Caron, Ahmet Iscen, Alireza Fathi, Cordelia Schmidarxiv.org/pdf/2403.02…null
2024-03-04Scalable Vision-Based 3D Object Detection and Monocular Depth Estimation for Autonomous Driving适用于自动驾驶的可扩展视觉 3D 物体检测和单目深度估计Yuxuan Liuarxiv.org/pdf/2403.02…link
2024-03-04Leveraging Anchor-based LiDAR 3D Object Detection via Point Assisted Sample Selection通过点辅助样本选择利用基于锚点的 LiDAR 3D 物体检测Shitao Chen, Haolin Zhang, Nanning Zhengarxiv.org/pdf/2403.01…link
2024-03-04Explicit Motion Handling and Interactive Prompting for Video Camouflaged Object Detection用于视频伪装物体检测的显式运动处理和交互式提示Xin Zhang, Tao Xiao, Gepeng Ji, Xuan Wu, Keren Fu, Qijun Zhaoarxiv.org/pdf/2403.01…null
2024-03-04Enhancing Information Maximization with Distance-Aware Contrastive Learning for Source-Free Cross-Domain Few-Shot Learning通过距离感知对比学习增强信息最大化,实现无源跨域少样本学习Huali Xu, Li Liu, Shuaifeng Zhi, Shaojing Fu, Zhuo Su, Ming-Ming Cheng, Yongxiang Liuarxiv.org/pdf/2403.01…link
2024-03-05Fourier-basis Functions to Bridge Augmentation Gap: Rethinking Frequency Augmentation in Image Classification弥合增强差距的傅里叶基函数:重新思考图像分类中的频率增强Puru Vaish, Shunxin Wang, Nicola Strisciuglioarxiv.org/pdf/2403.01…null
2024-03-04xT: Nested Tokenization for Larger Context in Large ImagesxT:大图像中更大上下文的嵌套标记化Ritwik Gupta, Shufan Li, Tyler Zhu, Jitendra Malik, Trevor Darrell, Karttikeya Mangalamarxiv.org/pdf/2403.01…null
2024-03-04Map-aided annotation for pole base detection用于杆基检测的地图辅助注释Benjamin Missaoui, Maxime Noizet, Philippe Xuarxiv.org/pdf/2403.01…null
2024-03-04FreeA: Human-object Interaction Detection using Free Annotation LabelsFreeA:使用免费注释标签进行人机交互检测Yuxiao Wang, Zhenao Wei, Xinyu Jiang, Yu Lei, Weiying Xue, Jinxiu Liu, Qi Liuarxiv.org/pdf/2403.01…null
2024-03-04AllSpark: Reborn Labeled Features from Unlabeled in Transformer for Semi-Supervised Semantic SegmentationAllSpark:从 Transformer 中未标记的特征中重生,用于半监督语义分割Haonan Wang, Qixiang Zhang, Yi Li, Xiaomeng Liarxiv.org/pdf/2403.01…link
2024-03-04A Simple Baseline for Efficient Hand Mesh Reconstruction高效手部网格重建的简单基线Zhishan Zhou, Shihao. zhou, Zhi Lv, Minqiang Zou, Yao Tang, Jiajun Liangarxiv.org/pdf/2403.01…null
2024-03-04PointCore: Efficient Unsupervised Point Cloud Anomaly Detector Using Local-Global FeaturesPointCore:使用局部-全局特征的高效无监督点云异常检测器Baozhu Zhao, Qiwei Xiong, Xiaohan Zhang, Jingfeng Guo, Qi Liu, Xiaofen Xing, Xiangmin Xuarxiv.org/pdf/2403.01…null
2024-03-04RankED: Addressing Imbalance and Uncertainty in Edge Detection Using Ranking-based LossesRankED:使用基于排名的损失解决边缘检测中的不平衡和不确定性Bedrettin Cetinkaya, Sinan Kalkan, Emre Akbasarxiv.org/pdf/2403.01…null
2024-03-04Exposing the Deception: Uncovering More Forgery Clues for Deepfake Detection揭露欺骗:发现更多用于 Deepfake 检测的伪造线索Zhongjie Ba, Qingyu Liu, Zhenguang Liu, Shuang Wu, Feng Lin, Li Lu, Kui Renarxiv.org/pdf/2403.01…link
2024-03-04Integrating Efficient Optimal Transport and Functional Maps For Unsupervised Shape Correspondence Learning集成高效的最优传输和功能图以实现无监督形状对应学习Tung Le, Khai Nguyen, Shanlin Sun, Nhat Ho, Xiaohui Xiearxiv.org/pdf/2403.01…null
2024-03-05Attention Guidance Mechanism for Handwritten Mathematical Expression Recognition手写数学表达式识别的注意力引导机制Yutian Liu, Wenjun Ke, Jianguo Weiarxiv.org/pdf/2403.01…null
2024-03-04Training-Free Pretrained Model Merging免训练预训练模型合并Zhengqi Xu, Ke Yuan, Huiqiong Wang, Yong Wang, Mingli Song, Jie Songarxiv.org/pdf/2403.01…link
2024-03-04Lightweight Object Detection: A Study Based on YOLOv7 Integrated with ShuffleNetv2 and Vision Transformer轻量级目标检测:基于YOLOv7结合ShuffleNetv2和Vision Transformer的研究Wenkai Gongarxiv.org/pdf/2403.01…null
2024-03-04RISeg: Robot Interactive Object Segmentation via Body Frame-Invariant FeaturesRISeg:通过身体框架不变特征进行机器人交互式对象分割Howard H. Qian, Yangxiao Lu, Kejia Ren, Gaotian Wang, Ninad Khargonkar, Yu Xiang, Kaiyu Hangarxiv.org/pdf/2403.01…null
2024-03-04MCA: Moment Channel Attention NetworksMCA:时刻通道注意力网络Yangbo Jiang, Zhiwei Jiang, Le Han, Zenan Huang, Nenggan Zhengarxiv.org/pdf/2403.01…null
2024-03-04PI-AstroDeconv: A Physics-Informed Unsupervised Learning Method for Astronomical Image DeconvolutionPI-AstroDeconv:一种基于物理的天文图像反卷积无监督学习方法Shulei Ni, Yisheng Qiu, Yunchun Chen, Zihao Song, Hao Chen, Xuejian Jiang, Huaxi Chenarxiv.org/pdf/2403.01…null
2024-03-04Zero-shot Generalizable Incremental Learning for Vision-Language Object Detection用于视觉语言目标检测的零样本可推广增量学习Jieren Deng, Haojian Zhang, Kun Ding, Jianhua Hu, Xingxuan Zhang, Yunkuan Wangarxiv.org/pdf/2403.01…null
2024-03-04PillarGen: Enhancing Radar Point Cloud Density and Quality via Pillar-based Point Generation NetworkPillarGen:通过基于 Pillar 的点生成网络增强雷达点云密度和质量Jisong Kim, Geonho Bang, Kwangjin Choi, Minjae Seong, Jaechang Yoo, Eunjong Pyo, Jun Won Choiarxiv.org/pdf/2403.01…null

图像理解

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-04Iterative Occlusion-Aware Light Field Depth Estimation using 4D Geometrical Cues使用 4D 几何线索进行迭代遮挡感知光场深度估计Rui Lourenço, Lucas Thomaz, Eduardo A. B. Silva, Sergio M. M. Fariaarxiv.org/pdf/2403.02…null
2024-03-04DD-VNB: A Depth-based Dual-Loop Framework for Real-time Visually Navigated BronchoscopyDD-VNB:基于深度的实时视觉导航支气管镜双环框架Qingyao Tian, Huai Liao, Xinyan Huang, Jian Chen, Zihui Zhang, Bingyu Yang, Sebastien Ourselin, Hongbin Liuarxiv.org/pdf/2403.01…null

Transformer

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-04A Spatio-temporal Aligned SUNet Model for Low-light Video Enhancement用于低光视频增强的时空对齐 SUNet 模型Ruirui Lin, Nantheera Anantrasirichai, Alexandra Malyugina, David Bullarxiv.org/pdf/2403.02…null
2024-03-05Neural Redshift: Random Networks are not Random Functions神经红移:随机网络不是随机函数Damien Teney, Armand Nicolicioiu, Valentin Hartmann, Ehsan Abbasnejadarxiv.org/pdf/2403.02…null
2024-03-04TripoSR: Fast 3D Object Reconstruction from a Single ImageTripoSR:从单个图像快速重建 3D 对象Dmitry Tochilkin, David Pankratz, Zexiang Liu, Zixuan Huang, Adam Letts, Yangguang Li, Ding Liang, Christian Laforte, Varun Jampani, Yan-Pei Caoarxiv.org/pdf/2403.02…link
2024-03-04Position Paper: Towards Implicit Prompt For Text-To-Image Models立场文件:走向文本到图像模型的隐式提示Yue Yang, Yuqi lin, Hong Liu, Wenqi Shao, Runjian Chen, Hailong Shang, Yu Wang, Yu Qiao, Kaipeng Zhang, Ping Luoarxiv.org/pdf/2403.02…null
2024-03-04DEMOS: Dynamic Environment Motion Synthesis in 3D Scenes via Local Spherical-BEV Perception演示:通过局部球形 BEV 感知在 3D 场景中进行动态环境运动合成Jingyu Gong, Min Wang, Wentao Liu, Chen Qian, Zhizhong Zhang, Yuan Xie, Lizhuang Maarxiv.org/pdf/2403.01…null
2024-03-043D Hand Reconstruction via Aggregating Intra and Inter Graphs Guided by Prior Knowledge for Hand-Object Interaction Scenario在手-物体交互场景的先验知识的指导下,通过聚合帧内图和帧间图来重建 3D 手部Feng Shuang, Wenbo He, Shaodong Liarxiv.org/pdf/2403.01…null

3D/CG

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-04A dataset of over one thousand computed tomography scans of battery cells超过一千次电池计算机断层扫描的数据集Amariah Condon, Bailey Buscarino, Eric Moch, William J. Sehnert, Owen Miles, Patrick K. Herring, Peter M. Attiaarxiv.org/pdf/2403.02…null
2024-03-04Twisting Lids Off with Two Hands用两只手拧开盖子Toru Lin, Zhao-Heng Yin, Haozhi Qi, Pieter Abbeel, Jitendra Malikarxiv.org/pdf/2403.02…null
2024-03-04Physics-Informed Learning for Time-Resolved Angiographic Contrast Agent Concentration Reconstruction用于时间分辨血管造影造影剂浓度重建的物理知情学习Noah Maul, Annette Birkhold, Fabian Wagner, Mareike Thies, Maximilian Rohleder, Philipp Berg, Markus Kowarschik, Andreas Maierarxiv.org/pdf/2403.01…null
2024-03-05Tree Counting by Bridging 3D Point Clouds with Imagery通过将 3D 点云与图像桥接来进行树木计数Lei Li, Tianfang Zhang, Zhongyu Jiang, Cheng-Yen Yang, Jenq-Neng Hwang, Stefan Oehmcke, Dimitri Pierre Johannes Gominski, Fabian Gieseke, Christian Igelarxiv.org/pdf/2403.01…null
2024-03-04AiSDF: Structure-aware Neural Signed Distance Fields in Indoor ScenesAiSDF:室内场景中的结构感知神经符号距离场Jaehoon Jang, Inha Lee, Minje Kim, Kyungdon Jooarxiv.org/pdf/2403.01…null

各类学习方式

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-04Superpixel Graph Contrastive Clustering with Semantic-Invariant Augmentations for Hyperspectral Images高光谱图像的超像素图对比聚类与语义不变增强Jianhan Qi, Yuheng Jia, Hui Liu, Junhui Houarxiv.org/pdf/2403.01…null
2024-03-04Open-world Machine Learning: A Review and New Outlooks开放世界机器学习:回顾与新展望Fei Zhu, Shijie Ma, Zhen Cheng, Xu-Yao Zhang, Zhaoxiang Zhang, Cheng-Lin Liuarxiv.org/pdf/2403.01…null

其他

Publish DateTitleTitle_CNAuthorsPDFCode
2024-03-04Towards Calibrated Deep Clustering Network迈向校准深度聚类网络Yuheng Jia, Jianhong Cheng, Hui Liu, Junhui Houarxiv.org/pdf/2403.02…null
2024-03-04Optimizing Illuminant Estimation in Dual-Exposure HDR Imaging优化双曝光 HDR 成像中的光源估计Mahmoud Afifi, Zhenhua Hu, Liang Liangarxiv.org/pdf/2403.02…null
2024-03-04Non-autoregressive Sequence-to-Sequence Vision-Language Models非自回归序列到序列视觉语言模型Kunyu Shi, Qi Dong, Luis Goncalves, Zhuowen Tu, Stefano Soattoarxiv.org/pdf/2403.02…null
2024-03-04Interpretable Models for Detecting and Monitoring Elevated Intracranial Pressure用于检测和监测颅内压升高的可解释模型Darryl Hannan, Steven C. Nesbit, Ximing Wen, Glen Smith, Qiao Zhang, Alberto Goffi, Vincent Chan, Michael J. Morris, John C. Hunninghake, Nicholas E. Villalobos, et.al.arxiv.org/pdf/2403.02…null
2024-03-04Perceptive self-supervised learning network for noisy image watermark removal用于去除噪声图像水印的感知自监督学习网络Chunwei Tian, Menghua Zheng, Bo Li, Yanning Zhang, Shichao Zhang, David Zhangarxiv.org/pdf/2403.02…null
2024-03-04Multi-Spectral Remote Sensing Image Retrieval Using Geospatial Foundation Models使用地理空间基础模型的多光谱遥感图像检索Benedikt Blumenstiel, Viktoria Moor, Romeo Kienzler, Thomas Brunschwilerarxiv.org/pdf/2403.02…link
2024-03-04TTA-Nav: Test-time Adaptive Reconstruction for Point-Goal Navigation under Visual CorruptionsTTA-Nav:视觉损坏下点目标导航的测试时自适应重建Maytus Piriyajitakonkij, Mingfei Sun, Mengmi Zhang, Wei Panarxiv.org/pdf/2403.01…null
2024-03-04Advancing Gene Selection in Oncology: A Fusion of Deep Learning and Sparsity for Precision Gene Selection推进肿瘤学基因选择:深度学习和稀疏性的融合实现精准基因选择Akhila Krishna, Ravi Kant Gupta, Pranav Jeevan, Amit Sethiarxiv.org/pdf/2403.01…null
2024-03-04Revisiting Learning-based Video Motion Magnification for Real-time Processing重新审视基于学习的视频运动放大以进行实时处理Hyunwoo Ha, Oh Hyun-Bin, Kim Jun-Seong, Kwon Byung-Ki, Kim Sung-Bin, Linh-Tam Tran, Ji-Yun Kim, Sung-Ho Bae, Tae-Hyun Oharxiv.org/pdf/2403.01…null
2024-03-04PLACE: Adaptive Layout-Semantic Fusion for Semantic Image SynthesisPLACE:用于语义图像合成的自适应布局-语义融合Zhengyao Lv, Yuxiang Wei, Wangmeng Zuo, Kwan-Yee K. Wongarxiv.org/pdf/2403.01…link
2024-03-04One Prompt Word is Enough to Boost Adversarial Robustness for Pre-trained Vision-Language Models一个提示词足以提高预训练视觉语言模型的对抗鲁棒性Lin Li, Haoyan Guan, Jianing Qiu, Michael Spratlingarxiv.org/pdf/2403.01…link
2024-03-05AtomoVideo: High Fidelity Image-to-Video GenerationAtomoVideo:高保真图像到视频生成Litong Gong, Yiran Zhu, Weijie Li, Xiaoyang Kang, Biao Wang, Tiezheng Ge, Bo Zhengarxiv.org/pdf/2403.01…null
2024-03-04Improving Visual Perception of a Social Robot for Controlled and In-the-wild Human-robot Interaction改善社交机器人的视觉感知,以实现受控和野外人机交互Wangjie Zhong, Leimin Tian, Duy Tho Le, Hamid Rezatofighiarxiv.org/pdf/2403.01…null