!UPDATED -- 2024-01-01
分类/检测/识别/分割
| Publish Date | Title | Authors | Code | |
|---|---|---|---|---|
| 2024-01-01 | Directional Antenna Systems for Long-Range Through-Wall Human Activity Recognition | Julian Strohmayer, Martin Kampel | arxiv.org/abs/2401.01… | null |
| 2024-01-01 | DiffAugment: Diffusion based Long-Tailed Visual Relationship Recognition | Parul Gupta, Tuan Nguyen, Abhinav Dhall, Munawar Hayat, Trung Le, Thanh-Toan Do | arxiv.org/abs/2401.01… | null |
| 2024-01-01 | Tissue Artifact Segmentation and Severity Analysis for Automated Diagnosis Using Whole Slide Images | Galib Muhammad Shahriar Himel | arxiv.org/abs/2401.01… | null |
| 2024-01-01 | Efficient Multi-domain Text Recognition Deep Neural Network Parameterization with Residual Adapters | Jiayou Chao, Wei Zhu | arxiv.org/abs/2401.00… | link |
| 2024-01-01 | Data Augmentation Techniques for Cross-Domain WiFi CSI-based Human Activity Recognition | Julian Strohmayer, Martin Kampel | arxiv.org/abs/2401.00… | null |
| 2024-01-01 | Skeleton2vec: A Self-supervised Learning Framework with Contextualized Target Representations for Skeleton Sequence | Ruizhuo Xu, Linzhi Huang, Mei Wang, Jiani Hu, Weihong Deng | arxiv.org/abs/2401.00… | link |
| 2024-01-01 | MultiFusionNet: Multilayer Multimodal Fusion of Deep Neural Networks for Chest X-Ray Image Classification | Saurabh Agarwal, K. V. Arya, Yogesh Kumar Meena | arxiv.org/abs/2401.00… | null |
| 2024-01-01 | BRAU-Net++: U-Shaped Hybrid CNN-Transformer Network for Medical Image Segmentation | Libin Lan, Pengzhou Cai, Lu Jiang, Xiaojuan Liu, Yongmei Li, Yudong Zhang | arxiv.org/abs/2401.00… | link |
| 2024-01-01 | Depth Map Denoising Network and Lightweight Fusion Network for Enhanced 3D Face Recognition | Ruizhuo Xu, Ke Wang, Chao Deng, Mei Wang, Xi Chen, Wenhui Huang, Junlan Feng, Weihong Deng | arxiv.org/abs/2401.00… | null |
| 2024-01-03 | Credible Teacher for Semi-Supervised Object Detection in Open Scene | Jingyu Zhuang, Kuo Wang, Liang Lin, Guanbin Li | arxiv.org/abs/2401.00… | null |
| 2024-01-01 | Self-supervised learning for skin cancer diagnosis with limited training data | Hamish Haggerty, Rohitash Chandra | arxiv.org/abs/2401.00… | link |
| 2024-01-01 | 1st Place Solution for 5th LSVOS Challenge: Referring Video Object Segmentation | Zhuoyan Luo, Yicheng Xiao, Yong Liu, Yitong Wang, Yansong Tang, Xiu Li, Yujiu Yang | arxiv.org/abs/2401.00… | link |
Transformer
| Publish Date | Title | Authors | Code | |
|---|---|---|---|---|
| 2024-01-01 | Boundary Attention: Learning to Find Faint Boundaries at Any Resolution | Mia Gaia Polansky, Charles Herrmann, Junhwa Hur, Deqing Sun, Dor Verbin, Todd Zickler | arxiv.org/abs/2401.00… | null |
| 2024-01-04 | Accurate Leukocyte Detection Based on Deformable-DETR and Multi-Level Feature Fusion for Aiding Diagnosis of Blood Diseases | Yifei Chen, Chenyan Zhang, Ben Chen, Yiyu Huang, Yifei Sun, Changmiao Wang, Xianjun Fu, Yuxing Dai, Feiwei Qin, Yong Peng, et.al. | arxiv.org/abs/2401.00… | link |
| 2024-01-01 | ScatterFormer: Efficient Voxel Transformer with Scattered Linear Attention | Chenhang He, Ruihuang Li, Guowen Zhang, Lei Zhang | arxiv.org/abs/2401.00… | link |
| 2024-01-01 | Mocap Everyone Everywhere: Lightweight Motion Capture With Smartwatches and a Head-Mounted Camera | Jiye Lee, Hanbyul Joo | arxiv.org/abs/2401.00… | null |
| 2024-01-01 | Rethinking RAFT for Efficient Optical Flow | Navid Eslami, Farnoosh Arefi, Amir M. Mansourian, Shohreh Kasaei | arxiv.org/abs/2401.00… | link |
| 2024-01-01 | Beyond Subspace Isolation: Many-to-Many Transformer for Light Field Image Super-resolution | Zeke Zexi Hu, Xiaoming Chen, Vera Yuk Ying Chung, Yiran Shen | arxiv.org/abs/2401.00… | null |
| 2024-01-01 | Optimizing ADMM and Over-Relaxed ADMM Parameters for Linear Quadratic Problems | Jintao Song, Wenqi Lu, Yunwen Lei, Yuchao Tang, Zhenkuan Pan, Jinming Duan | arxiv.org/abs/2401.00… | null |
| 2024-01-01 | Towards Improved Proxy-based Deep Metric Learning via Data-Augmented Domain Adaptation | Li Ren, Chen Chen, Liqiang Wang, Kien Hua | arxiv.org/abs/2401.00… | link |
生成模型
| Publish Date | Title | Authors | Code | |
|---|---|---|---|---|
| 2024-01-01 | DiffMorph: Text-less Image Morphing with Diffusion Models | Shounak Chatterjee | arxiv.org/abs/2401.00… | null |
| 2024-01-01 | Diffusion Models, Image Super-Resolution And Everything: A Survey | Brian B. Moser, Arundhati S. Shanbhag, Federico Raue, Stanislav Frolov, Sebastian Palacio, Andreas Dengel | arxiv.org/abs/2401.00… | null |
| 2024-01-01 | An attempt to generate new bridge types from latent space of generative adversarial network | Hongjun Zhang | arxiv.org/abs/2401.00… | link |
| 2024-01-01 | From Covert Hiding to Visual Editing: Robust Generative Video Steganography | Xueying Mao, Xiaoxiao Hu, Wanli Peng, Zhenliang Gan, Qichao Ying, Zhenxing Qian, Sheng Li, Xinpeng Zhang | arxiv.org/abs/2401.00… | null |
| 2024-01-02 | GD^2-NeRF: Generative Detail Compensation via GAN and Diffusion for One-shot Generalizable Neural Radiance Fields | Xiao Pan, Zongxin Yang, Shuai Bai, Yi Yang | arxiv.org/abs/2401.00… | null |
多模态
| Publish Date | Title | Authors | Code | |
|---|---|---|---|---|
| 2024-01-01 | Exploring Multi-Modal Control in Music-Driven Dance Generation | Ronghui Li, Yuqin Dai, Yachao Zhang, Jun Li, Jian Yang, Jie Guo, Xiu Li | arxiv.org/abs/2401.01… | null |
| 2024-01-01 | COSMO: COntrastive Streamlined MultimOdal Model with Interleaved Pre-Training | Alex Jinpeng Wang, Linjie Li, Kevin Qinghong Lin, Jianfeng Wang, Kevin Lin, Zhengyuan Yang, Lijuan Wang, Mike Zheng Shou | arxiv.org/abs/2401.00… | null |
| 2024-01-03 | Retrieval-Augmented Egocentric Video Captioning | Jilan Xu, Yifei Huang, Junlin Hou, Guo Chen, Yuejie Zhang, Rui Feng, Weidi Xie | arxiv.org/abs/2401.00… | null |
3D相关
| Publish Date | Title | Authors | Code | |
|---|---|---|---|---|
| 2024-01-01 | GenH2R: Learning Generalizable Human-to-Robot Handover via Scalable Simulation, Demonstration, and Imitation | Zifan Wang, Junyu Chen, Ziqing Chen, Pengwei Xie, Rui Chen, Li Yi | arxiv.org/abs/2401.00… | null |
| 2024-01-01 | Deblurring 3D Gaussian Splatting | Byeonghyeon Lee, Howoong Lee, Xiangyu Sun, Usman Ali, Eunbyung Park | arxiv.org/abs/2401.00… | null |
| 2024-01-01 | Sharp-NeRF: Grid-based Fast Deblurring Neural Radiance Fields Using Sharpness Prior | Byeonghyeon Lee, Howoong Lee, Usman Ali, Eunbyung Park | arxiv.org/abs/2401.00… | link |
| 2024-01-01 | GLIMPSE: Generalized Local Imaging with MLPs | AmirEhsan Khorashadizadeh, Valentin Debarnot, Tianlin Liu, Ivan Dokmanić | arxiv.org/abs/2401.00… | null |
| 2024-01-01 | NightRain: Nighttime Video Deraining via Adaptive-Rain-Removal and Adaptive-Correction | Beibei Lin, Yeying Jin, Wending Yan, Wei Ye, Yuan Yuan, Shunli Zhang, Robby Tan | arxiv.org/abs/2401.00… | null |
| 2024-01-01 | Text2Avatar: Text to 3D Human Avatar Generation with Codebook-Driven Body Controllable Attribute | Chaoqun Gong, Yuqin Dai, Ronghui Li, Achun Bao, Jun Li, Jian Yang, Yachao Zhang, Xiu Li | arxiv.org/abs/2401.00… | null |
| 2024-01-01 | Geometry Depth Consistency in RGBD Relative Pose Estimation | Sourav Kumar, Chiang-Heng Chien, Benjamin Kimia | arxiv.org/abs/2401.00… | null |
GNN
| Publish Date | Title | Authors | Code | |
|---|---|---|---|---|
| 2024-01-01 | Predicting Infant Brain Connectivity with Federated Multi-Trajectory GNNs using Scarce Data | Michalis Pistos, Islem Rekik | arxiv.org/abs/2401.01… | null |
其他
| Publish Date | Title | Authors | Code | |
|---|---|---|---|---|
| 2024-01-01 | Backdoor Attack on Unpaired Medical Image-Text Foundation Models: A Pilot Study on MedCLIP | Ruinan Jin, Chun-Yin Huang, Chenyu You, Xiaoxiao Li | arxiv.org/abs/2401.01… | null |
| 2024-01-01 | Refining Pre-Trained Motion Models | Xinglong Sun, Adam W. Harley, Leonidas J. Guibas | arxiv.org/abs/2401.00… | null |
| 2024-01-01 | Bracketing is All You Need: Unifying Image Restoration and Enhancement Tasks with Multi-Exposure Images | Zhilu Zhang, Shuohao Zhang, Renlong Wu, Zifei Yan, Wangmeng Zuo | arxiv.org/abs/2401.00… | link |
| 2024-01-01 | New Job, New Gender? Measuring the Social Bias in Image Generation Models | Wenxuan Wang, Haonan Bai, Jen-tse Huang, Yuxuan Wan, Youliang Yuan, Haoyi Qiu, Nanyun Peng, Michael R. Lyu | arxiv.org/abs/2401.00… | null |
| 2024-01-01 | Revisiting Nonlocal Self-Similarity from Continuous Representation | Yisi Luo, Xile Zhao, Deyu Meng | arxiv.org/abs/2401.00… | null |
| 2024-01-01 | Towards Efficient and Effective Text-to-Video Retrieval with Coarse-to-Fine Visual Representation Learning | Kaibin Tian, Yanhua Cheng, Yi Liu, Xinglin Hou, Quan Chen, Han Li | arxiv.org/abs/2401.00… | null |
| 2024-01-01 | PROMPT-IML: Image Manipulation Localization with Pre-trained Foundation Models Through Prompt Tuning | Xuntao Liu, Yuzhou Yang, Qichao Ying, Zhenxing Qian, Xinpeng Zhang, Sheng Li | arxiv.org/abs/2401.00… | null |