汇总了一些 AIGC 相关的研究论文(5.29-6.4),感兴趣的朋友可以看一下,都是精心挑选的前沿论文哟~
Reconstructing the Mind's Eye: fMRI-to-Image with Contrastive Learning and Diffusion Priors arxiv.org/abs/2305.18…
Break-A-Scene: Extracting Multiple Concepts from a Single Image arxiv.org/abs/2305.16…
RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths arxiv.org/abs/2305.18…
Generating Images with Multimodal Language Models arxiv.org/abs/2305.17…
Photoswap: Personalized Subject Swapping in Images arxiv.org/abs/2305.18…
GlyphControl: Glyph Conditional Control for Visual Text Generation arxiv.org/abs/2305.18…
A Neural Space-Time Representation for Text-to-Image Personalization arxiv.org/abs/2305.15…
PaLI-X: On Scaling up a Multilingual Vision and Language Model arxiv.org/abs/2305.18…
Video Colorization with Pre-trained Text-to-Image Diffusion Models arxiv.org/abs/2306.01…
Multilingual Conceptual Coverage in Text-to-Image Models arxiv.org/abs/2306.01…
Accelerating science with human-aware artificial intelligence arxiv.org/abs/2306.01…
TimelineQA: A Benchmark for Question Answering over Timelines arxiv.org/abs/2306.01…
Diffusion Self-Guidance for Controllable Image Generation arxiv.org/abs/2306.00…
StableRep: Synthetic Images from Text-to-Image Models Make Strong Visual Representation Learners arxiv.org/abs/2306.00…
AI Imagery and the Overton Window arxiv.org/abs/2306.00…
Diffusion Brush: A Latent Diffusion Model-based Editing Tool for AI-generated Images arxiv.org/abs/2306.00…
Analysis of ChatGPT on Source Code arxiv.org/abs/2306.00…
ReviewerGPT? An Exploratory Study on Using Large Language Models for Paper Reviewing arxiv.org/abs/2306.00…
LIV: Language-Image Representations and Rewards for Robotic Control arxiv.org/abs/2306.00…
ViCo: Detail-Preserving Visual Condition for Personalized Text-to-Image Generation arxiv.org/abs/2306.00…
SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds arxiv.org/abs/2306.00…
StyleDrop: Text-to-Image Generation in Any Style arxiv.org/abs/2306.00…
Thought Cloning: Learning to Think while Acting by Imitating Human Thinking arxiv.org/abs/2306.00…
STEVE-1: A Generative Model for Text-to-Behavior in Minecraft arxiv.org/abs/2306.00…
Controllable Text-to-Image Generation with GPT-4 arxiv.org/abs/2305.18…
StyleAvatar3D: Leveraging Image-Text Diffusion Models for High-Fidelity 3D Avatar Generation arxiv.org/abs/2305.19…
Unsupervised Melody-to-Lyric Generation arxiv.org/abs/2305.19…
Make-A-Voice: Unified Voice Synthesis With Discrete Representation arxiv.org/abs/2305.19…
Intent-aligned AI systems deplete human agency: the need for agency foundations research in AI safety arxiv.org/abs/2305.19…
Photoswap: Personalized Subject Swapping in Images arxiv.org/abs/2305.18…
Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models arxiv.org/abs/2305.16…
Control-A-Video: Controllable Text-to-Video Generation with Diffusion Models arxiv.org/abs/2305.13…
PandaGPT: One Model To Instruction-Follow Them All arxiv.org/abs/2305.16…
Enhancing Detail Preservation for Customized Text-to-Image Generation: A Regularization-Free Approach arxiv.org/abs/2305.13…