一周必读|AIGC前沿论文汇总(要跟不上AI的狂飙速度了)

523 阅读1分钟

汇总了一些 AIGC 相关的研究论文(5.29-6.4),感兴趣的朋友可以看一下,都是精心挑选的前沿论文哟~

Reconstructing the Mind's Eye: fMRI-to-Image with Contrastive Learning and Diffusion Priors arxiv.org/abs/2305.18…

Break-A-Scene: Extracting Multiple Concepts from a Single Image arxiv.org/abs/2305.16…

RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths arxiv.org/abs/2305.18…

Generating Images with Multimodal Language Models arxiv.org/abs/2305.17…

Photoswap: Personalized Subject Swapping in Images arxiv.org/abs/2305.18…

GlyphControl: Glyph Conditional Control for Visual Text Generation arxiv.org/abs/2305.18…

A Neural Space-Time Representation for Text-to-Image Personalization arxiv.org/abs/2305.15…

PaLI-X: On Scaling up a Multilingual Vision and Language Model arxiv.org/abs/2305.18…

Video Colorization with Pre-trained Text-to-Image Diffusion Models arxiv.org/abs/2306.01…

Multilingual Conceptual Coverage in Text-to-Image Models arxiv.org/abs/2306.01…

Accelerating science with human-aware artificial intelligence arxiv.org/abs/2306.01…

TimelineQA: A Benchmark for Question Answering over Timelines arxiv.org/abs/2306.01…

Diffusion Self-Guidance for Controllable Image Generation arxiv.org/abs/2306.00…

StableRep: Synthetic Images from Text-to-Image Models Make Strong Visual Representation Learners arxiv.org/abs/2306.00…

AI Imagery and the Overton Window arxiv.org/abs/2306.00…

Diffusion Brush: A Latent Diffusion Model-based Editing Tool for AI-generated Images arxiv.org/abs/2306.00…

Analysis of ChatGPT on Source Code arxiv.org/abs/2306.00…

ReviewerGPT? An Exploratory Study on Using Large Language Models for Paper Reviewing arxiv.org/abs/2306.00…

LIV: Language-Image Representations and Rewards for Robotic Control arxiv.org/abs/2306.00…

ViCo: Detail-Preserving Visual Condition for Personalized Text-to-Image Generation arxiv.org/abs/2306.00…

SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds arxiv.org/abs/2306.00…

StyleDrop: Text-to-Image Generation in Any Style arxiv.org/abs/2306.00…

Thought Cloning: Learning to Think while Acting by Imitating Human Thinking arxiv.org/abs/2306.00…

STEVE-1: A Generative Model for Text-to-Behavior in Minecraft arxiv.org/abs/2306.00…

Controllable Text-to-Image Generation with GPT-4 arxiv.org/abs/2305.18…

StyleAvatar3D: Leveraging Image-Text Diffusion Models for High-Fidelity 3D Avatar Generation arxiv.org/abs/2305.19…

Unsupervised Melody-to-Lyric Generation arxiv.org/abs/2305.19…

Make-A-Voice: Unified Voice Synthesis With Discrete Representation arxiv.org/abs/2305.19…

Intent-aligned AI systems deplete human agency: the need for agency foundations research in AI safety arxiv.org/abs/2305.19…

Photoswap: Personalized Subject Swapping in Images arxiv.org/abs/2305.18…

Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models arxiv.org/abs/2305.16…

Control-A-Video: Controllable Text-to-Video Generation with Diffusion Models arxiv.org/abs/2305.13…

PandaGPT: One Model To Instruction-Follow Them All arxiv.org/abs/2305.16…

Enhancing Detail Preservation for Customized Text-to-Image Generation: A Regularization-Free Approach arxiv.org/abs/2305.13…