Examples
As described in juejin.cn/post/723154…, the output of generative learning is structured stuff.
- Sentence: composed of tokens (character in Chinese, word piece in English), e.g., ChatGPT
- Image: composed of pixels, e.g., midjourney
- Video: Imagen Video
- Audio: InstructTTS
- Sound: AudioLDM
AI that can use tools
- New Bing
- WebGPT
- Toolformer
Strategies
- Autoregressive (AR) model:“各个击破”
- Non-autoregressive (NAR) model:“一次到位”
Comparison
Combination
-- this is actually the basic concept of diffusion model