《强化学习》part 1 总结

2019-09-01 102 阅读1分钟

我们将 reinforcement learning 方法划分为四个维度：

宽度 simple update / expected update
深度 one-step look ahead / full step look ahead
on policy / off policy
function approximation （将在part 2 详细介绍）