Reinforcement Learning: An Introduction to the Concepts, Applications and Code

阿新 • • 發佈：2018-12-28

Reinforcement Learning: An Introduction to the Concepts, Applications and Code

Part 1: An introduction to reinforcement learning, explaining common terms, concepts and applications.

In this series of reinforcement learning blog posts, I will be trying to create a simplified explanation of the concepts required to understand reinforcement learning and their applications. In this initial post, I highlight some of the main concepts and terminology in reinforcement learning. These concepts will be further explained in future blog posts with the applications and implementations in real-world problems.

Reinforcement Learning

Reinforcement learning (RL) can be viewed as an approach which falls between supervised and unsupervised learning. It is not strictly supervised as it does not rely only on a set of labelled training data but is not unsupervised learning because we have a reward which we want our agent to maximise. The agent needs to find the “right” actions to take in different situations to achieve its overall goal.

Reinforcement learning is the science of decision making.

Reinforcement learning involves no supervisor and only a reward signal is used for an agent to determine if they are doing well or not. Time is a key component in RL where the process is sequential with delayed feedback. Each action the agent makes affects the next data it receives.

Reinforcement Learning applied to Atari games by DeepMind

What is the reinforcement learning problem?

So far we have said that the agent needs to find the “right” action. The right action depends on the rewards.

Reward: The reward Rₜ is a scalar feedback signal which indicates how well the agent is doing at step time t.

In reinforcement learning we need define our problem such that it can be applied to satisfy our reward hypothesis. An example would be playing a game of chess where the agent gets a positive reward for winning a game and a negative reward for losing a game.

Reward Hypothesis: All goals can be described by the maximisation of expected cumulative reward.

Since our process involves sequential decision making tasks, our actions we make early on may have a long-term consequence on our overall goal. Sometimes it may be better to sacrifice immediate reward (reward at time step Rₜ) to gain more long-term reward. An example applied to chess would be to sacrifice a pawn to capture a rook at a later stage.

Goal: The goal is to select actions to maximise total future reward.

Reinforcement Learning: An Introduction to the Concepts, Applications and Code

Reinforcement Learning: An Introduction to the Concepts, Applications and Code

Part 1: An introduction to reinforcement learning, explaining common terms, concepts and applications.

Reinforcement Learning

What is the reinforcement learning problem?

Reinforcement Learning: An Introduction to the Concepts, Applications and Code

Reinforcement Learning An Introduction~Reinforcement Learning

Reinforcement Learning An Introduction~Examples

Reinforcement Learning An Introduction~Elements of Reinforcement Learning

An Introduction to the Risks of AI for General Counsel Corporate Counsel

Reinforcement Learning An Introduction~Limitations and Scope

強化學習導論(Reinforcement Learning: An Introduction)讀書筆記(一)：強化學習介紹

An introduction to Redis data types and abstractions

Gentle Introduction to the Adam Optimization Algorithm for Deep Learning

An Introduction to Flutter: The Basics

An Introduction to Deep Learning and Neural Networks

An introduction to pmemobj (part 1) - accessing the persistent memory

An introduction to IBM Cloud Log Analysis with the IBM Cloud Kubernetes Service

An introduction to parsing text in Haskell with Parsec

Note 1 for <Pratical Programming : An Introduction to Computer Science Using Python 3>

Note 2 for <Pratical Programming : An Introduction to Computer Science Using Python 3>

條件隨機場介紹（2）—— An Introduction to Conditional Random Fields

條件隨機場介紹（1）—— An Introduction to Conditional Random Fields

條件隨機場介紹（5）—— An Introduction to Conditional Random Fields

條件隨機場介紹（6）—— An Introduction to Conditional Random Fields

Reinforcement Learning: An Introduction to the Concepts, Applications and Code

Reinforcement Learning: An Introduction to the Concepts, Applications and Code

Part 1: An introduction to reinforcement learning, explaining common terms, concepts and applications.

Reinforcement Learning

What is the reinforcement learning problem?

相關推薦