Thread by @dqmonn, I think Reinforcement Learning is one of the most fun applications of [...]

Dom

dqmonn

Read on Twitter

I think Reinforcement Learning is one of the most fun applications of ML. Train an algorithm by rewarding it

Here's how it works, what it's good for

In traditional (supervised) ML, you train an algorithm on a dataset.

You have 100k data points and their respective labels, all that's left is to fit an algorithm to it. Still impressive, but old news

What if you didn't have that data beforehand? Instead, you'd have to learn "on the job"?

With Reinforcement Learning, you throw an algorithm into an environment.

At each step the algorithm gets an observation of the environment – an image or some measurements – and then needs to decide what to do.

At first, the algorithm will act quite randomly, just exploring the scene.

But over time, it will reach a goal and claim a reward (a score point, usually). Now the algorithm knows it was on the right track!

The earliest example of this I remember is when I built an RL agent to navigate a maze in Doom.

For each second it took, the agent lost a point. But when the agent cleared the maze, it gained 100!

The first 10,000 iterations or so, the agent navigated randomly and didn't receive any scores.

But then, out of pure luck, it reached the end and understood the goal!

Suddenly, the agent navigated more confidently.

Skip to 100,000 iterations later and the agent had worked out a strategy to clear the maze, no matter where it started.

The true super power here is that those 100,000 plays just took an hour or so. For humans, it'd have taken 34 days to do as many

Same principle applies to things like @DeepMind AlphaStar. Humans have a head start, but there's only so much StarCraft they can play in a day.

An algorithm can digest thousands of playing hours in seconds!

No call to action here, but this is an amazing watch about AlphaStar, the AI that beats professional StarCraft players at their own game

You can follow @dqmonn.

Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: