So MuZero from @deepmind can learn to play a game that it doesn't know the rules of. What does that mean, exactly?

Well, imagine I am going to teach you the game of Go. I tell you there are black and white stones, and you can put them on the board or take them off. 1/
Then, I tell you to start, and every time you do something illegal I reverse it. And when a game finishes, I tell you if you won, or I did, or it was a tie.

Will you eventually know the rules of Go? Yes. Did anyone tell you the rules of Go? Not really, no. 2/
That's what MuZero does. So it doesn't eliminate the need for some piece of software to know the rules of the game, so as to be able to set out what is legal and not. But the rules can be external to the algorithm itself. 3/
Which, if you think about it, is how human beings learn grammar in their native languages. No one tells us the rules, they just tell us when we are doing things wrong. 4/
So let's say you have a complicated set of rules and you want to know if they are broken in some way. Write code that will know all the valid moves in a given state, and encode that as a game for MuZero to learn. Train it, and see how it behaves. 5/
If there is a dominant strategy, for example, MuZero will find it, and play it. Then you can change the rules to eliminate it. I wonder if game designers are doing this yet. 6/6
You can follow @RoundTableLaw.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled:

By continuing to use the site, you are consenting to the use of cookies as explained in our Cookie Policy to improve your experience.