Karthik Nair

March 27, 2017

Deep Reinforcement Learning(Q-Learning)


Deep Learning was first seen in 1965, when a paper related to multilayer perceptron was published by Ivakhneno and Lap was published. Fast Forward to 2015 (Yaa just recently) Google published Tensorflow one of amazing frameworks i have ever seen, it not only enables you to manipulate data in 3-D but also provides flexibility to handle data at a much larger level

Problem Statement

Ping-Pong is a thing which seriously 90's kid would remember.We all have played this in a nintendo at one of our friend's place( ofcourse you would never have one ....). Well, Computer Scientist took it to whole different level.They thought why dont we assign one player as a bot which would learn from the other bot which they call Evil AI. The catch here in this setup is Evil AI is perfect i.e the ball would always hit the center of its pad.

We have to figure out an algorithm which beats the Evil AI (Ofcourse! there is a loophole)

Approach

What is Reinforcement Learning ?: Lets take an example , you have a dog named Pablo(He is a BoneLord).You have taken the responsibility to train Pablo.As a normal trainner what would you do ? You would give Pablo a biscuit whenever he does a good job and punish him when he does a bad one.

Thats how we roll here , our Algorithm will be trained in a Agent Environment Loop and it will reward the bot if it tries to counter the AI and depreciated when it moves away from its goal as a result our bot gets better over time.

How we apply Deep Learning ?: We will be taking inputs in form of pixel from the screen and analyze it in Convolutional Neurak Networks.We will be setting an Intial Epsilon(starting state ) and a Final Epsilon(final state).Activation function used in this case is Rectified Linear Unit (ReLU) as we have to predict output which is greater than one.So as we procced in processing the Convolutional Neural Network we get an output tensor and we will compare this output tensor of this frame with the output tensor of previous frame and we will reward or depreciate as per the performance of the bot.Tensorflow makes such complex task easy to handle checkout my explaination of Tensorflow

Conclusion

As the game proceeds the bot which we created try to match the Evil AI and at some point it will try to place ball at one of corner of opposite side(Loophole!!). Remember the Evil AI was programmed in such a way that the ball should hit middle of its pad but if bot places it into one of corner then it cannot play further and thats how you beat the Evil AI.


Thank you for Reading ! Hope you like it.

Peace Out !!!


Github Link to project

TAGS: Deep Learning - Reinforcement Learning- Tensorflow- Convolutional Neural Networks


# Back