Python – DANIEL SLATER'S BLOG

Book: Python Deep Learing

Me and my co-authors recently finished work on a book called Python Deep Learning. It is now available on amazon

The book aims to give a broad introduction to deep learning and show how to implement and use various techniques in Python. It includes examples of many applications of deep learning, including image recognition, speech recognition, anomaly detection in financial data.

My 2 chapters focus on my particular interest in using deep learning to play games. I’ve included examples of building AI in Python with Tensorflow that can master Pong, Breakout and Go.

If you are interested, this code KVGRSF30 gives a 30% discount for the e-book version from the publishers website.

Neural Networks, Python, Reinforcement Learning

AlphaToe

AlphaGo

Is an AI developed by Google Deepmind that recently became the first machine to beat a top level human Go player.

AlphaToe

Is an attempt to apply the same techniques used in AlphaGo to Tic-Tac-Toe. Why? I hear you ask. Tic-tac-toe is a very simple game and can be solved using basic min-max.

Because it’s a good platform to experiment with some of the AlphaGo techniques which it turns out they work at this scale. Also the neural networks involved can also be trained on my laptop in under an hour as opposed too the weeks on an array of super computers that AlphaGo required.

The project is written in Python using TensorFlow, the Github is here https://github.com/DanielSlater/AlphaToe and contains code for each step that AlphaGo used in it’s learning. It also contains code for Connect 4 and this ability to build games of Tic-Tac-Toe on larger boards.

Here is a sneak peak at how it did in the 3×3 game. In this graph it is training as first player and gets too an 85% win rate against a random opponent after 300000 games.

I will do a longer write up of this at some point, but in the mean time here is a talk I did about AlphaToe at a recent DataScienceFestival event in London. Which gives a broad overview of the project:

Python, Reinforcement Learning

PyDataLondon 2016

Last week I gave a talk at PyDataLondon 2016 hosted at the Bloomberg offices in central London. If you don’t know anything about PyData it is an community of Python data science enthusiasts that run various meetups and conferences across the world. If your interested in that sort of thing and they are running something near to you I would highly recommend checking it out.

Below is the YouTube video for my talk and this is the associated GitHub, which includes all the example code.

The complete collection of talks from the conference is here. The standard across the board was very high, but if you only have time to watch a few, of those I saw here are two that you might find interesting.

Vincent D Warmerdam – The Duct Tape of Heroes Bayesian statistics

Bayesian statistics is a fascinating subject with many applications. If your trying to understand deep learning at a certain point research papers such as Auto-Encoding Variational Bayes and Auxiliary Deep Generative Models will stop making any kind of sense unless you have a good understanding of Bayesian statistics(and even if you do it can still be a struggle). This video works as a good introduction to the subject. His blog is also quite good.

Geoffrey French & Calvin Giles – Deep learning tutorial – advanced techniques

This has a good overview of useful techniques, mostly around computer vision(though they could be applied in other areas). Such as computing the saliency of inputs in determining a classification and getting good classifications when there when there is only limited labelled data.

Ricardo Pio Monti – Modelling a text corpus using Deep Boltzmann Machines in python

This gives a good explanation of how a Restricted/Deep Boltzmann Machine works and then shows an interesting application where a Deep Boltzmann Machine was used to cluster groups of research papers.

Python, Reinforcement Learning

Mini-Pong and Half-Pong

I’m going to be giving a talk/tutorial at PyDataLondon 2016 on Friday the 6th of may, if your in London that weekend I would recommend going, there are going to be lots of interesting talks, and if you do go please say hi.

My talk is going to be a hands on, on how to build a pong playing AI, using Q-learning, step by step. Unfortunately training the agents even for very simple games still takes ages and I really wanted to have something training while I do the talk, so I’ve built two little games that I hope should train a bit faster.

Mini-Pong

This a version of pong with some of visual noise stripped out, no on screen score, no lines around the board. Also when you start you can pass args for the screen width and height and the game play should scale with these. This means you can run it as an 80×80 size screen(or even 40×40) and save to having to do the downsizing of the image when processing.

Half-Pong

This is an even kinder game than pong. There is only the players paddle and you get points just for hitting the other side of the screen. I’ve found that if you fiddle with the parameters you can start to see reasonable performance in the game with an hour of training(results may vary, massively). That said even after significant training the kinds of results I see are a some way off how well google deepmind report doing. Possibly they are using other tricks not reported in the paper, or just lots of hyper parameter tuning, or there are still more bugs in my implementation(entirely possible, if anyone finds any please submit).

I’ve also checked in some checkpoints of a trained half pong player, if anyone just wants to quickly see it running. Simply run this, from the examples directory.

https://gist.github.com/DanielSlater/bc8c56d95967c0fcc9860a468f9364b0.js

It performs significantly better than random, though still looks pretty bad compared to a human.

Distance from building our future robot overlords, still significant.