Wednesday 1 November 2017

Building a recommendation engine for Dota2

This post builds a recommendation engine that suggests heroes for a Dota2 player. This was one in a 3 set requirements to be completed. For more about Dota2 read on wikipedia and this. Two other requirements include player comparison and a leader board.

The application that was built looks like this.

Requirement: Dota2 has a set of 113 heroes and a huge community of players. The requirement was to suggest a hero for a given player. The recommendation engine should take into account play history of the specific player and heroes. Dota2 open API documentation is available at and the application must use this as a source of data for recommendations. The API for the solution must be available as REST and also on a web page. 

Approach: The solution is built as a collaborative item-to-item recommender. The data set for training and building the recommender is available from this api. An example for this api call to dota2 looks like this The result of this query has the hero play history (30 days back) for the player with id 87568060. For a list of players this can be utilised to build a data set. The advantage of using this api query is that it fits the recommender well as it has the play history for the heroes (by the players). And by using the data for all the players it contributes to the collaborative nature of the recommender. The resulting data set has the following columns and rows for each player to hero game plays.

player_id -> steam id for a player  
hero_id  -> id of the hero
plays -> number of times that this player played the hero.

This information is used to build a co-occurence matrix of all heroes to the player's heroes. This matrix is normalised using Jaccard index and a sorted weighted sum is applied to get the list of recommendations. Dota2 open api allows only 3 requests per second. So memcached is used to cache results of the external dota2 api call. The data set is divided into training and test respectively. The matrix is built on the training data. The matrix is built as follows

The disadvantage of this approach is that there needs to be a play history in place. Otherwise the matrix would be zero. This is the same for a new hero. One approach is to suggest a set of easy to play heroes to beginners. Again, this implementation builds the matrix for a subset of the players or for the whole community. The first case can be modified to build the matrix for only play history of beginners which would take into account hero plays of other beginners (collaborative).

Tools used:

Python 3.5.3
Django 1.10 & Django REST Framework
Pandas 0.20.3
Postgresql 9.5


For a good basic introduction to recommender systems see

No comments: