I promised to keep you informed on the progress of our League of Legends betting project, so here we are. Today I give you an article outlining our approach in finding Value Bets in League of Legends (LoL) tournaments using a rating system.
I have already outlined the basics of developing a rating system in my article about Rating Models. Now I am going to report on my own experience doing this in practice, our approach, the tools used and the work involved.
Why League of Legends Betting?
I am regularly talking about ‘us’ instead of ‘me’ in this project, since I am doing it together with a friend. I think it is important to involve someone with practical understanding of the sport instead of just crunching data, the meaning of which you don’t quite understand. Furthermore, it’s a lot more fun to run a project with a friend instead of doing it all on your own. My friend is a specialist in LoL (League of Legends, one of the most popular eSports) so it was an obvious choice.
We decided to focus on the European and the North American leagues since those are the ones he follows most closely. Luckily, LoL runs all year round, so this was another plus.
But, on top of that, LoL fit the criteria for suitable sport/competition I have outlined in my article on betting models. It is niche, relatively new and quickly growing. You can find a quick overview of the sport in my article on League of Legends betting. Since the sport is electronic, there must also be vast amounts of data available out there, which makes it perfect to model. At least in theory…
Soon after we have decided on the sport, we started looking for data. We checked a few familiar websites, and found a few new ones. Of course, we started with the websites offering free data. One that publishes workable data is Oracle’s Elixir. There are some gaps in certain leagues, where the data was less detailed than for others, but it is a start.
The man behind Oracle’s Elixir Tim Sevenhuysen is professionally involved with the industry and actually gave a talk to ESPN on the data available for analysing eSports. Sadly, the talk is less than encouraging for those interested in eSports data. It turns out Riot Games (the company owning LoL) is not too generous in giving away data from LoL games. Surprisingly, some leagues in more traditional sports are much more advanced in that sense (such as, say, MLB for baseball).
Anyway, we thought Oracle’s Elixir has a good variety of data and also the blog offers some great insights into the stats. The website even offers a simple calculator for computing winning probabilities in-game based on gold and dragon differential. Hopefully we will be able to improve on that after analyzing the data in depth. The data series extends all the way back to 2016. It does not sound like a lot at first glance, but for such a young sport, it is actually pretty decent. However, one should not forget about the:
Speaking of building models with in-game data for live League of Legends betting, here we already see the instability of the averages in LoL and just how much new patches influence the game. Riot constantly tries to tweak the game to the desired direction, be it to make the game more dynamic or to balance some overpowered champions. This has a significant influence over the game metrics. You could think of LoL as a sport, the rules are which are changing every few months. It would be hard for any model working on a long series of data to stay on top of this, but we will do our best.
Just to give a few examples for this phenomenon I will quote some summary statistics that show how the game has changed in time. Unfortunately, some patches have too few games to judge their influence over certain metrics, so I compiled moving averages (of the last 500 / 1000 games) to show you the effect that rules changes were having throughout time:
Below I will look at some key metrics and how they changed throughout time. I will go through Game Duration, Gold Earned, Importance of Player Roles, Blue Side Advantage and the Relationship between Gold Differential and Winning Probability.
One obvious development (and one that was intended by Riot) is the reduction in total game duration throughout time. Recording the gold differential at the 15th minute mark was chosen by Oracle’s Elixir for the reason that the 15th minute could have been considered the transition point from early to mid/late game. However, the average game duration has been decreasing ever since. One could argue, that in order to make a consistent comparison of the early game states throughout the years one must multiply the gold @ 15min differential by some sort of a game duration factor.
The gold earned per game has also been steadily increasing, which should be expected as the game duration was getting lower. However, one metric that can perhaps be a bit more stable would be gold earned per minute. The numbers there look as follows:
Even though some trends are noticeable, this is a relatively stable metric that can be used in a long-term model, perhaps with slight adjustments.
Importance of player roles
In order to assign a rating to a single player’s performance you need to correctly assess what is expected from this player in his role in a given game. This is similar to assigning ratings to defenders and attackers in football. You cannot compare them on the number of goals scored, since the expectations from the forward in that regard would be higher.
Therefore, it would be useful to see what share of the total gold has been earned by players in a given role:
You see how in time the relative importance of the Mid player, the Jungler and the Support was decreasing. On the other hand, the AD Carry and the Top player have steadily increased their gold shares. Lately, the Top role has overshadowed the Mid one in terms of gold earned.
When compiling the total rating of a team, such developments would help us to decide what weighting to give to each of the five players’ ratings.
Do not compare apples to oranges
Note that this graph only tells us how the importance of certain role has developed compared to the same role in the past. Comparing roles between each other based on gold earned can be tricky. The Jungler is the second worst earning role, however is considered by many to be the most important role in a LoL team. The Jungler is crucial for map control and shines in the early phases of the game, while the ADC takes over when the game end approaches.
One possible solution to this issue is to use slightly adjusted or different metrics to measure the performance of different roles. While for a Jungler one can focus on the gold collected in the early game, for an ADC the total gold collected for the duration of the game might be a better indicator. Same for wards placed/destroyed by a Support player. Those are not too relevant for the total budget but have huge influence over the visibility and consequently, the map control of a team.
Starting corner (Blue advantage)
Now, LoL has its own version of the home field advantage and it is the side of the map each team is starting from. The two starting positions are the lower left and the upper right corner, commonly referred to as Blue and Red. These are also the respective team colours during the round.
The map is not fully symmetrical, so the starting position matters. There are different factors benefitting the one or the other side, but historically the blue team always had the advantage. Most importantly, the blue team has better vision of what is going on, since the LoL camera projects from a certain angle and so covers a wider area in the upper right corner of the screen than in the lower left one. You can find more detail on the side advantage in the following article and the development of the side advantage below:
Finally, it is interesting to observe the…
Relationship between gold differential and winning chance
It is clear at first glance that there must be some relationship between gold advantage and chance of winning. In LoL you can measure just about anything by gold – creep kills, champion kills and assists, brought down turrets and so on. A player who plays better is expected to win more gold than a one who plays badly. But just how strong is that relationship?
I have looked separately at the gold differentials at the 10th and at the 15th minute mark. Then I have clustered the data points into brackets of 1000. Of course, you have a smaller sample size for larger gold differences. Therefore, I only look at the brackets with more than 100 games for the 15th minute mark. Finally, I have put the 10th minute one next to it for comparison. Here are the results:
Obviously, the relationship between the difference in gold and the chance to win the game is high. This is a useful finding for in-play League of Legends betting. Furthermore, it must be taken into account for any model accounting for the early- and mid-game stage of a game.
Having looked at that game data, one component that was still missing was…
Odds data would be essential to backtest the profitability of a League of Legends betting model. You would like to see the odds you would have bet on, had you been placing the bets your model identifies. For us that meant Pinnacle odds. After all we are all painfully aware of the fact that all the soft books offering eSports would kick us out the moment they noticed we beat the closing line. Furthermore, Pinnacle have announced their strategic focus in eSports, so we might expect the best offering there.
In the best case we would like to have opening odds to see what we can bet on. The closing odds will help us evaluate if our model picks are beating the closing line. In that way we can confirm or reject the profitability of our model within a much shorter sample.
It turns out Pinnacle’s API stopped offering odds for eSports more than a year ago. In the comments, Blog-a-Bet users speculate that Pinnacle’s lines on eSports were not sharp enough. So, in order not to offer an edge to sharp punters for free, they had cut the odds feed for eSports. I am not sure if it is true, but it sounds like a plausible explanation. In any case, this was a heavy hit for our young project.
From here on we have several options.
- First, develop a model with no odds whatsoever. You don’t need odds to make a model calculating event outcome probabilities. The only issue is there is nothing to test your model against, so you basically start betting ‘blind’. This can be solved by collecting opening and closing odds manually. Eventually after a couple of hundred bets we can make a meaningful evaluation of the profitability of the model.
- Second, we can pay a scraper to do it for us. Will cost a minimum of several hundred $/€, possibly with additional future maintenance needed
- Third, we can build a scraper ourselves. We both have some programming knowledge, but will still require quite some time to get it up, and again might need some maintenance as we go.
EDIT (8.8.2019 09:32 ECT):
A reader has informed me, that in fact Pinnacle have very different business reasons for not providing the data in their API than their eSports lines being weak. It has something to do with their b2b offering and (dis-)agreements with business partners. So approach the above BlogaBet users’ speculation with caution. Unsurprisingly, those odds are not so easy to beat.
What do we do?
We decided to go for the first option for now. Therefore it is not quite clear how far we will go with this project. We don’t want to invest too much money in it, which puts the scraper off the table for now. Needless to say, manually collecting odds can be a tough task for someone working full time. So we will see how far we will get with that. But even some data is better than no data, so we will try this approach. As the project grows, it might make sense at a certain point to switch to option 3. But until then, we will focus on:
Finding further data
We will check a couple of other sources to see if we haven’t missed anything. Oracle’s Elixir data is pretty good as is. Still, there are still certain stats that we are missing (for example, cool stuff like jungler proximity). But, that is perhaps too much to ask for.
However, we should also be careful not to overdo it. I have already written in my article on Rating Models about a problem known as Overparametarization. This is the practice of over-complicating your model in order to have it better fit the observed results without necessarily improving its predictive power.
Some stats are just generally not that important and shouldn’t be given the same weight as other, more important ones. In our case, we decided that gold is the go-to metric for our model. We will probably abstain from including additional data points that are reflected in the gold earned. Such are the killls / deaths / creep score / etc.
Once we are done with that, we will be…
Organising our data into a database
If we use data from several sources (still not clear if we do that) we will need to combine the data, using the game-ID as the primary key. We might add some structure by doing tables for players, champions and other entities if we decide this would bring value to the model. This way we will avoid data redundancy and improve data quality. Putting the data into many tables can reflect poorly on query performance though. Therefore, we must be careful here.
League of Legends Betting Model
Only after we are done with the above can we proceed to the fun part – the modelling! We are contemplating improving our Python skills in order to be able to use certain ready statistical packages. As an added benefit we might learn how to scrape, since Python is often used for this purpose as well. But we must see if that is not an overkill. You can also build quite nice models in Excel as well. In fact, Andrew Mack’s Statistical Sports Models in Excel is lying on my desk and is next on my reading list. So I hope there I find some help with that.
Some Elo Ratings
Now, I am a beginner in Python but have some coding experience with VBA. So I decided to calculate some ELO Ratings for fun – to warm myself up for the challenge to come. Using a K-factor of 30, here is what I came up with:
These are the current top-4 leagues before this weekend’s games have ended. I have no idea if those ratings are profitable to bet on, I guess not. They are far from perfect, quite simplistic and disregard things like game importance. Meaning a game from the WC counts the same as a game from Rift Rivals.
But by and large they do make sense and are somewhat aligned to the only LoL ELO ratings I found online so far. Which means they are a good a starting point for building a model. We will have to test those against actual results, with improving prediction accuracy as the goal. One could opt for Least Squares or perhaps improving the Ranked Probability Score.
Of course, we have already made the first considerations for the model to be built. Or rather the models. Since there are different markets available there will also be different things we will want to measure.
League of Legends Betting Markets
The moneyline market is the obvious candidate (and the biggest / most liquid market). But there are also other (perhaps even more interesting) alternatives. Let’s have a close look at the markets at our disposal:
The two-way market is the biggest market on a LoL event. You could bet on the winner of one of the maps as well as of the event as a whole. The event win probabilities can be directly derived by the single maps, assuming map results are independent from each other. One can see that in a single event the win probabilities per map differ from each other. That is most probably due to Blue advantage (see above).
The overround for the bigger leagues (e.g. LCK, LEC, LCS) in Pinnacle currently varies around 4.8%.
This will probably be the main target of our modeling efforts. Even though it is the sharpest, it has a beatable margin and higher limits than some of the smaller markets. We will probably try to calculate win probabilities by calculating player ratings and summing them up to arrive at a team’s rating.
There are currently four types of prop bets on offer for the bigger LoL games. You have First Baron, First Dragon, First Blood, First Inhibitor, kills per player/team and a few other ones. I won’t look at each one separately. Instead, I will give an example for how I would go about handicapping one of those. Let us take the example of the First Blood market.
One of the props available for the bigger LoL games – the first blood – predicts which team will make the first kill in a game.
I have tried to calculate some basic stats for this market. It is perhaps useful to know how often does each role take the first blood in the game. For the whole data set the FB % look like this:
Then, one might want to know which champions are most likely to win the first blood. A differentiation per role might make sense, since players only pick champions who fit their role. For the most likely role to win the FB – the jungler – the FB champions look like this (>100 picks only):
Add to it the champion preferences of the team member who plays the jungle and you should get a decent estimation of his FB probability if he is playing a team of similar strength. For example, perhaps the most notorious FB taker across all leagues, Lira (Clutch Gaming), plays exclusively the top champions in this list.
Repeat the procedure for all 5 players and you have a good estimation of the FB probability of the team.
Of course, at this point you should also factor in the difference in strength between teams. Even though FB is the most random from all props, the stronger team still has a slight advantage. The advantage weighs in more for the First Dragon odds and the most for First Barron and First Inhibitor.
I haven’t thought further on the best methodology for that and haven’t I tested the profitability of the concept. But this is how I would probably approach First Blood betting for a start. For other prop bets perhaps an accent on the team instead of the player would be more suitable. Killing a baron / dragon / inhibitor is more of a team effort than taking the first blood.
For the prop bets in Pinnacle you would be looking at a margin of around 9-11%.
Those were the first baby steps of our project of building a Rating Model for League of Legends betting. In a few months’ time, I will report on the (hopefully tremendous) progress on our plan above.
Thanks for reading and if you want to find out more on the topic, please feel free to check out my articles on League of Legends betting basics and Rating Models. Furthermore, make sure to follow me on Twitter and subscribe my Newsletter (on the upper right corner of this page) to stay updated on new content. See you around!