Deriving Home Field Advantage (HFA) in football, in terms of Elo points, from bookmakers closing odds a Thread.
My work was inspired by the following 2 great papers:
1) http://www.collective-behavior.com/publ/ELO.pdf
2) https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5988281/pdf/pone.0198668.pdf
My work was inspired by the following 2 great papers:
1) http://www.collective-behavior.com/publ/ELO.pdf
2) https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5988281/pdf/pone.0198668.pdf
The authors derived Elo ratings for football teams using bookmaker´s odds. Nevertheless in both papers HFA=80 for all Leagues and for all seasons.
I rather think this value changes over time and it should be specific for every competition (just like @clubelo does)
I rather think this value changes over time and it should be specific for every competition (just like @clubelo does)
Thus I have implemented my own procedure to derive HFA in a dynamic way. All steps involved in the preliminary process of collecting and analsysing data are described in the figure hereby attached
By averaging dr for each league and each season we can calculate HFA from odds. A few considerations:
-) HFA is slightly different from one competion to another
-) HFA can change from one season to another
-) Effect of Covid-19 on HFA is quite big as probably expected
-) HFA is slightly different from one competion to another
-) HFA can change from one season to another
-) Effect of Covid-19 on HFA is quite big as probably expected
So with reference to the papers mentioned before, using the actual HFA rather than a fixed value would already be an improvement, especially considering that HFA=80 is surely an overestimation. But we can actually improve this even further by calculating HFA dynamically
For each match in a league we have a dr value which beside Elo values of Home and Away team incorporates also HFA. Instead of calculating a yearly average we could calculate moving (rolling) average to follow HFA developlement over time .
The important question is how many matches to include in the calculation (subset size). Shorter-term moving average gives a more actual HFA evaluation, but it is more sensitive to random flactuations and to match-schedule/fixture.
An example of HFA rolling-average evaluation for all Premier League matches in the database is shown below. Results for 3 different subsets are shown: 5 matchdays (n=50),10 matchdays (n=100) and 20 matchdays (n=200). After some tests I decided to set n=150 as the best trade-off
So finally this is how dynamic-HFA with n=150 looks like for the BIG 5 european Leagues since August 2014 after applying a smoother function. I have heard many times that bookies often overlook HFA, but data seem to suggest otherwise. END