
A popular narrative at the end of last season.
This season Son has been on fire alongside Kane


Let me explain how this happened and why listening to such theories is a bad idea [THREAD]
This narrative probably started after Kane’s infamous hamstring injury heavily featured in the Amazon series.
With Kane on the sidelines, Son played amazingly and scored 4 goals in 5 games before his arm injury and the Covid-19 lockdown
With Kane on the sidelines, Son played amazingly and scored 4 goals in 5 games before his arm injury and the Covid-19 lockdown
After lockdown Kane was back and Son went 4 games without a single goal and looked abysmal.
By looking at the recent data at that time one might jump to the conclusion that Son was better off with Kane out of the team
By looking at the recent data at that time one might jump to the conclusion that Son was better off with Kane out of the team

What should we do when we read narratives like this based on small data samples?
I always like to look at a bigger picture and see if the pattern is repeatable over time
I always like to look at a bigger picture and see if the pattern is repeatable over time

The duo has played together for 5 years. If Son really is better without Kane then there should be more supporting evidence when we look at a bigger time frame than Son’s great run of 5 games in the first two months of 2020.
Let’s compare games were only Son started vs games were both Kane and Son started in the PL between 2017-2020.
Minutes played
w/ Kane: 4993 mins
wo/Kane: 1406 mins
Minutes played


Starting with Goal Involvements (GI) per 90 minutes.
w/ Kane: 0.79 GI
wo/Kane: 0.70 GI
Very similar results, Son's output were somewhat better with Kane


Very similar results, Son's output were somewhat better with Kane
Let's compare that with Expected Goal Involvements per 90 minutes.
w/ Kane: 0.57 xGI
wo/Kane: 0.51 xGI
The same pattern can be observed. Very similar results although Son's performance is slightly better with Kane.


The same pattern can be observed. Very similar results although Son's performance is slightly better with Kane.
Let's break it down to expected goals (xG) and assists (xA)
Goals
w/ Kane: 0.35 xG
wo/Kane: 0.40 xG
Assists
w/ Kane: 0.22 xA
wo/Kane: 0.11 xA
Slightly higher xG and lower xA without Kane
Goals


Assists


Slightly higher xG and lower xA without Kane
Excluding penalties and looking at non-penalty xG
w/ Kane: 0.35 npxG
wo/Kane: 0.35 npxG
Son has the exact same xG performance with and without Kane if we take penalties out of the equation.


Son has the exact same xG performance with and without Kane if we take penalties out of the equation.
Son's small increase in xG due to taking penalties seems to be nullified by the decrease in xA without Kane.
The differences are very small overall, so it would be reasonable to conclude that there is practically no difference in Son's performance playing with or without Kane.
The differences are very small overall, so it would be reasonable to conclude that there is practically no difference in Son's performance playing with or without Kane.
Son has played 1406 mins wo/ Kane between 2017-20. If we divide that into smaller blocks by looking at limited time frames we get unreliable results.
E.g. If we divide the 3-year period into 6 “half a seasons” we can see why looking at sample sizes like this is not a great idea
E.g. If we divide the 3-year period into 6 “half a seasons” we can see why looking at sample sizes like this is not a great idea
Looking at xGI per 90 minutes for each of the 6 “half a seasons” we can see that the conclusion would be very different for each time we looked at the data.
The conclusion would alternate between Son being better and worse without Kane.
The conclusion would alternate between Son being better and worse without Kane.
If we were to use either one of these small time frames to predict the next we would be wrong every time.
The same data presented as a bar chart.
Looking at no. 6 (last half of 19/20) Son performed two times better in games without Kane.
Looking at no. 6 (last half of 19/20) Son performed two times better in games without Kane.
In this time frame Son started only 5 games without Kane where 78% of his xG performance came in two games against the worst defenses in the league (Aston Villa and Norwich).
This clearly displays why looking for patterns in small sample sizes is ill adviced.
Unless the pattern is backed up by long term supporting evidence or very strong practical reasons, we should give them very little or no weight in our decision-making process.
Unless the pattern is backed up by long term supporting evidence or very strong practical reasons, we should give them very little or no weight in our decision-making process.
CONCLUSION
Son is not better without Kane
Do not look at patterns in small sample sizes
If new theories like this arise, try and look at the bigger picture



If you liked this thread you might like this one as well: https://twitter.com/yonkersfpl/status/1309231347060240384?s=20
SOURCE: http://FBref.com