Facts vs. Proxies

In sports analytics, it's important to be able to distinguish between metrics that are facts and metrics that are proxies.

A thread 👇
1/ Put simply:

- A metric is fact if it indisputably quantifies what it's intended to quantify

- A metric is a proxy if it’s intended to at least roughly represent something that is difficult or impossible to quantify
2/ First of all, it's important to understand that whether a metric is a fact or a proxy depends not only on the metric itself, but on what the metric is being used to quantify.
3/ A metric could be a fact when used to quantify one thing and a proxy when used to quantify something else.
4/ Take the metric Points Scored in 🏀 as an example:

If we use Points Scored to quantify the # of points scored by a player, the metric is a fact.

If we use Points Scored to quantify a player's overall offensive skill, the metric is a proxy.
5/ Points Scored indisputably quantifies how many points were scored, there's no arguing with that.

On the other hand, Points Scored does not indisputably quantify a player's overall offensive talent. We can debate how representative it is of overall offensive talent.
6/ Let's look at a common example of a proxy metric: Power Rankings

Power Rankings are an attempt to quantify the relative strengths of teams in a league and can be constructed in a number of ways, with or without analytics.
7/ However, as good as a Power Ranking may be, it’s not purely factual ranking.

The relative strengths of teams in a league is too abstract to indisputably quantify.
8/ We can all agree that a team won 10 games or scored 50 points. Those are facts.

We may not all agree about how strong a team is relative to its competitors.
9/ Of course, just because Power Rankings aren’t statements of fact doesn’t mean they aren’t valuable or useful.

Power Rankings can still be useful proxies for the relative strengths of teams.
10/ As you delve further into advanced sports analytics metrics, you’ll find that essentially all advanced metrics are proxies.

Metrics like Win Probability, Expected Goals, & Wins Above Replacement (WAR) are all proxies.
11/ If all we did was look at facts (like Points Scored), analytics wouldn't be all that useful or interesting.

Analytics becomes more valuable when it helps us quantify (via proxies) things that are difficult to measure.
12/ Yes, proxies will never give us a definitive answer to "What is the overall offensive skill level of Player X?", but they might give us a decent estimate, and that's still incredibly valuable.
13/ Finally, it's critical to remember that because proxies are not purely factual, subjective human thought went into constructing the metrics and their relationship to whatever it is they’re quantifying.
14/ Always keep in mind that proxies inherently contain this subjective influence.

While analytical measures often help us cut through human biases, it’s also possible that human biases inform how a proxy metric is constructed.
You can follow @brendankent.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled:

By continuing to use the site, you are consenting to the use of cookies as explained in our Cookie Policy to improve your experience.