Thread by @KoalatyStats, People creating predictions using statistical models in football is fantastic. I love [...]

Joseph Bryan

KoalatyStats

People creating predictions using statistical models in football is fantastic. I love to see it, but you should *always* want to see their “out of sample” R^2. #thread

If you are unfamiliar, OOS R^2 is how the model explains the variance of data that the model *has never seen before.* In simple terms, it’s how good the model is at *actually* predicting a result.

Your normal/adjusted R^2 could be great, but if the OOS is trash, the model itself isn’t worth much since it can’t predict anything it’s never seen before.

For example, say we are creating a model to predict weekly fantasy points using data from 2014 to 2019. You would want to train your model on about 70% of that. Leaving the other 30% on the side until you have completed your model.

After completing your model, you would *test* your model on this 30% you set aside. This 30% is new data to your model. It’s just like a new week in the NFL that the model has never seen before.

If your model does a good job at predicting that 30%, then you have yourself a good model. OOS R^2 is *critical* and if you ever see someone post predictions using a model, you should ask for the OOS R^2 if they didn’t post it.

You can follow @KoalatyStats.

Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: