People creating predictions using statistical models in football is fantastic. I love to see it, but you should *always* want to see their “out of sample” R^2. #thread
If you are unfamiliar, OOS R^2 is how the model explains the variance of data that the model *has never seen before.* In simple terms, it’s how good the model is at *actually* predicting a result.
Your normal/adjusted R^2 could be great, but if the OOS is trash, the model itself isn’t worth much since it can’t predict anything it’s never seen before.
For example, say we are creating a model to predict weekly fantasy points using data from 2014 to 2019. You would want to train your model on about 70% of that. Leaving the other 30% on the side until you have completed your model.
After completing your model, you would *test* your model on this 30% you set aside. This 30% is new data to your model. It’s just like a new week in the NFL that the model has never seen before.
If your model does a good job at predicting that 30%, then you have yourself a good model. OOS R^2 is *critical* and if you ever see someone post predictions using a model, you should ask for the OOS R^2 if they didn’t post it.
You can follow @KoalatyStats.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled:

By continuing to use the site, you are consenting to the use of cookies as explained in our Cookie Policy to improve your experience.