I decided to try something new for #TidyTuesday. I am always really interested in participating, but find myself getting absorbed and ignoring work tasks, so I decided to set a timer for myself... 1/?
I started with a 30 minute timer, and tried to do the most interesting thing I could think of. I started with some exploratory #dataviz of the penguin data and realized some observations had missing sex information. 2/?
I decided to use {tidymodels} to fit a simple model to the data to predict the missing sex information for the 9 penguins with complete data other than the sex variable. 3/?
I managed to get all the way through fitting the model and predicting the sex of the 9 penguins before my timer ran out! 4/?
My time running out corresponded with a morning meeting, so I put the penguins aside for a while. I came back after my meeting, and started a stopwatch with the goal of making a meaningful summary viz of my model in a short period of time (< 30 mins.) 5/?
I ended up with these two scatterplots, visualizing the two most important variables in the classification problem on the axes, and then coloring & faceting by sex and species. I stopped my stopwatch at 20 mins, 20 secs. 6/?
Looking at the 2nd plot, it's clear where the model is separating male/female penguins!

The script for my analysis is up on my GitHub: https://github.com/sctyner/tidytuesday/blob/master/data/2020/2020-07-28/thirty-minutes.R

Thanks for reading!
7/7
You can follow @sctyner.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled:

By continuing to use the site, you are consenting to the use of cookies as explained in our Cookie Policy to improve your experience.