There are lots of great resources that help data scientists land their first job, or learn about specific subjects. But I haven't seen any that focus on long-term professional development. I've been thinking about this topic, and here's what I think it involves. 👇
In a sense it's about constant improvement, but not just so that you can make more money or pass a test. It's about continuously breathing new life into your self-efficacy—giving yourself increasingly strong evidence that you can take on more responsibility and be more ambitious.
You're lucky if you can get this evidence in an organic way through the work you're already doing. But I have the impression that many data scientists, like me, eventually start to worry that certain skills are atrophying and that they are becoming over-specialized or complacent.
If you want to continually increase your self-efficacy, I think there are four areas to focus on:

1. Concepts
2. Skills
3. Tools
4. Resources
1. Concepts

Concepts are about what you know. Think:

- Bias-variance tradeoff
- Central limit theorem
- Curse of dimensionality
- Bayes’ theorem

Ideally, you are constantly deepening your knowledge of fundamental concepts, and branching out to new ones that interest you.
Some people do this by regularly reading papers or textbooks that force them to call fundamentals to mind. It's a good approach (much better than nothing!), but for me it lacks something crucial: testing yourself explicitly on your understanding.
I see two main ways to test yourself. One is through a spaced repetition practice, using software like Anki to help you stay sharp. The other is communication: explaining concepts to others through writing or speaking.
I'm especially excited by the prospect of evergreen note-taking as a practice for engaging more deeply with data science concepts. If you haven't heard of this idea, check out @andy_matuschak's notes on it: https://notes.andymatuschak.org/z4SDCZQeRo4xFEQ8H4qrSqd68ucpgE6LU155C
2. Skills

Skills are about what you can do. Think:

- Wrangle data
- Run statistical tests
- Write software packages
- Create data visualizations
- Train and validate models

Again, constantly go deeper on fundamentals and broader on your interests.
I see two main approaches to improving skills. One is to work on projects—ideally ones that are meaningful to you. The other is to regularly work through targeted exercises, like on Project Euler or Brilliant. The key is that you're *doing* something.
3. Tools

Tools are about what you use to do the things you do. Think:

- Specific programming languages
- Packages
- IDEs

It's hard to say whether there even *are* fundamentals when it comes to tools, since they change year by year. But you can't ignore them.
There's overlap here with concepts and skills—e.g. becoming a better Python programmer involves both learning new Python concepts and writing more Python code. So you can use spaced repetition, communication, projects, or targeted exercises. The key is to not neglect your tools.
4. Resources

Resources are about what you reference when you haven't stored important information in your brain. Think:

- Textbooks
- Online courses
- Cheatsheets
- Subscriptions
- Wikis

The goal here is to build yourself a "second brain", to borrow @fortelabs's phrase.
Improving your familiarity with data science resources is probably the easiest way to improve your professional self-efficacy, because it requires little effort but is a force multiplier for your efforts in learning about concepts, skills, and tools.
Essentially, you want to build yourself a central repository of the best resources on every relevant topic, so that you always know where to go to find information that you haven't memorized or to refresh your understanding on material that you feel rusty on.
So: concepts, skills, tools, and resources. My suggestion is that you should try not to neglect any one of these categories for too long. If you go a whole year without making an active effort to increase your conceptual understanding, something is probably missing.
But at the same time, you don't need to constantly work on them all in parallel. One or another might be more or less rewarding depending on your mood at a given time. Think of them like directions on a joystick.
There is more to say about the value of finding good mentors, where and how to get good feedback, and the value of connecting with a community of other data scientists. But, for now, the core idea is: try to constantly level up on concepts, skills, tools, and resources.
If you are a data scientist (or DS-adjacent), I would love to hear how you think about your professional development. Why does it matter to you? How do you go about it? What do you struggle with? What practices do you find most rewarding?
You can follow @davidklaing.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled:

By continuing to use the site, you are consenting to the use of cookies as explained in our Cookie Policy to improve your experience.