Thread by @, People often ask me how to build better intuitions about different machine [...]

People often ask me how to build better intuitions about different machine learning and deep learning methods. This is a thread about my experience (as an NLP Researcher) building better intuitions of ML/deep learning methods, including resources and tips.

Overview -- Building intuitions about concepts related to a field requires investing a lot of time and effort. For ML, it is no different. In this thread, I will share a bit of my journey and personal experience building intuitions about DL/ML algorithms & new research ideas.

I don't claim that the tips I share here will work for everyone. Doing a Ph.D. gave me enough time to explore ways to dig deeper into topics, so the context matters. I had access to great advisors that provided me a learning path to be productive in learning and building things.

High-level overview -- Before jumping deep into ML, I took courses like data mining & text mining to build a high-level understanding of methods for building predictive systems. Having this background allowed me to spend time on the problems/methods that I found interesting.

Here are books that I used in my studies to get that **high-level overview**:

Artificial Intelligence: A Modern Approach

Data Mining: Concepts and Techniques

Text Mining: Predictive Methods for Analyzing Unstructured Information

Hands-on experience -- Using that high-level knowledge, I built and trained a lot of models from scratch using tools like R and Python. In order to better understand these models, I attempted to adapt them to different problems, including working on different datasets/tasks.

These books helped with getting that initial hands-on experience:

Data Science from Scratch

Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems

* I also dabbled with Kaggle during my studies.

Scope -- As I focused on a family of approaches that helped with the problems I was interested in (mostly NLP-related), I began to get interested in the inner works of these models. I used a combination of visualizations, coding, and math to help me build deeper intuition.

Math - Understanding the math behind ML models really helped me build enough intuitions to get comfortable experimenting with different ML models. I had a strong math background going into my graduate studies so I mostly had to refresh on statistics & advanced calculus...

These two books helped with improving my mathematical understanding of predictive models:

Pattern Recognition and Machine Learning (by Christopher M. Bishop)

The Elements of Statistical Learning (by Jerome H. Friedman, Robert Tibshirani, and Trevor Hastie)

Another book that more recently I have found to be an excellent resource for better understanding the mathematics behind machine learning is the following:

Mathematics for Machine Learning ( by Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong)

Visualization -- Understanding the mathematics behind machine learning models is a difficult process especially if you lack a background in math. The good news is that we have excellent tools to help us with this. Graphing and visualization tools really come in handy here.

In terms of visualization, the following skills help:
- Visualize/Understand different probability distributions
- Plotting 2D/3D charts (line chart, scatter plots, bar charts, heat maps, etc)

Visualization is a powerful tool to build intuition... get enough practice here.

Deep learning involves a lot of different types of data transformations. It's important to understand what these transformations do (e.g., dot product, softmax, ReLU, etc.) to get better intuitions about what these models are attempting to do. It's all about exploring here.

These days we have a variety of interactive tools that make it so much easier to plot charts, debug models, visualize weights and predictions, produce loss curves, etc. Check out a few interactive tools put together by @__MLT__: https://github.com/Machine-Learning-Tokyo/Interactive_Tools

Interpretability -- Even though this is not discussed as much, being able to understand/evaluate ML models features/predictions is not only a great way to bet better intuitions but also an important skill as you aim to push ML models into the real world for decision making.

There is a whole area of research around evaluating model explanations and interpreting machine learning models. Here is a book I found really useful to get more familiar with how to make black box models explainable:

Interpretable Machine Learning (by Christoph Molnar)

Tools -- The tools you use really shouldn't matter but the more tools you know the better. Just use what works for you or your team. I use Plotly, Pandas, scikitlearn, TensorFlow, and PyTorch. I explore a lot of deep learning models, so I tend to use PyTorch more often.

Overall, I build intuitions around ML methods by:
- reading key literature/establishing background/understanding theory
- running code (if available) & additional experiments
- analyzing/visualizing loss curves, weights, etc.
- analyzing/interpreting/explaining predictions, etc.

If this thread is helpful, I will improve it and publish it as an article in my blog, which you can find here: https://elvissaravia.substack.com/

Latest Threads Unrolled: