🧵 1/n A quick primer on GPT-3 for anyone who's heard about it but doesn't know what it is.

Why? GPT is a game-changer in AI that has the potential to disrupt a huge amount of areas, potentially leading to truly generalized AI problem solvers.
2/ GPT is a series of language-based machine learning models built by @OpenAI. The goal of language models is essentially text generation: look at a sentence → predict the next word(s).
3/ The premise behind the GPT models: how much data & computing power can you throw at an unsupervised deep learning model? What are the performance limits before you start getting diminishing returns?
4/ To do this, you have to design a model & build physical infrastructure so that these huge inputs of data + computing power are possible. This is much easier said than done.
5/ Whereas many models are *specific* like translation or chatbots, GPT is *generalized* taking in a very broad set of data & learning general patterns from it. It is unsupervised (or self-supervised) — i.e. no labeled data like domain-specific ML models such as image recognition
7/ Results for GPT2 were so impressive in fact that OpenAI didn't release the actual model @ first to prevent bad-actors from using it maliciously (breaking an unwritten rule in the usually open ML research community)
9/ As to the importance of hardware infrastructure: GPT-3 was trained using Microsoft's latest supercomputer with 285,000 CPUs, 10,000 GPUs and 400GB/s of network speed. Estimated cost of training was $12M.
10/ The generalization of the model is the real killer here. The SAME model can output customer service chats, music/movie recs, legal docs, summaries of sporting events, poetry, code functions, and complex descriptions from simple text prompts.
12/ For a bunch of amazing examples of what people are doing with this, check out this thread. https://twitter.com/xuenay/status/1283312640199196673
14/ GPT-3 isn't perfect of course — it has failure modes & wouldn't fully pass the Turing test. http://lacker.io/ai/2020/07/06/giving-gpt-3-a-turing-test.html
15/ What will GPT-5 or its equivalent look like? Knowledge is encoded in language & GPT is learning to "understand" how the world works by learning human communication patterns. This is pretty epic & will have far reaching implications.

/end
You can follow @maxolson.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled:

By continuing to use the site, you are consenting to the use of cookies as explained in our Cookie Policy to improve your experience.