Real Python Podcast | Moving NLP forward with transformer models and attention

What’s the big breakthrough for Natural Language Processing (NLP) that has dramatically advanced machine learning into deep learning? What makes these transformer models unique, and what defines “attention?” In this episode of the Real Python podcast, I continue my talk with Chris about how machine learning (ML) models understand and generate text.

This episode is a continuation of the conversation in episode #119. I build on the concepts of bag-of-words, word2vec, and simple embedding models. We talk about the breakthrough mechanism called “attention,” which allows for parallelization in building models.

We also discuss the two major transformer models, BERT and GPT3. I continue to share multiple resources to help you continue exploring modeling and NLP with Python.