Transformer

A deep learning model architecture that relies entirely on attention mechanisms to draw global dependencies between input and output.

Description

The Transformer is a deep learning model architecture introduced in the paper "Attention Is All You Need". It uses self-attention mechanisms to process sequential data, allowing it to handle long-range dependencies more effectively than traditional recurrent neural networks. Transformers have revolutionized natural language processing and have been successful in various tasks such as machine translation, text summarization, and question answering.

Examples

  • 🤖 BERT
  • 📝 GPT models
  • 🔄 T5

Applications

🌐 Machine translation
📝 Text generation
😃 Sentiment analysis

Related Terms