Vocabulary
The set of unique tokens known to a language model or NLP system.
Description
In natural language processing, vocabulary refers to the set of unique tokens that a model or system recognizes. This can include words, subwords, or characters, depending on the tokenization method used. The vocabulary is typically built from the training data and has a significant impact on the model's ability to understand and generate text. The size and composition of the vocabulary can affect model performance, memory usage, and the ability to handle out-of-vocabulary words.
Examples
- π Word-level vocabulary
- π§© Subword vocabulary (e.g., in BERT or GPT models)
- π€ Character-level vocabulary
Applications
Related Terms
Featured

Winston AI
The most trusted AI detector

Abacus AI
The World's First Super Assistant for Professionals and Enterprises

Kimi AI
Kimi AI - K2 chatbot for long-context coding and research

Genspark AI
Your All-in-One AI Workspace

Blackbox AI
Accelerate development with Blackbox AI's multi-model platform

TurboLearn AI
AI Note Taker & Study Tools

Hailuo AI
AI Video Generator from Text & Image

Sora 2
Transform Ideas into Stunning Videos with Sora 2

AI PDF Assistant
AI PDF Assistant is an intelligent recommendation tool

Animon AI
Create anime videos for free

Un AI my text
βWhere AI Gets Its Human Touch.β

ChatGPT Atlas
The browser with ChatGPT built in

