AI Glossary

The definitive dictionary for AI, Machine Learning, and Governance terminology. From Flash Attention to RAG — look up any term.

All Categories Artificial Intelligence Machine Learning AI Governance Data Science General

A

Adversarial Attack

An input deliberately crafted to fool an AI model into making incorrect predictions. Adversarial examples often look normal to humans but cause models to fail spectacularly.

Artificial Intelligence

Agent Memory

Systems that give AI agents persistent storage for facts, preferences, and conversation history across sessions. Memory enables agents to build cumulative knowledge over time.

Artificial Intelligence

Agentic AI

AI systems designed to operate with high autonomy — planning, executing, and adapting without constant human oversight. Agentic AI emphasizes independent action-taking to accomplish user goals.

Artificial Intelligence

Agentic Memory Systems

Architectures for managing different types of memory in AI agents — working memory for current tasks, episodic memory for past interactions, and semantic memory for accumulated knowledge.

Artificial Intelligence

Agentic RAG

An advanced RAG pattern where an AI agent dynamically decides what to retrieve, how to refine queries, and when to search again based on the quality of initial results.

Artificial Intelligence

Agentic Workflow

A multi-step process where an AI agent autonomously plans, executes, evaluates, and iterates on tasks, making decisions at each step rather than following a fixed pipeline.

Artificial Intelligence

AI Agent

An AI system that can autonomously plan, reason, and take actions to accomplish goals. Unlike simple chatbots, agents can use tools, make decisions, execute multi-step workflows, and adapt their approach based on results.

Artificial Intelligence

AI Alignment Tax

The performance cost of making AI models safer and more aligned with human values. Safety training sometimes reduces raw capability on certain tasks.

Artificial Intelligence

AI Chip

A semiconductor designed specifically for artificial intelligence workloads, optimized for the mathematical operations (matrix multiplication, convolution) that neural networks require.

Artificial Intelligence

AI Coding Assistant

An AI tool that helps developers write, debug, review, and refactor code through natural language interaction and code completion. Modern coding assistants use LLMs fine-tuned on code.

Artificial Intelligence

AI Memory

Systems that give AI models the ability to retain and recall information across conversations or sessions. Memory enables persistent context, user preferences, and accumulated knowledge.

Artificial Intelligence

AI Orchestration Layer

The middleware that coordinates AI model calls, tool execution, memory management, and error handling in complex AI applications. It manages the flow between components.

Artificial Intelligence

Approximate Nearest Neighbor

An algorithm that finds vectors approximately closest to a query vector, trading perfect accuracy for dramatic speed improvements. ANN makes vector search practical at scale.

Artificial Intelligence

Artificial General Intelligence

A hypothetical AI system with human-level cognitive abilities across all domains — able to reason, learn, plan, and understand any intellectual task that a human can. AGI does not yet exist.

Artificial Intelligence

The broad field of computer science focused on creating systems capable of performing tasks that typically require human intelligence. This includes learning, reasoning, problem-solving, perception, and language understanding.

Artificial Intelligence

Artificial Superintelligence

A theoretical AI system that vastly surpasses human intelligence across all domains including creativity, problem-solving, and social intelligence. ASI remains purely hypothetical.

Artificial Intelligence

ASIC

Application-Specific Integrated Circuit — a chip designed for a single specific purpose. In AI, ASICs like Google's TPUs are designed exclusively for neural network operations.

Artificial Intelligence

Attention Head

A single attention computation within multi-head attention. Each head independently computes attention scores, allowing different heads to specialize in different types of relationships.

Artificial Intelligence

Attention Map

A visualization showing which parts of the input an AI model focuses on when making predictions. Attention maps reveal the model's internal focus patterns.

Artificial Intelligence

Attention Mechanism

A component in neural networks that allows the model to focus on the most relevant parts of the input when producing each part of the output. It assigns different weights to different input elements based on their relevance.

Artificial Intelligence

Attention Score

The numerical value representing how much one token should focus on another token in the attention mechanism. Higher scores mean stronger relationships between tokens.

Artificial Intelligence

Attention Sink

A phenomenon in transformers where the first few tokens in a sequence receive disproportionately high attention scores regardless of their content, acting as 'sinks' for excess attention.

Artificial Intelligence

Attention Window

The range of tokens that an attention mechanism can attend to in a single computation. Different attention patterns (local, global, sliding) use different window sizes.

Artificial Intelligence

Autonomous Agent Framework

A software framework providing the infrastructure for building AI agents including planning, memory, tool integration, error handling, and multi-agent coordination.

Artificial Intelligence

Autonomous AI

AI systems capable of making decisions and taking actions independently without continuous human guidance. Autonomous AI can plan, execute, and adapt to changing circumstances on its own.

Artificial Intelligence

Autonomous Vehicle

A vehicle that can navigate and operate without human input using AI systems for perception (cameras, lidar), decision-making, and control. Self-driving technology uses computer vision, sensor fusion, and planning.

Artificial Intelligence

B

Backdoor Attack

A type of data poisoning where a model is trained to behave maliciously when a specific trigger pattern is present in the input, while behaving normally otherwise.

Artificial Intelligence

Beam Search

A search algorithm used in text generation that explores multiple possible output sequences simultaneously, keeping the top-scoring candidates at each step. It finds higher-quality outputs than greedy decoding.

Artificial Intelligence

Benchmark

A standardized test or dataset used to evaluate and compare the performance of AI models. Benchmarks provide consistent metrics that allow fair comparisons between different approaches.

Artificial Intelligence

Benchmark Contamination

When a model's training data inadvertently includes test data from benchmarks, leading to inflated performance scores that do not reflect true capability.

Artificial Intelligence

BERT

Bidirectional Encoder Representations from Transformers — a language model developed by Google that reads text in both directions simultaneously. BERT excels at understanding language rather than generating it.

Artificial Intelligence

Black Box

A model or system whose internal workings are not visible or understandable to the user — you can see the inputs and outputs but not the reasoning in between. Most deep learning models are considered black boxes.

Artificial Intelligence

BM25

Best Matching 25 — a widely used ranking function for keyword-based information retrieval. BM25 scores documents based on query term frequency, document length, and corpus statistics.

Artificial Intelligence

Byte-Pair Encoding

A subword tokenization algorithm that starts with individual characters and iteratively merges the most frequent pairs to create a vocabulary of subword units. It balances vocabulary size with handling of rare words.

Artificial Intelligence

C

Capability Elicitation

Techniques for discovering and activating latent capabilities in AI models — abilities that exist but are not obvious from standard testing or usage.

Artificial Intelligence

Chain-of-Thought

A prompting technique where the model is encouraged to show its step-by-step reasoning process before arriving at a final answer. This improves accuracy on complex reasoning tasks.

Artificial Intelligence

Chatbot

An AI application designed to simulate conversation with human users through text or voice. Modern chatbots use LLMs to provide natural, contextually aware responses.

Artificial Intelligence

ChatGPT

OpenAI's consumer-facing AI chatbot powered by GPT models. ChatGPT brought LLMs to the mainstream when it launched in November 2022, reaching 100 million users in two months.

Artificial Intelligence

Chinchilla Scaling

Research by DeepMind showing that many LLMs were significantly undertrained — for a given compute budget, training a smaller model on more data yields better performance.

Artificial Intelligence

Chunking

The process of breaking large documents into smaller pieces (chunks) before creating embeddings for use in RAG systems. Chunk size and strategy significantly impact retrieval quality.

Artificial Intelligence

CI/CD for ML

Continuous Integration and Continuous Deployment applied to machine learning — automating the testing, validation, and deployment of ML models whenever code or data changes.

Artificial Intelligence

Claude

Anthropic's family of AI assistants known for their focus on safety, helpfulness, and honesty. Claude models are designed with Constitutional AI principles for safer, more reliable AI interactions.

Artificial Intelligence

CLIP

Contrastive Language-Image Pre-training — an OpenAI model trained to understand the relationship between images and text. CLIP can match images to text descriptions without being trained on specific image categories.

Artificial Intelligence

Closed Source AI

AI models where the architecture, weights, and training details are proprietary and not publicly available. Users access them only through APIs or products controlled by the developer.

Artificial Intelligence

Code Generation

The AI capability of producing functional source code from natural language descriptions, specifications, or partial code. Modern LLMs can generate code in dozens of programming languages.

Artificial Intelligence

Cognitive Architecture

A framework or blueprint for building AI systems that mimics aspects of human cognition, including perception, memory, reasoning, learning, and action.

Artificial Intelligence

Compute

The computational resources (processing power, memory, time) required to train or run AI models. Compute is measured in FLOPs (floating-point operations) and is a primary constraint and cost in AI development.

Artificial Intelligence

Compute-Optimal Training

Allocating a fixed compute budget optimally between model size and training data quantity, based on scaling law research like the Chinchilla findings.

Artificial Intelligence

Computer Vision

A field of AI that trains computers to interpret and understand visual information from the world — images, videos, and real-time camera feeds. It enables machines to 'see' and make decisions based on what they see.

Artificial Intelligence

Confidence Score

A numerical value (typically 0-1) indicating how certain a model is about its prediction. Higher scores indicate greater confidence in the output.

Artificial Intelligence

Constrained Generation

Techniques that force LLM output to conform to specific formats, schemas, or grammars. This ensures outputs are always valid JSON, SQL, or match a defined structure.

Artificial Intelligence

Constraint Satisfaction

The problem of finding values for variables that satisfy a set of constraints. In AI, it is used in scheduling, planning, and configuration tasks.

Artificial Intelligence

Context Management

Strategies for efficiently using an LLM's limited context window, including what information to include, how to compress it, and when to summarize or truncate.

Artificial Intelligence

Context Window

The maximum amount of text (measured in tokens) that a language model can process in a single interaction. It includes both the input prompt and the generated output. Larger context windows allow models to handle longer documents.

Artificial Intelligence

Continuous Batching

A serving technique where new requests are added to an in-progress batch as existing requests complete, maximizing GPU utilization rather than waiting for an entire batch to finish.

Artificial Intelligence

Conversational AI

AI technology that enables natural, multi-turn conversations between humans and machines. It combines NLU, dialog management, and NLG to maintain coherent, contextual interactions.

Artificial Intelligence

Counterfactual Explanation

An explanation of an AI decision that describes what would need to change in the input for the model to produce a different output. It answers 'What if?' questions about predictions.

Artificial Intelligence

CUDA

Compute Unified Device Architecture — NVIDIA's parallel computing platform that enables GPU programming for AI workloads. CUDA is the dominant software ecosystem for AI computation.

Artificial Intelligence

D

DALL-E

A text-to-image AI model created by OpenAI that generates original images from text descriptions. DALL-E can create realistic images, art, and conceptual visualizations from natural language prompts.

Artificial Intelligence

Denoising

The process of removing noise from data to recover the underlying clean signal. In generative AI, denoising is the core mechanism of diffusion models.

Artificial Intelligence

Dense Retrieval

Information retrieval using learned vector embeddings to find semantically similar documents. Called 'dense' because document representations are dense numerical vectors with no zero values.

Artificial Intelligence

Deployment

The process of making a trained ML model available for use in production applications. Deployment involves packaging the model, setting up serving infrastructure, and establishing monitoring.

Artificial Intelligence

Deterministic Output

When an AI model produces the same output every time for the same input. Achieved by setting temperature to 0 and using fixed random seeds.

Artificial Intelligence

Diffusion Model

A type of generative AI model that creates data by starting with random noise and gradually removing it, step by step, until a coherent output (like an image) emerges. This process is called denoising.

Artificial Intelligence

Digital Twin

A virtual replica of a physical system, process, or object that uses real-time data and AI to simulate, predict, and optimize the behavior of its physical counterpart.

Artificial Intelligence

Document Processing

AI-powered extraction and understanding of information from documents including PDFs, images, forms, and scanned papers. It combines OCR, NLP, and computer vision.

Artificial Intelligence

E

Edge Inference

Running AI models directly on local devices (phones, IoT sensors, cameras) rather than sending data to the cloud. This reduces latency, preserves privacy, and works without internet connectivity.

Artificial Intelligence

Embedding

A numerical representation of data (text, images, etc.) as a vector of numbers in a high-dimensional space. Similar items are placed closer together in this space, enabling machines to understand semantic relationships.

Artificial Intelligence

Embedding Dimension

The number of numerical values in a vector embedding. Higher dimensions can capture more nuanced relationships but require more storage and computation.

Artificial Intelligence

Embedding Drift

Changes in embedding vector distributions over time as the underlying data, vocabulary, or user behavior shifts. Drift degrades retrieval quality in RAG and search systems.

Artificial Intelligence

Embedding Model

A specialized model designed to convert text, images, or other data into vector embeddings. Embedding models are optimized for producing meaningful numerical representations rather than generating text.

Artificial Intelligence

Embedding Space

The high-dimensional geometric space in which embeddings exist. In this space, the distance and direction between points encode semantic relationships between the items they represent.

Artificial Intelligence

Embeddings as a Service

Cloud APIs that convert text or other data into vector embeddings without requiring users to host or manage embedding models themselves.

Artificial Intelligence

Emergent Behavior

Capabilities that appear in large AI models that were not explicitly trained for and were not present in smaller versions. Emergent abilities seem to appear suddenly at certain scale thresholds.

Artificial Intelligence

Encoder-Decoder

An architecture where the encoder compresses input into a fixed representation and the decoder generates output from that representation. This structure is used in translation, summarization, and image captioning.

Artificial Intelligence

Evaluation

The systematic process of measuring an AI model's performance, safety, and reliability using various metrics, benchmarks, and testing methodologies.

Artificial Intelligence

Evaluation Framework

A structured system for measuring AI model performance across multiple dimensions including accuracy, safety, fairness, robustness, and user satisfaction.

Artificial Intelligence

Evaluation Harness

A standardized testing framework for running AI models through suites of benchmarks and evaluation tasks. It ensures consistent, reproducible evaluation across models.

Artificial Intelligence

Expert System

An early AI system that mimics human expertise in a specific domain using a knowledge base of rules and facts. Expert systems were the dominant AI approach in the 1980s.

Artificial Intelligence

F

Federated Inference

Running AI model inference across multiple distributed devices or locations, rather than centralizing it in one place. Each device processes its own data locally.

Artificial Intelligence

Few-Shot Learning

A technique where a model learns to perform a task from only a few examples provided in the prompt. Instead of training on thousands of examples, the model generalizes from just 2-5 demonstrations.

Artificial Intelligence

Few-Shot Prompting

A prompt engineering technique where a small number of input-output examples are provided before the actual query, demonstrating the desired format and behavior to the model.

Artificial Intelligence

Fine-Tuning vs RAG

The strategic decision between customizing a model's weights (fine-tuning) or providing external knowledge at inference time (RAG). Each approach has different strengths and use cases.

Artificial Intelligence

Flash Attention

An optimized implementation of the attention mechanism that reduces memory usage and increases speed by tiling the computation and avoiding materializing the full attention matrix in memory.

Artificial Intelligence

FLOPS

Floating Point Operations Per Second — a measure of computing speed that quantifies how many mathematical calculations a processor can perform each second. Used to measure AI hardware performance.

Artificial Intelligence

Foundation Model

A large AI model trained on broad data at scale that can be adapted to a wide range of downstream tasks. Foundation models serve as the base upon which specialized applications are built.

Artificial Intelligence

Frontier Model

The most capable and advanced AI models available at any given time, typically characterized by the highest performance across multiple benchmarks. These models push the boundaries of AI capabilities.

Artificial Intelligence

Function Calling

A capability where an LLM can generate structured output to invoke specific functions or APIs. The model decides which function to call and what parameters to pass based on the user's request.

Artificial Intelligence

G

Gemini

Google DeepMind's family of multimodal AI models designed to understand and generate text, code, images, audio, and video. Gemini is Google's flagship AI model series.

Artificial Intelligence

Generative Adversarial Network

A framework where two neural networks compete — a generator creates fake data and a discriminator tries to tell real from fake. This adversarial process drives both networks to improve, producing increasingly realistic outputs.

Artificial Intelligence

Generative AI

AI systems that can create new content — text, images, music, code, video — rather than just analyzing or classifying existing data. These models learn patterns from training data and generate novel outputs that resemble the original data.

Artificial Intelligence

GGUF

A file format for storing quantized language models designed for efficient CPU inference. GGUF is the standard format used by llama.cpp and is popular for local LLM deployment.

Artificial Intelligence

GPT

Generative Pre-trained Transformer — a family of large language models developed by OpenAI. GPT models are trained to predict the next token in a sequence and can generate coherent, contextually relevant text across many tasks.

Artificial Intelligence

GPU

Graphics Processing Unit — originally designed for rendering graphics, GPUs excel at the parallel mathematical operations needed for training and running AI models. They are the primary hardware for modern AI.

Artificial Intelligence

GraphRAG

A RAG approach that uses knowledge graphs rather than vector databases for retrieval. It combines graph traversal with LLM generation to answer questions requiring multi-hop reasoning.

Artificial Intelligence

Greedy Decoding

A simple text generation strategy where the model always selects the most probable next token at each step. It is fast but can produce repetitive or suboptimal outputs.

Artificial Intelligence

Grounding

The practice of connecting AI model outputs to verifiable sources of information, ensuring responses are based on factual data rather than the model's potentially unreliable internal knowledge.

Artificial Intelligence

Guardrail Model

A separate, specialized AI model that monitors the inputs and outputs of a primary LLM to detect and block harmful, off-topic, or policy-violating content.

Artificial Intelligence

H

Hallucination

When an AI model generates information that sounds plausible and confident but is factually incorrect, fabricated, or not grounded in its training data or provided context. The model essentially 'makes things up'.

Artificial Intelligence

Hallucination Detection

Methods and systems for automatically identifying when an AI model has generated false or unsupported information. Detection can compare outputs against source documents or use consistency checks.

Artificial Intelligence

Hallucination Rate

The frequency at which an AI model generates incorrect or fabricated information. It is typically measured as a percentage of responses containing hallucinations.

Artificial Intelligence

Hardware Acceleration

Using specialized hardware (GPUs, TPUs, FPGAs, ASICs) to speed up AI computation compared to general-purpose CPUs. Accelerators are optimized for the specific math operations used in neural networks.

Artificial Intelligence

Hugging Face

The leading open-source platform for sharing and discovering AI models, datasets, and applications. Hugging Face hosts the Transformers library and a community hub with thousands of pre-trained models.

Artificial Intelligence

Human Evaluation

Using human judges to assess AI model quality on subjective dimensions like helpfulness, coherence, creativity, and safety that automated metrics cannot fully capture.

Artificial Intelligence

Human-in-the-Loop

A system design where humans are integrated into the AI workflow to provide oversight, make decisions, correct errors, or handle edge cases that the AI cannot reliably manage alone.

Artificial Intelligence

Hybrid Search

A search approach that combines keyword-based (lexical) search with semantic (vector) search to get the benefits of both — exact matching for specific terms and meaning-based matching for conceptual queries.

Artificial Intelligence

I

Image Classification

A computer vision task that assigns a category label to an entire image. The model determines what the main subject of the image is from a predefined set of categories.

Artificial Intelligence

Image Segmentation

A computer vision task that assigns a label to every pixel in an image, dividing it into meaningful regions. It identifies not just what objects are present but their exact shapes and boundaries.

Artificial Intelligence

In-Context Learning

An LLM's ability to learn new tasks from examples or instructions provided within the prompt, without any weight updates or fine-tuning. The model adapts its behavior based on the context given.

Artificial Intelligence

Inference

The process of using a trained model to make predictions on new, previously unseen data. Inference is what happens when an AI model is deployed and actively serving results to users.

Artificial Intelligence

Inference Optimization

Techniques for making AI model inference faster, cheaper, and more efficient. This includes quantization, batching, caching, speculative decoding, and hardware optimization.

Artificial Intelligence

Information Extraction

The task of automatically extracting structured information (entities, relationships, events) from unstructured text documents.

Artificial Intelligence

Instruction Following

An LLM's ability to accurately understand and execute user instructions, including complex multi-step directives with specific constraints on format, tone, length, and content.

Artificial Intelligence

Instruction Hierarchy

A framework for prioritizing different levels of instructions when they conflict — system prompts typically override user prompts, which override context from retrieved documents.

Artificial Intelligence

Instructor Embedding

An embedding approach where you provide instructions that describe the task alongside the text, producing task-specific embeddings from a single model.

Artificial Intelligence

K

Knowledge Cutoff

The date after which a language model has no training data. The model cannot reliably answer questions about events that occurred after its knowledge cutoff.

Artificial Intelligence

KV Cache

Key-Value Cache — a mechanism that stores previously computed attention key and value vectors during autoregressive generation, avoiding redundant computation for tokens already processed.

Artificial Intelligence

L

LangChain

A popular open-source framework for building applications powered by language models. It provides tools for prompt management, chains, agents, memory, and integration with external tools and data sources.

Artificial Intelligence

Large Language Model

A type of AI model trained on massive amounts of text data that can understand and generate human-like text. LLMs use transformer architecture and typically have billions of parameters, enabling them to perform a wide range of language tasks.

Artificial Intelligence

Latency

The time delay between sending a request to an AI model and receiving the response. In ML systems, latency includes data preprocessing, model inference, and network transmission time.

Artificial Intelligence

Leaderboard

A ranking of AI models by performance on specific benchmarks. Leaderboards drive competition and provide quick comparisons but can encourage gaming and narrow optimization.

Artificial Intelligence

Llama

A family of open-weight large language models released by Meta. Llama models are available for download and customization, making them the most widely adopted open-source LLM family.

Artificial Intelligence

LLM-as-Judge

Using a large language model to evaluate the quality of another model's outputs, replacing or supplementing human evaluators. The judge LLM scores responses on various quality dimensions.

Artificial Intelligence

Long Context

The ability of AI models to process very large amounts of input text — typically 100K tokens or more — enabling analysis of entire books, codebases, or document collections.

Artificial Intelligence

M

Machine Translation

The use of AI to automatically translate text or speech from one language to another. Modern neural machine translation uses transformer models and achieves near-human quality for many language pairs.

Artificial Intelligence

Mistral

A French AI company and their family of efficient, high-performance open-weight language models. Mistral models are known for strong performance relative to their size.

Artificial Intelligence

Mixture of Agents

An architecture where multiple different AI models collaborate on a task, with each model contributing its strengths. A routing or aggregation layer combines their outputs.

Artificial Intelligence

Mixture of Depths

A transformer architecture where different tokens use different numbers of layers, allowing the model to spend more computation on complex tokens and less on simple ones.

Artificial Intelligence

Mixture of Experts

An architecture where a model consists of multiple specialized sub-networks (experts) and a gating mechanism that routes each input to only the most relevant experts. Only a fraction of the total parameters are active per input.

Artificial Intelligence

Mixture of Modalities

AI architectures that natively process and generate multiple data types within a single unified model, rather than using separate models connected together.

Artificial Intelligence

MLOps

Machine Learning Operations — the set of practices that combine ML, DevOps, and data engineering to deploy and maintain ML models in production reliably and efficiently.

Artificial Intelligence

Model Collapse

A phenomenon where AI models trained on AI-generated content progressively lose quality and diversity, eventually producing repetitive, low-quality outputs. Each generation of model degrades further.

Artificial Intelligence

Model Context Protocol

An open protocol that standardizes how AI models connect to external tools, data sources, and services. MCP provides a universal interface for LLMs to access context from any compatible system.

Artificial Intelligence

Model Drift

The gradual degradation of a model's predictive performance over time as the real-world environment changes. Model drift can be caused by data drift, concept drift, or both.

Artificial Intelligence

Model Evaluation Pipeline

An automated system that runs a comprehensive suite of evaluations on AI models, generating reports on accuracy, safety, bias, robustness, and other quality dimensions.

Artificial Intelligence

Model Hub

A platform for hosting, discovering, and sharing pre-trained AI models. Model hubs provide standardized access to thousands of models across different tasks and architectures.

Artificial Intelligence

Model Interpretability Tool

Software tools that help understand how ML models make predictions, including feature importance, attention visualization, counterfactual explanations, and decision path analysis.

Artificial Intelligence

Model Monitoring

The practice of continuously tracking an ML model's performance, predictions, and input data in production to detect degradation, drift, or anomalies after deployment.

Artificial Intelligence

Model Registry

A centralized repository for storing, versioning, and managing trained ML models along with their metadata (metrics, parameters, lineage). It serves as the system of record for models.

Artificial Intelligence

Model Serving

The infrastructure and process of deploying trained ML models to production where they can receive requests and return predictions in real time. It includes scaling, load balancing, and version management.

Artificial Intelligence

Model Size

The number of parameters in a model, typically expressed in millions (M) or billions (B). Model size correlates loosely with capability but also determines compute and memory requirements.

Artificial Intelligence

Model Weights

The collection of all learned parameter values in a neural network. Model weights are what you download when you get a pre-trained model — they encode everything the model learned.

Artificial Intelligence

Multi-Agent System

An architecture where multiple AI agents collaborate, each with specialized roles or capabilities, to accomplish complex tasks that no single agent could handle alone.

Artificial Intelligence

Multi-Head Attention

An extension of attention where multiple attention mechanisms (heads) run in parallel, each learning to focus on different types of relationships in the data. The outputs are then combined.

Artificial Intelligence

Multilingual AI

AI models capable of understanding and generating text in multiple languages. Modern LLMs often support 50-100+ languages, though performance varies significantly across languages.

Artificial Intelligence

Multimodal AI

AI systems that can process and generate multiple types of data — text, images, audio, video — within a single model. Multimodal models understand the relationships between different data types.

Artificial Intelligence

Multimodal Embedding

Embeddings that map different data types (text, images, audio) into the same vector space, enabling cross-modal search and comparison.

Artificial Intelligence

Multimodal RAG

Retrieval-augmented generation that works across multiple data types — retrieving and reasoning over text, images, tables, and charts to answer questions that require multimodal understanding.

Artificial Intelligence

Multimodal Search

Search systems that can query across different data types — finding images with text, videos with audio descriptions, or documents that contain specific visual elements.

Artificial Intelligence

N

Named Entity Recognition

The NLP task of identifying and classifying named entities in text into predefined categories such as person names, organizations, locations, dates, monetary values, and more.

Artificial Intelligence

Narrow AI

AI systems designed and trained for a specific task or narrow set of tasks. All current AI systems are narrow AI — they excel in their domain but cannot generalize outside it.

Artificial Intelligence

Natural Language Generation

The AI capability of producing human-readable text from structured data, internal representations, or prompts. NLG is the output side of language AI — turning machine understanding into human words.

Artificial Intelligence

Natural Language Inference

The NLP task of determining the logical relationship between two sentences — whether one entails, contradicts, or is neutral with respect to the other.

Artificial Intelligence

Natural Language Processing

The branch of AI that deals with the interaction between computers and human language. NLP enables machines to read, understand, generate, and make sense of human language in a useful way.

Artificial Intelligence

Natural Language Understanding

The ability of an AI system to comprehend the meaning, intent, and context of human language, going beyond surface-level word matching to grasp semantics, pragmatics, and implied meaning.

Artificial Intelligence

Neuro-Symbolic AI

Approaches that combine neural networks (pattern recognition, learning from data) with symbolic AI (logical reasoning, knowledge representation) to get the strengths of both.

Artificial Intelligence

O

Object Detection

A computer vision task that identifies and locates specific objects within an image or video, providing both the object class and its position (usually as a bounding box).

Artificial Intelligence

Observability

The ability to understand the internal state and behavior of an AI system through its external outputs, including logging, tracing, and monitoring of LLM calls and agent actions.

Artificial Intelligence

ONNX

Open Neural Network Exchange — an open format for representing machine learning models that enables interoperability between different ML frameworks and deployment targets.

Artificial Intelligence

Open Source AI

AI models and tools released with open licenses that allow anyone to use, modify, and distribute them. Open-source AI democratizes access and enables community-driven improvement.

Artificial Intelligence

Optical Character Recognition

Technology that converts images of text (typed, handwritten, or printed) into machine-readable digital text. Modern OCR uses deep learning for high accuracy even on difficult inputs.

Artificial Intelligence

Orchestration

The coordination and management of multiple AI components, tools, and services to accomplish complex workflows. Orchestration handles routing, sequencing, error handling, and resource allocation.

Artificial Intelligence

P

Parallel Function Calling

The ability of an LLM to invoke multiple tool calls simultaneously in a single response, rather than sequentially. This enables faster task completion for independent operations.

Artificial Intelligence

Planning

An AI agent's ability to break down complex goals into a sequence of steps and determine the best order of actions to accomplish a task. Planning involves reasoning about dependencies, priorities, and contingencies.

Artificial Intelligence

Playground

An interactive web interface where users can experiment with AI models by adjusting parameters, testing prompts, and seeing results in real time without writing code.

Artificial Intelligence

Positional Encoding

A technique used in transformers to inject information about the position of each token in a sequence. Since transformers process all tokens in parallel, they need explicit position information.

Artificial Intelligence

Prompt Attack Surface

The total set of potential vulnerabilities in an LLM application that can be exploited through prompt-based attacks, including injection, leaking, and jailbreaking vectors.

Artificial Intelligence

Prompt Caching

A technique that stores and reuses the processed form of frequently used prompt prefixes, avoiding redundant computation. It speeds up inference and reduces costs for repeated prompts.

Artificial Intelligence

Prompt Chaining

A technique where the output of one LLM call becomes the input for the next, creating a pipeline of prompts that together accomplish a complex task.

Artificial Intelligence

Prompt Compression

Techniques for reducing the token count of prompts while preserving their essential meaning, enabling more efficient use of context windows and reducing API costs.

Artificial Intelligence

Prompt Engineering

The practice of designing and optimizing input prompts to get the best possible output from AI models. It involves crafting instructions, providing examples, and structuring queries to guide the model toward desired responses.

Artificial Intelligence

Prompt Injection

A security vulnerability where malicious input is crafted to override or manipulate an LLM's system prompt or instructions, causing it to behave in unintended ways.

Artificial Intelligence

Prompt Injection Defense

Techniques and strategies for protecting LLM applications from prompt injection attacks, including input sanitization, output filtering, and architectural defenses.

Artificial Intelligence

Prompt Leaking

When a user successfully extracts a system's hidden system prompt through clever questioning. Prompt leaking reveals proprietary instructions, business logic, and safety configurations.

Artificial Intelligence

Prompt Library

A curated collection of tested, optimized prompts organized by use case. Prompt libraries accelerate development by providing proven starting points for common tasks.

Artificial Intelligence

Prompt Management

The practice of versioning, testing, and managing prompts used in LLM applications. It treats prompts as code that needs proper lifecycle management.

Artificial Intelligence

Prompt Optimization

Systematic techniques for improving prompt effectiveness, including automated prompt search, A/B testing of prompt variants, and iterative refinement based on output quality metrics.

Artificial Intelligence

Prompt Template

A pre-defined structure for formatting prompts to AI models, with placeholders for dynamic content. Templates ensure consistent, optimized prompt formatting across applications.

Artificial Intelligence

Prompt Versioning

Tracking different versions of prompts over time, including changes, performance metrics, and rollback capabilities. Essential for managing prompts in production AI applications.

Artificial Intelligence

Q

Quantization

The process of reducing the precision of a model's numerical weights (e.g., from 32-bit to 8-bit or 4-bit), making the model smaller and faster while accepting a small trade-off in accuracy.

Artificial Intelligence

Question Answering

An NLP task where the model provides direct answers to questions, either from a given context passage (extractive QA) or from general knowledge (open-domain QA).

Artificial Intelligence

R

RAG Pipeline

The complete end-to-end system for retrieval-augmented generation, including document ingestion, chunking, embedding, indexing, retrieval, reranking, prompt construction, and generation.

Artificial Intelligence

Reasoning

An AI model's ability to think logically, make inferences, draw conclusions, and solve problems that require multi-step thought. Reasoning goes beyond pattern matching to genuine logical analysis.

Artificial Intelligence

Recommendation System

An AI system that predicts and suggests items a user might be interested in based on their behavior, preferences, and similarities to other users.

Artificial Intelligence

Relation Extraction

The NLP task of identifying and classifying semantic relationships between entities mentioned in text. It extracts structured facts from unstructured text.

Artificial Intelligence

Reranking

A second-stage ranking process that takes initial search results and reorders them using a more sophisticated model. Reranking improves precision by applying deeper analysis to a smaller candidate set.

Artificial Intelligence

Retrieval

The process of finding and extracting relevant information from a large collection of documents or data in response to a query. In AI systems, retrieval is often the first step before generation.

Artificial Intelligence

Retrieval Evaluation

Methods for measuring how well a retrieval system finds relevant documents. Key metrics include recall at K, mean reciprocal rank, and normalized discounted cumulative gain.

Artificial Intelligence

Retrieval Latency

The time it takes for a retrieval system to search through stored documents or embeddings and return relevant results. Measured in milliseconds, it is a critical component of RAG system performance.

Artificial Intelligence

Retrieval Quality

A measure of how relevant and accurate the documents retrieved by a search or RAG system are relative to the user's query. Poor retrieval quality is the leading cause of RAG failures.

Artificial Intelligence

Retrieval-Augmented Generation

A technique that enhances LLM outputs by first retrieving relevant information from external knowledge sources and then using that information as context for generation. RAG combines the power of search with the fluency of language models.

Artificial Intelligence

Retrieval-Augmented Reasoning

An advanced approach where an AI model interleaves retrieval with reasoning steps, fetching new information mid-reasoning rather than retrieving everything upfront.

Artificial Intelligence

Reward Hacking

When an AI system finds unintended ways to maximize its reward signal that do not align with the designer's actual goals. The system technically optimizes the metric but violates the spirit of the objective.

Artificial Intelligence

Robustness

The ability of an AI model to maintain reliable performance when faced with unexpected inputs, adversarial attacks, data distribution changes, or edge cases.

Artificial Intelligence

Role Prompting

A technique where the model is instructed to adopt a specific persona, expertise, or perspective in its responses. The assigned role shapes tone, depth, terminology, and reasoning approach.

Artificial Intelligence

S

Sampling Strategy

The method used to select the next token during text generation. Different strategies (greedy, top-k, top-p, temperature-based) produce different tradeoffs between quality and diversity.

Artificial Intelligence

Scaling Hypothesis

The theory that increasing model size, data, and compute will continue to improve AI capabilities predictably, and may eventually lead to artificial general intelligence.

Artificial Intelligence

Scaling Laws

Empirical findings showing predictable relationships between model performance and factors like model size (parameters), dataset size, and compute budget. Performance improves as a power law with these factors.

Artificial Intelligence

Self-Attention

A mechanism where each element in a sequence attends to all other elements to compute a representation, determining how much focus to place on each part of the input. It is the core innovation of the transformer.

Artificial Intelligence

Self-Consistency

A decoding strategy where the model generates multiple reasoning paths for the same question and selects the answer that appears most frequently across paths. It improves accuracy on reasoning tasks.

Artificial Intelligence

Semantic Caching

Caching LLM responses based on the semantic meaning of queries rather than exact string matching. Semantically similar questions return cached answers, reducing latency and cost.

Artificial Intelligence

Semantic Chunking

An intelligent chunking strategy for RAG that splits documents based on semantic meaning rather than fixed character counts, keeping coherent topics together.

Artificial Intelligence

Semantic Kernel

Microsoft's open-source SDK for integrating LLMs with programming languages. It provides a framework for orchestrating AI capabilities with conventional code.

Artificial Intelligence

Semantic Router

A system that routes user queries to appropriate handlers based on semantic meaning rather than keyword matching. It directs traffic in AI applications.

Artificial Intelligence

Semantic Search

Search that understands the meaning and intent behind a query rather than just matching keywords. It uses embeddings to find results that are conceptually related even if they use different words.

Artificial Intelligence

Semantic Similarity

A measure of how similar in meaning two pieces of text are, regardless of the specific words used. Semantic similarity captures conceptual relatedness rather than lexical overlap.

Artificial Intelligence

Sentence Embedding

A vector representation of an entire sentence or paragraph that captures its overall meaning. Sentence embeddings enable comparing the meanings of text passages.

Artificial Intelligence

Sentiment Analysis

The NLP task of identifying and classifying the emotional tone or opinion expressed in text as positive, negative, or neutral. Advanced systems detect nuanced emotions like frustration, excitement, or sarcasm.

Artificial Intelligence

Sequence-to-Sequence

A model architecture that transforms one sequence into another, where the input and output can be different lengths. It uses an encoder to process input and a decoder to generate output.

Artificial Intelligence

Singularity

A hypothetical future point at which AI self-improvement becomes so rapid that it triggers an intelligence explosion, leading to changes so profound they are impossible to predict.

Artificial Intelligence

Sparse Attention

A variant of attention where each token only attends to a subset of other tokens rather than all of them, reducing computational cost from O(n²) to O(n√n) or O(n log n).

Artificial Intelligence

Sparse Model

A neural network where most parameters are zero or inactive for any given input. Sparse models achieve high capacity with lower computational cost by only using relevant parameters.

Artificial Intelligence

Sparse Retrieval

Information retrieval using traditional keyword matching and term frequency methods (like BM25). Called 'sparse' because document representations have mostly zero values.

Artificial Intelligence

Speculative Decoding

A technique that uses a small, fast model to draft multiple tokens ahead, then uses the large model to verify them in parallel. It speeds up inference without changing output quality.

Artificial Intelligence

Speech-to-Text

AI technology that converts spoken audio into written text (also called automatic speech recognition or ASR). Modern systems handle accents, background noise, and multiple speakers.

Artificial Intelligence

Stable Diffusion

An open-source text-to-image diffusion model that generates detailed images from text descriptions. It works in a compressed latent space, making it more efficient than pixel-level diffusion.

Artificial Intelligence

Streaming

Delivering LLM output token-by-token as it is generated rather than waiting for the complete response. Streaming dramatically improves perceived latency and user experience.

Artificial Intelligence

Structured Output

The ability of an LLM to generate responses in a specific format like JSON, XML, or a defined schema. Structured output makes AI responses parseable by other software systems.

Artificial Intelligence

Summarization

The NLP task of condensing a longer text into a shorter version while preserving the key information and main points. Summarization can be extractive (selecting key sentences) or abstractive (generating new text).

Artificial Intelligence

Swarm Intelligence

Collective behavior emerging from the interaction of multiple simple agents that together produce sophisticated solutions. Inspired by natural swarms like ant colonies, bee hives, and bird flocks.

Artificial Intelligence

Symbolic AI

An approach to AI that represents knowledge using symbols and rules, and reasons by manipulating those symbols logically. Symbolic AI dominated before the deep learning era.

Artificial Intelligence

Synthetic Benchmark

A benchmark composed of artificially generated or carefully curated evaluation tasks designed to test specific AI capabilities, rather than using naturally occurring data.

Artificial Intelligence

Synthetic Evaluation

Using AI models to evaluate other AI models, generating test cases and scoring outputs automatically. This scales evaluation beyond what human evaluation alone can achieve.

Artificial Intelligence

System Prompt

Hidden instructions provided to an LLM that define its behavior, personality, constraints, and capabilities for a conversation. System prompts set the rules of engagement before the user interacts.

Artificial Intelligence

T

Temperature

A parameter that controls the randomness or creativity of an LLM's output. Lower temperatures (closer to 0) make outputs more deterministic and focused; higher temperatures increase randomness and creativity.

Artificial Intelligence

Test-Time Compute

Allocating additional computation during inference (not training) to improve output quality. Techniques include chain-of-thought, self-consistency, and iterative refinement.

Artificial Intelligence

Text Classification

The NLP task of assigning predefined categories or labels to text documents. It is one of the most common and commercially important NLP applications.

Artificial Intelligence

Text Mining

The process of deriving meaningful patterns, trends, and insights from large collections of text data using NLP and statistical techniques.

Artificial Intelligence

Text-to-Image

AI models that generate visual images from natural language text descriptions (prompts). This technology converts written descriptions into original images, illustrations, or photorealistic visuals.

Artificial Intelligence

Text-to-Speech

AI technology that converts written text into natural-sounding human speech. Modern TTS systems can generate voices with realistic intonation, emotion, and even clone specific voices.

Artificial Intelligence

Throughput

The number of requests or predictions a model can process in a given time period. High throughput means the system can serve many users simultaneously.

Artificial Intelligence

Token

The basic unit of text that language models process. A token can be a word, part of a word, or a punctuation mark. Text is broken into tokens before being fed into an LLM, and the model generates output one token at a time.

Artificial Intelligence

Token Limit

The maximum number of tokens a model can process in a single request, including both the input prompt and the generated output. Exceeding the limit results in truncated input or errors.

Artificial Intelligence

Tokenization

The process of breaking text into smaller units (tokens) for processing by NLP models. Tokenization can split text into words, subwords, or characters depending on the method used.

Artificial Intelligence

Tokenization Strategy

The approach and rules for how text is split into tokens. Different strategies (word-level, subword, character-level) make different tradeoffs between vocabulary size and sequence length.

Artificial Intelligence

Tokenizer

A component that converts raw text into tokens (numerical representations) that a language model can process. Different tokenizers split text differently, affecting model performance and efficiency.

Artificial Intelligence

Tokenizer Efficiency

How effectively a tokenizer represents text — measured by the average number of tokens needed to represent a given amount of text. More efficient tokenizers produce fewer tokens for the same content.

Artificial Intelligence

Tokenizer Training

The process of building a tokenizer's vocabulary from a corpus of text. The tokenizer learns which subword units to use based on frequency patterns in the training corpus.

Artificial Intelligence

Tokenizer Vocabulary

The complete set of tokens (words, subwords, characters) that a tokenizer can recognize and map to numerical IDs. Vocabulary size affects model efficiency and multilingual capability.

Artificial Intelligence

Tool Use

The ability of an AI model to interact with external tools, APIs, and systems to accomplish tasks beyond text generation. Tools extend the model's capabilities to include search, calculation, code execution, and more.

Artificial Intelligence

Top-k Sampling

A text generation method where the model only considers the k most likely next tokens at each step, ignoring all others. This limits the pool of candidates to the most probable options.

Artificial Intelligence

Top-p Sampling

A text generation method (also called nucleus sampling) where the model considers only the smallest set of tokens whose cumulative probability exceeds the threshold p. This balances diversity and quality.

Artificial Intelligence

TPU

Tensor Processing Unit — Google's custom-designed chip specifically optimized for machine learning workloads. TPUs are designed for matrix operations that are fundamental to neural network computation.

Artificial Intelligence

Transformer

A neural network architecture introduced in 2017 that uses self-attention mechanisms to process sequential data in parallel rather than sequentially. Transformers are the foundation of modern LLMs like GPT, Claude, and Gemini.

Artificial Intelligence

Transformer Architecture

The full stack of components that make up a transformer model: multi-head self-attention, feed-forward networks, layer normalization, residual connections, and positional encodings.

Artificial Intelligence

Tree of Thought

A prompting framework where the model explores multiple reasoning branches, evaluates intermediate states, and can backtrack from dead ends — like a deliberate tree search through thought space.

Artificial Intelligence

V

Variational Autoencoder

A generative model that learns a compressed, lower-dimensional representation (latent space) of input data and can generate new data by sampling from this learned space.

Artificial Intelligence

Vector Search

The process of finding the most similar vectors in a vector database to a given query vector. It enables retrieving semantically similar content at scale.

Artificial Intelligence

Vision-Language Model

An AI model that can process both visual and textual inputs, understanding images and generating text about them. VLMs combine computer vision with language understanding.

Artificial Intelligence

Voice Cloning

AI technology that creates a synthetic replica of a specific person's voice from a small sample of their speech. Cloned voices can speak any text in the original person's vocal characteristics.

Artificial Intelligence

W

Weights and Biases

A popular MLOps platform for experiment tracking, model monitoring, dataset versioning, and collaboration in machine learning development.

Artificial Intelligence

Whisper

OpenAI's open-source automatic speech recognition model that can transcribe and translate speech in multiple languages with high accuracy.

Artificial Intelligence

Z

Zero-Shot Classification

Classifying text into categories that the model was never explicitly trained on, using only the category names or descriptions as guidance.

Artificial Intelligence

Zero-Shot Learning

A model's ability to perform a task it was never explicitly trained on or shown examples of. The model applies its general knowledge and reasoning to handle entirely new task types.

Artificial Intelligence

Zero-Shot Prompting

Giving an LLM a task instruction without any examples, relying entirely on the model's pre-trained knowledge and instruction-following ability to perform the task.

Artificial Intelligence