Foundations of AI Research

Every field has its language. These are the twenty-four core terms that every serious AI researcher must know — the ones that show up in papers, talks, and production code alike.

Model

A mathematical function or system that maps inputs to outputs. In AI, models are trained to approximate patterns in data.

Dataset

A structured collection of examples used to train, validate, and test models. Quality and diversity matter more than size alone.

Training

The process of adjusting model parameters to minimize error on a dataset.

Loss Function

The numeric measure of how far a model’s predictions deviate from desired outputs. Lower is better.

Gradient Descent

The optimization method that iteratively adjusts parameters in the direction that reduces the loss function.

Backpropagation

The algorithm that computes gradients through layers of a neural network, enabling training.

Weights & Biases

The internal parameters of a neural network that determine how inputs are transformed at each layer.

Activation Function

The non-linear transformation applied to neuron outputs; examples include ReLU, Sigmoid, and GELU.

Epoch

One complete pass through the training dataset during learning.

Overfitting

When a model learns patterns specific to the training data but fails to generalize to new data.

Regularization

Techniques (like dropout or weight decay) that prevent overfitting by adding constraints to learning.

Batch Normalization

A method to stabilize and speed up training by normalizing activations within a batch.

Transformer

A neural architecture using self-attention to model long-range dependencies in sequences; the foundation of modern LLMs.

Attention Mechanism

A way for models to weigh different input elements based on their relevance to the current task.

Embedding

A dense vector representation of discrete data (like words or images) that captures semantic relationships.

Fine-Tuning

Adapting a pretrained model to a new dataset or task by continuing its training with smaller, task-specific updates.

Inference

The stage where a trained model generates outputs or predictions on new, unseen inputs.

Parameter Efficient Fine-Tuning (PEFT)

A family of methods, including LoRA, that modify only small subsets of model parameters to adapt large models efficiently.

Quantization

Reducing numerical precision (e.g., 32-bit → 8-bit) to make models faster and smaller with minimal accuracy loss.

Prompting

Crafting text or structured inputs to guide language models toward specific behaviors or outputs.

Evaluation Metric

The criteria used to measure model performance; examples include accuracy, F1 score, BLEU, and perplexity.

Inference Pipeline

The runtime environment and sequence of steps where a trained model receives inputs, processes them, and returns results.

Scalability

The ability of models, data pipelines, and compute systems to handle increasing workloads efficiently.

Alignment

Ensuring that AI systems behave according to human values, intentions, and safety expectations.