Foundations of AI Research
Every field has its language. These are the twenty-four core terms that every serious AI researcher must know — the ones that show up in papers, talks, and production code alike.
Model
A mathematical function or system that maps inputs to outputs. In AI, models are trained to approximate patterns in data.
Dataset
A structured collection of examples used to train, validate, and test models. Quality and diversity matter more than size alone.
Training
The process of adjusting model parameters to minimize error on a dataset.
Loss Function
The numeric measure of how far a model’s predictions deviate from desired outputs. Lower is better.
Gradient Descent
The optimization method that iteratively adjusts parameters in the direction that reduces the loss function.
Backpropagation
The algorithm that computes gradients through layers of a neural network, enabling training.
Weights & Biases
The internal parameters of a neural network that determine how inputs are transformed at each layer.
Activation Function
The non-linear transformation applied to neuron outputs; examples include ReLU, Sigmoid, and GELU.
Epoch
One complete pass through the training dataset during learning.
Overfitting
When a model learns patterns specific to the training data but fails to generalize to new data.
Regularization
Techniques (like dropout or weight decay) that prevent overfitting by adding constraints to learning.
Batch Normalization
A method to stabilize and speed up training by normalizing activations within a batch.
Transformer
A neural architecture using self-attention to model long-range dependencies in sequences; the foundation of modern LLMs.
Attention Mechanism
A way for models to weigh different input elements based on their relevance to the current task.
Embedding
A dense vector representation of discrete data (like words or images) that captures semantic relationships.
Fine-Tuning
Adapting a pretrained model to a new dataset or task by continuing its training with smaller, task-specific updates.
Inference
The stage where a trained model generates outputs or predictions on new, unseen inputs.
Parameter Efficient Fine-Tuning (PEFT)
A family of methods, including LoRA, that modify only small subsets of model parameters to adapt large models efficiently.
Quantization
Reducing numerical precision (e.g., 32-bit → 8-bit) to make models faster and smaller with minimal accuracy loss.
Prompting
Crafting text or structured inputs to guide language models toward specific behaviors or outputs.
Evaluation Metric
The criteria used to measure model performance; examples include accuracy, F1 score, BLEU, and perplexity.
Inference Pipeline
The runtime environment and sequence of steps where a trained model receives inputs, processes them, and returns results.
Scalability
The ability of models, data pipelines, and compute systems to handle increasing workloads efficiently.
Alignment
Ensuring that AI systems behave according to human values, intentions, and safety expectations.