Glossary
Every term we use at Ephizen, defined in plain English with real context.
A 3
- Activation Function ReLU · GELU · SigmoidDeep Learning
A non-linear function applied element-wise to a neuron's output so a stack of layers can represent more than a single linear map.
- Agent LoopAgents
The control loop an LLM agent runs — call the model, execute any tool calls it returns, feed the results back, repeat until it stops asking for tools.
- AttentionDeep Learning
A mechanism that lets a model weigh which other tokens in a sequence matter most when computing a representation for the current one.
B 4
- Backpropagation backpropDeep Learning
The algorithm that computes gradients of the loss with respect to every weight in a network by applying the chain rule backwards.
- Batch Normalization BatchNorm · BNDeep Learning
A layer that normalizes activations across a mini-batch so each feature has roughly zero mean and unit variance, then rescales them with learned parameters.
- Beam SearchLLMs
A decoding strategy that keeps the top-k highest-probability partial sequences at each step instead of greedily picking one.
- BiasClassical ML
The statistical error introduced when a model is too simple to capture the true structure of the data — not the societal fairness kind.
C 5
- Chain of Thought CoTLLMs
Prompting an LLM to write out its reasoning step by step before giving a final answer, which usually improves accuracy on multi-step problems.
- ChunkingRAG
Splitting documents into smaller, retrievable pieces before embedding them so retrieval returns the right span instead of the wrong book.
- Context Window context lengthLLMs
The maximum number of tokens an LLM can consider in a single forward pass — prompt plus generated output.
- Convolutional Neural Network ConvNet · CNNVision
A neural network architecture that uses learned convolution filters over local regions of an input grid — images, audio spectrograms, or any tensor with spatial structure.
- Cross-Validation CV · k-foldClassical ML
A resampling technique that estimates how well a model will generalize by training and evaluating on multiple data splits.
D 4
- Diffusion ModelDeep Learning
A generative model that learns to reverse a gradual noising process — you train it to denoise, then sample by iteratively denoising pure noise.
- Drift data drift · concept driftMLOps
When the data or relationships a model sees in production move away from what it was trained on, quietly degrading performance.
- DropoutDeep Learning
A regularization trick that randomly zeros out a fraction of activations during training so the network can't depend on any single neuron.
- Knowledge Distillation DistillationMLOps
Training a smaller "student" model to mimic the outputs of a larger "teacher" so you get most of the quality at a fraction of the cost.
E 3
- Embedding embeddingsDeep Learning
A dense vector that encodes the meaning of a piece of data so similar things land near each other in vector space.
- EncoderDeep Learning
The part of a model that reads an input sequence and produces a fixed-size or per-token vector representation for downstream use.
- EpochClassical ML
One full pass of the training algorithm over the entire training dataset.
F 4
- F1 Score F1Eval
The harmonic mean of precision and recall — a single number that punishes you for being lopsided on either.
- Feature EngineeringClassical ML
The craft of deriving inputs that make a model's job easier — usually by encoding domain knowledge the raw data doesn't surface.
- Few-Shot LearningLLMs
Teaching a model a new task by giving it a handful of example input/output pairs in the prompt rather than by fine-tuning.
- Fine-TuningLLMs
Continuing training on a pretrained model with a smaller, task-specific dataset to specialize its behavior.
G 2
- Generative Adversarial Network GANDeep Learning
Two networks trained against each other — a generator that creates fake samples and a discriminator that tries to tell them from real ones.
- Gradient Descent SGDDeep Learning
An iterative optimization method that nudges parameters in the opposite direction of the gradient to reduce a loss.
H 2
I 1
J 1
K 2
- k-Means ClusteringClassical ML
An unsupervised algorithm that partitions n points into k clusters by iteratively assigning points to the nearest centroid and moving centroids to the mean of their cluster.
- KL Divergence Kullback–Leibler divergenceMath
An asymmetric measure of how much one probability distribution differs from another — zero when they match, larger as they diverge.
L 4
- Latent SpaceDeep Learning
The lower-dimensional vector space a model compresses its inputs into, where similar things end up near each other.
- Learning RateDeep Learning
The scalar that controls how big a step the optimizer takes in the direction of the gradient — the single most important training hyperparameter.
- LoRA Low-Rank AdaptationLLMs
A parameter-efficient fine-tuning method that freezes the base model and trains small low-rank update matrices on top.
- Loss Function objective · cost functionDeep Learning
A scalar score that measures how wrong a model's predictions are — the thing the optimizer tries to make smaller.
M 3
- Mixture of Experts MoEDeep Learning
An architecture where only a few specialized sub-networks ("experts") fire for any given input, chosen by a learned router.
- Model Context Protocol MCPAgents
An open protocol for connecting LLM applications to tools, data sources, and prompts in a standard way — like USB for AI tool integrations.
- Multi-Head AttentionDeep Learning
Running several attention operations in parallel with different learned projections so the model can attend to multiple relationships at once.
N 1
O 2
P 3
- Perplexity PPLEval
An intrinsic LLM metric — the exponentiated average negative log-likelihood the model assigns to held-out text. Lower is better.
- Precision and RecallEval
Two complementary metrics for classification — precision is "of what I flagged, how much was right"; recall is "of what was actually there, how much did I catch".
- Prompt InjectionLLMs
An attack where malicious instructions hidden in untrusted input override the developer's prompt and steer the model into doing something it shouldn't.
Q 1
R 6
- ReActAgents
An agent pattern that interleaves reasoning ("think") with tool calls ("act") so the model can observe results and adjust mid-task.
- RegularizationClassical ML
Anything you add to training that discourages the model from fitting noise — usually a penalty on weight magnitude or randomness in the forward pass.
- Reranker cross-encoderRAG
A second-stage model that re-scores the top results from a fast retriever to push the most relevant ones to the top.
- Retrieval-Augmented Generation RAGRAG
A pattern where you retrieve relevant documents at query time and stuff them into the LLM's prompt so it can ground its answer in real sources.
- RLHF Reinforcement Learning from Human FeedbackLLMs
Fine-tuning a language model using a reward model trained on human preference data, with reinforcement learning to optimize for the reward.
- ROC-AUC AUC · AUROCEval
The area under the ROC curve — a threshold-independent measure of how well a classifier ranks positives above negatives.
S 1
T 5
- TemperatureLLMs
A scalar that divides logits before softmax at sampling time — lower temperature makes the model more deterministic, higher makes it more random.
- TokenLLMs
The atomic unit an LLM actually reads and writes — usually a sub-word fragment, not a whole word.
- TokenizerLLMs
The component that converts text to and from the integer token IDs an LLM actually consumes.
- Transfer LearningDeep Learning
Taking a model pretrained on a large general dataset and reusing it — with or without further training — on a different, usually smaller, task.
- TransformerDeep Learning
A neural network architecture built on stacked self-attention and feed-forward layers — the backbone of every modern LLM.
U 1
V 2
- Vanishing GradientDeep Learning
When gradients shrink exponentially as they propagate back through many layers, so early layers barely update and the network won't train.
- Vector Database vector storeRAG
A database whose primary index is built for fast approximate nearest-neighbour search over high-dimensional embeddings.