Become an AI/ML engineer.
Ephizen Wiki is the handbook we use to onboard engineers and the reference we send to candidates. It tracks how we think about foundations, models, tools, and the papers worth reading — organised as concrete roadmaps from zero to shipped.
Choose your roadmap
1 active · 5 upcomingBuilds LLM-powered products. Strong at API integration, RAG, agents, and shipping reliable AI features into real software.
Designs autonomous systems that perceive, reason, plan, and act. Specializes in agent loops, memory, tool use, and evaluation.
Lives in SQL and BI tools. Turns operational data into dashboards, reports, and answers the rest of the company can use.
Asks the right questions, runs the right experiments, and turns data into decisions the business can act on.
Trains, deploys, and maintains models in production. Bridges Data Science and SWE.
Builds the platform every other ML role runs on. CI/CD, infrastructure, observability, and cost control for machine learning.
Guides
view all →The calculus you need to understand what backprop is actually doing. Derivatives, gradients, and the chain rule — the rest you can Google.
When RAG isn't enough and you actually need to teach the model something new. How to decide, how to do it, and what it'll cost you.
The winner on tabular data for the last decade. How it works, which library to pick, and the three mistakes everyone makes tuning it.
The data structure ML engineers use more than any other — for deduping data, counting features, caching embeddings, and 80% of interview problems.
The simplest useful classifier, and a surprisingly strong baseline for most problems. Understand it deeply and you understand half of classical ML.
NumPy is the substrate every ML framework is built on. Vectorized operations, broadcasting, axis semantics — the stuff that makes the difference between a fast and a slow model.
Tools we use
view all →Featured papers
view all →The paper that replaced recurrence with self-attention and set off the transformer era.
A clean recipe for grounding generation in retrieved documents. The root of modern RAG.
Fine-tune a 7B model on a single GPU by training a handful of low-rank matrices.
Reasoning traces in the prompt improve multi-step answers from large models.
Glossary
view all →Every term we use at Ephizen, defined in plain English with real context.
A mechanism that lets a model weigh which other tokens in a sequence matter most when computing a representation for the current one.
A dense vector that encodes the meaning of a piece of data so similar things land near each other in vector space.
Retrieve relevant documents at query time and stuff them into the LLM's prompt so it can ground its answer.
A database whose primary index is built for fast approximate nearest-neighbour search over high-dimensional embeddings.
Prompting an LLM to write out its reasoning step by step before giving a final answer, which usually improves accuracy on multi-step problems.
Parameter-efficient fine-tuning that freezes the base model and trains small low-rank update matrices on top.