MLflow
Open-source platform for the ML lifecycle — experiment tracking, model registry, packaging, and deployment.
Category
MLOps
Difficulty
Intermediate
When to use
You need experiment tracking and a model registry without buying a full MLops platform, and you want something self-hostable.
When not to use
You're a solo researcher running ten experiments a month — a spreadsheet or Weights & Biases free tier is easier.
Alternatives
Weights & Biases Neptune Comet ClearML
At a glance
| Field | Value |
|---|---|
| Category | ML lifecycle platform |
| Difficulty | Intermediate |
| When to use | Experiment tracking + registry, self-hosted |
| When not to use | Tiny projects; heavy-collaboration teams that want polished UX |
| Alternatives | Weights & Biases, Neptune, Comet, ClearML |
What it is
MLflow has four components: Tracking (log params, metrics, artifacts per run), Projects (packaging training code), Models (a standard format with flavors for sklearn, PyTorch, XGBoost, etc.), and Model Registry (versioned, stage-tagged model store). It runs against a backend store (Postgres is standard) and an artifact store (S3/GCS).
When we reach for it at Ephizen
- Comparing training runs side by side with clear lineage from code commit to metrics.
- The registry as the source of truth for “which model is in production”.
- Packaging classical ML and small deep learning models into a standard serve-able format.
Getting started
import mlflow
mlflow.set_experiment("churn")
with mlflow.start_run():
mlflow.log_params({"lr": 0.05, "max_depth": 6})
mlflow.log_metric("val_auc", 0.87)
mlflow.sklearn.log_model(clf, "model", registered_model_name="churn-xgb")
Gotchas
- The tracking UI is functional but dated. Weights & Biases is much nicer if budget allows.
- Self-hosting needs real backing storage — SQLite + local files is fine for one user, not a team.
log_artifacton large files is slow over the network; prefer logging URIs for huge artifacts.