2023 · arXiv 2023

LLaMA: Open and Efficient Foundation Language Models

Touvron, Lavril, Izacard, et al.

2023 arXiv 2023

TL;DR

A family of 7B–65B decoder transformers trained on public data. The 13B model matches GPT-3 and the weights (eventually) leaked, kicking off the modern open-weights era.

Read paper

BACKLOG · WORK IN PROGRESS

This paper is being written.

The metadata and shape of this page are stable, but the body content isn't ready yet. We'll publish it once it meets the bar of teaching something new with worked examples and real tools.

Back to papers Track progress on GitHub