2020 · NeurIPS 2020

Language Models are Few-Shot Learners

Brown, Mann, Ryder, Subbiah, et al.

2020 NeurIPS 2020

TL;DR

Scales the GPT decoder to 175B parameters and shows that a single model, with no gradient updates, can do many tasks from a handful of in-context examples.

Read paper

BACKLOG · WORK IN PROGRESS

This paper is being written.

The metadata and shape of this page are stable, but the body content isn't ready yet. We'll publish it once it meets the bar of teaching something new with worked examples and real tools.

Back to papers Track progress on GitHub