List of GPT models

GPT model by OpenAI

GPT-1 (117 M parameters, 2019-06)

Improving Language Understanding by Generative Pre-Training (GPT-1 paper)

GPT-2 (124 M parameters, 2019-11-05)

Vocabulary size (V): 50,257
Hidden size (d_model): 768
Context length (n_ctx): 1024
Q V size: (d_head): 64
Attention heads (h): 12
FFN inner size (d_ff): 3072
Layers (L): 12

Language Models are Unsupervised Multitask Learners (GPT-2 paper)

cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf

GPT-2 implementation

GPT-2 implementation in PyTorch

nanoGPT

github.com/karpathy/nanoGPT

GPT-2 variant

GPT-2 medium (355 M parameters)

GPT-2 large (774 M parameters)

GPT-2 XL

GPT-3 (175 B parameters, 2020-06)

Vocabulary size (V): 50,257
Hidden size (d_model): 12,288
Context length 2048
Q V size: (d_head): 128
Attention heads (h): 96
FFN inner size (d_ff) 4 × 12,288 = 49,152
Layers (L): 96

GPT-4

GPT 4 Turbo

platform.openai.com/docs/models/gpt-4-turbo

GPT-5

Llama (language model)

Homepage: www.llama.com/

Llama 2 (2023)

Page: www.llama.com/llama2/

Llama 2 7B

Llama 3 (2024)

www.llama.com/models/llama-3/

Llama 3.1

Llama 3.1 8B

Llama 3.1 70B

Llama 3.1 405B

 Ancestors (15)

 View article source

 Discussion (0)

There are no discussions about this article yet.

 Articles by others on the same topic (0)

There are currently no matching articles.

  See all articles in the same topic Create my own version