OurBigBook
About
$
Donate
Sign in
Sign up
GPT-2
(124 M parameters, 2019-11-05)
Ciro Santilli
(
@cirosantilli,
37
)
...
Text-to-text model
Large language model
Generative pre-trained transformer
GPT model
List of GPT models
GPT model by OpenAI
2025-08-08
0
Like
0 By others
on same topic
0 Discussions
Create my own version
Vocabulary
size
(
V)
: 50,257
Hidden
size
(
d
_
model)
: 768
Context
length
(
n
_ctx): 1024
Q
V
size
: (
d
_head): 64
Attention heads (
h)
: 12
FFN inner
size
(
d
_ff): 3072
Layers (
L)
: 12
Table of contents
Language Models are Unsupervised Multitask Learners
GPT-2
GPT-2 implementation
GPT-2
GPT-2 implementation in PyTorch
GPT-2
nanoGPT
GPT-2 implementation in PyTorch
GPT-2 variant
GPT-2
GPT-2 medium
GPT-2 variant
GPT-2 large
GPT-2 variant
GPT-2 XL
GPT-2 variant
Language Models are Unsupervised Multitask Learners
(GPT-2 paper)
0
0
0
GPT-2
cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf
GPT-2 implementation
0
0
0
GPT-2
GPT-2 implementation in PyTorch
0
0
0
GPT-2
Tags:
PyTorch model
nanoGPT
0
0
0
GPT-2 implementation in PyTorch
github.com/karpathy/nanoGPT
GPT-2 variant
0
0
0
GPT-2
GPT-2 medium
(355 M parameters)
0
0
0
GPT-2 variant
GPT-2 large
(774 M parameters)
0
0
0
GPT-2 variant
GPT-2 XL
0
0
0
GPT-2 variant
Ancestors
(17)
GPT model by OpenAI
List of GPT models
GPT model
Generative pre-trained transformer
Large language model
Text-to-text model
AI text generation
Generative AI by modality
Generative AI
AI by capability
Artificial intelligence
Machine learning
Computer
Information technology
Area of technology
Technology
Home
Incoming links
(1)
Number of multiplications per token in a GPT model
View article source
Discussion
(0)
Subscribe (1)
New discussion
There are no discussions about this article yet.
Articles by others on the same topic
(0)
There are currently no matching articles.
See all articles in the same topic
Create my own version