OurBigBook
About
$
Donate
Sign in
Sign up
Ciro Santilli
@cirosantilli
37
Follow (9)
Message
Articles
(11k)
Discussions
(26)
Comments
(63)
Follows
Received
likes
Files
New
Updated
Top
Announced
A-Z
Liked
Followed
Show body
Body
0
Llama 3.1 405B
by
Ciro Santilli
37
2025-08-08
0
Llama 3.1 70B
by
Ciro Santilli
37
2025-08-08
0
Llama 3.1 8B
by
Ciro Santilli
37
2025-08-08
0
GPT-2 XL
by
Ciro Santilli
37
2025-08-08
0
GPT-2 large
by
Ciro Santilli
37
2025-08-08
0
GPT-2 medium
by
Ciro Santilli
37
2025-08-08
0
nanoGPT
by
Ciro Santilli
37
2025-08-08
View more
github.com/karpathy/nanoGPT
0
Llama 3.1
by
Ciro Santilli
37
2025-08-08
0
Llama 2 7B
by
Ciro Santilli
37
2025-08-08
0
GPT 4 Turbo
by
Ciro Santilli
37
2025-08-08
View more
platform.openai.com/docs/models/gpt-4-turbo
0
GPT-2 variant
by
Ciro Santilli
37
2025-08-08
0
GPT-2 implementation in PyTorch
by
Ciro Santilli
37
2025-08-08
0
GPT-2 implementation
by
Ciro Santilli
37
2025-08-08
0
Language Models are Unsupervised Multitask Learners
by
Ciro Santilli
37
2025-08-08
View more
cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf
0
Improving Language Understanding by Generative Pre-Training
by
Ciro Santilli
37
2025-08-08
0
Llama 3
by
Ciro Santilli
37
2025-08-08
View more
www.llama.com/models/llama-3/
0
Llama 2
by
Ciro Santilli
37
2025-08-08
View more
Page:
www.llama.com/llama2/
0
llama-cli inference batching
by
Ciro Santilli
37
2025-08-08
View more
As
of
llama.cpp
79e0b68c178656bb0632cb8602d2940b755077f8 there is
a
--parallel
option but not sure what it does.
Bibliography:
github.com/ggml-org/llama.cpp/discussions/3222
www.reddit.com/r/LocalLLaMA/comments/12aj0ze/what_is_batchsize_in_llamacpp_also_known_as_n/
www.reddit.com/r/LocalLLaMA/comments/12gtanv/batch_queries/
related for server:
www.reddit.com/r/LocalLLaMA/comments/1f19t2l/parallel_requests_using_llamaserver
0
GPT-4
by
Ciro Santilli
37
2025-08-08
0
GPT-3
by
Ciro Santilli
37
2025-08-08
View more
Vocabulary
size
(
V)
: 50,257
Hidden
size
(
d
_
model)
: 12,288
Context
length
2048
Q
V
size
: (
d
_head): 128
Attention heads (
h)
: 96
FFN inner
size
(
d
_ff)
4
× 12,288 = 49,152
Layers (
L)
: 96
1
2
3
4
5
6
7
8
9
10
>
>>
Total
articles
:
11064
There are unlisted articles,
also show them
or
only show them
.