GPT-3 2025-08-08
GPT-2 2025-08-08
The following is for a "classic" GPT-2-style model, the following estimates the number attention multiplications.
For each layer (L):
So the total sum is:
L * (
  h * (
    2 * d_model * d_head +
    n_ctx * d_head +
    d_model * d_model +
    n_ctx * d_model
  ) +
  2 * d_model * d_ff
)
This is coded at: llm_count_mults.py.
This example attempts to keep temperature to a fixed point by turning on a fan when a thermistor gets too hot.
You can test it easily if you are not in a place that is too hot by holding the thermistor with your finger to turn on the fan.
You can use a simple LED to represent the fan if you don't have one handy.
In Ciro's ASCII art circuit diagram notation:
            +----------FAN-----------+
            |                        |
            |                        |
RPI_PICO_W__gnd__gpio26Adc__3.3V@36__gpio2
            |    |          |
            |    |          |
            |    |          |
            |    +-THERMISTOR
            |    |
            |    |
            R_10-+

There are unlisted articles, also show them or only show them.