Source: /cirosantilli/amazon-ec2-gpu

= Amazon EC2 GPU
{c}

As of December 2023, the cheapest instance with an <Nvidia GPU> is <g4nd.xlarge>, so let's try that out. In that instance, <lspci> contains:
``
00:1e.0 3D controller: NVIDIA Corporation TU104GL [Tesla T4] (rev a1)
``
TODO meaning of "nd"? "n" presumably means <Nvidia>, but what is the "d"?

Be careful not to confuse it with <g4ad.xlarge>, which has an <AMD GPU> instead. TODO meaning of "ad"? "a" presumably means <AMD>, but what is the "d"?

Some documentation on which GPU is in each instance can seen at: https://docs.aws.amazon.com/dlami/latest/devguide/gpu.html (https://web.archive.org/web/20231126224245/https://docs.aws.amazon.com/dlami/latest/devguide/gpu.html[archive]) with a list of which GPUs they have at that random point in time. Can the GPU ever change for a given instance name? Likely not. Also as of December 2023 the list is already outdated, e.g. P5 is now shown, though it is mentioned at: https://aws.amazon.com/ec2/instance-types/p5/

When selecting the instance to launch, the GPU does not show anywhere apparently on the instance information page, it is so bad!

Also note that this instance has 4 vCPUs, so on a new account you must first make a customer support request to Amazon to increase your limit from the default of 0 to 4, see also: https://stackoverflow.com/questions/68347900/you-have-requested-more-vcpu-capacity-than-your-current-vcpu-limit-of-0[], otherwise instance launch will fail with:
\Q[You have requested more vCPU capacity than your current vCPU limit of 0 allows for the instance bucket that the specified instance type belongs to. Please visit http://aws.amazon.com/contact-us/ec2-request to request an adjustment to this limit.]

When starting up the instance, also select:
* image: <Ubuntu 22.04>
* storage size: 30 GB (maximum free tier allowance)
Once you finally managed to <SSH> into the instance, first we have to install drivers and reboot:
``
sudo apt update
sudo apt install nvidia-driver-510 nvidia-utils-510 nvidia-cuda-toolkit
sudo reboot
``
and now running:
``
nvidia-smi
``
shows something like:
``
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.147.05   Driver Version: 525.147.05   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000000:00:1E.0 Off |                    0 |
| N/A   25C    P8    12W /  70W |      2MiB / 15360MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
``

If we start from the raw <Ubuntu 22.04>, first we have to install drivers:
* https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/install-nvidia-driver.html official docs
* https://stackoverflow.com/questions/63689325/how-to-activate-the-use-of-a-gpu-on-aws-ec2-instance
* https://askubuntu.com/questions/1109662/how-do-i-install-cuda-on-an-ec2-ubuntu-18-04-instance
* https://askubuntu.com/questions/1397934/how-to-install-nvidia-cuda-driver-on-aws-ec2-instance

From basically everything should just work as normal. E.g. we were able to run a <CUDA hello world> just fine along:
``
nvcc inc.cu
./a.out
``

One issue with this setup, besides the time it takes to setup, is that you might also have to pay some network charges as it downloads a bunch of stuff into the instance. We should try out some of the pre-built images. But it is also good to know this pristine setup just in case.

Some stuff we then managed to run:
``
curl https://ollama.ai/install.sh | sh
/bin/time ollama run llama2 'What is quantum field theory?'
``
which gave:
``
0.07user 0.05system 0:16.91elapsed 0%CPU (0avgtext+0avgdata 16896maxresident)k
0inputs+0outputs (0major+1960minor)pagefaults 0swaps
``
so way faster than on my local desktop <CPU>, hurray.

After setup from: https://askubuntu.com/a/1309774/52975 we were able to run:
``
head -n1000 pap.txt | ARGOS_DEVICE_TYPE=cuda time argos-translate --from-lang en --to-lang fr > pap-fr.txt
``
which gave:
``
77.95user 2.87system 0:39.93elapsed 202%CPU (0avgtext+0avgdata 4345988maxresident)k
0inputs+88outputs (0major+910748minor)pagefaults 0swaps
``
so only marginally better than on <ciro santilli s hardware/P14s>. It would be fun to see how much faster we could make things on a more powerful GPU.