@cirosantilli/_file/python/pytorch/python/pytorch/matmul.py Updated 2025-01-01 +Created 1970-01-01
Matrix multiplication example.
Fundamental since deep learning is mostly matrix multiplication.
NumPy does not automatically use the GPU for it: stackoverflow.com/questions/49605231/does-numpy-automatically-detect-and-use-gpu, and PyTorch is one of the most notable compatible implementations, as it uses the same memory structure as NumPy arrays.
Invertible matrices. Or if you think a bit more generally, an invertible linear map.
Non-invertible are excluded "because" otherwise it would not form a group (every element must have an inverse). This is therefore the largest possible group under matrix multiplication, other matrix multiplication groups being subgroups of it.
The set of all invertible matrices forms a group: the general linear group with matrix multiplication. Non-invertible matrices don't form a group due to the lack of inverse.
For this sub-case, we can define the Lie algebra of a Lie group as the set of all matrices such that for all :If we fix a given and vary , we obtain a subgroup of . This type of subgroup is known as a one parameter subgroup.
The immediate question is then if every element of can be reached in a unique way (i.e. is the exponential map a bijection). By looking at the matrix logarithm however we conclude that this is not the case for real matrices, but it is for complex matrices.
TODO example it can be seen that the Lie algebra is not closed matrix multiplication, even though the corresponding group is by definition. But it is closed under the Lie bracket operation.
A common case is , and .
One thing that makes such functions particularly simple is that they can be fully specified by specifyin how they act on all possible combinations of input basis vectors: they are therefore specified by only a finite number of elements of .
Every linear map in finite dimension can be represented by a matrix, the points of the domain being represented as vectors.
As such, when we say "linear map", we can think of a generalization of matrix multiplication that makes sense in infinite dimensional spaces like Hilbert spaces, since calling such infinite dimensional maps "matrices" is stretching it a bit, since we would need to specify infinitely many rows and columns.
The prototypical building block of infinite dimensional linear map is the derivative. In that case, the vectors being operated upon are functions, which cannot therefore be specified by a finite number of parameters, e.g.
For example, the left side of the time-independent Schrödinger equation is a linear map. And the time-independent Schrödinger equation can be seen as a eigenvalue problem.
The matrix ring of degree n is the set of all n-by-n square matrices together with the usual vector space and matrix multiplication operations.
This set forms a ring.
This is a quick tutorial on how a quantum computer programmer thinks about how a quantum computer works. If you know:a concrete and precise hello world operation can be understood in 30 minutes.
- what a complex number is
- how to do matrix multiplication
- what is a probability
Although there are several types of quantum computer under development, there exists a single high level model that represents what most of those computers can do, and we are going to explain that model here. This model is the is the digital quantum computer model, which uses a quantum circuit, that is made up of many quantum gates.
Beyond that basic model, programmers only may have to consider the imperfections of their hardware, but the starting point will almost always be this basic model, and tooling that automates mapping the high level model to real hardware considering those imperfections (i.e. quantum compilers) is already getting better and better.
The way quantum programmers think about a quantum computer in order to program can be described as follows:
- the input of a N qubit quantum computer is a vector of dimension N containing classic bits 0 and 1
- the quantum program, also known as circuit, is a unitary matrix of complex numbers that operates on the input to generate the output
- the output of a N qubit computer is also a vector of dimension N containing classic bits 0 and 1
To operate a quantum computer, you follow the step of operation of a quantum computer:
- set the input qubits to classic input bits (state initialization)
- press a big red "RUN" button
- read the classic output bits (readout)
Each time you do this, you are literally conducting a physical experiment of the specific physical implementation of the computer:and each run as the above can is simply called "an experiment" or "a measurement".
- setup your physical system to represent the classical 0/1 inputs
- let the state evolve for long enough
- measure the classical output back out
The output comes out "instantly" in the sense that it is physically impossible to observe any intermediate state of the system, i.e. there are no clocks like in classical computers, further discussion at: quantum circuits vs classical circuits. Setting up, running the experiment and taking the does take some time however, and this is important because you have to run the same experiment multiple times because results are probabilistic as mentioned below.
Unlike in a classical computer, the output of a quantum computer is not deterministic however.
But the each output is not equally likely either, otherwise the computer would be useless except as random number generator!
This is because the probabilities of each output for a given input depends on the program (unitary matrix) it went through.
Therefore, what we have to do is to design the quantum circuit in a way that the right or better answers will come out more likely than the bad answers.
We then calculate the error bound for our circuit based on its design, and then determine how many times we have to run the experiment to reach the desired accuracy.
The probability of each output of a quantum computer is derived from the input and the circuit as follows.
First we take the classic input vector of dimension N of 0's and 1's and convert it to a "quantum state vector" of dimension :
We are after all going to multiply it by the program matrix, as you would expect, and that has dimension !
Note that this initial transformation also transforms the discrete zeroes and ones into complex numbers.
For example, in a 3 qubit computer, the quantum state vector has dimension and the following shows all 8 possible conversions from the classic input to the quantum state vector:
000 -> 1000 0000 == (1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)
001 -> 0100 0000 == (0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)
010 -> 0010 0000 == (0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0)
011 -> 0001 0000 == (0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0)
100 -> 0000 1000 == (0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0)
101 -> 0000 0100 == (0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0)
110 -> 0000 0010 == (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0)
111 -> 0000 0001 == (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0)
This can be intuitively interpreted as:
- if the classic input is
000
, then we are certain that all three bits are 0.Therefore, the probability of all three 0's is 1.0, and all other possible combinations have 0 probability. - if the classic input is
001
, then we are certain that bit one and two are 0, and bit three is 1. The probability of that is 1.0, and all others are zero. - and so on
Now that we finally have our quantum state vector, we just multiply it by the unitary matrix of the quantum circuit, and obtain the dimensional output quantum state vector :
And at long last, the probability of each classical outcome of the measurement is proportional to the square of the length of each entry in the quantum vector, analogously to what is done in the Schrödinger equation.
For example, suppose that the 3 qubit output were:
Then, the probability of each possible outcomes would be the length of each component squared:i.e. 75% for the first, and 25% for the third outcomes, where just like for the input:
- first outcome means
000
: all output bits are zero - third outcome means
010
: the first and third bits are zero, but the second one is 1
All other outcomes have probability 0 and cannot occur, e.g.:
001
is impossible.Keep in mind that the quantum state vector can also contain complex numbers because we are doing quantum mechanics, but we just take their magnitude in that case, e.g. the following quantum state would lead to the same probabilities as the previous one:
This interpretation of the quantum state vector clarifies a few things:
- the input quantum state is just a simple state where we are certain of the value of each classic input bit
- the matrix has to be unitary because the total probability of all possible outcomes must be 1.0This is true for the input matrix, and unitary matrices have the probability of maintaining that property after multiplication.Unitary matrices are a bit analogous to self-adjoint operators in general quantum mechanics (self-adjoint in finite dimensions implies is stronger)This also allows us to understand intuitively why quantum computers may be capable of accelerating certain algorithms exponentially: that is because the quantum computer is able to quickly do an unitary matrix multiplication of a humongous sized matrix.If we are able to encode our algorithm in that matrix multiplication, considering the probabilistic interpretation of the output, then we stand a chance of getting that speedup.
As we could see, this model is was simple to understand, being only marginally more complex than that of a classical computer, see also: quantumcomputing.stackexchange.com/questions/6639/is-my-background-sufficient-to-start-quantum-computing/14317#14317 The situation of quantum computers today in the 2020's is somewhat analogous to that of the early days of classical circuits and computers in the 1950's and 1960's, before CPU came along and software ate the world. Even though the exact physics of a classical computer might be hard to understand and vary across different types of integrated circuits, those early hardware pioneers (and to this day modern CPU designers), can usefully view circuits from a higher level point of view, thinking only about concepts such as:as modelled at the register transfer level, and only in a separate compilation step translated into actual chips. This high level understanding of how a classical computer works is what we can call "the programmer's model of a classical computer". So we are now going to describe the quantum analogue of it.
- logic gates like AND, NOR and NOT
- a clock + registers
Bibliography:
- arxiv.org/pdf/1804.03719.pdf Quantum Algorithm Implementations for Beginners by Abhijith et al. 2020
When it distributes it inverts the order of the matrix multiplication: