Calculus

This is a general philosophy that Ciro Santilli, and likely others, observes over and over.

Basically, continuity, or higher order conditions like differentiability seem to impose greater constraints on problems, which make them more solvable.

Some good examples of that:

complex discrete problems:
- classification of finite groups
simple continuous problems:
- characterization of Lie groups

Discrete

Something that is very not continuous.

Notably studied in discrete mathematics.

Infinity ( $\infty$ )

Chuck Norris counted to infinity. Twice.

There are a few related concepts that are called infinity in mathematics:

limits that are greater than any number
the cardinality of a set that does not have a finite number of elements
in some number systems, there is an explicit "element at infinity" that is not a limit, e.g. projective geometry

L'Hôpital's rule (limit of a ratio)

Derivative

 3  0

The derivative of a function gives its slope at a point.

More precisely, it give sthe inclination of a tangent line that passes through that point.

Infinitely differentiable function ( $C^{\infty}$ )

Flat top bump function

math.stackexchange.com/questions/1786964/is-it-possible-to-construct-a-smooth-flat-top-bump-function

Maxima and minima

Given a function

f

from some space. For beginners the real numbers but more generally topological spaces should work in general
to the real numbers

we want to find the points

x

of the domain of

f

where the value of

f

is smaller (for minima, or larger for maxima) than all other points in some neighbourhood of

x

In the case of Functionals, this problem is treated under the theory of the calculus of variations.

Calculus of variations

Lifegard problem

pumphandle.consulting/2020/09/04/the-lifeguard-problem-solved/

Partial derivative symbol ( $\partial$ )

Nope, it is not a Greek letter, notably it is not a lowercase delta. It is just some random made up symbol that looks like a letter D. Which is of course derived from delta, which is why it is all so damn confusing.

I think the symbol is usually just read as "D" as in "d f d x" for

\frac{\partial F ( x , y , z )}{\partial x}

Partial label partial derivative notation ( $\partial_{x} F$ , $\partial_{y} F$ )

Partial index partial derivative notation ( $\partial_{0} F$ , $\partial_{1} F$ )

This notation is not so common in basic mathematics, but it is so incredibly convenient, especially with Einstein notation as shown at Section "Einstein notation for partial derivatives":

\partial_{0} F (x, y, z) = \frac{\partial F ( x , y , z )}{\partial x} \partial_{1} F (x, y, z) = \frac{\partial F ( x , y , z )}{\partial y} \partial_{2} F (x, y, z) = \frac{\partial F ( x , y , z )}{\partial x}

This notation is similar to partial label partial derivative notation, but it uses indices instead of labels such as

x

y

, etc.

Total derivative

The total derivative of a function assigns for every point of the domain a linear map with same domain, which is the best linear approximation to the function value around this point, i.e. the tangent plane.

E.g. in 1D:

T o t a l d er i v a t i v e = D [f (x_{0})] (x) = f (x_{0}) + \frac{\partial f}{\partial x} (x_{0}) \times x

and in 2D:

D [f (x_{0}, y_{0})] (x, y) = f (x_{0}, y_{0}) + \frac{\partial f}{\partial x} (x_{0}, y_{0}) \times x + \frac{\partial f}{\partial y} (x_{0}, y_{0}) \times y

(2)

Directional derivative

Integral

 3  0

Area

Volume

3D area.

Riemann integral

The easy and less generic integral. The harder one is the Lebesgue integral.

Lebesgue integral

"More complex and general" integral. Matches the Riemann integral for "simple functions", but also works for some "funkier" functions that Riemann does not work for.

Ciro Santilli sometimes wonders how much someone can gain from learning this besides the beauty of mathematics, since we can hand-wave a Lebesgue integral on almost anything that is of practical use. The beauty is good reason enough though.

Lebesgue integral vs Riemann integral

Advantages over Riemann:

Lebesgue integral of $L^{p}$ is complete but Riemann isn't.
youtu.be/PGPZ0P1PJfw?t=710 you are able to switch the order of integrals and limits of function sequences on non-uniform convergence. TODO why do we care? This is linked to the Fourier series of course, but concrete example?

Riemann integral vs. Lebesgue integral by The Bright Side Of Mathematics (2018)

Source.

youtube.com/watch?v=PGPZ0P1PJfw&t=808 shows how Lebesgue can be visualized as a partition of the function range instead of domain, and then you just have to be able to measure the size of pre-images.

One advantage of that is that the range is always one dimensional.

But the main advantage is that having infinitely many discontinuities does not matter.

Infinitely many discontinuities can make the Riemann partitioning diverge.

But in Lebesgue, you are instead measuring the size of preimage, and to fit infinitely many discontinuities in a finite domain, the size of this preimage is going to be zero.

So then the question becomes more of "how to define the measure of a subset of the domain".

Which is why we then fall into measure theory!

Real world applications of the Lebesgue integral

In "practice" it is likely "useless", because the functions that it can integrate that Riemann can't are just too funky to appear in practice :-)

Its value is much more indirect and subtle, as in "it serves as a solid basis of quantum mechanics" due to the definition of Hilbert spaces.

Bibliography:

Lebesgue measurable

Lebesgue integral of $L^{p}$ is complete but Riemann isn't

L^{p}

is:

complete under the Lebesgue integral, this result is may be called the Riesz-Fischer theorem
not complete under the Riemann integral: math.stackexchange.com/questions/397369/space-of-riemann-integrable-functions-not-complete

And then this is why quantum mechanics basically lives in

L^{2}

: not being complete makes no sense physically, it would mean that you can get closer and closer to states that don't exist!

TODO intuition

Riesz-Fischer theorem

A measurable function defined on a closed interval is square integrable (and therefore in

L^{2}

) if and only if Fourier series converges in

L^{2}

norm the function:

lim_{N \to \infty} ∥ S_{N} f - f ∥_{2} = 0

$L^{p}$ is complete

TODO

Fourier basis is complete for $L^{2}$

math.stackexchange.com/questions/316235/proving-that-the-fourier-basis-is-complete-for-cr-2-pi-c-with-l2-norm

Riesz-Fischer theorem is a norm version of it, and Carleson's theorem is stronger pointwise almost everywhere version.

Note that the Riesz-Fischer theorem is weaker because the pointwise limit could not exist just according to it:

L^{p}

norm sequence convergence does not imply pointwise convergence.

$L^{p}$ norm sequence convergence does not imply pointwise convergence

math.stackexchange.com/questions/138043/does-convergence-in-lp-imply-convergence-almost-everywhere

There are explicit examples of this. We can have ever thinner disturbances to convergence that keep getting less and less area, but never cease to move around.

If it does converge pointwise to something, then it must match of course.

Carleson's theorem

The Fourier series of an

L^{2}

function (i.e. the function generated from the infinite sum of weighted sines) converges to the function pointwise almost everywhere.

The theorem also seems to hold (maybe trivially given the transform result) for the Fourier series (TODO if trivially, why trivially).

Only proved in 1966, and known to be a hard result without any known simple proof.

This theorem of course implies that Fourier basis is complete for

L^{2}

, as it explicitly constructs a decomposition into the Fourier basis for every single function.

TODO vs Riesz-Fischer theorem. Is this just a stronger pointwise result, while Riesz-Fischer is about norms only?

One of the many fourier inversion theorems.

Lp space ( $L^{p}$ )

is complete but Riemann isn't

Integrable functions to the power

p

, usually and in this text assumed under the Lebesgue integral because: Lebesgue integral of

L^{p}

$L^{1}$

$L^{2}$

L^{p}

for

p == 2

L^{2}

is by far the most important of

L^{p}

because it is quantum mechanics states live, because the total probability of being in any state has to be 1!

L^{2}

has some crucially important properties that other

L^{p}

don't (TODO confirm and make those more precise):

it is the only $L^{p}$ that is Hilbert space because it is the only one where an inner product compatible with the metric can be defined:
- math.stackexchange.com/questions/2005632/l2-is-the-only-hilbert-space-parallelogram-law-and-particular-ft-gt
- www.quora.com/Why-is-L2-a-Hilbert-space-but-not-Lp-or-higher-where-p-2
Fourier basis is complete for $L^{2}$ , which is great for solving differential equation

Plancherel theorem

Some sources say that this is just the part that says that the norm of a

L^{2}

function is the same as the norm of its Fourier transform.

Others say that this theorem actually says that the Fourier transform is bijective.

The comment at math.stackexchange.com/questions/446870/bijectiveness-injectiveness-and-surjectiveness-of-fourier-transformation-define/1235725#1235725 may be of interest, it says that the bijection statement is an easy consequence from the norm one, thus the confusion.

TODO does it require it to be in

L^{1}

as well? Wikipedia en.wikipedia.org/w/index.php?title=Plancherel_theorem&oldid=987110841 says yes, but courses.maths.ox.ac.uk/node/view_material/53981 does not mention it.

The Fourier transform is a bijection in $L^{2}$

As mentioned at Section "Plancherel theorem", some people call this part of Plancherel theorem, while others say it is just a corollary.

This is an important fact in quantum mechanics, since it is because of this that it makes sense to talk about position and momentum space as two dual representations of the wave function that contain the exact same amount of information.

Every Riemann integrable function is Lebesgue integrable

But only for the proper Riemann integral: math.stackexchange.com/questions/2293902/functions-that-are-riemann-integrable-but-not-lebesgue-integrable

Measure theory

Main motivation: Lebesgue integral.

The Bright Side Of Mathematics 2019 playlist: www.youtube.com/watch?v=xZ69KEg7ccU&list=PLBh2i93oe2qvMVqAzsX1Kuv6-4fjazZ8j

The key idea, is that we can't define a measure for the power set of R. Rather, we must select a large measurable subset, and the Borel sigma algebra is a good choice that matches intuitions.

Fourier series

Approximates an original function by sines. If the function is "well behaved enough", the approximation is to arbitrary precision.

Fourier's original motivation, and a key application, is solving partial differential equations with the Fourier series.

Can only be used to approximate for periodic functions (obviously from its definition!). The Fourier transform however overcomes that restriction:

The Fourier series behaves really nicely in

L^{2}

, where it always exists and converges pointwise to the function: Carleson's theorem.

But what is a Fourier series? by 3Blue1Brown (2019)

Source. Amazing 2D visualization of the decomposition of complex functions.

Applications of the Fourier series

Solving partial differential equations with the Fourier series

See: math.stackexchange.com/questions/579453/real-world-application-of-fourier-series/3729366#3729366 from heat equation solution with Fourier series.

Separation of variables of certain equations like the heat equation and wave equation are solved immediately by calculating the Fourier series of initial conditions!

Other basis besides the Fourier series show up for other equations, e.g.:

Discrete Fourier transform (DFT)

Input: a sequence of

N

complex numbers

x_{k}

Output: another sequence of

N

complex numbers

X_{k}

such that:

x_{n} = \frac{1}{N} \sum_{k = 0}^{N - 1} X_{k} e^{i 2 π \frac{kn}{N}}

Intuitively, this means that we are braking up the complex signal into

N

sinusoidal frequencies:

$X_{0}$ : is kind of magic and ends up being a constant added to the signal because $e^{i 2 π \frac{kn}{N}} = e^{0} = 1$
$X_{1}$ : sinusoidal that completes one cycle over the signal. The larger the $N$ , the larger the resolution of that sinusoidal. But it completes one cycle regardless.
$X_{2}$ : sinusoidal that completes two cycles over the signal
...
$X_{N - 1}$ : sinusoidal that completes $N - 1$ cycles over the signal

and is the amplitude of each sine.

We use Zero-based numbering in our definitions because it just makes every formula simpler.

Motivation: similar to the Fourier transform:

compression: a sine would use N points in the time domain, but in the frequency domain just one, so we can throw the rest away. A sum of two sines, only two. So if your signal has periodicity, in general you can compress it with the transform
noise removal: many systems add noise only at certain frequencies, which are hopefully different from the main frequencies of the actual signal. By doing the transform, we can remove those frequencies to attain a better signal-to-noise

In particular, the discrete Fourier transform is used in signal processing after a analog-to-digital converter. Digital signal processing historically likely grew more and more over analog processing as digital processors got faster and faster as it gives more flexibility in algorithm design.

Sample software implementations:

numpy.fft, notably see the example: numpy/fft.py

Figure 1.
DFT of $2 sin (t) + cos (4 t)$ with 25 points
. This is a simple example of a discrete Fourier transform for a real input signal. It illustrates how the DFT takes N complex numbers as input, and produces N complex numbers as output. It also illustrates how the discrete Fourier transform of a real signal is symmetric around the center point.

Discrete Fourier transform of a real signal

See sections: "Example 1 - N even", "Example 2 - N odd" and "Representation in terms of sines and cosines" of www.statlect.com/matrix-algebra/discrete-Fourier-transform-of-a-real-signal

The transform still has complex numbers.

Summary:

$X_{0}$ is real
$X_{1} = \overset{ˉ}{X_{N - 1}}$
$X_{2} = \overset{ˉ}{X_{N - 2}}$
$X_{k} = \overset{ˉ}{X_{N - k}}$

Therefore, we only need about half of

X_{k}

to represent the signal, as the other half can be derived by conjugation.

"Representation in terms of sines and cosines" from www.statlect.com/matrix-algebra/discrete-Fourier-transform-of-a-real-signal then gives explicit formulas in terms of

X_{k}

NumPy for example has "Real FFTs" for this: numpy.org/doc/1.24/reference/routines.fft.html#real-ffts

Figure 1.
DFT of $2 sin (t) + cos (4 t)$ with 25 points
. Source at: numpy/fft_plot.py. This plot illustrates how the DFT of a real signal is symmetric around the middle point, and so only half of the transform points are needed to reconstruct the original signal. We also see how the phase of the sinusoids determines if their DFT components are real or imaginary.

Normalized DFT

There are actually two possible definitions for the DFT:

1/N, given as "the default" in many sources:
$x_{n} = \frac{1}{N} \sum_{k = 0}^{N - 1} X_{k} e^{i 2 π \frac{kn}{N}}$
(1)
$1/ N$ , known as the "normalized DFT" by some sources: www.dsprelated.com/freebooks/mdft/Normalized_DFT.html, definition which we adopt:
$x_{n} = \frac{1}{N} \sum_{k = 0}^{N - 1} X_{k} e^{i 2 π \frac{kn}{N}}$
(2)

The

1/ N

is nicer mathematically as the inverse becomse more symmetric, and power is conserved between time and frequency domains.

Fast Fourier transform

An efficient algorithm to calculate the discrete Fourier transform.

Fourier transform

Continuous version of the Fourier series.

Can be used to represent functions that are not periodic: math.stackexchange.com/questions/221137/what-is-the-difference-between-fourier-series-and-fourier-transformation while the Fourier series is only for periodic functions.

Of course, every function defined on a finite line segment (i.e. a compact space).

Therefore, the Fourier transform can be seen as a generalization of the Fourier series that can also decompose functions defined on the entire real line.

As a more concrete example, just like the Fourier series is how you solve the heat equation on a line segment with Dirichlet boundary conditions as shown at: Section "Solving partial differential equations with the Fourier series", the Fourier transform is what you need to solve the problem when the domain is the entire real line.

Multidimensional Fourier transform

Lecture notes:

www.robots.ox.ac.uk/~az/lectures/ia/lect2.pdf Lecture 2: 2D Fourier transforms and applications by A. Zisserman (2014)

How the 2D FFT works by Mike X Cohen (2017)

Source. Animations showing how the 2D Fourier transform looks like for simple inpuf functions.

Fourier inversion theorem

A set of theorems that prove under different conditions that the Fourier transform has an inverse for a given space, examples:

Carleson's theorem for $L^{2}$

Laplace transform

The Laplace Transform: A Generalized Fourier Transform by Steve Brunton (2020)

Source. Explains how the Laplace transform works for functions that do not go to zero on infinity, which is a requirement for the Fourier transform. No applications in that video yet unfortunately.

History of the Fourier series

First published by Fourier in 1807 to solve the heat equation.

Topology

Topology is the plumbing of calculus.

The key concept of topology is a neighbourhood.

Just by havin the notion of neighbourhood, concepts such as limit and continuity can be defined without the need to specify a precise numerical value to the distance between two points with a metric.

As an example. consider the orthogonal group, which is also naturally a topological space. That group does not usually have a notion of distance defined for it by default. However, we can still talk about certain properties of it, e.g. that the orthogonal group is compact, and that the orthogonal group has two connected components.

Covering space

Basically it is a larger space such that there exists a surjection from the large space onto the smaller space, while still being compatible with the topology of the small space.

We can characterize the cover by how injective the function is. E.g. if two elements of the large space map to each element of the small space, then we have a double cover and so on.

Double cover

Neighbourhood (mathematics)

The key concept of topology.

Topological space

Manifold

We map each point and a small enough neighbourhood of it to

R^{n}

, so we can talk about the manifold points in terms of coordinates.

Does not require any further structure besides a consistent topological map. Notably, does not require metric nor an addition operation to make a vector space.

Manifolds are cool. Especially differentiable manifolds which we can do calculus on.

A notable example of a Non-Euclidean geometry manifold is the space of generalized coordinates of a Lagrangian. For example, in a problem such as the double pendulum, some of those generalized coordinates could be angles, which wrap around and thus are not euclidean.

Atlas (topology)

Collection of coordinate charts.

The key element in the definition of a manifold.

Coordinate chart

Covariant derivative

A generalized definition of derivative that works on manifolds.

TODO: how does it maintain a single value even across different coordinate charts?

Differentiable manifold

TODO find a concrete numerical example of doing calculus on a differentiable manifold and visualizing it. Likely start with a boring circle. That would be sweet...

Tangent space

TODO what's the point of it.

Bibliography:

www.youtube.com/watch?v=j1PAxNKB_Zc Manifolds #6 - Tangent Space (Detail) by WHYB maths (2020). This is worth looking into.
- www.youtube.com/watch?v=oxB4aH8h5j4 actually gives a more concrete example. Basically, the vectors are defined by saying "we are doing the Directional derivative of any function along this direction".
  One thing to remember is that of course, the most convenient way to define a function $f$ and to specify a direction, is by using one of the coordinate charts.
  We can then just switch between charts by change of basis.
jakobschwichtenberg.com/lie-algebra-able-describe-group/ by Jakob Schwichtenberg
math.stackexchange.com/questions/1388144/what-exactly-is-a-tangent-vector/2714944 What exactly is a tangent vector? on Stack Exchange

Tangent vector to a manifold

A member of a tangent space.

One-form

www.youtube.com/watch?v=tq7sb3toTww&list=PLxBAVPVHJPcrNrcEBKbqC_ykiVqfxZgNl&index=19 mentions that it is a bit like a dot product but for a tangent vector to a manifold: it measures how much that vector derives along a given direction.

Metric (mathematics, $d (x, y)$ )

A metric is a function that give the distance, i.e. a real number, between any two elements of a space.

A metric may be induced from a norm as shown at: Section "Metric induced by a norm".

Because a norm can be induced by an inner product, and the inner product given by the matrix representation of a positive definite symmetric bilinear form, in simple cases metrics can also be represented by a matrix.

Metric space

Canonical example: Euclidean space.

Euclidean space

Metric space vs normed vector space vs inner product space

TODO examples:

metric space that is not a normed vector space
norm vs metric: a norm gives size of one element. A metric is the distance between two elements. Given a norm in a space with subtraction, we can obtain a distance function: the metric induced by a norm.

Figure 1.
Hierarchy of topological, metric, normed and inner product spaces
. Source.

Complete metric space

is complete but Riemann isn't.

In plain English: the space has no visible holes. If you start walking less and less on each step, you always converge to something that also falls in the space.

One notable example where completeness matters: Lebesgue integral of

L^{p}

Normed vector space

Inner product space

Subcase of a normed vector space, therefore also necessarily a vector space.

Inner product

Appears to be analogous to the dot product, but also defined for infinite dimensions.

Norm (mathematics, $∣ x ∣$ )

Vs metric:

a norm is the size of one element. A metric is the distance between two elements.
a norm is only defined on a vector space. A metric could be defined on something that is not a vector space. Most basic examples however are also vector spaces.

Norm induced by an inner product

An inner product

x \cdot y

induces a norm with:

∣ x ∣ = < x, x >

Norm induced by the complex dot product

Metric induced by a norm

In a vector space, a metric may be induced from a norm by using subtraction:

d (x, y) = ∣ x - y ∣

Pseudometric space

Metric space but where the distance between two distinct points can be zero.

Notable example: Minkowski space.

Minkowski space

Compact space

Dense set

Connected space

Connected component

When a disconnected space is made up of several smaller connected spaces, then each smaller component is called a "connected component" of the larger space.

See for example the

Simply connected space

Loop (topology)

Homotopy

Generalized Poincaré conjecture

There are two cases:

(topological) manifolds
differential manifolds

Questions: are all compact manifolds / differential manifolds homotopic / diffeomorphic to the sphere in that dimension?

for topological manifolds: this is a generalization of the Poincaré conjecture.
Original problem posed, $n = 3$ for topological manifolds.
Millennium Prize Problems.
Last to be proven, only the 4-differential manifold case missing as of 2013.
Even the truth for all $n > 4$ was proven in the 60's!
Why is low dimension harder than high dimension?? Surprise!
AKA: classification of compact 3-manifolds. The result turned out to be even simpler than compact 2-manifolds: there is only one, and it is equal to the 3-sphere.
For dimension two, we know there are infinitely many: classification of closed surfaces
for differential manifolds:
Not true in general. First counter example is $n = 7$ . Surprise: what is special about the number 7!?
Counter examples are called exotic spheres.
Totally unpredictable count table:
Dimension Smooth types
1 1
2 1
3 1
4 ?
5 1
6 1
7 28
8 2
9 8
10 6
11 992
12 1
13 3
14 2
15 16256
16 2
17 16
18 16
19 523264
20 24
$n = 4$ is an open problem, there could even be infinitely many. Again, why are things more complicated in lower dimensions??

Dimension	Smooth types
1	1
2	1
3	1
4	?
5	1
6	1
7	28
8	2
9	8
10	6
11	992
12	1
13	3
14	2
15	16256
16	2
17	16
18	16
19	523264
20	24

Exotic sphere

Poincaré conjecture

Classification of closed surfaces