Well summarized as "the branch of mathematics that deals with limits".
An fancy name for calculus, with the "more advanced" connotation.
The fundamental concept of calculus!
The reason why the epsilon delta definition is so venerated is that it fits directly into well known methods of the formalization of mathematics, making the notion completely precise.
This is a general philosophy that Ciro Santilli, and likely others, observes over and over.
Basically, continuity, or higher order conditions like differentiability seem to impose greater constraints on problems, which make them more solvable.
Some good examples of that:
Something that is very not continuous.
Notably studied in discrete mathematics.
Chuck Norris counted to infinity. Twice.
There are a few related concepts that are called infinity in mathematics:
  • limits that are greater than any number
  • the cardinality of a set that does not have a finite number of elements
  • in some number systems, there is an explicit "element at infinity" that is not a limit, e.g. projective geometry
Here's an example of the chain rule. Suppose we want to calculate:
So we have:
and so:
Therefore the final result is:
Given a function :
we want to find the points of the domain of where the value of is smaller (for minima, or larger for maxima) than all other points in some neighbourhood of .
In the case of Functionals, this problem is treated under the theory of the calculus of variations.
Nope, it is not a Greek letter, notably it is not a lowercase delta. It is just some random made up symbol that looks like a letter D. Which is of course derived from delta, which is why it is all so damn confusing.
I think the symbol is usually just read as "D" as in "d f d x" for .
This notation is not so common in basic mathematics, but it is so incredibly convenient, especially with Einstein notation as shown at Section "Einstein notation for partial derivatives":
This notation is similar to partial label partial derivative notation, but it uses indices instead of labels such as , , etc.
The total derivative of a function assigns for every point of the domain a linear map with same domain, which is the best linear approximation to the function value around this point, i.e. the tangent plane.
E.g. in 1D:
and in 2D:
The easy and less generic integral. The harder one is the Lebesgue integral.
"More complex and general" integral. Matches the Riemann integral for "simple functions", but also works for some "funkier" functions that Riemann does not work for.
Ciro Santilli sometimes wonders how much someone can gain from learning this besides the beauty of mathematics, since we can hand-wave a Lebesgue integral on almost anything that is of practical use. The beauty is good reason enough though.
Advantages over Riemann:
Video 1. Riemann integral vs. Lebesgue integral by The Bright Side Of Mathematics (2018) Source.
youtube.com/watch?v=PGPZ0P1PJfw&t=808 shows how Lebesgue can be visualized as a partition of the function range instead of domain, and then you just have to be able to measure the size of pre-images.
One advantage of that is that the range is always one dimensional.
But the main advantage is that having infinitely many discontinuities does not matter.
Infinitely many discontinuities can make the Riemann partitioning diverge.
But in Lebesgue, you are instead measuring the size of preimage, and to fit infinitely many discontinuities in a finite domain, the size of this preimage is going to be zero.
So then the question becomes more of "how to define the measure of a subset of the domain".
Which is why we then fall into measure theory!
In "practice" it is likely "useless", because the functions that it can integrate that Riemann can't are just too funky to appear in practice :-)
Its value is much more indirect and subtle, as in "it serves as a solid basis of quantum mechanics" due to the definition of Hilbert spaces.
And then this is why quantum mechanics basically lives in : not being complete makes no sense physically, it would mean that you can get closer and closer to states that don't exist!
TODO intuition
A measurable function defined on a closed interval is square integrable (and therefore in ) if and only if Fourier series converges in norm the function:
Riesz-Fischer theorem is a norm version of it, and Carleson's theorem is stronger pointwise almost everywhere version.
Note that the Riesz-Fischer theorem is weaker because the pointwise limit could not exist just according to it: norm sequence convergence does not imply pointwise convergence.
There are explicit examples of this. We can have ever thinner disturbances to convergence that keep getting less and less area, but never cease to move around.
If it does converge pointwise to something, then it must match of course.
The Fourier series of an function (i.e. the function generated from the infinite sum of weighted sines) converges to the function pointwise almost everywhere.
The theorem also seems to hold (maybe trivially given the transform result) for the Fourier series (TODO if trivially, why trivially).
Only proved in 1966, and known to be a hard result without any known simple proof.
This theorem of course implies that Fourier basis is complete for , as it explicitly constructs a decomposition into the Fourier basis for every single function.
TODO vs Riesz-Fischer theorem. Is this just a stronger pointwise result, while Riesz-Fischer is about norms only?
Integrable functions to the power , usually and in this text assumed under the Lebesgue integral because: Lebesgue integral of is complete but Riemann isn't
for .
is by far the most important of because it is quantum mechanics states live, because the total probability of being in any state has to be 1!
has some crucially important properties that other don't (TODO confirm and make those more precise):
Some sources say that this is just the part that says that the norm of a function is the same as the norm of its Fourier transform.
Others say that this theorem actually says that the Fourier transform is bijective.
The comment at math.stackexchange.com/questions/446870/bijectiveness-injectiveness-and-surjectiveness-of-fourier-transformation-define/1235725#1235725 may be of interest, it says that the bijection statement is an easy consequence from the norm one, thus the confusion.
As mentioned at Section "Plancherel theorem", some people call this part of Plancherel theorem, while others say it is just a corollary.
This is an important fact in quantum mechanics, since it is because of this that it makes sense to talk about position and momentum space as two dual representations of the wave function that contain the exact same amount of information.
Main motivation: Lebesgue integral.
The key idea, is that we can't define a measure for the power set of R. Rather, we must select a large measurable subset, and the Borel sigma algebra is a good choice that matches intuitions.
Approximates an original function by sines. If the function is "well behaved enough", the approximation is to arbitrary precision.
Fourier's original motivation, and a key application, is solving partial differential equations with the Fourier series.
The Fourier series behaves really nicely in , where it always exists and converges pointwise to the function: Carleson's theorem.
Video 1. But what is a Fourier series? by 3Blue1Brown (2019) Source. Amazing 2D visualization of the decomposition of complex functions.
Separation of variables of certain equations like the heat equation and wave equation are solved immediately by calculating the Fourier series of initial conditions!
Other basis besides the Fourier series show up for other equations, e.g.:
Input: a sequence of complex numbers .
Output: another sequence of complex numbers such that:
Intuitively, this means that we are braking up the complex signal into sinusoidal frequencies:
  • : is kind of magic and ends up being a constant added to the signal because
  • : sinusoidal that completes one cycle over the signal. The larger the , the larger the resolution of that sinusoidal. But it completes one cycle regardless.
  • : sinusoidal that completes two cycles over the signal
  • ...
  • : sinusoidal that completes cycles over the signal
and is the amplitude of each sine.
We use Zero-based numbering in our definitions because it just makes every formula simpler.
Motivation: similar to the Fourier transform:
  • compression: a sine would use N points in the time domain, but in the frequency domain just one, so we can throw the rest away. A sum of two sines, only two. So if your signal has periodicity, in general you can compress it with the transform
  • noise removal: many systems add noise only at certain frequencies, which are hopefully different from the main frequencies of the actual signal. By doing the transform, we can remove those frequencies to attain a better signal-to-noise
In particular, the discrete Fourier transform is used in signal processing after a analog-to-digital converter. Digital signal processing historically likely grew more and more over analog processing as digital processors got faster and faster as it gives more flexibility in algorithm design.
Sample software implementations:
See sections: "Example 1 - N even", "Example 2 - N odd" and "Representation in terms of sines and cosines" of www.statlect.com/matrix-algebra/discrete-Fourier-transform-of-a-real-signal
The transform still has complex numbers.
  • is real
Therefore, we only need about half of to represent the signal, as the other half can be derived by conjugation.
"Representation in terms of sines and cosines" from www.statlect.com/matrix-algebra/discrete-Fourier-transform-of-a-real-signal then gives explicit formulas in terms of .
An efficient algorithm to calculate the discrete Fourier transform.
Continuous version of the Fourier series.
Can be used to represent functions that are not periodic: math.stackexchange.com/questions/221137/what-is-the-difference-between-fourier-series-and-fourier-transformation while the Fourier series is only for periodic functions.
Of course, every function defined on a finite line segment (i.e. a compact space).
Therefore, the Fourier transform can be seen as a generalization of the Fourier series that can also decompose functions defined on the entire real line.
As a more concrete example, just like the Fourier series is how you solve the heat equation on a line segment with Dirichlet boundary conditions as shown at: Section "Solving partial differential equations with the Fourier series", the Fourier transform is what you need to solve the problem when the domain is the entire real line.
Lecture notes:
Video 1. How the 2D FFT works by Mike X Cohen (2017) Source. Animations showing how the 2D Fourier transform looks like for simple inpuf functions.
A set of theorems that prove under different conditions that the Fourier transform has an inverse for a given space, examples:
Video 1. The Laplace Transform: A Generalized Fourier Transform by Steve Brunton (2020) Source. Explains how the Laplace transform works for functions that do not go to zero on infinity, which is a requirement for the Fourier transform. No applications in that video yet unfortunately.
First published by Fourier in 1807 to solve the heat equation.
Topology is the plumbing of calculus.
The key concept of topology is a neighbourhood.
Just by havin the notion of neighbourhood, concepts such as limit and continuity can be defined without the need to specify a precise numerical value to the distance between two points with a metric.
As an example. consider the orthogonal group, which is also naturally a topological space. That group does not usually have a notion of distance defined for it by default. However, we can still talk about certain properties of it, e.g. that the orthogonal group is compact, and that the orthogonal group has two connected components.
Basically it is a larger space such that there exists a surjection from the large space onto the smaller space, while still being compatible with the topology of the small space.
We can characterize the cover by how injective the function is. E.g. if two elements of the large space map to each element of the small space, then we have a double cover and so on.
The key concept of topology.
We map each point and a small enough neighbourhood of it to , so we can talk about the manifold points in terms of coordinates.
Does not require any further structure besides a consistent topological map. Notably, does not require metric nor an addition operation to make a vector space.
Manifolds are cool. Especially differentiable manifolds which we can do calculus on.
A notable example of a Non-Euclidean geometry manifold is the space of generalized coordinates of a Lagrangian. For example, in a problem such as the double pendulum, some of those generalized coordinates could be angles, which wrap around and thus are not euclidean.
Collection of coordinate charts.
The key element in the definition of a manifold.
A generalized definition of derivative that works on manifolds.
TODO: how does it maintain a single value even across different coordinate charts?
TODO find a concrete numerical example of doing calculus on a differentiable manifold and visualizing it. Likely start with a boring circle. That would be sweet...
TODO what's the point of it.
A member of a tangent space.
www.youtube.com/watch?v=tq7sb3toTww&list=PLxBAVPVHJPcrNrcEBKbqC_ykiVqfxZgNl&index=19 mentions that it is a bit like a dot product but for a tangent vector to a manifold: it measures how much that vector derives along a given direction.
A metric is a function that give the distance, i.e. a real number, between any two elements of a space.
A metric may be induced from a norm as shown at: Section "Metric induced by a norm".
Because a norm can be induced by an inner product, and the inner product given by the matrix representation of a positive definite symmetric bilinear form, in simple cases metrics can also be represented by a matrix.
Canonical example: Euclidean space.
TODO examples:
Figure 1. Hierarchy of topological, metric, normed and inner product spaces. Source.
In plain English: the space has no visible holes. If you start walking less and less on each step, you always converge to something that also falls in the space.
One notable example where completeness matters: Lebesgue integral of is complete but Riemann isn't.
Subcase of a normed vector space, therefore also necessarily a vector space.
Appears to be analogous to the dot product, but also defined for infinite dimensions.
Vs metric:
  • a norm is the size of one element. A metric is the distance between two elements.
  • a norm is only defined on a vector space. A metric could be defined on something that is not a vector space. Most basic examples however are also vector spaces.
An inner product induces a norm with:
In a vector space, a metric may be induced from a norm by using subtraction:
Metric space but where the distance between two distinct points can be zero.
Notable example: Minkowski space.
When a disconnected space is made up of several smaller connected spaces, then each smaller component is called a "connected component" of the larger space.
See for example the
There are two cases:
  • (topological) manifolds
  • differential manifolds
Questions: are all compact manifolds / differential manifolds homotopic / diffeomorphic to the sphere in that dimension?
  • for topological manifolds: this is a generalization of the Poincaré conjecture.
    Original problem posed, for topological manifolds.
    Last to be proven, only the 4-differential manifold case missing as of 2013.
    Even the truth for all was proven in the 60's!
    Why is low dimension harder than high dimension?? Surprise!
    AKA: classification of compact 3-manifolds. The result turned out to be even simpler than compact 2-manifolds: there is only one, and it is equal to the 3-sphere.
    For dimension two, we know there are infinitely many: classification of closed surfaces
  • for differential manifolds:
    Not true in general. First counter example is . Surprise: what is special about the number 7!?
    Counter examples are called exotic spheres.
    Totally unpredictable count table:
    Dimension | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | Smooth types | 1 | 1 | 1 | ? | 1 | 1 | 28 | 2 | 8 | 6 | 992 | 1 | 3 | 2 | 16256 | 2 | 16 | 16 | 523264 | 24 |
    is an open problem, there could even be infinitely many. Again, why are things more complicated in lower dimensions??
So simple!! You can either:
  • cut two holes and glue a handle. This is easy to visualize as it can be embedded in : you just get a Torus, then a double torus, and so on
  • cut a single hole and glue aMöbius strip in it. Keep in mind that this is possible because the Möbius strip has a single boundary just like the hole you just cut. This leads to another infinite family that starts with:
A handle cancels out a Möbius strip, so adding one of each does not lead to a new object.
You can glue a Mobius strip into a single hole in dimension larger than 3! And it gives you a Klein bottle!
Intuitively speaking, they can be sees as the smooth surfaces in N-dimensional space (called an embedding), such that deforming them is allowed. 4-dimensions is enough to embed cover all the cases: 3 is not enough because of the Klein bottle and family.