Certain feelings in my body lead me to believel that I have to stduy Fourier Series and Fourier Transform for a better understanding of probability theory, measure theory,entroy and information theory.
For a periodic function f(x) with period 1 (you can normalize the period to 1 by scaling the x-axis), we can write it as a Fourier Series:
f(x)=n=−∞∑∞cne2πinx
where cn is the Fourier coefficient, which is the projection of f(x) onto the basis e2πinx:
cn=⟨f(x),e2πinx⟩=∫01f(x)e−2πinxdx
I like using the inner product notation to represent the Fourier coefficient, because it is more intuitive to me. The inner product is the projection of f(x) onto the basis e2πinx, which is the same as the Fourier coefficient.
Before we moving on, let’s discuss the properties of the Fourier coefficient cn:
cn is a complex number, which can be written as cn=an+ibn, where an and bn are real numbers.
when f(x) is a real function, cn=c−n, where c−n is the complex conjugate of c−n. This is because the basis e2πinx is a complex number, and the inner product of two complex numbers is a complex number. When f(x) is a real function, the inner product of f(x) and e2πinx is a real number, which is the same as the inner product of f(x) and e−2πinx.
c0 is the average of f(x), which is a real number:c0=∫01f(x)dx
when f(x) is even, cn is also even, which means cn=c−n. cn=∫01f(x)e−2πinxdx=−∫0−1f(−s)e2πinsdslet s=−x=∫−10f(−s)e−2πin(−s)ds=c−s=c−n
when f(x) is odd, cn is odd, which means cn=−c−n.
when f(x) is real and even, cn is real and even, which means cn=c−n.
when f(x) is real and odd, cn is imaginary and odd, which means cn=−c−n.
cn=−c−nwhen f(x) is oddcn=c−nwhen f(x) is real⟶−c−n=c−nonly possible when it is pure imaginary
Those properties are very useful when we calculate the Fourier coefficient as they could help us to verify the correctness of our calculation. Since most of signals are real, we can use those properties in practice.
Two Examples
Conside a square wave of period 1, which is defined as:
Notice, f(x) is an odd function, so cn is imaginary and odd, which means cn=−c−n. Notice that
1−e−πin=1−cos(πn)−isin(πn)=1−(−1)n−isin(πn)={02n is evenn is odd
So the series can be simplified as:
f(x)=n=−∞,n=0∑∞cne2πinx=n is odd∑πin2e2πinx
Reflections
We have shown that when the function is real and odd, the fourier coefficients
are pure imaginary and odd.
Now, we combine the positive and negative terms together:
e2πinx−e−2πinx=2isin(2πnx)
let n=2k+1, we have:
f(x)=n is odd∑πin2e2πinx=−∞∑∞πi(2k+1)2e2πi(2k+1)x=−∞∑∞πi(2k+1)2(e2πi(2k+1)x−e−2πi(2k+1)x)=k=0∑∞π(2k+1)4sin[2π(2k+1)x]=π4k=0∑∞2k+11sin[2π(2k+1)x]
Here is the visualization of the Fourier Series of the square wave (when N=100, you can click the right bottom corner to see the animation):
From the above example,we can see that the fourier series is ‘converging’ to the square wave. The more terms we add, the more similar it is to the square wave. However, we
also see discontinuity at the jump points. This is called Gibbs phenomenon. Since
both sine and cosine are continuous, the fourier series of a function is also continuous. Therefore the fourier series of a discontinuous function will have discontinuity at the jump points.
Now, let’s see another example - traingle wave - which is defined as:
f(t)=21−∣t∣={21+t21−t−21≤t<00≤t<21
Figure 2. Illustration of triangle wave.
The coefficient of f(t) is at n=0 is the average of f(t), which is 1/4. For n=0, we have:
cn=A(n)+A(−n)=4π2n21[1+eπin(πin−1)]+4π2n21[1+e−πin(−πin−1)]=4π2n21[2+eπin(πin−1)+e−πin(−πin−1)]=4π2n21[2+(cos(πn)+isin(πn))(πin−1)−(cos(πn)−isin(πn))(πin+1)]=4π2n21[2+cos(πn)(πin−1)−cos(πn)(πin+1)]=2π2n21(1−cos(πn))={0π2n21n is evenn is odd
Now, let’s write down the fourier series of f(t):
f(t)=n=−∞∑∞cne2πint=n is odd∑π2n21e2πint=−∞∑0π2n21e2πint+1∑∞π2n21e2πint=c−ne−2πint+cne2πint=cn(e2πint+e−2πint)=π2n22cos(2πnt)=41+k=0∑∞π2(2k+1)21cos[2π(2k+1)t]
For this example, there is no joumping points, so there is no Gibbs phenomenon. The fourier series is converging to the triangle wave. However, since we have infinite terms, the fourier series is not a triangle wave. It is a smooth triangle wave. The fourier series is a smooth approximation of the triangle wave. The more terms we add, the more similar it is to the triangle wave.
This is due to the fact that the fourier series is a linear combination of the basis e2πint. The basis e2πint is a smooth function, so the fourier series is also a smooth function. Or put it in another way, both sines and cosines are differentiable to any order, so the fourier series is also differentiable to any order.
In summary, a discontinuoity in any order derivative of a periodic function will
force an infinite number of terms in the fourier series to approximate the function.
Note also that for the triangle wave the coefficients decrease like 1/n2 while
for the square wave they decrease like 1/n. Or, it takes around N=100 terms to approximate the square wave, but it only takes around N=10 terms to approximate the triangle wave. This has exactly do do wit the fact that the square wave is discontinuous while the triangle wave is continuous but its derivative is discontinuous.
Reflections
I hope those two examples could give you the sense of how the fourier series works and how it converges to the original function in terms of the speed and the smoothness.
Convergence of Fourier Series
Until now, we have assumed that the period is always 1. Now, let’s assume f is
periodic at interval L from [a,b], which means f(x+L)=f(x). We can write the fourier series as:
cn=f^(n)=L1∫abf(x)e−2πinx/Ldx,n∈Z(1)
The N-th partial sum of the fourier series is:
SN(f)(x)=n=−N∑Nf^(n)e2πinx/L(2)
Now, we try to answer the following questions:
Does the fourier series converge to f(x)?
In what sense does SN(f)(x) converge to f(x) as N→∞?
Roughly speaking, there are three senses of convergence:
Pointwise Convergence: SN(f)(x) converges to f(x) for every x.
Uniform Convergence: SN(f)(x) converges to f(x) uniformly. In words, when N is large, the partial sum SN(f)(x) is close to f(x) for every x over the
entire interval [a,b].
Mean Square Convergence: SN(f)(x) converges to f(x) in the mean square sense. In words, the average of the square of the difference between SN(f)(x) and f(x) converges to 0 as N→∞, meaning:
N→∞lim∫ab∣SN(f)(x)−f(x)∣2dx=0(3)
We will not prove the convergence of the fourier series here. We refer the readers to the two examples we have shown above. The square wave is discontinuous, so the fourier series converges to the square wave in the mean square sense. The triangle wave is continuous, so the fourier series converges to the triangle wave uniformly. Generally speaking, uniform convergence is the strongest form of convergence. Pointwise convergence is the weakest form of convergence. Mean square convergence is in between, which is also very subtle to study.
Dirichlet Kernel
After introducing the partial sum, it is natural to ask how good is the partial sum SN(f)(x) in approximating f(x). The answer is given by the Dirichlet kernel. Now, let’s examine the partial sum SN(f)(x) (to simplify the notation, we assume L=1):
We will not discuss the derivation of the Dirichelt kernel here. We will learn more about the Dirichlet kernel in the future when we talk about the convolution.
Orthogonality of the basis
In the previous post, we have show that
en(t)=e2πint
is an orthogonal basis. From this, we could derive Pythagoras’s Theorem for the inner product:
⟨f,g⟩=∫abf(x)g(x)dx
For our basis, we have:
⟨en,em⟩=∫abe2πinte2πimtdt={10n=mn=m
The Pythagoras’s Theorem for the inner product is:
From the above equation, we can see that f(x) is an odd function, so the fourier coefficient is imaginary and odd. Based on Rayleigh’s Identity, we have:
Reflections
I have mentioned that learning Fourer Series will help us to understand probability
theory. Here is an example. The above example is related to the Basel problem and Zeta
function, which are related to Gamma function. When you study probability theory, you will see Gamma function a lot, for example, the Gamma distribution or the Gamma distribution as the conjugate prior of the Poisson distribution, etc.