# Ansatz to Gram-Schmidt Orthonormalization

The Gram–Schmidt process is a method for orthonormalising a set of vectors in an inner product space and the trivial way to remember this is through an ansatz :

Let $|v_{1}> , |v_{2}> , \hdots |v_{n}>$  be a set of normalized basis vectors but we would also like to make them orthogonal.  We will call $|v_{1}^{'}> , |v_{2}^{'}> , \hdots |v_{n}^{'}>$ be the orthonormalized set of basis vectors formed out  $|v_{1}> , |v_{2}> , \hdots |v_{n}>$.

$|v_{1}^{'} > = |v_{1}>$

Now we construct a second vector $|v_{2}^{'}>$ out of $|v_{1}^{'}>$ and $|v_{2}>$:

$|v_{2}^{'} > = |v_{2}> - \lambda |v_{1}^{'}>$

But what must be true of $|v_{2}^{'} >$ is that  $|v_{1}^{'}>$ and $|v_{2}^{'}>$ must be orthogonal i.e $ = 0$ .

$ = - \lambda $

$0 = - \lambda$

$\lambda = $

Therefore we get the following expression for $v_{2}^{'}$ ,

$|v_{2}^{'} > = |v_{2}> - |v_{1}>$

which upon normalization looks like so:

$|v_{2}^{'} > = \frac{|v_{2}^{'} >}{ }$

That might have seemed trivial geometrically, but this process can be generalized for any complete n-dimensional vector space. Let’s continue the Gram – Schmidt for the third vector by choosing $|v_{3}^{'} >$ of the following form and generalizing this process:

$|v_{3}^{'} > = |v_{3}> - \lambda_{1} |v_{1}^{'}> - \lambda_{2} |v_{2}^{'}>$

The values for $\lambda_{1}$ and $\lambda_{1}$ are found out to be as:

$\lambda_{1} = $

$\lambda_{2} = $

Therefore we get,

$|v_{3}^{'} > = |v_{3}> - |v_{1}^{'}> - |v_{2}^{'}>$ (or)

$|v_{3}^{'} > = |v_{3}> - \sum\limits_{j=1,2} |v_{j}^{'}>$

$|v_{3}^{'} > = \frac{|v_{2}^{'} >}{ }$

Generalizing, we obtain:

$|v_{i}^{'} > = |v_{i}> - \sum\limits_{j=1,2,...,i-1} |v_{j}^{'}>$

$|v_{i}^{'} > = \frac{|v_{i}^{'} >}{ }$

Now although you would never need to remember the above expression because you can derive it off the bat with the above procedure, it is essential to understand how it came out to be.

Cheers!

# nth roots of unity : A geometric approach

When one is dealing with complex numbers, it is many a times useful to think of them as transformations. The problem at hand is to find the nth roots of unity. i.e

$z^n = 1$

## Multiplication as a Transformation

Multiplication in the complex plane is mere rotation and scaling. i.e

$z_{1} = r_{1}e^{i\theta_{1}}, z_{2} = r_{2}e^{i\theta_{2}}$

$z_{1}z_{2} = \underbrace{r_{1} r_{2}}_{scaling} \underbrace{e^{i(\theta_{1} + \theta_{2})}}_{rotation}$

Now what does finding the n roots of unity mean?

If you start at 1 and perform n equal rotations( because multiplication is nothing but rotation + scaling ), you should again end up at 1.

We just need to find the complex numbers that do this.i.e

$z^n = 1$

$\underbrace{zz \hdots z}_{n} = 1$

$z = re^{i\theta}$

$r^{n}e^{i(\theta + \theta + \hdots \theta)} = 1e^{2\pi k i}$

$r^{n}e^{in\theta} =1e^{2\pi k i}$

This implies that :

$\theta = \frac{2\pi k}{n}, r = 1$

And therefore :

$z = e^{\frac{2\pi k i}{n}}$

Take a circle, slice it into n equal parts and voila you have your n roots of unity.

## Okay, but what does this imply ?

Multiplication by 1 is a $360^o/0^o$ rotation.

When you say that you are multiplying a positive real number(say 1) with 1 , we get a number(1) that is on the same positive real axis.

Multiplication by (-1) is a $180^o$ rotation.

When you multiply a positive real number (say 1) with -1, then we get a number (-1) that is on the negative real axis

The act of multiplying 1 by (-1) has resulted in a 180o transformation. And doing it again gets us back to 1.

Multiplication by $i$ is a $90^o$ rotation.

Similarly multiplying by i takes 1 from real axis to the imaginary axis, which is a 90o rotation.

This applies to -i as well.

That’s about it! – That’s what the nth roots of unity mean geometrically. Have a good one!

# Solving the Laplacian in Spherical Coordinates (#1)

In this post, let’s derive a general solution for the Laplacian in Spherical Coordinates. In future posts, we shall look at the application of this equation in the context of Fluids and Quantum Mechanics.

$x = rsin\theta cos\phi$
$y = rsin\theta cos\phi$
$z = rcos\theta$

where

$0 \leq r < \infty$
$0 \leq \theta \leq \pi$
$0 \leq \phi < 2\pi$

The Laplacian in Spherical coordinates in its ultimate glory is written as follows:

$\nabla ^{2}f ={\frac {1}{r^{2}}}{\frac {\partial }{\partial r}}\left(r^{2}{\frac {\partial f}{\partial r}}\right)+{\frac {1}{r^{2}\sin \theta }}{\frac {\partial }{\partial \theta }}\left(\sin \theta {\frac {\partial f}{\partial \theta }}\right)+{\frac {1}{r^{2}\sin ^{2}\theta }}{\frac {\partial ^{2}f}{\partial \phi ^{2}}} = 0$

To solve it we use the method of separation of variables.

$f = R(r)\Theta(\theta)\Phi(\phi)$

Plugging in the value of $f$ into the Laplacian, we get that :

$\frac{\Theta \Phi}{r^2} \frac{d}{dr} \left( r^2\frac{dR}{dr} \right) + \frac{R \Phi}{r^2 sin \theta} \frac{d}{d \theta} \left( sin \theta \frac{d\Theta}{d\theta} \right) + \frac{\Theta R}{r^2 sin^2 \theta} \frac{d^2 \Phi}{d\phi^2} = 0$

Dividing throughout by $R\Theta\Phi$ and multiplying throughout by $r^2$, further simplifies into:

$\underbrace{ \frac{1}{R} \frac{d}{dr} \left( r^2\frac{dR}{dr} \right)}_{h(r)} + \underbrace{\frac{1}{\Theta sin \theta} \frac{d}{d \theta} \left( sin \theta \frac{d\Theta}{d\theta} \right) + \frac{1}{\Phi sin^2 \theta} \frac{d^2 \Phi}{d\phi^2}}_{g(\theta,\phi)} = 0$

It can be observed that the first expression in the differential equation is merely a function of $r$ and the remaining a function of $\theta$ and $\phi$ only. Therefore, we equate the first expression to be $\lambda = l(l+1)$ and the second to be $-\lambda = -l(l+1)$. The reason for choosing the peculiar value of $l(l+1)$ is explained in another post.

$\underbrace{ \frac{1}{R} \frac{d}{dr} \left( r^2\frac{dR}{dr} \right)}_{l(l+1)} + \underbrace{\frac{1}{\Theta sin \theta} \frac{d}{d \theta} \left( sin \theta \frac{d\Theta}{d\theta} \right) + \frac{1}{\Phi sin^2 \theta} \frac{d^2 \Phi}{d\phi^2}}_{-l(l+1)} = 0$ (1)

The first expression in (1) the Euler-Cauchy equation in $r$.

$\frac{d}{dr} \left( r^2\frac{dR}{dr} \right) = l(l+1)R$

The general solution of this has been in discussed in a previous post and it can be written as:

$R(r) = C_1 r^l + \frac{C_2}{r^{l+1}}$

The second expression in (1) takes the form as follows:

$\frac{sin \theta}{\Theta} \frac{d}{d \theta} \left( sin \theta \frac{d\Theta}{dr} \right)+ l(l+1)sin^2 \theta + \frac{1}{\Phi} \frac{d^2 \Phi}{d\phi^2} = 0$

The following observation can be made similar to the previous analysis

$\underbrace{\frac{sin \theta}{\Theta} \frac{d}{d \theta} \left( sin \theta \frac{d\Theta}{dr} \right)+ l(l+1)sin^2 \theta }_{m^2} + \underbrace{\frac{1}{\Phi} \frac{d^2 \Phi}{d\phi^2}}_{-m^2} = 0$ (2)

The first expression in the above equation (2) is the Associated Legendre Differential equation.

$\frac{sin \theta}{\Theta} \frac{d}{d \theta} \left( sin \theta \frac{d\Theta}{dr} \right)+ l(l+1)sin^2 \theta = m^2$

$sin \theta \frac{d}{d \theta} \left( sin \theta \frac{d\Theta}{dr} \right)+ \Theta \left( l(l+1)sin^2 \theta - m^2 \right) = 0$

The general solution to this differential equation can be given as:
$\Theta(\theta) = C_3 P_l^m(cos\theta) + C_4 Q_l^m(cos\theta)$

The solution to the second term in the equation (2) is a trivial one:

$\frac{d^2 \Phi}{d\phi^2} = m^2 \Phi$
$\Phi(\phi) = C_5 e^{im\phi} + C_6 e^{-im\phi}$

Therefore the general solution to the Laplacian in Spherical coordinates is given by:

$R\Theta\Phi = \left(C_1 r^l + \frac{C_2}{r^{l+1}} \right) \left(C_3 P_l^m(cos\theta) + C_4 Q_l^m(cos\theta \right) \left(C_5 e^{im\phi} + C_6 e^{-im\phi}\right)$

# Matrix Multiplication and Heisenberg Uncertainty Principle

We now understand that Matrix multiplication is not commutative (Why?). What has this have to do anything with Quantum Mechanics ?

Behold the commutator operator:
$[\hat{A}, \hat{B}] = \hat{A}\hat{B} - \hat{B}\hat{A}$

where $\hat{A},\hat{B}$ are operators that are acting on the wavefunction $\psi$. This is equal to 0 if they commute and something else if they don’t.

One of the most important formulations in Quantum mechanics is the Heisenberg’s Uncertainty principle and it can be written as the commutation of the momentum operator (p) and the position operator (x):

$[\hat{p}, \hat{x}] = \hat{p}\hat{x} - \hat{x}\hat{p} = i\hbar$

If you think of p and x as some Linear transformations. (just for the sake of simplicity).

This means that measuring distance and then momentum is not the same thing as measuring momentum and then distance. Those two operators do not commute! You can sort of visualize them in the same way as in the post.

But in Quantum Mechanics, the matrices that are associated with $\hat{p}$ and $\hat{x}$ are infinite dimensional. ( The harmonic oscillator being the simple example to this )

$\hat{x} = \sqrt{\frac{\hbar}{2m \omega}} \begin{bmatrix} 0 & \sqrt{1} & 0 & 0 & \hdots \\ \sqrt{1} & 0 &\sqrt{2} & 0 & \hdots \\ 0 & \sqrt{2} & 0 &\sqrt{3} & \hdots \\ 0 & 0 & \sqrt{3} & 0 & \hdots \\ \vdots & \vdots & \vdots & \vdots \end{bmatrix}$

$\hat{p} = \sqrt{\frac{\hbar m \omega}{2}} \begin{bmatrix} 0 & -i & 0 & 0 & \hdots \\ i & 0 & -i \sqrt{2} & 0 & \hdots \\ 0 & i\sqrt{2} & 0 &\-i \sqrt{3} & \hdots \\ 0 & 0 & i\sqrt{3} & 0 & \hdots \\ \vdots & \vdots & \vdots & \vdots \end{bmatrix}$

# Why on earth is matrix multiplication NOT commutative ? – Intuition

One is commonly asked to prove in college as part of a linear algebra problem set that matrix multiplication is not commutative. i.e If A and B are two matrices then :

$AB \neq BA$

But without getting into the Algebra part of it, why should this even be true ? Let’s use linear transformations to get a feel for it.

If A and B are two Linear Transformations namely Rotation and Shear. Then it means that.

$(Rotation)(Shearing) \neq (Shearing)(Rotation)$

Is that true? Well, lets perform these linear operations on a unit square and find out:

(Rotation)(Shearing)

(Shearing)(Rotation)

You can clearly see that the resultant shape is not the same upon the two transformations. This means that the order of matrix multiplication matters a lot ! ( or matrix multiplication is not commutative.)

# Basis Vectors are instructions !

Basis vectors are best thought of in the context of roads.

Imagine you are in a city – X which has only roads that are perpendicular to one another.

You can reach any part of the city but the only constraint is that you need to move along these perpendicular roads to get there.

Now lets say you go to another city-Y which has a different structure of roads.

In this case as well you can get from one part of the city to any other, but you have to travel these ‘Sheared cubic’ pathways to get there.

Just like these roads determine how you move about in the city, Basis Vectors encode information on how you move about on a plane. What do I mean by that ?

The basis vector of City-X is given as:

$\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}$

This to be read as – ” If you would like to move in City-X you can only do so by taking 1 step in the x-direction or 1 step in the y-direction ”

The basis vector of City-Y is given as:

$\begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix}$

This to be read as – ” If you would like to move in City-Y you can only do so by taking 1 step in the x-direction or  1 step along the diagonal OB ”

Conclusion:

By having the knowledge about the Basis Vectors of any city, you can travel to any destination by merely scaling these basis vectors.

As an example, lets say need to get to the point (3,2), then in City-X,  you would take 2 steps in the x-direction and 3 steps in the y-direction

$\begin{bmatrix} 3 \\ 2 \end{bmatrix} = 3* \begin{bmatrix} 1 \\ 0 \end{bmatrix} + 2 * \begin{bmatrix} 0 \\ 1 \end{bmatrix}$

And similarly in City-Y, you would take 1 step along the x -direction and 2 steps along the diagonal OB.

$\begin{bmatrix} 3 \\ 2 \end{bmatrix} = 1* \begin{bmatrix} 1 \\ 0 \end{bmatrix} + 2 * \begin{bmatrix} 1 \\ 1 \end{bmatrix}$

Destination Arrived 😀

# Inverse of an Infinite matrix

$\begin{bmatrix} 0 & 1 & 0 & 0 & 0 & 0 & \hdots \\ 0 & 0 & 2 & 0 & 0 & 0 & \hdots \\ 0 & 0 & 0 & 3 & 0 & 0 & \hdots \\ 0 & 0 & 0 & 0 & 4 & 0 & \hdots \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \end{bmatrix}$

What is the inverse of the above matrix ? I would strongly suggest that you think about the above matrix and what its inverse would look like before you read through.

On the face of it, it is indeed startling to even think of an inverse of an infinite dimensional matrix. But the only reason why this matrix seems weird is because I have presented it out of context.

You see, the popular name of the matrix is the Differentiation Matrix and is commonly denoted as $D$.

The differentiation matrix is a beautiful matrix and we will discuss all about it in some other post, but in the this post lets talk about its inverse. The inverse of the differentiation matrix is ( as you might have guessed ) is the Integration Matrix ($I^*$)

$I^* = \begin{bmatrix} 0 & 0 & 0 & 0 & 0 & 0 & \hdots \\ 1 & 0 & 0 & 0 & 0 & 0 & \hdots \\ 0 & \frac{1}{2} & 0 & 0 & 0 & 0 & \hdots \\ 0 & 0 & \frac{1}{3} & 0 & 0 & 0 & \hdots \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \end{bmatrix}$

And it can be easily verified that $DI^* = I$, where $I$ is the Identity matrix.

Lesson learned: Infinite dimensional matrices can have inverses. 😀