Quantum Bites

Thursday, February 12, 2015

The Quantum Particle and its Hilbert Space

The Quantum Particle and The Two Slit Interference Experiment

A classical particle is an object that has mass, and with the property of having a definite location and momentum at any instant of time. In this post we will consider the quantum version of the classical particle, which we will refer to as the quantum particle or simply particle. In order for us to discover how the quantum description of a particle arises, we will revisit the well-known two slit experiment involving electrons. This will eventually lead us into the construction of the particle Hilbert space.

Consider a collimated weakly interacting (for all practical purposes non-interacting) electron beam incident upon a pair of slits. On the opposite of the slits is a film that serves as a screen on which we observe the arrival of an electron. Also on the upper slit there is a detector that tells which slit the electron has passed through on its way to the screen. If the detector lights up, the electron went through the upper slit; otherwise, through the lower slit. We are all aware of the result of the experiment: In the presence of the detector, no interference appears on the screen; otherwise, in the absence of the detector, interference appears. Let us go beyond appearances and quantify this result.

Let us suppose that the screen is dressed with an array of detectors with some width $\delta x$, and that the $j$-th detector is centered at the point $x_j$. The $x_j$'s then give the location where an electron is detected. First, let us consider when the detector is present. Let there be a total of $N$-electrons that arrived at the screen after the experiment. Let us denote the number of electrons that arrived at the detector at $x_j$ through the upper slit by $N_u(x_j)$, and through the lower slit by $N_l(x_j)$; moreover, let $N_t(x_j)$ be the total number of electrons that arrived at $x_j$. When the detector is present, we find that the total number of arrivals at $x_j$ is just $N_t(x_j)=N_u(x_j)+N_l(x_j)$.

Now when the detector is absent so that there is no way for us to determine which slit an electron went through, an uncanny behavior is observed. The distribution of arriving electrons no longer resemble that of the former configuration. That means we can no longer describe the number of arrivals at a certain location as the sum of the electrons passing through the upper and lower slits. That is we can no longer describe that a certain number $N_u(x_j)$ passed through the upper slit or a certain number $N_l(x_j)$ through the lower slit. To describe what we observe is to go beyond counting. Nevertheless, we can still count the number of electrons that arrived at the location $x_j$, and say it is given by $N_t(x_j)$.

To describe the distribution on the screen we go through the following manipulation—almost a magical one—on the number $N_t(x_j)$. Since $N_t(x_j)$ is positive, we can always find a complex number $\eta(x_j)$ such that $N_t(x_j)=\bar{\eta}(x_j)\eta(x_j)$. Now it turns out that the only way to describe the result of the experiment is to associate a complex valued function to each of the upper and lower slits, $\eta_u(x_j)$ and $\eta_l(x_j)$, respectively; and that the complex valued fuction $\eta(x_j)$ is a linear sum of the two, $\eta(x_j)=\eta_u(x_j)+\eta_l(x_j)$. On writing $\eta_u(x_j)=|\eta_u(x_j)| \exp\left(i \phi_u(x_j)\right)$ and $\eta_u(x_j)=|\eta_l(x_j)| \exp\left(i \phi_l(x_j)\right)$, the total number of electrons that arrived at $x_j$ assumes the form \begin{equation} N_t(x_j)=\left|\eta_u(x_j)\right|^2+\left|\eta_l(x_j)\right|^2 + 2 \sqrt{\left|\eta_u(x_j)\eta_l(x_j)\right|} \cos\left(\phi_u(x_j)-\phi_l(x_j)\right). \end{equation} The first two terms are positive and the third term oscillates between $\pm 2 \sqrt{\left|\eta_u(x_j)\eta_l(x_j)\right|}$ for all $j$'s. Clearly it is the third term that is responsible for the observed interference pattern on the screen.

This tells us that in order to describe the result of the experiment, we have to introduce complex quantities out of the measurable quantities. In this case knowledge of the complex quantities can determine the distribution of the results of the experiments. Note that the result of the first experiment can be brought into the framework of the second experiment by identifying $N_u(x_j)=\left|\eta_u(x_j)\right|^2$ and $N_l(x_j)=\left|\eta_l(x_j)\right|^2$, with the third term missing. It can be interpreted that the act of extracting information on which slit an electron has passed through has caused the third term to disappear. How the measurement process reduces the third term away is the central problem of quantum measurement theory which we will consider in detail later.

A characteristic feature of the two experiments is that where a single electron arrives on the screen is completely random. We can imagine several similar experiments done simultaneously. Necessary care is made such that the conditions of these experiments are the same; in particular, their initial conditions are the same. If we observe where the first electron of each experimental set-up ends up on the screen, we find that they do not land on the same point. The same thing with the second, the third, and so on. More importantly, we cannot, on the basis of the complete knowledge of the initial conditions of the experimental set-up, predict where a single electron will land on the screen.

However, we find that as the number of electrons arriving gets more and more numerous a pattern on the screen emerges. And this pattern is identical to all experiments involving arbitrarily large number of electrons. While the result of a single measurement is unpredictable and is not determined by the initial configurations of the measuring instruments, the result of a large number of measurement is predictable and is determined by the initial configuration of the system and the measuring apparatus. The existence of an underlying distribution of results imply that, while the occurrence of an event is random, one can assign a definite probability for the occurrence of that event. It is the business of quantum mechanics to providing the recipe in computing for such probabilities.

The Wavefunction

We find above that in order to describe the result of the two slit experiment we need to introduce a complex valued quantity. This, in the context of the experiment, is what we call a number amplitude, because the modulus of which gives the number of electrons arriving at a given point on the screen. Let $N$ be the total number of electrons that arrived at the screen during the duration of the experiment. Then consider the quantity $N_t(x_j)/N \delta x = \delta P(x_j)/\delta x$, where $\delta P(x_j)=N_t(x_j)/N$ gives the probability of locating a single electron in the immediate vicinity of $x_j$. The quantity $\delta P(x_j)/\delta x$ gives us the probability density.

Now let the number $N$ of electrons gets large indefinitely and let the width of the detector $\delta x$ gets smaller indefinitely as well. Quantum mechanics presupposes that there exists a well defined limit, \begin{equation} \rho(x)=\lim_{\delta x\rightarrow 0,\; N\rightarrow \infty} \frac{N_t(x)}{N\delta x}. \end{equation} Moreover the quantities $\eta(x_j)/\sqrt{N\delta x}$, $\eta_u(x_j)/\sqrt{N\delta x}$, and $\eta_l(x_j)/\sqrt{N\delta x}$ likewise obtain the limiting values $\psi(x)$, $\psi_u(x)$, and $\psi_l(x)$, respectively, such that $\rho(x)=\bar{\psi}(x) \psi(x)=\left|\psi(x)\right|^2$ and $\psi(x)=\psi_u(x)+\psi_l(x)$.

From the way we have arrived at $\psi(x)$ the quantity $\left|\psi(x)\right|^2 dx$ yields the probability for the electron to arrive at $x$ or for us to find the electron in the neighborhood of $x$. Knowledge of $\psi(x)$ does not allow us to predict where an electron will arrive on the screen; however, it gives complete information on the distribution of outcome of position measurements, which is given by $|\psi(x)|^2$. The complex valued function $\psi(x)$ is known as the wave-function.

The Superposition Principle

The second experiment teaches us a very important lesson on the quantum description of a quantum particle. Two or more results are mutually exclusive if one result excludes the other. An electron taking the upper slit and the same electron taking the lower slit are mutually exclusive because a detector interrogating the electron can only register one of the two possibilities. If the detector lights up, then the electron went through the upper slit and not through the lower slit, and vice versa. Now if the experimental set up is such that no information can be extracted on which of the mutually exclusive events is the case, then the description of the object requires that a complex valued quantity be assigned to each mutually exclusive event and that the state of the object is given by a complex valued quantity which is the linear sum of these complex valued quantities corresponding to each possibility. This is the well-known phenomenon of self-interference, which is embodied by the quantum superposition principle.

The Hilbert Space of the Quantum Particle

Quantum mechanics postulates that the state of a quantum particle in one dimension is completely described by the wave-function $\psi(x)$. We now turn into constructing the Hilbert space of the quantum particle. We start with the identification that $|\psi(x)|^2$ is the probability density of locating the particle at $x$, so that the probability of finding it in the interval $\Delta$ is given by the integral \begin{equation} P(\Delta)=\int_{\Delta} |\psi(x)|^2 \mbox{d}x. \end{equation} We can discretize the entire line into disjoint intervals, $\{\Delta_k\}$, and obtain the probability of finding the particle in each interval, $\{P(\Delta_k)\}$. The law of probability demands that the sum of all probabilities must be unity, so that $\sum_k P(\Delta_k)=1$. This translates into what is known as the normalizability condition \begin{equation} \int_{\mathbb{R}}|\psi(x)|^2\, \mbox{d}x=1 . \end{equation} If it happens that the wave-function is not normalized, i.e. $\int_{\mathbb{R}}|\psi(x)|^2\, \mbox{d}x\neq 1$, it can be normalized with the replacement $\psi(x)\rightarrow \psi(x)/\int_{\mathbb{R}}|\psi(x)|^2\mbox{d}x$. Quantum mechanics does not distinguished these two functions: They represent the same quantum state of the particle.

The foregoing dictates that any complex valued function $\psi(x)$ is a legitimate state of a quantum particle provided it is normalizable; in particular, it satisfies what is known as the square integrability condition \begin{equation} \int_{\mathbb{R}}\left|\psi(x)\right|^2 \, \mbox{d}x<\infty . \end{equation} Let us denote the set of complex valued, square-integrable functions in the real line by $\mathcal{S}^2(\mathbb{R})$. This set is a complex vector space under the usual rules of pointwise addition of functions, and complex multiplication of functions with complex numbers. This follows from the well-known inequality \begin{equation} \int_{\mathbb{R}}\left|\alpha\psi(x)+\beta\phi(x)\right|^2\, \mbox{d}x\leq |\alpha|^2 \int_{\mathbb{R}} \left|\psi(x)\right|^2 \, \mbox{d}x + |\beta|^2 \int_{\mathbb{R}} \left|\phi(x)\right|^2 \, \mbox{d}x < \infty \end{equation} for every pair of functions $\psi(x)$ and $\phi(x)$ in $\mathcal{S}^2(\mathbb{R})$, and pair of complex numbers $\alpha$ and $\beta$. That is if $\psi(x)$ and $\phi(x)$ are in $\mathcal{S}^2(\mathbb{R})$, then the function $\alpha\psi(x)+\beta\phi(x)$ is again in $\mathcal{S}^2(\mathbb{R})$.

By the quantum superposition principle, we identify the set $\mathcal{S}^2(\mathbb{R})$ to comprise a set of legitimate pure states of the quantum particle. Notice that we did not say "to comprise the whole class of pure states of the particle." The reason is that it is not yet the Hilbert space which is suppose to already contain all available pure states of the particle. To obtain the Hilbert space, we first need to turn $\mathcal{S}^2(\mathbb{R})$ into a pre-Hilbert space by assigning an inner product to it. Let $\left<\cdot\left|\cdot\right>\right.$ be the still undetermined inner product. The inner product must incorporate the square integrability condition. For every $\psi(x)$ in $\mathcal{S}^2(\mathbb{R})$ we define its norm or length by \begin{equation}\label{normcond} \left|\left|\psi\right|\right|^2 = \int_{\mathbb{R}}|\psi(x)|^2\, \mbox{d}x . \end{equation} The inner product must necessarily be consistent with this definition of the norm. For a Hilbert space, the relationship between the norm and the inner product is given by $\left|\left|\psi\right|\right|^2 = \left<\psi\left|\psi\right>\right.$. This is a necessary condition but not sufficient to determine the appropriate inner product.

The inner product is ultimately determined by the quantum principle of equivalence of mutual exclusivity and orthogonality of states. This requires us to define when two or more wave-functions are mutually exclusive. Let us consider two wave-functions $\psi_1(x)$ and $\psi_2(x)$ that are non-vanishing only in the intervals $\Delta_1$ and $\Delta_2$, respectively. Let us assume that the intervals do not overlap. If the particle is prepared in the wave-function $\psi_1(x)$, then the probability of finding the particle in the interval $\Delta_2$ is given by $\int_{\Delta_2}|\psi_1(x)|^2 \mbox{d}x=0$; the probability vanishes because $\psi_1(x)$ vanishes everywhere in $\Delta_2$. That is if the particle is prepared in $\psi_1(x)$, then any measurement of position of the particle excludes any result that is consistent with any result when the particle is prepared in the state $\psi_2(x)$. The converse of the statement is true. This leads us to the identification of mutually exclusive states: non-overlapping wave-functions are mutually exclusive.

The definition of the inner product must then be consistent with this definition of mutually exclusive states. For every pair of wave-functions $\psi(x)$ and $\phi(x)$, we define the functional $\int_{\mathbb{R}}F(x) \psi^*(x) \phi(x)\, \mbox{d}x$ where the function $F(x)$ is positive everywhere in the real line. This functional satisfies all the axioms of an inner product. Moreover, we find that when $\psi(x)$ and $\phi(x)$ are non-overlapping the value of the functional is zero. Hence it is consistent with the definition of mutual exclusivity of two non-overlapping wave-functions. However, it is arbitrary because of the arbitrariness of $F(x)$. The inner product is fixed by imposing the norm condition \ref{normcond}, which requires that $F(x)=1$. Finally we identify the inner product to be given by \begin{equation} \left<\psi\left|\phi\right>\right. = \int_{\mathbb{R}}\psi^*(x) \phi(x)\, \mbox{d}x . \end{equation} Form this we have the metric \begin{equation} ||\psi-\phi||^2 = \int_{\mathbb{R}}|\psi(x)-\phi(x)|^2\, \mbox{d}x . \end{equation} And we have completed the construction of the pre-Hilbert space of the pure states of the particle. The Hilbert space is finally obtained by completing the pre-Hilbert space $\mathcal{S}^2(\mathbb{R})$, by adjoining to it all limit points of the pre-Hilbert space. The Hilbert space is denoted by $\mathcal{L}^2(\mathbb{R})$, the Lebesgue square integrable complex valued functions. This Hilbert space is infinite dimensional and separable.

Quantum Particle in Three Dimensional Space

In the foregoing we considered a quantum particle restricted along a line. When the particle can move in the entire spatial space, the Hilbert space must be appropriately modified. The Hilbert space, denoted by $\mathcal{L}^2(\mathbb{R}^3)$, consists of all complex valued functions $\psi(\vec{r})$ satisfying the condition \begin{equation} \int_{\mathbb{R}^3} |\psi(\vec{r})|^2\, \mbox{d}^3 r < \infty . \end{equation} The inner product is given by \begin{equation} \left<\psi\left|\phi\right>\right. = \int_{\mathbb{R}^3} \psi^*(\vec{r})\phi(\vec{r})\, \mbox{d}^3 r , \end{equation} for all pairs of wave-functions $\psi(x)$ and $\phi(x)$ in $\mathcal{L}^2(\mathbb{R}^3)$. The inner product again is consistent with the condition of mutual exclusivity of two wave-functions that are non-overlapping, with the interpretation that the probability of finding the particle in the region $V$ is given by \begin{equation} P_{\psi}(V) = \int_{V} |\psi(\vec{r})|^2\, \mbox{d}^3 r, \end{equation} provided that $\psi(\vec{r})$ is normalized.

A Particle in a Box

For a particle confined in the interval $[a,b]$, where $a < b$, the Hilbert space consists of all square integrable complex valued functions in the interval $[a,b]$, i.e. those $\psi(x)$'s satisfying the condition \begin{equation} \int_a^b |\psi(x)|^2 \mbox{d} x < \infty . \end{equation} The Hilbert space is denoted by $\mathcal{L}^2[a,b]$. In this Hilbert space, the inner product is given by \begin{equation} \left<\psi\left|\phi\right>\right. = \int_{a}^b \psi^*(x)\phi(x)\, \mbox{d}x, \end{equation} for all pairs of wave-functions $\psi(x)$ and $\phi(x)$ in $\mathcal{L}^2[a,b]$. The same interpretation holds for the wave-functions $\psi(x)$ of the Hilbert space $\mathcal{L}^2[a,b]$ as those in the Hilbert spaces above.

Separable, Infinite Dimensional Hilbert Spaces and $l^2$

The Hilbert spaces $\mathcal{L}^2(\mathbb{R})$, $\mathcal{L}^2(\mathbb{R}^3)$ and $\mathcal{L}^2[a,b]$ are examples of separable and infinite dimensional Hilbert spaces. They represent distinct physical systems in different configuration spaces. However, each of them and any separable infinite dimensional Hilbert space is isomorphic to the Hilbert space $l^2$. Two Hilbert spaces are isomorphic if there is a one-to-one correspondence between vectors of the Hilbert spaces; and two such Hilbert spaces are mathematically equivalent. In an earlier post we have addressed the question whether two isomorphic Hilbert spaces represent the same quantum system or not.

The isomorphism of a separable and infinite dimensional Hilbert space $\mathcal{H}$ and $l^2$ can be established as follows. $\mathcal{H}$, being separable, possesses a countable set of orthonormal basis vectors, say $\{\phi_1,\phi_2,\phi_3,\dots\}$. Then every vector $\psi$ of $\mathcal{H}$ has the Fourier expansion \begin{equation} \psi=\sum_{k=1}^{\infty} a_k \phi_k,\;\;\; a_k = \left<\phi_k\left|\psi\right>_{\mathcal{H}}\right., \end{equation} in which $\left<\cdot|\cdot\right>_{\mathcal{H}}$ is the inner product in $\mathcal{H}$. Note that the complex Fourier coefficients $a_k$ are uniquely determined by $\psi$, so that we have the one-to-one correspondence \begin{equation} \psi \longleftrightarrow (a_1,a_2,a_3,\dots) \end{equation} Also that $\psi$ belongs to $\mathcal{H}$ implies that it has a finite norm, that is \begin{eqnarray}\label{xxx} ||\psi||^2 &=& \left<\psi\left|\psi\right>\right. &=& \sum_{k=1}^{\infty} |a_k|^2 < \infty, \end{eqnarray} where we have used the orthonormality of the basis vectors to arrive at the right hand side of the equation. Now for an arbitary $\varphi$ in $\mathcal{H}$ with the Fourier expansion \begin{equation} \varphi=\sum_{k=1}^{\infty} b_k \phi_k,\;\;\; b_k = \left<\phi_k\left|\varphi\right>_{\mathcal{H}}\right., \end{equation} we have the inner product \begin{equation}\label{inner1} \left<\psi\left|\varphi\right>_{\mathcal{H}}\right. = \sum_{k=1}^{\infty} a_k^* b_k , \end{equation} where we have invoked again the orthonormality of the basis vectors.

We now recall that a vector $\zeta$ of $l^2$ is a sequence of complex numbers, $\zeta=(\alpha_1,\alpha_2,\alpha_3,\dots)$, with the sequence satisfying \begin{equation}\label{xxxx} \sum_{k=1}^{\infty} |\alpha_k|^2 < \infty. \end{equation} For arbitrary pair of vectors $\zeta=(\alpha_1,\alpha_2,\alpha_3,\dots)$ and $\eta=(\beta_1,\beta_2,\beta_3)$ in $l^2$, the inner product is given by \begin{equation}\label{inner2} \left<\zeta\left|\eta\right>_{l^2}\right. = \sum_{k=1}^{\infty} \alpha_k^* \beta_k . \end{equation}

The fact that an arbitrary vector $\psi$ of $\mathcal{H}$ is uniquely identified with its complex Fourier coefficients $(a_1,a_2,a_3,\dots)$ and that these coefficients satisfy the same condition \ref{xxx} as those elements of $l^2$ \ref{xxxx} allow us to set up a one-to-one correspondence between vectors of $\mathcal{H}$ and vectors of $l^2$. Moreover, the inner product in $l^2$ \ref{inner2} reproduces the inner product \ref{inner1} in $\mathcal{H}$, so that both Hilbert space have the same mathematical structures.

Wednesday, February 4, 2015

Are there countable number of quantum systems?

Quantum mechanics demands that every quantum system corresponds to a separable Hilbert space. The separability condition requires that the Hilbert space possesses a countable set of basis vectors. A collection of objects or a set is called countable if it has a finite number of members or elements; or, in the case of infinite number of elements, if one can arrange the elements in one to one correspondence with the natural numbers $\{1,2,3,\dots\}$. The condition restricts the available Hilbert spaces to $\mathbb{C}^2$, $\mathbb{C}^3$, $\mathbb{C}^4$, $\dots$, for finite dimensional quantum systems, and $l^2$ for infinite dimensional ones. Clearly the available Hilbert spaces is countable. If one takes the position that systems with the same dimensions are equivalent, then we arrive at the conclusion that there are countable number of quantum systems.

I am aware of at least one credible instance where equality of dimensions was taken to imply equivalence of quantum systems. Arno Bohm, in his "The Rigged Hilbert Space and Quantum Mechanics", motivated the introduction of the rigged Hilbert space as a replacement of the separable Hilbert space of standard quantum mechanics with the premise "that all (separable infinite dimensional) Hilbert spaces are isomorphic and so are—roughly speaking—their algebras of operators. This means that all physical systems would be equivalent, which is obviously not the case (unless one has only one physical system, which then would have to be the microphysical world)." This claim, pushed to its logical conclusion, implies that finite dimensional quantum systems with equal dimensions are as well equivalent because a finite dimensional Hilbert space is unique up to a unitary transformation.

But what is a quantum system? A quantum system is ultimately defined by the measurements and the outcomes of measurements performed on the system. Consider a photon and an electron. If we restrict our attention to the internal degrees of freedom of the photon and the electron, then we will be restricted in their polarization and spin degrees of freedom, respectively. The photon has vertical and horizontal polarizations; these polarization states are mutually exclusive and hence define a basis for internal degree of freedom of the photon. The Hilbert space is two dimensional and is given by $\mathbb{C}^2$. On the other hand, the electron has spin-up and spin-down states; these spin-states are also mutually exclusive and define a basis for the spin degree of freedom of the electron. The Hilbert space is again $\mathbb{C}^2$.

The two internal degrees of freedom reside in two isomorphically equivalent Hilbert spaces, yet they are clearly different physical systems. They are different systems because they involve two different sets of measuring instruments in identifying them. The photon polarization is identified with measurements involving, say, polarizers and beam splitters; on the other hand, the electron spin is identified with measurements involving, say, the magnetic field interacting with the magnetic moment of the electron. These two sets of measurements are not interchangeable. The electron spin cannot be measured by passing it through a polarizer; and the photon polarization cannot be measured by magnetic interaction.

While it is true that a quantum system is a Hilbert space in its barest mathematical sense, a separable Hilbert space is not a single quantum system but an abstraction of a family of quantum systems characterized by their common dimensions. A specific quantum system is a particular physical realization of a Hilbert space, but it is by no means the only realization. There are potentially infinitely many possible realizations of a given Hilbert space. One only has to recognize the infinite class of interaction potentials that may act on a quantum particle—all representing distinct physical systems. So while the available separable Hilbert spares are countable, the available quantum systems are uncountably many.

Wednesday, January 28, 2015

Quantum Systems and their Hilbert Spaces

What is a Quantum System

A quantum system is traditionally defined as an object or a collection of objects in the scale of the electron or of the atoms, the description of which is governed by the laws of quantum mechanics. However, it has become increasingly evident that the grips of quantum mechanics extend much beyond this traditional scale of a quantum object. Recent experiments have shown that the quintessential signature of quantum behavior---self-interference---is not only manifested by electrons and atoms but also by much larger objects such as the fullerene, an arrangement of 60-carbon atoms taking the shape of the soccer ball; and, more recently, large organic molecules that are compounds of 430 atoms. Also it is already known that much more complex systems with internal degrees of freedom more than 1000 manifest self-interference.

These recent experiments, together with the recent theoretical developments of quantum mechanics, point to the conclusion that it is only our technological ability to isolate objects from their environments that stands in the way in their manifesting quantum behavior. The theoretical developments have afforded us the understanding that even the lack of self-interference by larger objects can be explained entirely within the scope of quantum mechanics. All these taken together lead us to the conclusion that quantum mechanics applies to the entire spectrum of spatial sizes and structural complexities. Hence any object or a collection of objects that we can conceive of (say the chair you are sitting on right now) can be legitimately identified as a quantum object, with the full or degraded manifestation of self-interference determined only by the degree the system is isolated from its environment.

So, given a quantum system, how do we proceed in describing it quantum mechanically? Quantum mechanics demands that a quantum object is identified with a complex, separable Hilbert space. In short a quantum system is a Hilbert space in its barest mathematical sense. The Hilbert space description of a quantum object is a natural consequence of the quantum superposition principle, the physical manifestation of which is the well-known self-interference phenomenon. Quantum mechanics as we know it is just the set of rules that relate the different structures of the underlying Hilbert space with outcomes of our measurements on the system. In this post we will see how the principle of superposition principle, in conjunction with the concept of mutually exclusive outcomes, leads to the construction of the Hilbert space of a quantum system.

The Quantum Hilbert Space

Central to the description of any physical system is the concept of state of a system. The state is the collection of minimal information that completely describes the system at a given moment. It encodes all the information that we can ever extract from the system. It is the initial data fed into the formal rules of the appropriate physics that yields predictions into the outcomes of measurements made on the system. The collection of all possible states constitute the state space of the system. For example, for a classical particle, the collection of position and momentum of the particle at any instant of time constitute the state of the particle at that instant; the initial position and momentum---the initial state---fed into Newton's laws of motion predicts all outcomes of measurements made on the particle. The volume of the phase space available to the particle constitutes its state space.

Now the state space of a quantum system splits in two in accordance with the predictability of outcomes of measurements on the system. A state is referred to as a pure state if there exists a measurement whose outcome is completely predictable when the system is prepared in that state. Otherwise, the state is referred to as a mixed state if there exists no measurement whose outcome is predictable when the system is prepared in that state. Mixed states are expressible in terms of pure states (some statistical averages of pure states), so that pure states lay at the foundation of the state space of a quantum system. Quantum mechanics postulates that pure states of a quantum system constitute a complex, separable Hilbert space. Constructing or assigning the Hilbert space of a quantum system is identifying all possible pure states of the system, through them the whole state space of the system is constituted.

A Hilbert space has two basic constituents: its dimension and its inner product. Once these two are specified the corresponding unique (up to a unitary transformation) Hilbert space can already be constructed. Quantum mechanics provides the recipe of identifying the dimension and the inner product; the recipe is based on measurements of observables of the system under consideration. Now an observable is any measurable property of the system, such as energy and momentum. Without losing generality, we will only consider discrete observables in this post. Not all possible observables of the system may serve to define the Hilbert space. The observables that are relevant to the construction of the Hilbert space are known as sharp observables, in contrast to the non-relevant ones known as unsharp observables. We will have a separate posting on unsharp observables later.

Let us consider some observable $A$ whose only measurable values are $\{a_1,a_2,\dots\}$. The number of measurable values may be finite or infinite. The observable $A$ is a sharp observable if it has the following property: Successive measurements of $A$ always yield the same value, say $a_k$. Successive here means a series of measurements made just right after another measurement. The outcomes of the measurement of $A$ are then said to be mutually exclusive. That is if the result of measurement is $a_1$, then a measurement made right after $a_1$ is obtained excludes all other possible outcomes except $a_1$ itself. For an unsharp observable, there is a chance that the outcome of succession of measurements is different from the first outcome.

We can now proceed with the recipe in discovering the dimension of the system Hilbert space. Let $\{A,B,C,\dots\}$ comprise all possible sharp observables of the given system. Now measure $A$ an arbitrary number of times, with the system in arbitrary configuration each time, and determine the number of possible outcomes of the measurement, say $N_A$. Next measure $B$ using the same algorithm, obtaining in the process $N_B$ possible outcomes, and so on. Note that the actual value measured is not important here. It is the number of distinct outcomes of measurements that is required. The maximum of the set $\{N_A,N_B,N_C,\dots\}$ is the dimension of the system Hilbert space. If the dimension is finite, the system is called a finite dimensional quantum system; otherwise, it is called infinite dimensional.

Finite Dimensional Quantum Systems

Now let $\mathcal{H}$ be the Hilbert space of the system which is yet to be specified, except that we presuppose to have already discovered the dimension $N<\infty$ of $\mathcal{H}$. Consider now a sharp observable $A$ whose number of mutually exclusive values equals that of the dimension $N$ of $\mathcal{H}$, $\{a_1,a_2,\dots,a_N\}$; the values may be arranged in increasing order. Quantum mechanics postulates that to each of the outcome $a_k$ corresponds to a pure state $\psi_k$ of the system, with the $\psi_k$'s belonging to the system Hilbert space $\mathcal{H}$. The correspondence is that if the system is prepared in $\psi_k$ and $A$ is measured thereafter the outcome of the measurement is invariably $a_k$. The outcomes $\{a_1,a_2,\dots,a_N\}$ are mutually exclusive so that the corresponding pure states $\{\psi_1,\psi_2,\dots,\psi_N\}$ have the property of being a set of mutually exclusive states. This property means that if the system is prepared in the state $\psi_k$ then measurement of $A$ excludes the possibility of obtaining an outcome corresponding to $\psi_l$ for $k\neq l$.

Let $\left<\cdot|\cdot\right>$ be the inner product of the Hilbert space $\mathcal{H}$, which until here is unspecified. Quantum mechanics postulates that the mutual exclusivity of the states $\{\psi_1,\psi_2,\dots,\psi_N\}$ translates to the statement that the set is an orthogonal set. Explicitly

\begin{equation} \psi_k,\, \psi_l\;\; \mbox{mutually exclusive} \;\; \longleftrightarrow \left<\psi_k\left|\psi_l\right>\right.=\left<\psi_l\left|\psi_k\right>\right. = 0, \;\; k\neq l . \end{equation}Without sacrificing generality, we can always assume that the $\psi_k$'s are normalized, i.e. $||\psi_k||=\sqrt{\left<\psi_k\left|\psi_k\right>\right.}=1$ or $\left<\psi_k\left|\psi_k\right>\right.=1$ . Then the set $\{\psi_1,\psi_2,\dots,\psi_N\}$ is an orthonormal set, i.e. $\left<\psi_l\left|\psi_k\right>\right.=\delta_{kl}$. We can now restate the principle
\begin{equation} \{\psi_1,\psi_2,\dots,\psi_N\}\;\; \mbox{mutually exclusive states} \;\; \longleftrightarrow \;\; \left<\psi_k\left|\psi_l\right>\right.=\delta_{kl}.\end{equation}That is a set of mutually exclusive pure states is an orthonormal set.

The entire vector space of pure states is constructed using the quantum superposition principle which states that there exists a meaningful operation of addition $(+)$ of pure states and multiplication $(\cdot)$ of complex numbers and pure states such that the linear sum $\alpha\cdot \psi + \beta \cdot \phi$ is again a pure state of the system for every pure states $\psi$, $\phi$ and complex numbers $\alpha$, $\beta$. Using this principle, quantum mechanics demands that the mutually exclusive states $\{\psi_1,\psi_2,\dots,\psi_N\}$ form an orthonormal basis of the system Hilbert space $\mathcal{H}$. Then every pure state $\psi$ or vector of $\mathcal{H}$ can be written in terms of this basis
\begin{equation}
\psi=\sum_{k=1}^N \alpha_k \psi_k
\end{equation} for some complex $\alpha_k$'s. Notice that the vector $\psi$ is uniquely identified with the $N$-tuple of complex numbers $\{\alpha_1,\alpha_2,\dots,\alpha_N\}$.

Finally we are in the position to identify the inner product of the Hilbert space $\mathcal{H}$. Given an arbitrary pair of of vectors
\begin{equation}
\psi=\sum_{k=1}^N \alpha_k \psi_k, \;\;\; \phi=\sum_{k=1}^N \beta_k \psi_k ,
\end{equation}their inner product is deduced from the orthonormality of the basis vectors using the linear and antilinear properties of the inner product. We arrive at the inner product
\begin{equation}
\left<\psi\left|\phi\right>\right. =\sum_{k=1}^N \alpha_k^* \beta_k .
\end{equation} Again notice that the resulting inner product depends only on the $N$-tuple of complex numbers identifying the vectors. Finally we obtain that the Hilbert space of an $N$-dimensional quantum system is $\mathcal{H}=\mathbb{C}^N$.

The foregoing is an abstract description of the Hilbert space of a finite dimensional Hilbert space. The Hilbert space is made concrete and practicable by an explicit representation of the pure states in terms of column matrices. This is possible because matrices of fixed sizes form a vector space and the inner product can be implemented in terms of a known matrix operation.

To simplify our notation let us just consider the simplest quantum system, the qubit which is a two dimensional quantum system with the Hilbert space $\mathbb{C}^2$. Vectors in $\mathbb{C}^2$ are uniquely identified with a 2-tuple of complex numbers $\{\alpha_1,\alpha_2\}$, and they can be represented as $2\times 1$ column vectors, \begin{equation} \psi = \left(\begin{array}{c} \alpha_1\\ \alpha_2 \end{array}\right) \nonumber. \end{equation} The vector space structure of $\mathbb{C}^2$ is imlemented with the usual rules of matrix operations. In particular, for arbitrary complex numbers $\alpha,\beta$ and vectors $\psi,\phi$ we have \begin{eqnarray} \alpha\cdot \psi + \beta \cdot \phi =\left(\begin{array}{c} \alpha\alpha_1 + \beta\beta_1\\ \alpha\alpha_2 + \beta\beta_2 \nonumber \end{array}\right) . \end{eqnarray} On the other hand the inner product is implemented by means of the multiplication of matrices, \begin{equation} \left<\psi\left|\phi\right>\right. = (\alpha_1^*\;\; \alpha_2^*)\left(\begin{array}{c} \beta_1\\ \beta_2 \end{array}\right) = \alpha_1^* \beta_1 + \alpha_2^*\beta_2 \nonumber . \end{equation} All these taken together we identify the basis vectors, representing a particular set of mutually exclusive states, are given by \begin{equation} \psi_1 = \left(\begin{array}{c} 1\\ 0 \end{array}\right),\;\;\; \psi_2 = \left(\begin{array}{c} 0\\ 1 \end{array}\right) \nonumber . \end{equation} Of course, these are not the only possible basis vectors; there are, in fact, infinitely many of them. This means that the underlying quantum system has an infinitely many sharp observables.

The extension of the column vector representation of $\mathbb{C}^N$ for arbitrary positive integer $N$ is evident.

Infinite Dimensional Quantum Systems

For infinite dimensional quantum systems, there are subtle issues that arise in the construction of the Hilbert space. The problem lies in the possibility that a given measurable value corresponds to more than one distinct configuration or state of the system. This can be avoided by constructing a basis out of an observable of the system that has infinite number of possible values with the property that to every value there corresponds one and only one configuration or state of the system that yields the same value. Such an observable is known as a non-degenerate observable; otherwise, it is known as degenerate. Once such an observable has been identified, we can proceed in a similar fashion as in the case of finite dimensional systems except that one has to go via the pre-Hilbert space and then followed by completion of the pre-Hilbert space to obtain the Hilbert space. This leads us to the Hilbert space $\mathcal{H}=l^2$ for an infinite dimensional quantum system. We will have a separate posting to elaborate the infinite dimensional case.

Wednesday, January 21, 2015

The Hilbert Space of Quantum Mechanics

I. Introduction

A system is any object or a collection of objects whose properties are subject to observation and measurement. The state of a given system is the minimum collection of information that completely describes the system. For a classical particle, its state is specified by giving its position and momentum at any given time. The state space of the particle is the two-dimensional phase-space of position and momentum, every point in which describes a state of the particle. On the other hand, the state of a quantum object, such as an atom, is described not in terms of position and momentum but of complex valued functions that comprise the state space of the quantum object. The state space of a quantum object has the structure of a complex, separable Hilbert space. To describe a quantum object one needs to specify the Hilbert space that is uniquely associated with the system. Here I describe the Hilbert space that is relevant to quantum mechanics.

II. Vector Space

A complex vector space is a set $ V=\{\psi, \phi, \varphi, \dots\}$, together with a rule that defines the sum (denoted by $+$) of elements of $V$ and a rule that defines the product (denoted by $\cdot$) of any complex number and any element of $ V$. The sum and the product satisfy the following axioms: For every complex numbers $\alpha, \beta$ and elements $\psi,\varphi,\phi$ in $V$,

$\psi + \varphi = \varphi+\psi$ belongs to $V$,
$(\psi+\varphi)+\phi=\psi+(\varphi+\phi)$,
$\alpha\cdot \psi$ belongs to $V$,
$\alpha\cdot(\psi+\varphi)=\alpha\cdot\psi+\alpha\cdot\varphi$
$(\alpha+\beta)\cdot\psi=\alpha\cdot\psi + \beta\cdot\psi$
$(\alpha\beta)\cdot\psi=\alpha\cdot(\beta\cdot\psi)$
$1\cdot\psi=\psi$
There exists a unique element $\theta$ of $V$ such that for all $\psi$ in $V$ the equality $\psi+\theta=\psi$ holds.
To every element $\psi$ of $V$ there exists a unique element $\eta$ of $V$ such that $\psi+\eta=\theta$. The element $\eta$ is given by $\eta=(-1)\cdot\psi$, which we denote simply by $-\psi$.

The ordered triplet $(V,+,\cdot)$ constitutes a complex vector space, and the elements of $V$ are called vectors. More than one pair of rules $(+,\cdot)$ can be assigned to the given set $V$, with each pair defining a distinct vector space with $V$. If the pair $(+,\cdot)$ is clear from the outset, the set $V$ can be simply referred to as a vector space.

Example-1. Let $\mathbb{C}^n$ denote the set of all $n$-tuple complex numbers. A typical element $z$ of this set is given by $z=\{\alpha_1,\alpha_2,\dots,\alpha_n\}$ where the $\alpha_k$'s are arbitrary complex numbers. On its own, $\mathbb{C}^n$ is just a set and not a vector space. It is turned into a vector space by defining the sum of every pair of its elements, and defining the product of its every vector with a complex number that satisfy the above axioms of a vector space.

Given two arbitrary elements $z=\{\alpha_1,\alpha_2,\dots,\alpha_n\}$ and $w=\{\beta_1,\beta_2,\dots,\beta_n\}$ of $\mathbb{C}^n$, we define their sum to be $z+w=\{\alpha_1+\beta_1,\alpha_2+\beta_2,\dots,\alpha_n+\beta_n\}$, where $\alpha_k+\beta_k$, for all $k=1,\dots,n$, denote the usual addition of complex numbers. Moreover, we define the product of any element $z$ and complex number $\lambda$ to be $\lambda \cdot z = \{\lambda \alpha_1,\lambda \alpha_2,\dots,\lambda\alpha_n\}$, where $\lambda\alpha_k$ denote the usual multiplication of complex numbers. Using the known properties of complex numbers, it is not difficult to show that all the axioms 1 to 7 of a vector space are satisfied by these definitions.

Now the $n$-tuple of zeros, $z_0=\{0,\dots,0\}$, belongs to $\mathbb{C}^n$, the numeric zero being a complex number itself; moreover, $z_0$ satisfies $z+z_0=z$ for all $z$. Hence $z_0$ is the zero of $\mathbb{C}^n$ with respect to the given sum. From this, for every $z=\{\alpha_1,\alpha_2,\dots,\alpha_n\}$ we can find a $z'$ in $\mathbb{C}^n$ such that $z+z'=z_0$; it is given by $z'=\{-\alpha_1,-\alpha_2,\dots,-\alpha_n\}$. Then $\mathbb{C}^n$, together with the defined sum and multiplication, is a complex vector space. $\square$

Example-2. Let $C^2\!(\mathbb{R})$ denote the set of all continuous, complex valued, (Lebesque) square integrable functions $\psi(x)$ in the real line $\mathbb{R}$; the last property of the elements of the set means $\int_{-\infty}^{\infty} |\psi(x)|^2 dx<\infty$. The function $O(x)=0$ for all real $x$ (the function that identically vanishes in the entire real line) belongs to this set.

The set $C^2\!(\mathbb{R})$ is a vector space under the usual rule of pointwise addition of functions and the usual rule of multiplication of complex numbers with any complex valued function. That is if $\psi(x)$ and $\phi(x)$ belong to this set, then their sum is defined to be the function $\varphi(x)=\psi(x)+\phi(x)$, where the plus sign indicate the usual addition of complex numbers. The sum belongs to the set again. This follows from the well-known inequality, $\sqrt{\int |\psi(x)+\phi(x)|^2 dx} \leq \sqrt{\int |\psi(x)|^2 dx }+ \sqrt{\int |\phi(x)|^2 dx}$. Since the two terms in the right hand of the side are both finite, the left hand side must be finite, too. That is $\varphi(x)=\psi(x)+\phi(x)$ is itself square integrable. Also for any complex $\lambda$ and function $\psi(x)$ in $ C^2(\mathbb{R})$, the function $ \lambda \psi(x)$ is clearly in the set, too. Finally the zero vector of the set is $O(x)$. The rest of the axioms of a vector space can be easily shown to be satisfied. $\square$

III. Inner Product Space

A complex inner product space is a complex vector space $V$ together with a binary operation $\left<\cdot|\cdot\right>$ taking pairs of vectors of $V$ into complex numbers such that for every complex numbers $\alpha$, $\beta$ and vectors $\psi$, $\phi$, $\varphi$ in $V$ the following axioms are satisfied:

$\left<\psi|\phi\right>=\left<\phi|\psi\right>^*$,
$\left<\psi|\alpha\phi + \beta\varphi\right>=\alpha\left<\psi|\phi\right>+\beta\left<\psi|\phi\right>$,
$0\leq \left<\psi|\psi\right>$, with $\left<\psi|\psi\right>=0$ if and only if $\psi$ is the zero vector.

The ordered pair $(V,\left<\cdot|\cdot\right>)$ constitutes an inner product space. If from the outset the inner product is clear, the set $V$ can be referred to as an inner product space.

Example-1. For the vector space $\mathbb{C}^n$, the following inner product can be introduced: For every pair of vectors $z=\{\alpha_1,\alpha_2,\dots,\alpha_n\}$ and $w=\{\beta_1,\beta_2,\dots,\beta_n\}$, define

\begin{equation} \label{c1} \left<z|w\right>=\sum_{k=1}^n \alpha_k^* \beta_k . \end{equation}

It is not difficult to show that this binary operation satisfies all the properties of an inner product. $\square$

Example-2. For the vector space $C^2\!(\mathbb{R})$, the following inner product can be introduced: For every pair of vectors $\psi(x)$ and $\varphi(x)$, define

\begin{equation}\label{c2}\left<\psi|\varphi\right>=\int_{-\infty}^{\infty} \psi^*(x) \varphi(x) dx.\end{equation}

From the linear property of the integral, it can shown that this binary operation satisfies all the axioms of an inner product. $\square$

Then $\mathbb{C}^n$ and $C^2\!(\mathbb{R})$, together with their respective inner products defined above, are inner product spaces.

IV. Normed Space

A complex normed space is a complex vector space $V$ together with an operation $||\cdot||$ taking vectors of $V$ such that for every complex number $\alpha$ and vectors $\psi$, $\phi$ in $V$ the following axioms are satisfied:

$0\leq ||\psi||$, with $||\psi||=0$ if and only if $\psi$ is the zero vector,
$||\psi+\phi||\leq ||\psi||+||\phi||$,
$||\alpha||=|\alpha| \cdot ||\psi||$.

The ordered pair $(V,||\cdot||)$ is called a complex normed space. The positive number $||\psi||$ is called the norm of the vector $\psi$. The norm is a generalization of the concept of length in an abstract set. An inner product space can be turned into a normed space by equipping the vector space with the norm $||\psi||=\sqrt{\left<\psi|\psi\right>}$.

Example-1. The inner product space $\mathbb{C}^n$ is a normed space under the norm induced by the inner product \ref{c1}. The norm is given by
\begin{equation}\label{d1}
||z||=\sqrt{\sum_{k=0}^k |\alpha_k|^2} .
\end{equation} . $\square$

Example-2. The inner product space $C_0^2(\mathbb{R})$ is a normed space under the norm induced by the inner product \ref{c2}. The norm is given by
\begin{equation}\label{d2} ||\psi|| = \sqrt{\int_{-\infty}^{\infty} |\psi(x)|^2\, dx}\end{equation} . $\square$

V. Metric Space

A complex metric space is a complex vector space $V$ together with a binary operation $d(\cdot,\cdot)$ such that for every vectors $\psi$, $\phi$, $\varphi$ in $V$ the following axioms are satisfied:

$d(\phi,\psi)\geq 0$, with $d(\psi,\phi)=0$ if and only if $\psi=\phi$,
$d(\psi,\phi)=d(\phi,\psi)$,
$d(\psi,\phi)\leq d(\psi,\varphi)+d(\varphi,\phi)$.

The ordered set $(V,d(\cdot,\cdot))$ constitute a metric space. The metric is a generalization of the concept of distance between two points. Observe that an normed space can be made into a metric space by defining the metric as $d(\psi,\phi)=||\psi-\phi||$.

A metric defines equality of vectors, i.e. two vectors are equal if the distance between them is zero, even though they are not identical. (This point will be illustrated below.) Also a metric defines the convergence of a sequence of vectors in $V$ to a particular vector in $V$.

Example-1. The inner product space $\mathbb{C}^n$ is a metric space under the metric induced by the norm \ref{d1}. The metric is given by
\begin{equation}
||z-w||=\sqrt{\sum_{k=0}^k |\alpha_k-\beta_k|^2} .
\end{equation} . $\square$

Example-2. The inner product space $C_0^2(\mathbb{R})$ is a metric space under the metric induced by the norm \ref{d2}. The metric is given by
\begin{equation} ||\psi-\varphi|| = \sqrt{\int_{-\infty}^{\infty} |\psi(x)-\varphi(x)|^2\, dx}\end{equation} $\square$

VI. Hilbert Space

A complex pre-Hilbert space $H_P$ is a complex linear vector space equipped with an inner product $\left<\cdot|\cdot\right>$, with a norm given by $||\psi||=\sqrt{\left<\psi|\psi\right>}$, and with a metric $ d(\psi,\phi)=\sqrt{\left<\psi-\phi|\psi-\phi\right>}$. Thus to define a pre-Hilbert space one only needs to specify the vector space involved and to define the corresponding inner product. Note that a given vector space maybe equipped with more than one inner product. In quantum mechanics the choice of inner product is not arbitrary but dictated by physical considerations.

Before we can define what is a Hilbert space, we need the concept of a Cauchy sequence. A sequence of vectors in a pre-Hilbert space $H_P$ is an indexed collection of vectors, i.e. $\{\varphi_1,\varphi_2,\varphi_3,\dots\}$ with each $\varphi_k$ belonging to $H_P$. A sequence is called Cauchy if $\lim_{m,n\rightarrow \infty}||\varphi_m-\varphi_n||=0$. The pre-Hilbert space $H_P$ is called complete if for every Cauchy sequence $\{\varphi_1,\varphi_2,\varphi_3,\dots\}$ in $H_P$ there exists a vector $\varphi$ in $H_P$ such that $\lim_{m\rightarrow \infty}||\varphi_m-\varphi||=0$.

A complete pre-Hilbert space is called a Hilbert space.

A pre-Hilbert space which is not complete can be turned into a Hilbert space by adding all limits of Cauchy sequences to the pre-Hilbert space itself. The resulting Hilbert space is called the completion or closure of the pre-Hilbert space. Generally Hilbert spaces are constructed in this way. That is one starts with a linear space, then equips that space with an inner product to make it into a pre-Hilbert space, then finally completes the pre-Hilbert space to get a Hilbert space.

Example-1. $\mathbb{C}^n$, together with the inner product \ref{c1}, is a pre-Hilbert space which is already a Hilbert space. $\square$

Example-2. $C_0^2(\mathbb{R})$, together with the inner product \ref{c2}, is a pre-Hilbert space which is not a Hilbert space. The reason is that there are Cauchy sequences in the vector space which do not converge to continuous functions. For example, the sequence of continuous functions given by

\begin{equation}\eta_n(x)=\left\{
\begin{array}{ll} 0 &, x\leq 0\\
n x &, 0\leq x\leq \frac{1}{n} \\
1 &, \frac{1}{n}\leq x\leq 1 \\
n+1-nx &, 1\leq x \leq 1+\frac{1}{n}\\
0&, 1+\frac{1}{n} \leq x
\end{array}
\right. , \; \;\;\;\; n=1, 2, \dots
\end{equation}

is a Cauchy sequence in $C_0^2(\mathbb{R})$. However, the sequence converges to the function which has the value $1$ in the interval $(0,1)$ and $0$ elsewhere, which is clearly not continuous but belonging to $\mathcal{L}^2(\mathbb{R})$. The pre-Hilbert space is then not complete. To construct a Hilbert space out of $C_0^2(\mathbb{R})$, we adjoin to $C_0^2(\mathbb{R})$ all limits of its Cauchy sequences. The resulting Hilbert space then consists not only continuous functions but also non-continuous functions. The Hilbert space is denoted by $\mathcal{L}^2(\mathbb{R})$.

A pecularity of $\mathcal{L}^2(\mathbb{R})$ is that its vectors are not individual functions but classes of functions. Consider the function $\eta(x)\in C_0^2(\mathbb{R})$, which, of course, belongs to $\mathcal{L}^2(\mathbb{R})$. Now define the function $\zeta(x)=\eta(x)$ for all $x\neq x_0$ and $\zeta(x)=\infty$ for $x=x_0$. While $\zeta(x)$ is infinite at $x_0$, the integral $\int_{\mathbb{R}}|\zeta(x)|^2\mbox{d}x$ exists, and in fact its value is given by $\int_{\mathbb{R}}|\zeta(x)|^2 \mbox{d}x=\int_{\mathbb{R}}|\eta(x)|^2\mbox{d}x$. That is so because the value of an integral does not change by changing the value of the integrand at isolated points. Thus $\zeta(x)$ is a vector of $\mathcal{L}^2(\mathbb{R})$. Now $\zeta(x)-\eta(x)=0$ for all $x\neq x_0$, so that $\int_{\mathbb{R}}|\zeta(x)-\eta(x)|^2\, \mbox{d}x=0$ or $\|\zeta-\eta\|=0$. By the definition of equality of vectors of a metric space, the functions $\zeta(x)$ and $\eta(x)$, while they are not equal pointwise, represent the same vector in $\mathcal{L}^2(\mathbb{R})$. Also we can introduce the new function $\omega(x)=\eta(x)$ for all $x\neq x_1,x_2$, with $x_1\neq x_2$, and assign arbitrary values to $\omega(x)$ at $x_1$ and $x_2$. Again $\eta(x),\zeta(x),\omega(x)$ are not equal pointwise, but $\|\eta-\zeta\|=0$, $\|\eta-\omega\|$, $\|\omega-\zeta\|=0$ so that they represent the same vector in $\mathcal{L}^2(\mathbb{R})$. Clearly we can go on changing the values of $\eta(x)$ at isolated points to obtain an infinite number of functions that differ with $\eta(x)$ at isolated points only but with zero distance with $\eta(x)$. We can now see that a class of functions represents a vector of $\mathcal{L}^2(\mathbb{R})$. $\square$

VII. Separable Hilbert Spaces

Some collection of vectors, $\{\varphi_1,\varphi_2,\varphi_3,\cdots\,\varphi_n\}$, of a Hilbert space $\mathcal{H}$ is called linearly independent if the sum
\begin{equation}\alpha_1\varphi_1 + \alpha_2\varphi_2 + \dots +\alpha_n \varphi_n =0 \end{equation} implies that $\alpha_1=\alpha_2=\dots=\alpha_n=0$. The largest number of linearly independent vectors in a Hilbert space is the dimension $N$ of the Hilbert space. If $N$ is finite, the Hilbert space is finite dimensional; otherwise, it is infinite dimensional.

A Hilbert space $\mathcal{H}$ is called separable if there exists a set of linearly independent vectors in $\mathcal{H}$, $\{\phi_1,\phi_2,\phi_3,\dots\}$, such that for every vector $\psi$ in $\mathcal{H}$ there exists a set of complex numbers $\{\alpha_1,\alpha_2,\alpha_3,\dots\}$ such that\begin{equation}\label{vectorsum} \psi=\alpha_a\phi_1 + \alpha_2\phi_2 + \alpha_3\phi_3 + \dots \end{equation}
The number of such independent vectors must necessarily equal the dimensionality of the the Hilbert space. When such set satisfies the condition, the set is said to span the Hilbert space. Not all Hilbert spaces are separable. The Hilbert space of quantum mechanics is separable; because of that whenever we refer to a Hilbert space from hereon we mean separable Hilbert space.

Two vectors $\phi$ and $\psi$ are called orthogonal if $\left<\phi|\psi\right>=0$. A set of independent vectors, which we denote shorthand by $\{\phi_k\}$, is called an orthogonal set if $<\phi_i|\phi_j>=0$ when $i\neq j$. Moreover, the set is called an orthonormal set if $<\phi_i|\phi_j>=\delta_{ij}$ where $\delta_{ij}$ is the Kronecher delta function, $\delta_{i\neq j}=0$ and $\delta_{i=j}=1$. When the set $\{\phi_k\}$ spans the Hilbert space, the set is called an orthonormal basis for the Hilbert space $\mathcal{H}$.

Thus when $\{\phi_k\}$ is an orthonormal basis, every vector $\psi$ in $\mathcal{H}$ can be written as the sum given by equation-\ref{vectorsum}. The coefficients $\alpha_k$ are found using the orthonormality of the $\phi_k$'s and the linearity of the inner product,
\begin{eqnarray}
\left<\phi_k|\psi\right>&=&\sum_i\alpha_i\left<\phi_k|\phi_i\right>\nonumber\\ &=&\sum_i \alpha_i\delta_{ik} \nonumber\\
&=&\alpha_k.
\end{eqnarray}
Hence the coefficients are given by $\alpha_k=\left<\phi_k|\psi\right>$. The $\alpha_k$'s are called the Fourier coefficients of $\psi$ with respect to the basis set $\{\phi_k\}$. From this definition it is clear that a given set of orthonormal vectors form a basis if and only if $\left<\psi|\varphi_k\right>=0$ for all $k$ implies that $\varphi=0$.

Tuesday, January 20, 2015

Vector Spaces

The entire quantum enterprise rests on the quantum superposition principle (QSP). The principle posits that states of a quantum system are elements of a certain set and that the sum of any pair of elements of the set is again an element of the set. The superposition principle endows the set the mathematical structure of being a vector space, the elements of which, referred to as vectors, constitute the state space of a quantum system. In simpler terms, the QSP states that if two vectors are legitimate states of a quantum system, then their linear sum is again a legitimate state of the same system. I will describe later in detail how the vector space arises naturally in the description of quantum systems. Here I give the precise mathematical definition of vector space which is the foundation of the mathematical structure of quantum mechanics.

Field

A field is a set $F$ together with the operation of addition , denoted by $+$, and the operation of multiplication, denoted by $\cdot$, of every pair of elements of $F$. The operations satisfy the following axioms:

For all $a, b$ in $F$, the sum $a+b$ and the product $a\cdot b$ belong to $F$ (closure property).
For all $a, b, c$ in $F$, the equalities $(a+b)+c=a+(b+c)$ and $(a\cdot b)\cdot c=a\cdot (b\cdot c)$ hold (associativity property).
For all $a, b$ in $F$, it holds that $a+b=b+a$ and $a\cdot b=b\cdot a$ (commutativity property).
For every $a$ in $F$ there exists an element of $F$, denoted by $0$, such that the equality $a+0=a$ holds; the element $0$ is called the additive identity of $F$ or simply the zero element of $F$.
For every $a$ in $F$ there exists an element of $F$, denoted by $1$, such that the equality $a\cdot 1=a$ holds; the element $1$ is called the multiplicative identity of $F$ or simply the identity element of $F$.
For every $a$ in $F$ there exists an element of $F$, denoted by $(-a)$, such that $a+(-a)=0$; the element $(-a)$ is called the additive inverse of $a$. The existence of additive inverse for all elements of the set defines the operation of subtraction among elements of the set.
For every $a$ in $F$ other than the zero element $0$, there exists an element of $F$, denoted by $a^{-1}$ such that $a\cdot a^{-1}=1$; the element $a^{-1}$ is called the multiplicative inverse of $a$. The existence of multiplicative inverse for every non-zero element of the set defines the operation of division among elements of the set.
For all $a, b, c$ in $F$, the equality $a\cdot (b+c)=a\cdot b+ a\cdot c$ holds (distributive property).

A field is then an ordered triplet $(F,+,\cdot)$, and elements of the set $F$ are called scalars. When the two operations are clear from the outset, we can refer to the set $F$ as a field itself or a scalar field.

Example: The set of real numbers, denoted by $\mathbb{R}$, endowed with the usual addition and multiplication of real numbers is a field, called the real field. The set of complex numbers, denoted by $\mathbb{C}$, endowed with the usual addition and multiplication of complex numbers is a field, called the complex field. When we refer to $\mathbb{R}$ and $\mathbb{C}$ as real and complex fields, respectively, we will mean the sets themselves together with their respective operations of addition and multiplication. $\square$

Vector Space

A vector space has four components: (a) a scalar field $F$, (b) a set $V$, (c) an operation that defines the sum (denoted by $+$) of every pair of elements of the set $V$ and (d) an operation that defines the product (denoted by $\cdot$) of every scalar in the scalar field $F$ and every element of the set $V$. The sum and the product satisfy the following axioms:

For all $\psi, \varphi$ in $V$, the sum $\psi+\varphi$ belongs to $V$.
For every $\alpha$ in $F$ and $\psi$ in $V$, the product $\alpha\cdot\psi$ belongs to $V$.
For all $\psi, \varphi$ in $V$, the equality $\psi+\varphi=\varphi+\psi$ holds (commutative property).
For all $\psi, \varphi, \phi$ in $V$, the equality $(\psi+\varphi)+\phi=\psi+(\varphi+\phi)$ holds (associative property).
For every $\alpha$ in $F$ and all $\psi,\varphi$ in $V$, the equality $\alpha\cdot(\psi+\varphi)=\alpha\cdot\psi+\alpha\cdot\varphi$ holds.
For all $\alpha,\beta$ in $F$ and every $\psi$ in $V$, the equality $(\alpha+\beta)\cdot\psi=\alpha\cdot\psi+\beta\cdot\psi$ holds.
For every $\psi$ in $V$, the equality $1\cdot\psi=\psi$ holds, where $1$ is the identity of $F$.
For every $\psi$ in $V$ there exists a unique element $\theta$ of $V$ such that, the equality $\psi+\theta=\psi$ holds; the element $\theta$ is called the zero element of $V$.
For every $\psi$ in $V$, there exists an element $\psi'$ of $V$ such that $\psi+\psi'=\theta$; the element $\psi'$ is denoted by $\psi'=-\psi$.

Then a vector space is the ordered quadruple $(F,V,+,\cdot)$. If the operations $+$ and $\cdot$ are clear from the outset, we can simply refer to the vector space as the vector space $V$ over the scalar field $F$ or the set $V$ as the vector space itself. The elements of $V$ are referred to as vectors. If the scalar field is the real number field, the vector space is called a real vector space; on the other hand, if the scalar field is the complex number field, the vector space is called a complex vector space.

Example: The quadruple $(\mathbb{R},\mathbb{R},+,\cdot)$, where $+$ is the usual addition of real numbers and $\cdot$ the usual multiplication of real numbers, is clearly a real vector space. Likewise the quadruple $(\mathbb{C},\mathbb{C},+,\cdot)$, where $+$ is the usual addition of complex numbers and $\cdot$ the usual multiplication of complex numbers, is a complex vector space.

Define the quadruple $(\mathbb{R},\mathbb{C},+,\cdot)$. The operation of addition $+$ is the usual addition of complex numbers. The operation of multiplication $\cdot$ is defined as follows: for every real $\alpha$ and complex number $z=\mbox{Re}\, z + i \mbox{Im}\, z$, we have $\alpha\cdot z=\alpha \mbox{Re}\, z + i \alpha \mbox{Im}\, z$, where the multiplication in the right hand side is the usual multiplication of real numbers. This quadruple is clearly a vector space; and, by definition, it is a real vector space (the real field being the scalar field) even though its vectors are complex numbers.

Now consider the quadruple $(\mathbb{C},\mathbb{R},+,\cdot)$. The operation $+$ is the usual addition of real numbers. The operation $\cdot$ is the usual multiplication of reals and complex numbers, as defined in the preceding paragraph. This quadruple is not a vector space, because for every complex $z$ in the scalar field and every real $\alpha$ in the set $\mathbb{R}$ the number $z\cdot\alpha$ is not in general real, i.e. it does not belong to the set $\mathbb{R}$. $\square$

Example: Let $\mathbb{R}^n$ denote the set of ordered $n$-tuple of real numbers. Elements of $\mathbb{R}^n$ are denoted by $x=\{x_1,x_2,\dots,x_n\}$, where all the $x_k$'s are real. We turn this set into a vector space by first defining the operation of addition among pairs of its elements, identifying an appropriate scalar field over which it is defined, and defining the operation of multiplication over every scalar of the field and every element of $\mathbb{R}^n$, with the two operations consistent with the above enumerated properties.

For all pairs of elements $x=\{x_1,x_2,\dots,x_n\}$ and $y=\{y_1,y_2,\dots,y_n\}$ of $\mathbb{R}^n$, we define their sum to be
\begin{eqnarray}
x+y&=&\{x_1,x_2,\dots,x_n\}+\{y_1,y_2,\dots,y_n\}\nonumber\\
&=&\{x_1+y_1,x_2+y_2,\dots,x_n+y_n\}, \label{sum1}
\end{eqnarray}where the addition in $(x_k+y_k)$ is the usual addition of real numbers. Since the reals form a field under their usual addition, the definition \ref{sum1} satisfies all the requirements for an operation of addition of pairs of elements of $\mathbb{R}^n$, in particular, the sum in \ref{sum1} belongs to $\mathbb{R}^n$.

Suppose we wish to define a vector space out of $\mathbb{R}^n$ over the real field $\mathbb{R}$. Then we chose our scalar field to be $\mathbb{R}$. For every real $r$ in $\mathbb{R}$ and element $x=\{x_1,x_2,\dots,x_n\}$ in $\mathbb{R}$, we define
\begin{equation}\label{prod1}
r\cdot x=r\cdot \{x_1,x_2,\dots,x_n\} = \{r x_1,r x_2,\dots,r x_n\},
\end{equation} where the product $r x_k$ is the usual product of real numbers. Again from the field properties of the reals under their usual multiplication, the product \ref{prod1} satisfies all the required properties. If we wanted to define a complex vector space out of $\mathbb{R}^2$ with the $\cdot$ operation given by \ref{prod1} and $r$ replaced by a complex number, it would not be possible because the right hand side of \ref{prod1} would have been an $n$-tuple of complex numbers, which was not an element of $\mathbb{R}^n$.

Now that we have defined the operations $+$ and $\cdot$, it remains to show that the last two properties are satisfied. The zero vector $\theta$ is not an element of the given set that is usually defined from the start. It is identified and shown to exist in the set only after the operation of addition has been defined. That is the zero vector depends on the $+$ operation. It is important to recognize that different $+$ operations can be assigned to one and the same set; and these different operations lead to different zero elements of the set. Since $\theta$ depends on the $+$ operation, the solution to $\psi+\psi'=\theta$ likewise depends on the given $+$ operation.

For the $+$ operation defined in \ref{sum1}, the solution to $x+\theta=x$ for every $x$ in $\mathbb{R}^n$ is the element $\theta=\{0,0,\dots,0\}$, the $n$-tuple of zeros. Since the $n$-tuple of zeros belongs to $\mathbb{R}^n$, the zero vector exists for the given $+$ operation. We refer to $\theta$ as the zero vector not because it is an $n$-tuple of zeros; it is the zero vector because it is the solution to $x+\theta=x$ for the given $+$ operation. Also for every $x=\{x_1,x_2,\dots,x_n\}$ in $\mathbb{R}^n$, we have the unique solution $x'=\{-x_1,-x_2,\dots,-x_n\}$ to $x+x'=\theta$. We have thus established that all the properties of a vector space are satisfied.

Then the quadruple $(\mathbb{R},\mathbb{R}^n,+,\cdot)$, with $+$ and $\cdot$ defined by \ref{sum1} and \ref{prod1} respectively, is a real vector space. From now on whenever we refer to $\mathbb{R}^n$ as a vector space we will always mean this quadruple. $\square$

Example: Let $\mathbb{C}^n$ denote the ordered $n$-tuple of complex numbers. Its elements are denoted by $z=\{z_1,z_2,\dots,z_n\}$. Following the steps leading to the construction of the real vector space $\mathbb{R}^n$, it can be shown that the quadruple $(\mathbb{C},\mathbb{C}^n,+,\cdot)$, with the operations$+$ and $\cdot$ defined respectively by
\begin{eqnarray}
z+w&=&\{z_1,z_2,\dots,z_n\}+\{w_1,w_2,\dots,w_n\}\nonumber\\
&=&\{z_1+w_1,z_2+w_2,\dots,z_n+w_n\}
\end{eqnarray}\begin{equation}
c\cdot z=c\cdot \{z_1,z_2,\dots,z_n\}=\{c z_1,c z_2,\dots, c z_n\}
\end{equation}for all $z,w$ in $\mathbb{C}^n$ and every complex $c$, is a complex vector space. From now on whenever we refer to $\mathbb{C}^n$ as a vector space we will mean the complex vector space $(\mathbb{C},\mathbb{C}^n,+,\cdot)$. $\square$

Example: Let $C(\mathbb{R})$ denote the set of continuous, complex valued functions in the real line $\mathbb{R}$. The function $\zeta(x)=0$ for all $x\in\mathbb{R}$ belongs to the set $C(\mathbb{R})$. The quadruple $(\mathbb{R},C(\mathbb{R}),+,\cdot)$, with the $+$ and $\cdot$ operations defined by the pointwise addition and multiplication of functions respectively, is a real vector space with $\zeta(x)$ as the zero vector. On the other hand, the quadruple $(\mathbb{C},C(\mathbb{R}),+,\cdot)$, with the $+$ and $\cdot$ operations defined by the pointwise addition and multiplication of functions respectively, is a complex vector space with $\zeta(x)$ as the zero vector. $\square$

Monday, January 19, 2015

Where weirdness runs supreme

Quantum mechanics provides the most accurate description of our physical universe so far. It reigns supreme in its scope, from the smallest known particles that are the quarks to the largest conglomeration of objects that is the cosmos. It accounts for all the observed properties of the elements of the periodic table, and the molecules that form from them, so that life as we know it has quantum mechanics at its foundation. Every experiment to date to test quantum mechanics to exacting standards has only increased our confidence that it paints a correct picture of the universe. Yet quantum mechanics is weird to the core.

The confirmed predictions of quantum mechanics run smack against our sane expectations and corporeal experiences. Your pet can only be either dead or alive, but never dead and alive at once, a seemingly irrefutable fact established by our everyday experiences of “reality”. Yet quantum mechanics says otherwise. Your pet cannot only be either dead or alive but it can also be both dead and alive at once---not a paralyzed half-dead-half-alive pet but a whole pet simultaneously dead and alive. This is clearly a travesty of logic to the highest order and heretical at best. But your pet in some vague state of being dead and alive is an established fact by real experiments that can be replicated anytime, anywhere. Physical reality according to quantum mechanics is not the physical reality that we know of. In some ways our experiences are mere reductions of the real reality according to quantum mechanics.

The culprit behind quantum weirdness is the principle known as the principle of quantum superposition. Given two mutually exclusive alternatives, say dead or alive, the superposition principle says that if there is nothing in principle to distinguish the two alternatives then the alternatives are simultaneously true. So if you have a living cat and isolate it from the rest of the universe, say by placing it inside a sufficiently isolated box, then quantum mechanics says that the cat will eventually be in a morbid state of being dead and alive at once. That is so because isolating the cat from the rest of the universe removes all possibilities of distinguishing the two mutually exclusive states of the cat. By the superposition principle, the two alternatives must be simultaneously true so that the cat is dead and alive. But how do we know that that indeed is the correct description of the cat when we are not looking in the first place? Without the superposition principle, there would not have been any atom.

The principle of superposition is weird. But it spawns to a much weirder phenomenon---quantum entanglement. Entanglement arises when two initially isolated objects come into contact with each other, say when two electrons collide. The interaction makes the separate objects into a single, inseparable quantum object. The individual identities of the initial objects have been lost and one can only ascertain the state of the entangled objects as a whole but not the separate states of the constituent objects. Equivalently for entangled objects, we cannot infer the state of the whole by looking at its parts. This is contrary to our expectations. Given a dismantled chair, you can infer from the parts by mere inspection that they came from a chair. But not when the parts are quantum objects that have entangled. In our everyday experience, the information on the whole is contained entirely on its parts. For entangled objects, it is possible that no information on the whole is ever encoded on its parts, so that no amount of inspection on the parts may yield information on their combined system.

Quantum entanglement between two objects also arises when the objects have been produced at the same time from a common source, such as when the objects are fragments of a disintegrating body. This happens, for example, when a photon, the grain of light, disintegrates into an electron and a positron, a process known as pair creation. The electron and the positron are entangled in this case and are inextricably linked to one another. They are not separate objects but form a unit entity, even when they fly apart to the “opposite ends” of the universe, separated by millions of light years. As such anything that happens to the electron has an instantaneous effect on the positron however the two are far apart. The link between them provided by quantum entanglement seems to act as a hyper-highway allowing an infinitely fast transmission of information from the electron to the positron, that is, at a speed infinitely faster than the speed of light. Einstein called it “spooky action at a distance”, which he could not accept as reflecting the true description of physical reality and thus rejected it. However, entanglement prevailed over Einstein.

Quantum mechanics is weird. And its weirdness does not stop at quantum entanglement. The quantum superposition principle has flung open the door to the odd reality that has been hidden from us. If there is anything we have learned from recent developments, it is this: The room that superposition principle has led us into is vast and much remains to be explored. Perhaps farther down the room we may be able to reconcile our everyday version of reality with the version of quantum mechanics. However, it is more likely that our reality is a mere mirage of the weird quantum reality. And the best that we can do is to embrace weirdness and conquer the future with it.

Pages