Shortcuts to adiabaticity applied to nonequilibrium entropy production: an information geometry viewpoint

Kazutaka Takahashi

doi:10.1088/1367-2630/aa9534

1. Introduction

Understanding nonequilibrium properties of dynamical systems is a fascinating topic in physics and has been studied intensively. The fluctuations of thermodynamic functions are considered to be key properties, and the Jarzynski equality [1, 2] and the fluctuation theorem [3, 4] play the prominent roles. The nonequilibrium entropy production is one of quantities to measure nonequilibrium properties of the system and has been studied in many contexts. Especially, knowing the lower bound is an important task since it determines irreversibility, dissipation properties, efficiency and so on [5–11].

Thermally isolated quantum systems can be treated by the unitary dynamics of the Schrödinger equation. In this paper we characterize the dynamics by the method of shortcuts to adiabaticity (STA). This method enables us to achieve an adiabatic dynamics in a finite time. To prevent the nonadiabatic transitions, we introduce an additional term called the counterdiabatic term. The fundamental idea was pointed out using a simple two-level Hamiltonian in [12] and the general formulation was developed in several works [13–16]. Since then, the method has been intensively studied in various way [17]. We can find applications to simple systems [18, 19], scale-invariant systems [20], many-body systems [21–23], and classical systems [24–27] and so on. The method can also be implemented experimentally to several systems [28–31]. It is also expected to be applied to quantum computations such as the quantum annealing.

It should be stressed that STA is applied to any dynamical systems. STA is useful not only to control the system but also to describe general unitary dynamics. As we describe below, the system Hamiltonian is separated into two parts, $\hat{H}(t)={\hat{H}}_{0}(t)+{\hat{H}}_{1}(t)$ , and the state satisfying the Schrödinger equation is given by adiabatic states of ${\hat{H}}_{0}(t)$ . Then, it would be an interesting problem to apply this separation to general nonequilibrium processes.

In STA, a cost of the time evolution was studied in [32–35]. A trade-off relation between time, energy fluctuation and state distance is known as the quantum speed limit [36] and was discussed in the context of STA in [37]. The applications of STA to thermodynamic systems were studied in several works [38–41]. A universal trade-off relation was derived from work fluctuation relations in [42].

In this paper, we study properties of the nonequilibrium entropy production that are applicable to general nonequilibrium processes. In thermally isolated systems, the entropy is directly related to the work average and the present result is essentially equivalent to the result in [42]. However, the entropy is represented by the Kullback–Leibler (KL) divergence, which leads us naturally to the information-geometric interpretation of the nonequilibrium process. Establishing this novel picture is the main aim of the present work.

The organization of this paper is as follows. In section 2, we discuss how a given Hamiltonian is separated into two parts. Then, the method is applied to the nonequilibrium entropy production in section 3. In section 4, we discuss lower bounds of the entropy by using the improved Jensen inequalities and derive a trade-off relation. The last section 5 is devoted to conclusions.

2. STA for general dynamical systems

2.1. General formula

We start from reviewing the method of STA, somewhat in a different way from the standard prescription [13, 15], with the emphasis on its applicability to general dynamical systems. For a given time-dependent Hamiltonian $\hat{H}(t)$ and an initial state $| \psi (0)\rangle$ , the time evolution of the state, $| \psi (t)\rangle$ , satisfies the Schrödinger equation

$\begin{eqnarray}&&{\rm{i}}\displaystyle \frac{\partial }{\partial t}| \psi (t)\rangle =\hat{H}(t)| \psi (t)\rangle ,\end{eqnarray} \tag{ 1 }$

where we put ${\hslash }=1$ . When we start the time evolution from an eigenstate of the initial Hamiltonian $| n\rangle$ satisfying $\hat{H}(0)| n\rangle ={\epsilon }_{n}(0)| n\rangle$ , the state is written as

$\begin{eqnarray}&&| {\psi }_{n}(t)\rangle =\hat{U}(t)| n\rangle ,\end{eqnarray} \tag{ 2 }$

where $\hat{U}(t)$ is the time evolution operator. Generally, the state vector is defined on a Hilbert space and the total number of indices is equal to the dimension of the space. The eigenstates $\{| n\rangle \}$ satisfy the orthonormal relation $\langle m| n\rangle ={\delta }_{m,n}$ and the completeness relation ${\sum }_{n}| n\rangle \langle n| =1$ .

We write the Hamiltonian using the basis $\{| {\psi }_{n}(t)\rangle \}$ . The Hamiltonian is separated into the diagonal and offdiagonal parts: $\hat{H}(t)={\hat{H}}_{0}(t)+{\hat{H}}_{1}(t)$ . They are written respectively as

$\begin{eqnarray}&&{\hat{H}}_{0}(t)=\displaystyle \sum _{n}| {\psi }_{n}(t)\rangle \langle {\psi }_{n}(t)| \hat{H}(t)| {\psi }_{n}(t)\rangle \langle {\psi }_{n}(t)| ,\end{eqnarray} \tag{ 3 }$

$\begin{eqnarray}&&{\hat{H}}_{1}(t)=\displaystyle \sum _{m\ne n}| {\psi }_{m}(t)\rangle \langle {\psi }_{m}(t)| \hat{H}(t)| {\psi }_{n}(t)\rangle \langle {\psi }_{n}(t)| .\end{eqnarray} \tag{ 4 }$

This separation indicates that the state $| \psi (t)\rangle$ is given by the eigenstates of ${\hat{H}}_{0}(t)$ with the eigenvalue ${\epsilon }_{n}(t)=\langle {\psi }_{n}(t)| \hat{H}(t)| {\psi }_{n}(t)\rangle$ . The most general form of the state is

$\begin{eqnarray}&&| \psi (t)\rangle =\sum _{n}{c}_{n}| {\psi }_{n}(t)\rangle ,\end{eqnarray} \tag{ 5 }$

where c_n is a time-independent constant. ${\hat{H}}_{1}(t)$ is called the counterdiabatic term and is rewritten by using the Schrödinger equation as

$\begin{eqnarray}&&{\hat{H}}_{1}(t)={\rm{i}}\displaystyle \sum _{m\ne n}| {\psi }_{m}(t)\rangle \langle {\psi }_{m}(t)| {\dot{\psi }}_{n}(t)\rangle \langle {\psi }_{n}(t)| ,\end{eqnarray} \tag{ 6 }$

where the dot symbol denotes the time derivative. The counterdiabatic term prevents nonadiabatic transitions between instantaneous eigenstates of ${\hat{H}}_{0}(t)$ .

We note that the eigenstate of ${\hat{H}}_{0}(t)$ does not necessarily satisfy the Schrödinger equation. Following the definition of the adiabatic state, we need to add an appropriate phase as

$\begin{eqnarray}&&| {\psi }_{n}(t)\rangle =\exp \left(-{\rm{i}}{\int }_{0}^{t}{\rm{d}}t^{\prime} \,{\epsilon }_{n}(t^{\prime} )-{\int }_{0}^{t}{\rm{d}}t^{\prime} \,\langle n(t^{\prime} )| \dot{n}(t^{\prime} )\rangle \right)| n(t)\rangle ,\end{eqnarray} \tag{ 7 }$

where $| n(t)\rangle$ is an eigenstate of ${\hat{H}}_{0}(t)$ . Using the eigenstate set $\{| n(t)\rangle \}$ , we can write the Hamiltonians (3) and (6) in the same form with the replacement $| {\psi }_{n}(t)\rangle \to | n(t)\rangle$ .

Thus, the problem of solving the Schrödinger equation for a given Hamiltonian $\hat{H}(t)$ reduces to finding the proper separation $\hat{H}(t)={\hat{H}}_{0}(t)+{\hat{H}}_{1}(t)$ . In the engineering problem, we consider ${\hat{H}}_{0}(t)$ as the original Hamiltonian and the additional counterdiabatic term is introduced to prevent the nonadiabatic transitions. However, this procedure is problematic in most cases since the counterdiabatic term generally takes a complicated form and is hard to manipulate [23]. Otherwise, we can consider the inverse engineering to keep the original form of the Hamiltonian [16]. Here we set up the problem by defining the total Hamiltonian so that the method is applicable to any dynamical systems. Although the separation of the Hamiltonian is possible in any systems, it is generally a difficult problem to find the proper separation.

The time dependence of the Hamiltonian appears through parameters in the Hamiltonian. Since the state is given by the instantaneous eigenstates of ${\hat{H}}_{0}$ , we consider that the eigenstates and eigenvalues depend on parameters $\lambda (t)$ as $| n(\lambda (t))\rangle$ and ${\epsilon }_{n}(\lambda (t))$ respectively. On the other hand, the time derivative appears in ${\hat{H}}_{1}$ which means that the counterdiabatic term is written as ${\hat{H}}_{1}=\dot{\lambda }(t)\hat{\xi }(\lambda (t))$ where

$\begin{eqnarray}&&\hat{\xi }(\lambda )={\rm{i}}\displaystyle \sum _{m\ne n}| m(\lambda )\rangle \langle m(\lambda )| {\partial }_{\lambda }n(\lambda )\rangle \langle n(\lambda )| .\end{eqnarray} \tag{ 8 }$

It was discussed in [27] that, for classical systems, ξ is represented by the λ-derivative of a generalized action. The action is introduced by using the Hamilton–Jacobi theory and reduces to the adiabatic invariant in a special case, which imply that the counterdiabatic term characterizes the dynamics. Although we discuss quantum systems in this paper, the following discussions are applicable to classical systems as well.

2.2. Quantum quench

It is worth mentioning that the method of STA is applicable even when a prepared initial state is driven by a time-independent total Hamiltonian. Using the time-dependent basis, we can introduce a time-dependent ${\hat{H}}_{0}(t)$ and ${\hat{H}}_{1}(t)$ to write $\hat{H}={\hat{H}}_{0}(t)+{\hat{H}}_{1}(t)$ . We note that the separation is useful only when the initial state is not in the eigenstate of the Hamiltonian. Such a situation is realized in the problem of quantum quench where we consider the state evolution under a sudden change of the Hamiltonian [43].

First, we prepare the state as an eigenstate of the Hamiltonian ${\hat{H}}^{(0)}$ . The eigenstate $| n\rangle$ satisfies the eigenvalue equation

$\begin{eqnarray}&&{\hat{H}}^{(0)}| n\rangle ={\epsilon }_{n}^{(0)}| n\rangle .\end{eqnarray} \tag{ 9 }$

Then, at t = 0, we start the state evolution by a different Hamiltonian $\hat{H}$ . The state is given by $| {\psi }_{n}(t)\rangle ={{\rm{e}}}^{-{\rm{i}}\hat{H}t}| n\rangle$ where we set the initial condition $| {\psi }_{n}(0)\rangle =| n\rangle$ .

As we explained in the general formulation, the Hamiltonian is separated into two parts by using the basis $| {\psi }_{n}(t)\rangle$ . Using the fact that the total Hamiltonian is time-independent at $t\gt 0$ , we can write $\hat{H}={\hat{H}}_{0}(0)+{\hat{H}}_{1}(0)$ where ${\hat{H}}_{0}(0)$ is the diagonal part with respect to the basis $\{| n\rangle \}$ and satisfies $[{\hat{H}}^{(0)},{\hat{H}}_{0}(0)]=0$ . ${\hat{H}}_{1}(0)$ is the offdiagonal part and is defined by $\hat{H}-{\hat{H}}_{0}(0)$ . For the time-evolved state, ${\hat{H}}_{0}(t)$ and ${\hat{H}}_{1}(t)$ are written respectively as

$\begin{eqnarray}&&{\hat{H}}_{0}(t)={{\rm{e}}}^{-{\rm{i}}\hat{H}t}{\hat{H}}_{0}(0){{\rm{e}}}^{{\rm{i}}\hat{H}t}=\displaystyle \sum _{n}{\epsilon }_{n}| {\psi }_{n}(t)\rangle \langle {\psi }_{n}(t)| ,\end{eqnarray} \tag{ 10 }$

$\begin{eqnarray}&&{\hat{H}}_{1}(t)={{\rm{e}}}^{-{\rm{i}}\hat{H}t}{\hat{H}}_{1}(0){{\rm{e}}}^{{\rm{i}}\hat{H}t},\end{eqnarray} \tag{ 11 }$

where ${\epsilon }_{n}=\langle {\psi }_{n}(t)| \hat{H}| {\psi }_{n}(t)\rangle$ is time independent. The problem of quantum quench is reduced to solving the eigenvalue equation if we know the form of ${\hat{H}}_{0}(t)$ . Of course, it is still a difficult problem in general.

2.3. Example

The simplest example is the system where the dimension of the Hilbert space is equal to two. The general Hamiltonian is written by using the Pauli-operator vector $\hat{{\boldsymbol{\sigma }}}$ as

$\begin{eqnarray}&&\hat{H}(t)=\displaystyle \frac{1}{2}{\boldsymbol{h}}(t)\cdot \hat{{\boldsymbol{\sigma }}}=\displaystyle \frac{1}{2}\left({{\boldsymbol{h}}}_{0}(t)+\displaystyle \frac{{{\boldsymbol{h}}}_{0}(t)\times {\dot{{\boldsymbol{h}}}}_{0}(t)}{{{\boldsymbol{h}}}_{0}^{2}(t)}\right)\cdot \hat{{\boldsymbol{\sigma }}}.\end{eqnarray} \tag{ 12 }$

The second equality denotes the separation of ${\hat{H}}_{0}(t)$ and ${\hat{H}}_{1}(t)$ . For a given vector function ${\boldsymbol{h}}(t)$ , we need to find ${{\boldsymbol{h}}}_{0}(t)$ . Although the explicit general formula to write ${{\boldsymbol{h}}}_{0}(t)$ in terms of ${\boldsymbol{h}}(t)$ is not known, it is clear from the above general discussion that such a function ${{\boldsymbol{h}}}_{0}(t)$ can be obtained in principle.

An example where the total Hamiltonian is time-independent was treated in [22]. We exploit that example to see below how the method works when it is applied to the problem of quantum quench.

We consider ${\boldsymbol{h}}=(0,0,h)$ where h is a real constant. In this case, the Schrödinger equation is easily solved by the standard textbook method. The general form of the state is given by

$\begin{eqnarray}&&| \psi (t)\rangle ={c}_{+}{{\rm{e}}}^{-\tfrac{{\rm{i}}{h}{t}}{2}}| +\rangle +{c}_{-}{{\rm{e}}}^{\tfrac{{\rm{i}}{h}{t}}{2}}| -\rangle ,\end{eqnarray} \tag{ 13 }$

where $| \pm \rangle$ are eigenstates of ${\hat{\sigma }}^{z}$ satisfying ${\hat{\sigma }}^{z}| \pm \rangle =\pm | \pm \rangle$ , and ${c}_{\pm }$ are complex constant values determined from the initial condition. If we start the time evolution from one of the eigenstates, the state remains in the same eigenstate throughout the time evolution.

We analyze the same system by using STA. Using the formula in (12), we can find the most general form of ${{\boldsymbol{h}}}_{0}(t)$ as

$\begin{eqnarray}&&{{\boldsymbol{h}}}_{0}(t)=h\cos {\theta }_{0}\left(\begin{array}{c}\sin {\theta }_{0}\cos ({ht}+{\varphi }_{0})\\ \sin {\theta }_{0}\sin ({ht}+{\varphi }_{0})\\ \cos {\theta }_{0}\end{array}\right),\end{eqnarray} \tag{ 14 }$

where ${\theta }_{0}$ and ${\varphi }_{0}$ are real constants. Each part of the Hamiltonian is given respectively as

$\begin{eqnarray}&&{\hat{H}}_{0}(t)=\displaystyle \frac{1}{2}h\cos {\theta }_{0}\left(\begin{array}{c}\sin {\theta }_{0}\cos ({ht}+{\varphi }_{0})\\ \sin {\theta }_{0}\sin ({ht}+{\varphi }_{0})\\ \cos {\theta }_{0}\end{array}\right)\cdot \hat{{\boldsymbol{\sigma }}},\end{eqnarray} \tag{ 15 }$

$\begin{eqnarray}&&{\hat{H}}_{1}(t)=-\displaystyle \frac{1}{2}h\sin {\theta }_{0}\left(\begin{array}{c}\cos {\theta }_{0}\cos ({ht}+{\varphi }_{0})\\ \cos {\theta }_{0}\sin ({ht}+{\varphi }_{0})\\ -\sin {\theta }_{0}\end{array}\right)\cdot \hat{{\boldsymbol{\sigma }}}.\end{eqnarray} \tag{ 16 }$

In this example, we see that the time-dependent parameter is given by $\lambda (t)={ht}$ . The corresponding state is given by a linear combination of the adiabatic states of ${\hat{H}}_{0}(t)$ . We obtain

$\begin{eqnarray}&&| \psi (t)\rangle ={\tilde{c}}_{+}{{\rm{e}}}^{-\tfrac{{\rm{i}}{h}{t}}{2}}\left(\begin{array}{c}\cos \tfrac{{\theta }_{0}}{2}\\ {{\rm{e}}}^{{\rm{i}}({ht}+{\varphi }_{0})}\sin \tfrac{{\theta }_{0}}{2}\end{array}\right)+{\tilde{c}}_{-}{{\rm{e}}}^{\tfrac{{\rm{i}}{h}{t}}{2}}\left(\begin{array}{c}-{{\rm{e}}}^{-{\rm{i}}({ht}+{\varphi }_{0})}\sin \tfrac{{\theta }_{0}}{2}\\ \cos \tfrac{{\theta }_{0}}{2}\end{array}\right),\end{eqnarray} \tag{ 17 }$

where the vector representation ${\left(\begin{array}{cc}a & b\end{array}\right)}^{{\rm{T}}}=a| +\rangle +b| -\rangle$ is used and ${\tilde{c}}_{\pm }$ are complex constant values. This state is equivalent to (13) but the separation of the vector has a different meaning. If we start the time evolution from one of the eigenstates for ${\hat{H}}_{0}(0)$ (one of two vectors at t = 0 in the above equation), the state remains in the same eigenstate of ${\hat{H}}_{0}(t)$ (one of two vectors at t in the above equation) throughout the time evolution. This picture holds even when the initial state is not in the eigenstate of the initial Hamiltonian. We note that the eigenstate is time dependent in this case. This result will be a useful tool to understand the quench dynamics.

The important conclusion in this section is that the separation of the Hamiltonian is very general and is applied for arbitrary choices of the Hamiltonian as we see in the above derivation. This means that the general dynamics is characterized by STA. As a possible application, we consider the nonequilibrium entropy production in the following.

3. Nonequilibrium entropy production

3.1. Entropy production and Pythagorean theorem

To characterize nonequilibrium states, we use the work done on the system as one of the measure. We prepare the state in contact with a bath and the initial state is in equilibrium. Then, the system is thermally isolated from the outside and is evolved under a control by the external agent. The work is obtained by the two-time measurement scheme.

The initial state is assumed to be distributed according to the Boltzmann distribution ${p}_{n}(0)={{\rm{e}}}^{-\beta {\epsilon }_{n}(0)}/{Z}_{0}$ where ${Z}_{0}={\sum }_{n}{{\rm{e}}}^{-\beta {\epsilon }_{n}(0)}$ and β is the inverse temperature. The time evolution of the system is described by the time-dependent Hamiltonian $\hat{H}(t)$ and the work is defined by the energy difference between the initial and final states. Since the initial state is distributed randomly, we can define the work distribution function

$\begin{eqnarray}P(W,t) & = & \displaystyle \sum _{n}{p}_{n}(0)\langle {\psi }_{n}(t)| \delta (W-(\hat{H}(t)-{\epsilon }_{n}(0)))| {\psi }_{n}(t)\rangle \\ & = & \displaystyle \sum _{n}{p}_{n}(0)\langle n(t)| \delta (W-(\hat{H}(t)-{\epsilon }_{n}(0)))| n(t)\rangle .\end{eqnarray} \tag{ 18 }$

The main question here is whether we can find any useful information on this work distribution by using the separation of the Hamiltonian $\hat{H}(t)={\hat{H}}_{0}(t)+{\hat{H}}_{1}(t)$ .

We are mainly interested in the work average ${[W]}_{t}=\int {\rm{d}}W\,P(W,t)W$ . In [42], it was shown that the average is given by

$\begin{eqnarray}&&{[W]}_{t}=\displaystyle \sum _{n}{p}_{n}(0)\langle n(t)| (\hat{H}(t)-{\epsilon }_{n}(0))| n(t)\rangle =\displaystyle \sum _{n}{p}_{n}(0)({\epsilon }_{n}(t)-{\epsilon }_{n}(0)),\end{eqnarray} \tag{ 19 }$

with the use of the relation $\langle n(t)| {\hat{H}}_{1}(t)| n(t)\rangle =0$ . This equation shows that the counterdiabatic term ${\hat{H}}_{1}(t)$ does not contribute to the average. It can also be shown that the squared average ${[{W}^{2}]}_{t}$ is separated into two parts as

$\begin{eqnarray}&&{[{W}^{2}]}_{t}=\displaystyle \sum _{n}{p}_{n}(0){({\epsilon }_{n}(t)-{\epsilon }_{n}(0))}^{2}+\displaystyle \sum _{n}{p}_{n}(0)\langle n(t)| {\hat{H}}_{1}^{2}(t)| n(t)\rangle ,\end{eqnarray} \tag{ 20 }$

and the second term has a geometric meaning as we discuss below.

Using the averaged work, we define the nonequilibrium entropy production

$\begin{eqnarray}&&{\rm{\Sigma }}(t)=\beta {[W]}_{t}-\beta ({F}_{t}-{F}_{0}),\end{eqnarray} \tag{ 21 }$

where F_t is the free energy for the Hamiltonian $\hat{H}(t)$ and is defined as $-\beta {F}_{t}=\mathrm{ln}{Z}_{t}$ . ${Z}_{t}={\mathrm{Tre}}^{-\beta \hat{H}(t)}$ is the partition function defined at each t. We note that the final state is not necessarily in equilibrium.

${\rm{\Sigma }}(t)$ is a non-negative quantity. This property is understood from the relation that ${\rm{\Sigma }}(t)$ is written by the KL divergence of two density operators:

$\begin{eqnarray}&&{\rm{\Sigma }}(t)={D}_{\mathrm{KL}}(\hat{\rho }(0\to t)| | \hat{\rho }(t)).\end{eqnarray} \tag{ 22 }$

We use (19) to derive this equation. The KL divergence is defined as ${D}_{\mathrm{KL}}(\hat{P}| | \hat{Q})=\mathrm{Tr}\hat{P}\mathrm{ln}\hat{P}-\mathrm{Tr}\hat{P}\mathrm{ln}\hat{Q}$ and the density operators are defined as

$\begin{eqnarray}&&\hat{\rho }(0\to t)=\displaystyle \frac{1}{{Z}_{0}}\hat{U}(t){{\rm{e}}}^{-\beta \hat{H}(0)}{\hat{U}}^{\dagger }(t)=\displaystyle \frac{1}{{Z}_{0}}\exp (-\beta \hat{U}(t)\hat{H}(0){\hat{U}}^{\dagger }(t)),\end{eqnarray} \tag{ 23 }$

$\begin{eqnarray}&&\hat{\rho }(t)=\displaystyle \frac{1}{{Z}_{t}}{{\rm{e}}}^{-\beta \hat{H}(t)}.\end{eqnarray} \tag{ 24 }$

$\hat{\rho }(0\to t)=\hat{U}(t)\hat{\rho }(0){\hat{U}}^{\dagger }(t)$ is the time-evolved state of the initial distribution $\hat{\rho }(0)$ and represents the actual distribution of states at each t. On the other hand, $\hat{\rho }(t)$ represents the distribution for the canonical equilibrium states of the Hamiltonian $\hat{H}(t)$ . Generally, the evolved state is not in equilibrium and these operators are not equal with each other. The above relation shows that the entropy production represents how far the nonequilibrium state is from the equilibrium one and is written by the divergence of two distributions.

As we see in (23), $\hat{\rho }(0\to t)$ is characterized by the effective Hamiltonian $\hat{H}(0\to t)=\hat{U}(t)\hat{H}(0){\hat{U}}^{\dagger }(t)$ . Its spectral decomposition is given by

$\begin{eqnarray}&&\hat{H}(0\to t)=\displaystyle \sum _{n}{\epsilon }_{n}(0)| n(t)\rangle \langle n(t)| ,\end{eqnarray} \tag{ 25 }$

which has a similar form to ${\hat{H}}_{0}(t)$ . The eigenstates are time dependent but the eigenvalues are not. This Hamiltonian satisfies the equation for the Lewis–Riesenfeld invariant [44]:

$\begin{eqnarray}&&{\rm{i}}\displaystyle \frac{\partial \hat{H}(0\to t)}{\partial t}=[\hat{H}(t),\hat{H}(0\to t)]=[{\hat{H}}_{1}(t),\hat{H}(0\to t)].\end{eqnarray} \tag{ 26 }$

We note that ${\hat{H}}_{0}(t)$ and $\hat{H}(0\to t)$ commutes with each other. This equation was studied systematically in [23] and the Lax form for classical nonlinear integrable systems is shown to be useful to find a pair $(\hat{H}(0\to t),{\hat{H}}_{1}(t))$ . The relation ${\hat{H}}_{0}(t)=\hat{H}(0\to t)$ holds when the energy eigenvalues of ${\hat{H}}_{0}(t)$ are independent of t. In this special case, the entropy is given by the divergence between two canonical distributions ${\hat{\rho }}_{0}(t)={{\rm{e}}}^{-\beta {\hat{H}}_{0}(t)}/{Z}_{t}^{(0)}$ and $\hat{\rho }(t)$ as

$\begin{eqnarray}&&{\rm{\Sigma }}(t)={D}_{\mathrm{KL}}(\hat{\rho }(0\to t)| | \hat{\rho }(t))={D}_{\mathrm{KL}}({\hat{\rho }}_{0}(t)| | \hat{\rho }(t))=\beta ({F}_{t}^{(0)}-{F}_{t}),\end{eqnarray} \tag{ 27 }$

where $-\beta {F}_{t}^{(0)}=\mathrm{ln}{Z}_{t}^{(0)}={\mathrm{lnTre}}^{-\beta {\hat{H}}_{0}(t)}$ . The entropy is given by the free energy difference. Here we use again $\langle n(t)| {\hat{H}}_{1}(t)| n(t)\rangle =0$ .

In the general case ${\hat{H}}_{0}(t)\ne \hat{H}(0\to t)$ , by using the separation of the Hamiltonian $\hat{H}(t)={\hat{H}}_{0}(t)+{\hat{H}}_{1}(t)$ , we can easily show the entropy production is separated into two parts:

$\begin{eqnarray}&&{\rm{\Sigma }}(t)={D}_{\mathrm{KL}}(\hat{\rho }(0\to t)| | \hat{\rho }(t))={{\rm{\Sigma }}}_{0}(t)+{{\rm{\Sigma }}}_{1}(t).\end{eqnarray} \tag{ 28 }$

Each term is written by using the KL divergence:

$\begin{eqnarray}{{\rm{\Sigma }}}_{0}(t) & = & {D}_{\mathrm{KL}}(\hat{\rho }(0\to t)| | {\hat{\rho }}_{0}(t))\\ & = & \displaystyle \frac{1}{{Z}_{0}}\displaystyle \sum _{n}{{\rm{e}}}^{-\beta {\epsilon }_{n}(0)}\beta ({\epsilon }_{n}(t)-{\epsilon }_{n}(0))+\beta ({F}_{0}-{F}_{t}^{(0)}),\end{eqnarray} \tag{ 29 }$

$\begin{eqnarray}&&{{\rm{\Sigma }}}_{1}(t)={D}_{\mathrm{KL}}({\hat{\rho }}_{0}(t)| | \hat{\rho }(t))=\beta ({F}_{t}^{(0)}-{F}_{t}).\end{eqnarray} \tag{ 30 }$

${{\rm{\Sigma }}}_{0}(t)$ represents the KL divergence between the canonical distributions of $\hat{H}(0\to t)$ and ${\hat{H}}_{0}(t)$ and is expressed by the spectrum distance between $\{{\epsilon }_{n}(0)\}$ and $\{{\epsilon }_{n}(t)\}$ . It is independent of ${\hat{H}}_{1}(t)$ . On the other hand, ${{\rm{\Sigma }}}_{1}(t)$ represents a distance due to the counterdiabatic term since it goes to zero when ${\hat{H}}_{1}(t)=0$ . Thus, the entropy production is separated into two parts and each part plays a different role.

We note that the difference between ${\rm{\Sigma }}(t)$ and ${{\rm{\Sigma }}}_{1}(t)$ has been studied in some works. In [39], the difference was studied for a process in an Otto cycle and the result was plotted for the harmonic-oscillator Hamiltonian. In [11], ${{\rm{\Sigma }}}_{1}(t)$ was defined in a process of the projective measurement to derive the inequality ${\rm{\Sigma }}(t)\geqslant {{\rm{\Sigma }}}_{1}(t)$ . Our result is derived as an equality and is applicable to general systems.

3.2. Information-geometric interpretation

It is well known that the KL divergence is a generalization of a squared distance measure. This means that (28) represents the Pythagorean theorem and has a geometric meaning. The theorem has been closely discussed in the field of information geometry [45]. In the following, we interpret the result (28) to refine the method of STA from the aspect of the information geometry [45].

The Hamiltonian is generally written as

$\begin{eqnarray}&&-\beta \hat{H}(\theta )=\displaystyle \sum _{i}{\theta }_{i}{\hat{X}}_{i}.\end{eqnarray} \tag{ 31 }$

Although the following discussions hold for classical systems as well, we use general finite-dimensional quantum systems for the description. Then, $\{{\hat{X}}_{i}\}$ represents a set of independent operators and the number of operators is determined by specifying the Hilbert space. The coefficient $\theta =({\theta }_{1},{\theta }_{2},\mathrm{...})$ plays the role of coordinates. The coordinate system is used to specify the probability distribution. In the present study we treat the canonical distribution

$\begin{eqnarray}&&\hat{\rho }(\theta )=\exp (-\beta \hat{H}(\theta )-\psi (\theta )).\end{eqnarray} \tag{ 32 }$

The normalization function $\psi (\theta )$ defined as $\psi (\theta )={\mathrm{lnTre}}^{-\beta \hat{H}(\theta )}$ is a convex function and represents the free energy $-\beta F$ in physics.

For a coordinate system θ where a convex function $\psi (\theta )$ is defined, we can introduce the dual coordinate system ${\theta }^{* }$ and the dual convex function ${\psi }^{* }({\theta }^{* })$ by using the Legendre transformation. The dual coordinate is defined by

$\begin{eqnarray}&&{\theta }_{i}^{* }=\displaystyle \frac{\partial }{\partial {\theta }_{i}}\psi (\theta ),\end{eqnarray} \tag{ 33 }$

and the corresponding convex function is

$\begin{eqnarray}&&{\psi }^{* }({\theta }^{* })=\theta \cdot {\theta }^{* }-\psi (\theta ),\end{eqnarray} \tag{ 34 }$

where we use the abbreviation $\theta \cdot {\theta }^{* }={\sum }_{i}{\theta }_{i}^{}{\theta }_{i}^{* }$ . In the canonical distribution, the dual coordinate is written as the canonical average of operators:

$\begin{eqnarray}&&{\theta }_{i}^{* }=\langle {\hat{X}}_{i}{\rangle }_{H}=\mathrm{Tr}{\hat{X}}_{i}\exp (-\beta \hat{H}(\theta )-\psi (\theta )).\end{eqnarray} \tag{ 35 }$

The state in the canonical distribution can also be uniquely specified by ${\theta }^{* }$ instead of using θ.

In a coordinate system where a convex function $\psi (\theta )$ is defined, the Bregman divergence is introduced as a distance measure

$\begin{eqnarray}&&{D}_{\psi }(\theta | | \theta ^{\prime} )=\psi (\theta )-\psi (\theta ^{\prime} )-\theta {{\prime} }^{* }\cdot (\theta -\theta ^{\prime} )=\psi (\theta )+{\psi }^{* }(\theta {{\prime} }^{* })-\theta \cdot \theta {{\prime} }^{* }.\end{eqnarray} \tag{ 36 }$

The last expression shows that the dual divergence can also be defined as ${D}_{\psi }^{* }({\theta }^{* }| | \theta {{\prime} }^{* })={D}_{\psi }(\theta ^{\prime} | | \theta )$ . We note that the divergence is not symmetric in general: ${D}_{\psi }(\theta | | \theta ^{\prime} )\ne {D}_{\psi }(\theta ^{\prime} | | \theta )$ . However, by considering an infinitesimal distance, it has a symmetric form, which defines the Riemannian metric in the manifold parametrized by the coordinate θ. In the present case where the probability distribution is given by the canonical distribution, the Bregman divergence is equivalent to the KL divergence:

$\begin{eqnarray}&&{D}_{\psi }(\theta | | \theta ^{\prime} )={D}_{\psi }^{* }(\theta {{\prime} }^{* }| | {\theta }^{* })={D}_{\mathrm{KL}}(\theta ^{\prime} | | \theta ).\end{eqnarray} \tag{ 37 }$

Now we discuss the geometric meaning of the Pythagorean theorem. Since we are interested in the KL divergence, we treat the dual divergence ${D}_{\psi }^{* }({\theta }^{* }| | \theta {{\prime} }^{* })={D}_{\mathrm{KL}}(\theta | | \theta ^{\prime} )$ . For given three points ${\theta }_{P}$ , ${\theta }_{Q}$ and ${\theta }_{R}$ , We consider the condition such that the dual Pythagorean theorem holds:

$\begin{eqnarray}&&{D}_{\psi }^{* }(P| | R)={D}_{\psi }^{* }(P| | Q)+{D}_{\psi }^{* }(Q| | R).\end{eqnarray} \tag{ 38 }$

A simple calculation gives

$\begin{eqnarray}&&({\theta }_{P}^{* }-{\theta }_{Q}^{* })\cdot ({\theta }_{Q}-{\theta }_{R})=0.\end{eqnarray} \tag{ 39 }$

The affine coordinate system θ introduced in (32) represents a dually flat manifold. The geodesic is parametrized as a straight line connecting two points, Q and R, as ${\theta }_{{QR}}(\tau )=\tau {\theta }_{Q}+(1-\tau ){\theta }_{R}$ where τ parametrizes the straight line and takes between 0 and 1. Then, ${\theta }_{Q}-{\theta }_{R}$ represents the tangent vector. In the same way the dual geodesic line is written by using the dual coordinate. Equation (39) means that the dual geodesic ${\theta }_{P}^{* }-{\theta }_{Q}^{* }$ is perpendicular to the geodesic ${\theta }_{Q}-{\theta }_{R}$ .

We apply the general argument to the present problem. Three points, P, Q and R, are represented by the canonical distributions of $\hat{H}(0\to t)$ , ${\hat{H}}_{0}(t)$ and $\hat{H}(t)$ respectively. Then, the counterdiabatic term is written as

$\begin{eqnarray}&&-\beta {\hat{H}}_{1}(t)=({\theta }_{R}(t)-{\theta }_{Q}(t))\cdot \hat{X}.\end{eqnarray} \tag{ 40 }$

To identify ${\theta }_{P}^{* }-{\theta }_{Q}^{* }$ , we note that the dual geodesic is represented by the canonical average of operators. The point P is the average in terms of $\hat{H}(0\to t)$ and Q is in terms of ${\hat{H}}_{0}(t)$ . These Hamiltonians are diagonalized by the basis $\{| n(t)\rangle \}$ . Then, the dual geodesic is defined on a submanifold where the Hamiltonian is diagonalized by the same basis. Using the relation $\langle n(t)| {\hat{H}}_{1}(t)| n(t)\rangle =0$ , we conclude that the perpendicular condition is represented as

$\begin{eqnarray}0 & = & -\beta \langle {\hat{H}}_{1}(t){\rangle }_{\hat{H}(0\to t)}+\beta \langle {\hat{H}}_{1}(t){\rangle }_{{\hat{H}}_{0}(t)}\\ & = & ({\theta }_{R}(t)-{\theta }_{Q}(t))\cdot (\langle \hat{X}{\rangle }_{\hat{H}(0\to t)}-\langle \hat{X}{\rangle }_{{\hat{H}}_{0}(t)})\\ & = & ({\theta }_{R}(t)-{\theta }_{Q}(t))\cdot ({\theta }_{P}^{* }(t)-{\theta }_{Q}^{* }(t)).\end{eqnarray} \tag{ 41 }$

The dual geodesic connecting P and Q is interpreted the dual projection of P to a flat submanifold. The flat submanifold includes points on the geodesic QR, and is parametrized by the coordinate θ. Each point is represented by the canonical distribution of the Hamiltonian ${\hat{H}}_{e}(t)={\hat{H}}_{0}(t)+{\theta }_{e}(t)\cdot \hat{X}$ . The coordinate ${\theta }_{e}(t)$ satisfies the perpendicular condition

$\begin{eqnarray}&&\langle n(t)| {\theta }_{e}(t)\cdot \hat{X}| n(t)\rangle =0.\end{eqnarray} \tag{ 42 }$

The point Q represents the nearest point of P in the 'e-flat' submanifold. In the same way, we can define the 'm-flat' submanifold including the dual geodesic PQ, which is parametrized by the dual coordinate. Then, the geodesic RQ represents the projection of the point R to the submanifold. This property is known as the projection theorem in the information geometry.

We summarize the information-geometric interpretation of the Pythagorean theorem in figure 1. This interpretation shows that the Hamiltonian ${\hat{H}}_{0}(t)$ plays an important role for the difference between $\hat{H}(0\to t)$ and $\hat{H}(t)$ . For a given flat manifold including R, the point Q is uniquely determined by the dual projection of P to the manifold. Of course, it is a difficult problem to find the proper manifold and this interpretation is of no use in general to find ${\hat{H}}_{0}(t)$ for a given $\hat{H}(t)$ . Nevertheless, we expect that this new picture will be a guiding principle to design the system.

**Figure 1.** Information-geometric interpretation of the nonequilibrium entropy production. The Hamiltonian is evolved from $\hat{H}(0)$ to three different forms $\hat{H}(0\to t)=\hat{U}(t)\hat{H}(0){\hat{U}}^{\dagger }(t)$ , ${\hat{H}}_{0}(t)$ and $\hat{H}(t)={\hat{H}}_{0}(t)+{\hat{H}}_{1}(t)$ , and their canonical distributions are denoted by P, Q and R respectively. Then, the points make a right triangle and the entropy production satisfies the Pythagorean theorem ${D}_{\mathrm{KL}}(P| | R)={D}_{\mathrm{KL}}(P| | Q)+{D}_{\mathrm{KL}}(Q| | R)$ . The dual geodesic PQ is perpendicular to the geodesic QR.
Download figure:
Standard image High-resolution image

4. Lower bounds of entropy production

4.1. Improved Jensen inequality

In the previous section, we studied that the nonequilibrium entropy is separated into two parts according to their roles and is represented by the KL divergence. The KL divergence represents a degree of separation of two points and the metric is obtained from an infinitesimal separation.

It is an interesting problem to study the lower bound of the entropy production since it characterizes the efficiency of the system control. The counterdiabatic term has a geometric meaning and is related to the Fubini–Study distance [46, 47]. In the setting we are studying in this paper, we consider a time evolution from $\hat{\rho }(0)$ to $\hat{\rho }(0\to t)=\hat{U}(t)\hat{\rho }(0){\hat{U}}^{\dagger }(t)$ . The length between two state is defined, according to [42], by

$\begin{eqnarray}{\ell }(\hat{\rho }(0),\hat{\rho }(0\to t)) & = & {\displaystyle \int }_{0}^{t}{\rm{d}}t^{\prime} \,\sqrt{\displaystyle \sum _{n}{p}_{n}(0)\langle \dot{n}(t^{\prime} )| {\hat{H}}_{1}^{2}(t^{\prime} )| \dot{n}(t^{\prime} )\rangle }\\ & = & {\displaystyle \int }_{0}^{t}{\rm{d}}t^{\prime} \,\sqrt{\displaystyle \sum _{n}{p}_{n}(0)\langle \dot{n}(t^{\prime} )| (1-| n(t^{\prime} )\rangle \langle n(t^{\prime} )| )| \dot{n}(t^{\prime} )\rangle }.\end{eqnarray} \tag{ 43 }$

This is considered to be a natural length since it satisfies ${\ell }(\hat{\rho }(0),\hat{\rho }(0\to t))\geqslant { \mathcal L }(\hat{\rho }(0),\hat{\rho }(0\to t))$ where ${ \mathcal L }$ represents the Bures distance:

$\begin{eqnarray}&&{ \mathcal L }({\hat{\rho }}_{1},{\hat{\rho }}_{2})=\arccos \,\mathrm{Tr}\sqrt{\sqrt{{\hat{\rho }}_{1}}{\hat{\rho }}_{2}\sqrt{{\hat{\rho }}_{1}}}.\end{eqnarray} \tag{ 44 }$

We note that the integrand in ℓappears in the squared average of the work as in (20). In the nonequilibrium entropy production, we treat the divergence between $\hat{\rho }(0\to t)$ and $\hat{\rho }(t)$ and it is expected to be bounded from below by an appropriate distance. From a mathematical point of view, the KL divergence is bounded below by the Bures distance. It was shown in [9, 10, 48] that

$\begin{eqnarray}&&{D}_{\mathrm{KL}}({\hat{\rho }}_{1}| | {\hat{\rho }}_{2})\geqslant \displaystyle \frac{8}{{\pi }^{2}}{{ \mathcal L }}^{2}({\hat{\rho }}_{1},{\hat{\rho }}_{2}).\end{eqnarray} \tag{ 45 }$

When this relation is applied to the entropy production ${\rm{\Sigma }}(t)={D}_{\mathrm{KL}}(\hat{\rho }(0\to t)| | \hat{\rho }(t))$ , we can find a lower limit. However, this relation holds at each t and does not suitable to characterize the efficiency of the time evolution. In this section, we study a different lower bound of the nonequilibrium entropy by using the improved Jensen inequality and apply it to derive a trade-off relation for the time evolution.

The property ${\rm{\Sigma }}(t)\geqslant 0$ can be shown by using the Jensen inequality

$\begin{eqnarray}&&{{\rm{e}}}^{-\beta {[W]}_{t}}\leqslant {[{{\rm{e}}}^{-\beta W}]}_{t}={{\rm{e}}}^{-\beta ({F}_{t}-{F}_{0})},\end{eqnarray} \tag{ 46 }$

where the equality represents the Jarzynski formula obtained from the definition of the work distribution (18). To find a nontrivial lower bound of ${\rm{\Sigma }}(t)$ , we use the improved Jensen inequality derived in [49]. Using the formula in appendix A, we obtain

$\begin{eqnarray}&&{{\rm{e}}}^{-\beta {[W]}_{t}}\leqslant \displaystyle \frac{{[{{\rm{e}}}^{-\beta W}]}_{t}}{1+\tfrac{{\beta }^{2}}{2}C(t){[{(W-{[W]}_{t})}^{2}]}_{t}},\end{eqnarray} \tag{ 47 }$

where C(t) is a positive function and is written as

$\begin{eqnarray}&&C(t)=\exp \left(-\displaystyle \frac{\beta }{3}\displaystyle \frac{{[{(W-{[W]}_{t})}^{3}]}_{t}}{{[{(W-{[W]}_{t})}^{2}]}_{t}}\right).\end{eqnarray} \tag{ 48 }$

This gives a tighter bound of ${\rm{\Sigma }}(t)$ than (46). We can write

$\begin{eqnarray}&&{\rm{\Sigma }}(t)\geqslant \mathrm{ln}\left(1+\displaystyle \frac{{\beta }^{2}}{2}C(t){[{(W-{[W]}_{t})}^{2}]}_{t}\right).\end{eqnarray} \tag{ 49 }$

The bound is written by the second variance of W. Using (19) and (20), we have

$\begin{eqnarray}&&{[{(W-{[W]}_{t})}^{2}]}_{t}\geqslant \displaystyle \sum _{n}{p}_{n}(0)\langle n(t)| {\hat{H}}_{1}^{2}(t)| n(t)\rangle .\end{eqnarray} \tag{ 50 }$

The form of the right-hand side appears in (43) and we can write

$\begin{eqnarray}{\displaystyle \int }_{0}^{t}{\rm{d}}t^{\prime} \,\sqrt{{{\rm{e}}}^{{\rm{\Sigma }}(t^{\prime} )}-1} & \geqslant & \sqrt{\displaystyle \frac{{\beta }^{2}}{2}{C}_{\min }}\,{\ell }(\hat{\rho }(0),\hat{\rho }(0\to t))\\ & \geqslant & \sqrt{\displaystyle \frac{{\beta }^{2}}{2}{C}_{\min }}\,{ \mathcal L }(\hat{\rho }(0),\hat{\rho }(0\to t)),\end{eqnarray} \tag{ 51 }$

where ${C}_{\min }$ represents the minimum value of C(t). This inequality represents a trade-off relation. The left-most hand side represents the time integration of a velocity and the right-most hand side represents a distance. We note that $\sqrt{{{\rm{e}}}^{{\rm{\Sigma }}(t)}-1}$ plays the role of velocity. This is a non-negative quantity and measures a degree of separation from the equilibrium state. For a given distance, a large-t is required for a near-equilibrium process, and a small-t for a far-from-equilibrium one. Thus, we have a trade-off relation between time, entropy and state distance.

We note that the coefficient C(t) is determined by the ratio of second- and third-order fluctuations of W. It highly depends on the system and is a nonuniversal quantity. On the other hand, the average of the work is bounded from below by the second fluctuation which is related to the universal geometric distance. We also note that this lower limit is negligible at the thermodynamic limit since the left-hand side of (49) depends linearly on the system size and the right-hand side logarithmically. The present result is important only for small systems where the fluctuation plays a significant role.

We can improve the bound by using the property that the entropy production is separated into two parts. It is possible to find a lower bound for each part, although the physical meaning is not evident. ${{\rm{\Sigma }}}_{0}(t)=D(\hat{\rho }(0\to t)| | {\hat{\rho }}_{0}(t))$ represents the divergence for systems without the counterdiabatic term and we obtain

$\begin{eqnarray}&&{{\rm{e}}}^{{{\rm{\Sigma }}}_{0}(t)}-1\geqslant \displaystyle \frac{{\beta }^{2}}{2}{C}_{0}(t){[{(W-{[W]}_{t}^{(0)})}^{2}]}_{t}^{(0)},\end{eqnarray} \tag{ 52 }$

where

$\begin{eqnarray}&&{C}_{0}(t)=\exp \left(-\displaystyle \frac{\beta }{3}\displaystyle \frac{{[{(W-{[W]}_{t}^{(0)})}^{3}]}_{t}^{(0)}}{{[{(W-{[W]}_{t}^{(0)})}^{2}]}_{t}^{(0)}}\right).\end{eqnarray} \tag{ 53 }$

The average denoted by ${[\ ]}_{t}^{(0)}$ is calculated from the distribution

$\begin{eqnarray}&&{P}^{(0)}(W,t)=\displaystyle \sum _{n}{p}_{n}(0)\langle n(t)| \delta (W-({\hat{H}}_{0}(t)-{\epsilon }_{n}(0)))| n(t)\rangle .\end{eqnarray} \tag{ 54 }$

This result is derived in the same way as ${\rm{\Sigma }}(t)$ from the improved Jensen inequality. We note that the relation ${[W]}_{t}={[W]}_{t}^{(0)}$ holds as we see from (19).

The bound of ${{\rm{\Sigma }}}_{1}(t)=D({\hat{\rho }}_{0}(t)| | \hat{\rho }(t))$ is calculated from the improved Gibbs–Bogoliubov inequality, which can be derived from the improved Jensen inequality. The detail is described in appendix B and we obtain

$\begin{eqnarray}&&{{\rm{e}}}^{{{\rm{\Sigma }}}_{1}(t)}-1\geqslant \displaystyle \frac{{\beta }^{2}}{2}{C}_{1}(t)\langle {\hat{H}}_{1}^{2}(t){\rangle }_{t}^{(0)},\end{eqnarray} \tag{ 55 }$

where

$\begin{eqnarray}&&{C}_{1}(t)=\exp \left(-\displaystyle \frac{\beta }{3}\displaystyle \frac{\langle {\hat{H}}_{1}^{3}(t){\rangle }_{t}^{(0)}-\tfrac{1}{2}\langle [{\hat{H}}_{1}(t),[{\hat{H}}_{1}(t),{\hat{H}}_{0}(t)]]{\rangle }_{t}^{(0)}}{\langle {\hat{H}}_{1}^{2}(t){\rangle }_{t}^{(0)}}\right),\end{eqnarray} \tag{ 56 }$

and the average is with respect to ${\hat{H}}_{0}(t)$ as

$\begin{eqnarray}&&\langle \cdots {\rangle }_{t}^{(0)}=\displaystyle \frac{1}{{Z}_{t}^{(0)}}\mathrm{Tr}(\cdots ){{\rm{e}}}^{-\beta {\hat{H}}_{0}(t)}.\end{eqnarray} \tag{ 57 }$

In this case, the second order fluctuation in the right-hand side of (55) is written as

$\begin{eqnarray}&&\langle {\hat{H}}_{1}^{2}(t){\rangle }_{t}^{(0)}=\displaystyle \sum _{n}{p}_{n}(t)\langle \dot{n}(t)| (1-| n(t)\rangle \langle n(t)| )| \dot{n}(t)\rangle ,\end{eqnarray} \tag{ 58 }$

where ${p}_{n}(t)={{\rm{e}}}^{-\beta {\epsilon }_{n}(t)}/{Z}_{t}^{(0)}$ . which is slightly different from the fluctuation in the right-hand side of (50). It is not clear whether this quantity is further bounded from below by a geometric distance.

From a practical point of view, the sum of the lower bounds for ${{\rm{\Sigma }}}_{0}(t)$ and ${{\rm{\Sigma }}}_{1}(t)$ is expected to be larger than the bound for ${\rm{\Sigma }}(t)$ and can be a good approximation of the entropy production. We study simple examples in the next section.

4.2. Examples

First, we consider the two-level system. As we mentioned in section 2, the Hamiltonian $\hat{H}(t)={\hat{H}}_{0}(t)+{\hat{H}}_{1}(t)$ is given by

$\begin{eqnarray}&&\hat{H}(t)=\displaystyle \frac{1}{2}h(t){\boldsymbol{n}}(t)\cdot \hat{{\boldsymbol{\sigma }}}+\displaystyle \frac{1}{2}({\boldsymbol{n}}(t)\times \dot{{\boldsymbol{n}}}(t))\cdot \hat{{\boldsymbol{\sigma }}},\end{eqnarray} \tag{ 59 }$

where h(t) is positive and ${\boldsymbol{n}}(t)$ is a unit vector. We note that $h(t){\boldsymbol{n}}(t)$ corresponds to ${{\boldsymbol{h}}}_{0}(t)$ in (12). In the present example, ${\rm{\Sigma }}(t)$ is calculated exactly:

$\begin{eqnarray}&&{{\rm{\Sigma }}}_{0}(t)=-\displaystyle \frac{\beta (h(t)-h(0))}{2}\tanh \left(\displaystyle \frac{\beta h(0)}{2}\right)+\mathrm{ln}\left[\displaystyle \frac{\cosh \left(\tfrac{\beta h(t)}{2}\right)}{\cosh \left(\tfrac{\beta h(0)}{2}\right)}\right],\end{eqnarray} \tag{ 60 }$

$\begin{eqnarray}&&{{\rm{\Sigma }}}_{1}(t)=\mathrm{ln}\left[\displaystyle \frac{\cosh \left(\tfrac{\beta \tilde{h}(t)}{2}\right)}{\cosh \left(\tfrac{\beta h(t)}{2}\right)}\right],\end{eqnarray} \tag{ 61 }$

where $\tilde{h}(t)=\sqrt{{h}^{2}(t)+{({\boldsymbol{n}}(t)\times \dot{{\boldsymbol{n}}}(t))}^{2}}$ . ${{\rm{\Sigma }}}_{0}(t)$ and ${{\rm{\Sigma }}}_{1}(t)$ are respectively bounded below in the form ${{\rm{\Sigma }}}_{0}(t)\geqslant \mathrm{ln}(1+{\delta }_{0}(t))$ and ${{\rm{\Sigma }}}_{1}(t)\geqslant \mathrm{ln}(1+{\delta }_{1}(t))$ . We note that the calculation of the bound for ${\rm{\Sigma }}(t)$ in (49) is cumbersome compared to those of ${{\rm{\Sigma }}}_{0}(t)$ and ${{\rm{\Sigma }}}_{1}(t)$ . This is because we need to calculate the fluctuations of the total Hamiltonian. Using the decomposition, we can calculate a bound more easily.

We consider an example

$\begin{eqnarray}&&h(t)=1+\displaystyle \frac{1}{2}\cos (2t),\end{eqnarray} \tag{ 62 }$

$\begin{eqnarray}&&{({\boldsymbol{n}}(t)\times \dot{{\boldsymbol{n}}}(t))}^{2}={\sin }^{2}t,\end{eqnarray} \tag{ 63 }$

with $\beta =2$ . The result is plotted in figure 2. We see that the lower bound gives a good approximation of ${\rm{\Sigma }}(t)$ in this example. We also plot ${{\rm{\Sigma }}}_{0}(t)$ in figure 3 and ${{\rm{\Sigma }}}_{1}(t)$ in figure 4 to compare the result with the different bound in (45). The result shows that our bound becomes better than the bound from (45) in some ranges of the parameters, and becomes worse in the other cases.

**Figure 2.** ${\rm{\Sigma }}(t)={{\rm{\Sigma }}}_{0}(t)+{{\rm{\Sigma }}}_{1}(t)$ for a two-level system. The result is periodic and is plotted for one period π.
Download figure:
Standard image High-resolution image

**Figure 3.** ${{\rm{\Sigma }}}_{0}(t)$ for a two-level system.
Download figure:
Standard image High-resolution image

**Figure 3.** ${{\rm{\Sigma }}}_{0}(t)$ for a two-level system.
Download figure:
Standard image High-resolution image

**Figure 4.** ${{\rm{\Sigma }}}_{1}(t)$ for a two-level system.
Download figure:
Standard image High-resolution image

**Figure 4.** ${{\rm{\Sigma }}}_{1}(t)$ for a two-level system.
Download figure:
Standard image High-resolution image

The second example is the harmonic oscillator where the Hamiltonian is written as

$\begin{eqnarray}&&\hat{H}(t)=\displaystyle \frac{1}{2m}{\hat{p}}^{2}+\displaystyle \frac{m{\omega }^{2}(t)}{2}{\hat{x}}^{2}-\displaystyle \frac{\dot{\omega }(t)}{4\omega (t)}(\hat{x}\hat{p}+\hat{p}\hat{x}),\end{eqnarray} \tag{ 64 }$

where the last term represent the counterdiabatic term [19]. ${{\rm{\Sigma }}}_{0}(t)$ and ${{\rm{\Sigma }}}_{1}(t)$ are respectively calculated as

$\begin{eqnarray}&&{{\rm{\Sigma }}}_{0}(t)=\displaystyle \frac{\beta (\omega (t)-\omega (0))}{2\tanh \left(\tfrac{\beta \omega (0)}{2}\right)}-\mathrm{ln}\left[\displaystyle \frac{\sinh \left(\tfrac{\beta \omega (t)}{2}\right)}{\sinh \left(\tfrac{\beta \omega (0)}{2}\right)}\right],\end{eqnarray} \tag{ 65 }$

$\begin{eqnarray}&&{{\rm{\Sigma }}}_{1}(t)=-\mathrm{ln}\left[\displaystyle \frac{\sinh \left(\tfrac{\beta {\rm{\Omega }}(t)}{2}\right)}{\sinh \left(\tfrac{\beta \omega (t)}{2}\right)}\right],\end{eqnarray} \tag{ 66 }$

where ${\rm{\Omega }}(t)=\sqrt{{\omega }^{2}(t)-{[\dot{\omega }(t)/(2\omega (t))]}^{2}}$ . We note that the condition ${\omega }^{2}(t)\geqslant | \dot{\omega }(t)| /2$ is required to make ${\rm{\Omega }}(t)$ real. We parametrize $\omega (t)$ as

$\begin{eqnarray}&&\omega (t)=1-\displaystyle \frac{1}{2}\cos t,\end{eqnarray} \tag{ 67 }$

and set $\beta =2$ . The result is plotted in figure 5. Again, we can obtain a good approximation of ${\rm{\Sigma }}(t)$ using the lower bound. We see that ${{\rm{\Sigma }}}_{1}(t)$ is negligible at $t\sim \pi$ . This is because $\dot{\omega }(t)$ takes a small value around the point.

5. Conclusions

In conclusion, we have discussed nonequilibrium properties of thermally isolated systems by using STA. The main conclusions in this paper are (i) STA is applicable to any dynamical systems, (ii) the entropy production is separated into two parts and the information-geometric interpretation is possible and (iii) the entropy production has a lower limit which is used to derive a trade-off relation.

We have stressed that the idea of STA is applicable to any nonequilibrium processes. The property that the Hamiltonian is separated into two parts is directly reflected to the nonequilibrium entropy production. The Pythagorean theorem opens up a novel perspective in studying the nonequilibrium systems. Separation of the Hamiltonian can be used not only to solve the dynamical problems but also to characterize the nonequilibrium properties. It will also be useful to find an efficient algorithm for a dynamical system.

The lower bound of the entropy production gives a new type of trade-off relation between time and a notion of distance to equilibrium. To derive the lower bound, we used the improved Jensen inequality. Although there is no physical meaning of this inequality, the lower limit is represented by work fluctuations and is related to the geometric distance of two states. It is interesting to note that the entropy plays a role of velocity. This can be understood intuitively since the entropy becomes small for a quasistatic process where a large time is required to change the state to a different one.

Although it is still a difficult problem to find the proper separation of a given general Hamiltonian, we can invent, for example, a new approximation method from an information-geometric point of view. Actually, the projection theorem is utilized to find an optimized solution in the problem of information processing. The present work is only the beginning for applications of the concept of the information geometry to nonequilibrium dynamics. We expect that we can find a new efficient algorithm based on a picture that we discussed in this paper.

Acknowledgments

The author is grateful to Ken Funo, Tomoyuki Obuchi and Keiji Saito for useful discussions and comments. This work was supported by JSPS KAKENHI Grant No. 26400385.

Appendix A.: Jensen inequality

For a convex function f of random variables X, the average satisfies the inequality

$\begin{eqnarray}&&\langle f(X)\rangle \geqslant f(\langle X\rangle ),\end{eqnarray} \tag{ A.1 }$

where $\langle \rangle$ denotes the average with respect to X. The standard Jensen inequality is obtained by setting $f(X)={{\rm{e}}}^{X}$ . To improve the inequality, Decoster used the convex function [49]

$\begin{eqnarray}&&{f}_{N}(X)={{\rm{e}}}^{X}-\left(1+X+\displaystyle \frac{{X}^{2}}{2!}\,+\cdots +\,\displaystyle \frac{{X}^{2N-1}}{(2N-1)!}\right),\end{eqnarray} \tag{ A.2 }$

where N is integer. The case N = 1 gives the standard inequality $\langle {{\rm{e}}}^{X}\rangle \geqslant {{\rm{e}}}^{\langle X\rangle }$ . Here we take N = 2 to improve the inequality. By using the replacement $X\to X-\langle X\rangle +\alpha$ where α is real, we have

$\begin{eqnarray}&&\langle {{\rm{e}}}^{X-\langle X\rangle }\rangle \geqslant 1+{{\rm{e}}}^{-\alpha }\left[\displaystyle \frac{\langle {(X-\langle X\rangle )}^{2}\rangle }{2!}+\displaystyle \frac{\langle {(X-\langle X\rangle )}^{3}\rangle +3\alpha \langle {(X-\langle X\rangle )}^{2}\rangle }{3!}\right].\end{eqnarray} \tag{ A.3 }$

To find the tightest inequality, we choose α so that the right-hand side of this equation is maximized. We find $\alpha =-\langle {(X-\langle X\rangle )}^{3}\rangle /(3\langle {(X-\langle X\rangle )}^{2}\rangle )$ and obtain

$\begin{eqnarray}&&\langle {{\rm{e}}}^{X-\langle X\rangle }\rangle \geqslant 1+\displaystyle \frac{\langle {(X-\langle X\rangle )}^{2}\rangle }{2}\exp \left(\displaystyle \frac{\langle {(X-\langle X\rangle )}^{3}\rangle }{3\langle {(X-\langle X\rangle )}^{2}\rangle }\right).\end{eqnarray} \tag{ A.4 }$

Appendix B.: Gibbs–Bogoliubov inequality

As an application of the Jensen inequality, we consider the free energy calculated from the partition function

$\begin{eqnarray}&&Z={\mathrm{Tre}}^{-\beta H}={{\rm{e}}}^{-\beta F}.\end{eqnarray} \tag{ B.1 }$

We write the Hamiltonian $H={H}_{0}+{H}_{1}$ . Then, using the standard Jensen inequality, we can obtain

$\begin{eqnarray}&&Z\geqslant {Z}_{0}{{\rm{e}}}^{-\beta \langle {H}_{1}{\rangle }_{0}}\end{eqnarray} \tag{ B.2 }$

where ${Z}_{0}={\mathrm{Tre}}^{-\beta {H}_{0}}={{\rm{e}}}^{-\beta {F}_{0}}$ and $\langle \cdots \rangle =\mathrm{Tr}(\cdots ){{\rm{e}}}^{-\beta {H}_{0}}/{Z}_{0}$ . Thus, we have

$\begin{eqnarray}&&F\leqslant {F}_{0}+\langle {H}_{1}{\rangle }_{0}.\end{eqnarray} \tag{ B.3 }$

This is the Gibbs–Bogoliubov inequality [50, 51]. This result holds for arbitrary separations of H. We also note that the formula holds even when ${\hat{H}}_{0}$ and ${\hat{H}}_{1}$ do not commute with each other. When they do not commute, we use the Peierls inequality [52]

$\begin{eqnarray}&&\displaystyle \sum _{n}\langle n| {{\rm{e}}}^{\hat{X}}| n\rangle \geqslant \displaystyle \sum _{n}{{\rm{e}}}^{\langle n| \hat{X}| n\rangle },\end{eqnarray} \tag{ B.4 }$

where $\hat{X}$ is a hermitian operator and $\{| n\rangle \}$ represents a complete basis.

The noncommutativity of the operators becomes important when we consider the improved inequality. The improved Gibbs–Bogoliubov inequality corresponding to the improved Jensen inequality in (A.4) is calculated in a similar way and we obtain

$\begin{eqnarray}F & \leqslant & {F}_{0}+\langle {H}_{1}{\rangle }_{0}-\displaystyle \frac{1}{\beta }\mathrm{ln}\left[1+\displaystyle \frac{{\beta }^{2}}{2}\langle {({H}_{1}-\langle {H}_{1}{\rangle }_{0})}^{2}{\rangle }_{0}\right.\\ & & \left.\times \exp \left(-\displaystyle \frac{\beta }{3}\displaystyle \frac{\langle {({H}_{1}-\langle {H}_{1}{\rangle }_{0})}^{3}{\rangle }_{0}-\tfrac{1}{2}\langle [{\hat{H}}_{1},[{\hat{H}}_{1},{\hat{H}}_{0}]]{\rangle }_{0}}{\langle {({H}_{1}-\langle {H}_{1}{\rangle }_{0})}^{2}{\rangle }_{0}}\right)\right].\end{eqnarray} \tag{ B.5 }$

Since the third term of the right-hand side is negative, this inequality becomes an improvement of the standard inequality (B.3).

Shortcuts to adiabaticity applied to nonequilibrium entropy production: an information geometry viewpoint

Article metrics

Author e-mails

Author affiliations

Dates

Abstract

1. Introduction

2. STA for general dynamical systems

2.1. General formula

2.2. Quantum quench

2.3. Example

3. Nonequilibrium entropy production

3.1. Entropy production and Pythagorean theorem

3.2. Information-geometric interpretation

4. Lower bounds of entropy production

4.1. Improved Jensen inequality

4.2. Examples

5. Conclusions

Acknowledgments

Appendix A.: Jensen inequality

Appendix B.: Gibbs–Bogoliubov inequality

Shortcuts to adiabaticity applied to nonequilibrium entropy production: an information geometry viewpoint

Article metrics

Share this article

Author e-mails

Author affiliations

Dates

Abstract

1. Introduction

2. STA for general dynamical systems

2.1. General formula

2.2. Quantum quench

2.3. Example

3. Nonequilibrium entropy production

3.1. Entropy production and Pythagorean theorem

3.2. Information-geometric interpretation

4. Lower bounds of entropy production

4.1. Improved Jensen inequality

4.2. Examples

5. Conclusions

Acknowledgments

Appendix A.: Jensen inequality

Appendix B.: Gibbs–Bogoliubov inequality