Brought to you by:
Paper

Partition function loop series for a general graphical model: free-energy corrections and message-passing equations

and

Published 28 September 2011 © 2011 IOP Publishing Ltd
, , Citation Jing-Qing Xiao and Haijun Zhou 2011 J. Phys. A: Math. Theor. 44 425001 DOI 10.1088/1751-8113/44/42/425001

1751-8121/44/42/425001

Abstract

A loop series expansion for the partition function of a general statistical model on a graph is carried out. If the auxiliary probability distributions of the expansion are chosen to be a fixed point of the belief-propagation equation, the first term of the loop series gives the Bethe–Peierls (BP) free-energy functional at the replica-symmetric level of the mean-field spin-glass theory, and corrections are contributed only by subgraphs that are free of dangling edges. This result generalizes the early work of Chertkov and Chernyak on binary statistical models. If the belief-propagation equation has multiple fixed points, a loop series expansion is performed for the grand partition function. The first term of this series gives the BP free-energy functional at the first-step replica-symmetry-breaking (RSB) level of the mean-field spin-glass theory, and corrections again come only from subgraphs that are free of dangling edges, provided that the auxiliary probability distributions of the expansion are chosen to be a fixed point of the survey-propagation equation. The same loop series expansion can be performed for higher-level partition functions, obtaining the higher-level RSB BP free-energy functionals (and the correction terms) and message-passing equations without using the BP approximation.

Export citation and abstract BibTeX RIS

1. Introduction

Graph expansion methods for statistical models have extensively been discussed in the literature. They have been very helpful in studying the high-temperature behaviors and the phase transition properties of discrete models such as the Ising model of ferromagnetism. Some of the early efforts were carried out by Brout and others [13]. More recently, Georges and Yedidia [4] found that the high-temperature expansion of the Ising spin-glass free energy can also be carried out under the constraints of fixed mean spin values. This later work was extended by Sessak and Monasson [5] to include pair correlations of spin variables as another set of expansion constraints. The constrained graph expansion method was applied to the inverse Ising problem [5], with the aim of inferring the microscopic interactions of an Ising system based on the observed mean spin values and spin-pair correlations.

For finite-connectivity binary statistical models, Chertkov and Chernyak [6, 7] showed that the partition function can be expressed as a sum of contributions from subgraphs. The first term of this expansion is identical to the partition function obtained using the Bethe–Peierls (BP) approximation. Loop corrections to the BP approximation were also calculated by Montanari, Rizzo and collaborators [8, 9] and by Parisi and Slanina [10]. The derivation of the partition function expansion by Chertkov and Chernyak relied on a special property of Ising spin variables (see equation (17) of [7]). This special property is not valid for more general statistical models, whose microscopic variables are not necessarily binary or discrete. Whether the conclusion of [6, 7] is valid to general statistical models is still an open issue (for models whose discrete variables take Q > 2 values, a complicated loop-tower expansion was presented in [11]).

In the present contribution, we first extend the results of [6, 7, 11] by carrying out a very simple derivation of partition function loop series for a general statistical model defined on a graph. We do not make any assumptions on the nature of the microscopic state variable of each edge (or vertex) of the graph. This state variable can be discrete or real valued, or be a vector, or even be a function itself. We show that the first term of this expansion is also identical to the BP partition function, and corrections to the BP partition function come only from looped subgraphs without any dangling edges. The auxiliary probability distributions of this loop series expansion are chosen to be a fixed point of the belief-propagation equation. This particular choice makes all the subgraphs with at least one dangling edge to have zero contribution to the correction terms.

As the second main result, we present the loop series expressions for the grand partition function and higher-level partition functions. The belief-propagation equation of a statistical model may have multiple fixed points, each of which is referred to as a macroscopic state of the configuration space. If this happens, we define a grand partition function at the level of macroscopic states and perform a loop series expansion for this grand partition function. When the auxiliary probability distributions of this expansion are chosen to be a fixed point of the survey-propagation equation (first derived through the first-step replica-symmetry-breaking (1RSB) mean-field theory of spin glasses [12]), the first term of this loop expansion gives the BP free-energy functional at the level of macroscopic states. Corrections again come only from subgraphs that are free of dangling edges. In case the survey-propagation equation has multiple fixed points, the same loop series expansion can be performed for higher-level partition functions. As a result, we obtain the higher-level BP free-energy functionals and the correction contributions, and the associated message-passing equations.

This work is a mathematical approach to the theory of spin glasses from the framework of partition function loop expansion. It is clear that at each replica-symmetry-breaking (RSB) level of the mean-field theory [12], the corrections to the free energy due to looped nontrivial subgraphs are neglected. This neglected correction contribution is explicitly expressed as a logarithm over a finite series in this paper. At a given level of macroscopic states, we anticipate that the magnitude of the total loop correction contribution to the free energy will be sub-linear in N (with N being the total number of vertices) if there is only one fixed point for the corresponding message-passing equation, but it will be linear in N if there exist multiple fixed points. This statement needs to be checked by numerical calculations on single graphical systems.

Section 2 introduces the general statistical model. We work out the loop series of the partition function in section 3 and derive the belief-propagation equation. In section 4, we extend the discussion to the case that the belief-propagation equation has multiple fixed points and perform a loop series expansion for the grand partition function. The 1RSB survey-propagation equation is derived here. We conclude this work in section 5 and discuss some possible extensions. The appendix contains graph expansion results for a one-dimensional ring.

2. General statistical models on graphs

We consider a graph G composed of N vertices (i = 1, 2, ..., N) and M edges. An edge (i, j) between two vertices i and j has a state variable xij. This variable may be a binary spin for some systems, xij = ±1. For other systems, xij may be real valued or be a vector, or be even more complicated. In this paper, we make no assumptions on the nature of the microscopic state variable xij of each edge (i, j). Each vertex i has an energy Ei(xii), where $x_{i \partial i}\equiv \lbrace x_{i j_1}, x_{i j_2}, \ldots , x_{i j_k}\rbrace$, with j1, j2, ..., jk being the k other vertices with which i forms an edge. The number k of nearest neighbors of a vertex might be different for different vertices. The vertex energy is a function of the state variables of its connected edges. Note that xij and xji both denote the state of edge (i, j); therefore xijxji. The total vertex energy for an edge configuration {xij} is

Equation (1)

The partition function of the system is defined as

Equation (2)

In the above equation, β is the inverse temperature, ρ0(xij) is the probability of microscopic state xij for an isolated edge (i, j) and ∏(i, j) means the product over all the edges of graph G. For simplicity, we assume that the a priori probabilities ρ0(xij) are identical for all edges. This assumption is of course nonessential.

The partition function (2) also applies to graphical models whose microscopic states are defined on vertices rather than on edges [6, 7]. For example, consider a graph G with the property that its vertices can be divided into two subsets, denoted by {i} (the variable nodes) and {a} (the check nodes), such that all the edges of G are between a variable node i and a check node a. For each variable node i, we assume that

Equation (3)

where ∂i denotes the set of nearest-neighboring check nodes of i. Equation (2) then simplifies to

Equation (4)

where ∂a denotes the set of nearest-neighboring variable nodes of a. Equation (4) is the partition function of a system defined on a factor graph, with each variable node i having a microscopic state xi and each check node a having an energy Ea. The check energy Ea depends on the microscopic state xa of the variable nodes in ∂a.

In some graphical models, the state xij of each edge (i, j) is a collection of two microscopic states yiij and yjij, xij ≡ {yiij, yjij}. We assume that the a priori probability distribution of the edge state xij equals ρ0(yiij0(yjij), and that the energy Ei of a vertex i can be expressed as

Equation (5)

Under these assumptions, the partition function (2) becomes

Equation (6)

which describes a graphical model whose vertex energy $\tilde{E}_i$ depends on the microscopic state yi of vertex i and the microscopic states yi of its nearest neighbors. An example of such statistical models is the palette-coloring problem [1315].

3. Graph expansion for the general statistical model

To find a loop series expression for the partition function (2), we introduce for each edge (i, j) two auxiliary probability distributions qji(xij) and qij(xji), and rewrite Z(β) as

Equation (7)

Equation (8)

In (8), C(i, j) is an edge constant with the value

Equation (9)

and Δ(i, j)(xij, xji) is expressed as

Equation (10)

From (8) we know that the partition function can be expressed as the sum of contributions from all the possible non-empty subgraphs of G:

Equation (11)

In the above equation, ZBP is calculated by

Equation (12)

A non-empty subgraph g of graph G contains a partial set of the edges of G and all the vertices that are attached to these edges. The correction Lg is expressed as

Equation (13)

Consider a subgraph g which has a vertex i that is linked to the other parts of g only through a single edge (i, j). The neighborhood of such a leaf vertex i is shown schematically in figure 1. We find that after integrating over the variable xij, the correction Lg is expressed as

Equation (14)

where $\hat{q}_{i\rightarrow j}(x_{i j})$ is determined by the set of probability functions qi\j ≡ {qki, k ∈ ∂i\j} through

Equation (15)

The function Bij(qi\j) as defined by (15) is called the belief-propagation equation. It takes as input a set of probability distributions qki(xik) (k ∈ ∂i) and output a new probability distribution $\hat{q}_{i\rightarrow j}(x_{i j})$.

Figure 1.

Figure 1. The neighborhood of a vertex i. A solid line between any two vertices i and j means that the edge (i, j) is presented both in the graph G and in the subgraph g. A dashed line between two vertices i and k means that the edge (i, k) is presented in G but not in g. A solid circle denotes a vertex that belongs to the subgraph g, and a dashed circle denotes a vertex that does not belong to g. A solid circle is attached by at least one solid edge. Vertex i is connected by only one solid edge, it is a leaf vertex of subgraph g and the edge (i, j) is a dangling edge.

Standard image

Since we are free to choose the auxiliary probabilities functions {qij(xji)}, we can choose this set of auxiliary functions to be a fixed point of the belief-propagation equation (15). In other words, we require the auxiliary probability functions to satisfy

Equation (16)

Then for each edge (i, j) we have $\hat{q}_{i\rightarrow j}(x_{i j}) = {q}_{i\rightarrow j}(x_{i j})$, and the expression inside the curly brackets of (14) is identically zero. Under this special choice, only those subgraphs of graph G with each vertex i having at least two attached edges have non-zero contributions to the correction of the partition function. The total free energy F(β) is then expressed as

Equation (17)

where g' denotes a looped subgraph that contains no dangling edges. The free energy FBP(β) corresponds to the partition function ZBP and is expressed as

Equation (18)

with

Equation (19)

Equation (20)

We emphasize that FBP(β) is identical in form to the mean-field free energy as obtained by the replica-symmetric (RS) spin-glass theory [12]. Expression (18) was first derived in the mean-field theory by using the BP approximation. The free energy FBP can also be viewed as a functional of the 2M probability distributions {pij(xij)} on the M edges (i, j) of graph G. In this paper, we refer to FBP as the BP free-energy functional. It is easy to check that the variation of FBP with respect to any a probability distribution pij(xij) is zero at a fixed point of (16). Equation (18) is expressed as the difference between the total vertex contribution (∑ifi) and the total edge contribution (∑(i, j)f(i, j)). An intuitive understanding of this is as follows: each edge participates in two-vertex interactions and its effect is counted twice when calculating the total vertex contribution; this over-counting should be subtracted from the total vertex contribution.

From the viewpoint of the partition function loop expansion, the belief-propagation fixed-point condition (16) is a requirement for ensuring that all the corrections Lg from subgraphs g with dangling edges are identically zero. For a loopy subgraph g without dangling edges, its correction contribution Lg is obtained through (13). The correction to the total free energy is expressed as the logarithm of the sum of all these loop correction contributions (see (17)). In the appendix, we report the free-energy correction contribution of a one-dimensional ring of N edges. The correction is found to be positive when this ring is energetically frustrated. For more complicated model systems, the sign of the free-energy correction contribution is still an open issue.

For a discrete model whose edge states can take Q > 2 different values, Chernyak and Chertkov [11] derived a loop-tower expansion for the partition function by exploiting the gauge symmetry of the microscopic states. The derived belief-propagation equation by their approach does not fix the gauge freedom completely, and therefore high-order gauge fixing was introduced, making the loop-tower expansion scheme very complicated. Gauge fixing is not needed in the present loop series expansion scheme. In the light of this work, it might be possible to simplify the scheme of [11] and obtain an alternative derivation of the free-energy expression (17). We are currently working on this mathematical issue.

4. Graph expansion for the grand partition function

For the general statistical model defined by the partition function (2), the belief-propagation equation (16) may have multiple fixed points. If this happens, then the BP free energy FBP as a functional of {pij(xij)} has more than one minimal value. In the following, we will refer to a fixed-point solution {pij(xij)} of (16) with a minimal value of FBP as a macroscopic state of the configuration space. Each macroscopic state has a corresponding BP free-energy value FBP. To account for the existence of multiple macroscopic states, in analogy with (2), we define a grand partition function Ξ as

Equation (21)

In the above equation, ∫Dqij means summing over all different possibilities of the distribution qij, and the Dirac delta functions δ(qijBij(qi\j)) ensure that only fixed-point solutions of the belief-propagation equation (16) contribute to Ξ. The parameter y is an introduced inverse temperature at the level of macroscopic states.

In analogy with (7), we can rewrite (21) as

Equation (22)

In the above equation, Pij(qij) is an introduced auxiliary probability distribution function for the probability distribution qij(xji), fi is the free-energy contribution of vertex i as expressed by (19), C(1)(i, j) is an edge constant,

Equation (23)

with f(i, j) being the free-energy contribution of an edge (i, j) as given by (20) and Δ(1)(i, j) is expressed as

Equation (24)

The grand partition function can therefore be expanded as

Equation (25)

where ΞSP is expressed as

Equation (26)

and the correction term L(1)g of a subgraph g is expressed as

Equation (27)

Consider a subgraph g which has a leaf vertex i and a dangling edge (i, j). After integrating over the probabilities around vertex i, the correction contribution of this subgraph can be expressed as

Equation (28)

where the probability distribution $\hat{P}_{i\rightarrow j}(q_{i\rightarrow j})$ is calculated by

Equation (29)

with

Equation (30)

In accordance with the spin-glass literature, we refer to (29) as the survey-propagation equation.

The expression inside the curly brackets of (28) is identically zero if $\hat{P}_{i\rightarrow j}(q_{i\rightarrow j}) = P_{i\rightarrow j}(q_{i\rightarrow j})$. Since the auxiliary probability distributions {Pij(qij)} are free to choose, we can choose them appropriately to ensure that the correction contribution L(1)g = 0 for any a subgraph g with at least one dangling edge. In other words, {Pij(qij)} should be a fixed-point solution of the survey-propagation equation:

Equation (31)

This equation was first derived in [12] under physical considerations (the BP approximation was again used).

At a fixed point of (31), the expression of the total grand free energy is

Equation (32)

where g' again denotes a looped subgraph that contains no dangling edges. From the framework of the partition function loop expansion, (31) is a requirement to ensure that subgraphs with dangling edges do not have correction contributions to the grand free energy.

In (32), the grand free energy GSP(y; β) is expressed as

Equation (33)

with

Equation (34)

Equation (35)

being, respectively, the contribution to the grand free energy from a vertex i and an edge (i, j). GSP(y; β) is identical in form to the 1RSB free energy of the mean-field spin-glass theory [12], which was derived previously by applying the BP approximation. GSP can also be regarded as a functional of the 2M probabilities {Pij(qij)}, and its variation with respect to any Pij(qij) is zero at a fixed-point of the survey-propagation equation. We refer to GSP as the survey-propagation free-energy functional (it is the BP free-energy functional at the 1RSB mean-field level).

We conclude this section with a discussion on the reweighting parameter y of (21). Denoting a fixed-point solution of the belief-propagation equation (a macroscopic state) as α and its associated BP free energy as F(α)BP, the grand partition function Ξ can be rewritten as

Equation (36)

where $\exp \Bigl (N \Sigma (f)\Bigr )$ is the total number of macroscopic states with a given BP free energy FBP = Nf (the quantity f is called the free-energy density). The function Σ(f) is called the complexity in the spin-glass literature (it is the entropy density at the level of macroscopic states). For N ≫ 1, the integration in (36) is dominated by the value of $f=\overline{f}$ which satisfies $\frac{{\rm d}\Sigma (f)}{{\rm d} f}|_{f=\overline{f}}=y$. The value $\overline{f}$ is the mean BP free-energy density at a given value of y. If we neglect the loop correction to the grand free energy in (32), then

Equation (37)

where $\overline{f}_{i}$ and $\overline{f}_{(i,j)}$ are, respectively, the mean free-energy contribution of a vertex i and an edge (i, j), with the expression

Equation (38)

Equation (39)

The value of the complexity Σ is expressed as

Equation (40)

The smallest mean free-energy density $\overline{f}$ corresponds to the value of y which makes the complexity be zero, Σ = 0. Another special value of y is y = β. If the complexity calculated at y = β is positive, the corresponding mean free-energy density value $\overline{f}$ is the typical value of BP free-energy densities of the macroscopic states sampled at the inverse temperature β [12].

5. Conclusion and discussion

The main results of this paper are the free-energy expression (17) and the grand free-energy expression (32), and the corresponding belief-propagation equation (16) and survey-propagation equation (31). From the viewpoint of the partition function loop expansion, the belief-propagation and survey-propagation equations are, respectively, conditions needed to ensure that subgraphs with dangling edges have zero correction contributions to the free energy and the grand free energy.

This work helps to place the mean-field RSB theory of spin glasses on a firmer mathematical ground. There are many unsolved problems ahead, for example, the relationship free-energy functionals GBP and GSP and the free-energy landscape of the system, the link between the defined grand partition function Ξ and the original partition function Z, the issue of sampling microscopic configurations {xij} giving a fixed-point solution {qji(xij)} of the belief-propagation equation and so on.

The discussion in section 4 can be readily extended to the case when the survey-propagation equation (31) has multiple fixed-point solutions. As a result, the mean-field second-step RSB free energy and its loop correction expression will be derived, as well as the corresponding message-passing equation. This expansion hierarchy can be continued to produce the mean-field results and the corresponding loop correction expressions and message-passing equations at even higher levels of replica-symmetry breaking.

For statistical models defined on a factor graph with partition functions expressed in the form of (4), the method of this paper can also be directly applied without the need of first turning the partition function into the form of (2).

This paper also points to some other important open issues. One question is as follows: how to express the mean value of a local observable in terms of a finite loop series? Examples of local observables are the state variable xij on an edge (i, j) of the system, the correlation between the two edge variables xij and xkl, the energy of a single interaction and so on. Loop series expressions for these local observations should be very useful in improving the predictions of the mean-field cavity theory. For a graphical model with many short loops, it is desirable to represent the system as a collection of many basic clusters in the framework of Kikuchi's cluster variation method (for a review, see [16] and [17]). These basic clusters are connected to each other by joint clusters [18]. The joint clusters can be regarded as generalized edges. The present partition function loop expansion method probably is also applicable to these more complex graphical systems.

Acknowledgments

This work was finished while the authors were participating in the 'Interdisciplinary Applications of Statistical Physics and Complex Networks' program of KITPC (March, 2011). JQX thanks Professor Xiang–Mao Ding for encouragement and support. This work was partially supported by the NSFC grant 10834014, the 973-Program grant 2007CB935903 and the PKIP grant KJCX2.YW.W10 of CAS.

Appendix.: Discrete models on a one-dimensional ring

As a simple application of the graph expansion method discussed in the main text, we calculate the loop correction contribution for a model defined on a one-dimensional ring with N vertices and N edges. We assume that the edge state xi, i + 1 between two vertices i and (i + 1) can take Q different discrete values, xi, i + 1 ∈ {1, 2, ..., Q}. The energy of the ring is

Equation (A.1)

where J is a coupling constant and δyx is the Kronecker symbol. The prior distribution ρ0(xi, i + 1) is uniform over the Q states.

For this simple system, the fixed point of the belief-propagation equation (16) is qi → (i + 1)(x) = 1/Q and q(i + 1) → i(x) = 1/Q for x ∈ {1, 2, ..., Q}. Then we have Δi, i + 1(xi, i + 1, xi + 1, i) = Qδxi + 1, ixi, i + 1 − 1, and by a straightforward summation along the ring, the loop correction expression (13) is simplified as

Equation (A.2)

We note that for N being even, Lg ⩾ 0 and therefore the loop correction to the free energy (see (17)) is negative (the BP free energy FBP is higher than the true free energy). However, if N is odd and J < 0, then Lg < 0 and the loop correction to the free energy becomes positive (FBP is lower than the true free energy). This different behavior is related to the fact that the one-dimensional ring with an odd number of interactions is frustrated when J < 0. This simple example also shows that the loop correction Lg decays exponentially with loop length.

Please wait… references are loading.
10.1088/1751-8113/44/42/425001