Brought to you by:
Letter

Reconstructing multi-mode networks from multivariate time series

, , , , , , and

Published 24 November 2017 Copyright © EPLA, 2017
, , Citation Zhong-Ke Gao et al 2017 EPL 119 50008 DOI 10.1209/0295-5075/119/50008

0295-5075/119/5/50008

Abstract

Unveiling the dynamics hidden in multivariate time series is a task of the utmost importance in a broad variety of areas in physics. We here propose a method that leads to the construction of a novel functional network, a multi-mode weighted graph combined with an empirical mode decomposition, and to the realization of multi-information fusion of multivariate time series. The method is illustrated in a couple of successful applications (a multi-phase flow and an epileptic electro-encephalogram), which demonstrate its powerfulness in revealing the dynamical behaviors underlying the transitions of different flow patterns, and enabling to differentiate brain states of seizure and non-seizure.

Export citation and abstract BibTeX RIS

Introduction

Characterizing dynamical processes in complex systems from observed time series is a vitally important, yet intractable, issue in a broad range of research areas. Various solutions sprung up (such as fractal analysis [1], power spectra [2], recurrence plot [3], de-trended cross-correlation analysis [4] and decomposed transfer entropy [5]), but the increase of system complexity makes it largely difficult to depict the dynamical behavior from time series, and conventional methods encountered resistance to produce reliable results.

Regarding each system component as a node and determining the edges in terms of units' interactions pave a way for mapping from a system to a graph [611]. Thereafter, complex network theory can be used to investigate the qualities of the system. Several effective methodologies have been proposed to map a univariate/multivariate time series into a complex graph [1221] and have been applied to various areas like climate dynamics [22,23], epilepsy diagnosis [24], rainfall prediction [25], thermo-acoustic instability detection [26] and two-phase flow analysis [2731].

In this letter, we put forward a novel powerful functional network, namely a multi-mode weighted network, by combining multi-mode conversion with recurrence networks to focus on the analysis of high-dimensional time series. The key lies in delineating pair-wise system element interrelationships by searching for the times when both of their trajectories recur simultaneously, i.e., the occurrence of joint recurrences. In particular, we first perform a multivariate empirical mode decomposition (MEMD) [32] on each high-dimensional time series to obtain several multivariate intrinsic mode functions (IMFs) corresponding to different modes. Next, for each targeted IMFs (at each targeted mode), we infer a weighted recurrence network to form a multi-mode weighted graph. The method is then applied to measured data from fluid dynamics and neuroscience, which validate its effectiveness. Finally, comparisons with traditional functional networks will also be discussed.

Methodology

We start by introducing the MEMD method. The past decades have witnessed a rapid development of traditional time-frequency analysis methods, from Fourier transform to wavelet approaches. Such traditional methods are affected by intrinsic limitations: standard Fourier methods project data onto fixed functions and are then unable to properly analyze short and intermittent real-world data, while the employment of integral transforms in typical time-frequency analysis harms the analytic signal representation resulting from the blurring of the notion of time. These and other limitations triggered the appearance of the empirical mode decomposition (EMD) method, which holds the advantages of adaptivity, enhanced accuracy and integrity due to the fact that its bases are data-driven rather than fixed. However, channel-wise EMD analysis may generate IMFs with different numbers, and same-index IMFs are not necessarily containing equal scales. The latter problem led to the emergence of MEMD [32]. MEMD is capable of containing mode-aligned intrinsic joint rotational modes, which is achieved by analyzing the data in a p-dimensional domain, and performing uniform sampling for the projection vector directions. Therefore, the p channels of data have the same number of mode-aligned IMFs, including the same rotational modes. In addition, MEMD allows the occurrence of mixing modes, which makes it applicable for coherent multivariate time-frequency analysis. The MEMD procedure is as follows: given a p-dimensional time series x(t), 1) we first obtain a uniform sampling of V points on a p-dimensional sphere to form direction angles $\theta_{v}$ , where $v=1,2,\ldots,V$ , which are used to generate projection vectors $s_{\theta_{v}}$ (for this a low-discrepancy Hammersley sequence namely quasi-Monte-Carlo sampling is performed) [33]); 2) we project the p-dimensional time series x(t) along the direction vector $s_{\theta_{v}}$ , where $v=1,2,\ldots,V$ , so that the set of projections $\{q_{\theta_{v}}(t)\}^{V}_{v=1}$ can be achieved; 3) we find the time instants $\{t_{\theta _{v}}^{i}\}_{v=1}^{V}$ at which the set of the projected signals $\{q_{\theta_{v}}(t)\}^{V}_{v=1}$ reach the local extremum, and then interpolates $[t_{\theta _{v}}^{i},x(t_{\theta _{v}}^{i})]$ via cubic splines to acquire the multivariate envelope curves $\{e_{\theta _{v}}(t)\}_{v=1}^{V}$ ; 4) we subtract the mean of the V multidimensional envelopes $\text{m}(t)$ from x(t) to gain the detail $\text{d}(t)$ , where $\text{m}(t)=\frac{1}{V}\sum_{v=1}^{V}e_{\theta _{v}}(t)$ ; 5) if $\text{d}(t)$ satisfies the stopping criteria described in [32] for a multivariate IMFs, then $x(t)=x(t)-\text{d}(t)$ otherwise $x(t)=\text{d}(t)$ . With the new x(t), the procedure goes back to step 1). Eventually, a multivariate time series is converted into several multivariate IMFs corresponding to different frequency bands. The quasi-dyadic filterbank property of MEMD and the ability in producing mode-aligned IMFs greatly facilitate us to pick out the targeted frequency bands and focus on the analysis of those targeted modes of highest significance.

The weighted recurrence network at each targeted multivariate IMFs is then constructed. In order to capture synchronization, here the joint recurrence plot (JRP) [34] technique is considered. The construction of the weighted recurrence network can be described as follows. For a multi-channel signal $\{x_{k,l}\}_{l=1}^{L},k=1,2,\ldots,p$ which contains p sub-signals of equal length L, the phase-space reconstruction is first conducted channel-wise (by choosing a suitable embedded dimension m and time delay τ by the False Nearest Neighbors algorithm [35] and C-C method [36], respectively) as $\vec{x}_{k}(t)=(x_{k,t},x_{k,t+\tau },\ldots,x_{k,t+(m-1)\tau }),t=1,2,\ldots,N$ , where N is the number of vector points in the reconstructed trajectory. For any generated trajectory $\vec{x}_{m}(i)$ , a $N\times N$ recurrence plot (RP) can be obtained

Equation (1)

where ε is a pre-chosen threshold resulting from fixing the recurrence rate (a measure of the density of recurrence points in the recurrence plot (RP)) to 0.1, and $\| \cdot \|$ is the maximum norm. Thus, for a multi-channel signal $\{x_{k,l}\}_{l=1}^{L},k=1,2,\ldots,p$ containing a p sub-signal, the number of obtained recurrence plots is p. For pair-wise RPs, the joint recurrence plot (JRP) is

Equation (2)

In what follows, the joint recurrence rate (JRR) [37] is used to quantify the density of recurrence points in each JRP, i.e., $JRR(\vec{x}_{m},\vec{x}_{n})=\frac{1}{N^{2}}\sum_{i,j=1}^{N}JRP_{i,j}^{\vec{x}_{m},\vec{x}_{n}}$ . Then, we characterize the synchronization of pair-wise time series by $S(\vec{x}_{m},\vec{x}_{n})=\frac{JRR(\vec{x}_{m},\vec{x}_{n})}{RR}$ , where RR represents the recurrence rate of each recurrence plot, which is 0.1. Thus, for a multi-channel signal $\{x_{k,l}\}_{l=1}^{L},k=1,2,\ldots,p$ , we eventually obtain a synchronization matrix $S(\vec{x}_{m},\vec{x}_{n})$ of size $p\times p$ . By means of these calculations, we infer a weighted recurrence network by regarding each sub-signal of the multivariate IMFs as a node and deeming the value of the synchronization index $S(\vec{x}_{m},\vec{x}_{n})$ as the weight of the link between nodes m and n. Thus, the novel functional network (namely the multi-mode weighted network) is formed, and consists of several weighted networks calculated at each targeted mode. The schematic diagram of our method is described in fig. 1.

Fig. 1:

Fig. 1: (Color online) The schematic diagram of the method leading to a multi-mode weighted network representation of the original data set.

Standard image

The application in a two-phase flow system

Validation of the performance and effectiveness of the method is provided with two applications in real systems. The first is an oil-water two-phase flow, which is frequently encountered in many industrial fields. Particularly, the combination of different total flow velocities and water cut Kw (identified as different flow conditions) gives rise to the formation of three typical flow patterns, i.e., oil-in-water slug flow (D OS/W), oil-in-water bubble flow (D O/W) and very fine dispersed oil-in-water bubble flow (VFD O/W) [38]. Recognition of the transition between flow patterns, and characterization of the underlying dynamical behaviors are extremely important, especially for the enhancement of oil recovery. Our method identifies the flow patterns in the complicated oil-water two-phase flow system and detect the underlying dynamical behaviors during the transitions. We use tap-water and No. 3 white oil, with a density of 856 kg/m3 and a viscosity of 11.984 mPa · s, to conduct a small-diameter vertical oil-water two-phase flow experiment, where a high-speed cycle motivation conductance sensor is designed to record the conductivity of the oil-water mixture resulting from the evolution of different flow patterns. A series of 48-channel time series is then obtained. MEMD conversion is first performed on each 48-channel time series. According to the energy distribution of each mode, we select as targeted modes those whose energy have only little difference with the original signal and whose frequency distribution is in accordance with the typical frequency bands of the three flow patterns. For each generated weighted recurrence network, only those edges whose weight is the highest (in the range 10%–35%) among all the links are retained, aiming at inspecting the topology of relatively sparse networks [39]. In addition, the integrals (over the sparsity range) of two network metrics are considered for two metrics: the weighted global efficiency which is defined as $E^{w}=\frac{1}{n}\sum_{i\in N}\frac{\sum _{j\in N,j\neq i}(d_{ij}^{w})^{-1}}{n-1}$ , and the weighted characteristic path length which can be calculated as $L^{w}=\frac{1}{n}\sum_{i\in N}\frac{\sum _{j\in N,j\neq i}d_{ij}^{w}}{n-1}$ (both indicating the functional integration of networks). Note that, $d_{ij}^{w}$ is the shortest weighted path length between nodes i and j and is decided by $d_{ij}^{w}=\sum_{a_{uv}\in gi\mathop{\leftrightarrow} \limits^w j}f(w_{uv})$ , where $g_{i\mathop{\leftrightarrow} \limits^w j}$ is the shortest weighted between i and j and f is a map from weight to length [40].

The final values of Lw and Ew of each flow condition are obtained by averaging the integrated Ew and Lw from each generated recurrence network. The combined distribution graph of these two metrics is reported in fig. 2, together with images of the different flow conditions. In addition, fig. 3(a) and (b) presents some detailed results of these two metrics, aiming at portraying the evolution of different flow patterns. Notice that sparse networks are here weighted, while traditional functional networks are commonly un-weighted. As a result, we can compare our weighted sparse networks with multi-mode un-weighted networks and calculating the integrated un-weighted global efficiency and characteristic path length (denoted as E and L, respectively) for the same flow conditions as in fig. 3(a) and (b) (the detailed results are shown in fig. 3(c) and (d)). The definitions of E and L are similar to that of Ew and Lw, but $d_{ij}^{w}$ should be substituted for dij which represents the shortest path length (distance) between nodes i and j. Furthermore, the Pearson correlation coefficient (a widely used metric for functional networks) is here also used on the original data to infer a traditional functional network for the flow conditions described in fig. 3, and again the integrated Ew and Lw are calculated (the results are reported in fig. 4).

Fig. 2:

Fig. 2: (Color online) The combined graph of Lw and Ew (see text for definition). Insets are photos of the corresponding flow patterns.

Standard image
Fig. 3:

Fig. 3: (Color online) The evolutional process of the two metrics during the transition among different patterns, while the water cut Kw is 86% and 97%, respectively: (a) Ew; (b) Lw; (c) E; (b) L (see text for definitions of all quantities).

Standard image
Fig. 4:

Fig. 4: (Color online) The evolutional process of the two metrics, calculated using a traditional (Pearson correlation coefficient based) functional network, during the transition among different flow patterns while the water cut Kw is 86% and 97%, respectively: (a) Ew; (b) Lw.

Standard image

In fig. 3(a) and (b) the distribution of the two measures is significantly different in the transition among the three typical oil-water flow patterns. While the total flow velocity is low and the oil cut is relatively high, the oil-in-water slug flow appears as an aggregation of cup-shaped oil slugs and small oil droplets moving in a water continuum quasi-periodically. The increase of the flow velocity and the turbulent kinetic energy destroys the joint movement conditions, and makes it impossible to form oil slugs which separate into relatively uniformly distributed small oil bubbles. At that moment, the movement of oil bubbles becomes stochastic, resulting in a gradual weakening of the synchronization of the system and further leading to the increase of Lw and the decrease of Ew. Figure 3(a) and (b) provides the concrete evolutional process of the two metrics, related to the transition of flow patterns, and the water cut Kw is 86% and 97%, respectively. When the oil cut is pretty low but the turbulent kinetic energy is huge, the giant crushing force triggers oil bubbles scattering into the whole tube and transforming them into tiny droplets, indicating that the very fine dispersed oil-in-water bubble flow shows up. Thus, the synchronization of the system nearly dies away and Lw possesses a growth trend while Ew changes in the opposite trend simultaneously. On the other side, as shown in fig. 3(c) and (d), the reconstruction of un-weighted network is infeasible to identify the three different flow patterns, let alone characterize the complicated evolution process. In addition, fig. 4(a) and (b) shows that the traditional approach, based on Pearson correlation analysis, is also unable to uncover the flow behavior in the transitions between different flow conditions. All these findings render our multi-mode weighted complex network extraordinarily effective for identifying flow patterns, and for characterizing the dynamical behaviors in the evolution of different flow patterns.

The application in an epileptic electro-encephalogram

As a paroxysmal neurological disorder, epilepsy occurs with recurrent and abrupt seizures closely linked to neural synchronization [41]. Most of the studies on epilepsy which use complex networks analysis [4244] aim at differentiating epilepsy patients from healthy controls. However, the identification of pre-seizure and seizure periods (i.e., the seizure prediction task) can further improve the precaution and even the treatment of epilepsy [45,46]. We here apply our method to the identification of pre-seizure and seizure in the scalp electroencephalogram (EEG) recordings [47] of three pediatric patients (labeled as ch01, ch03, ch08), who feature 7, 7 and 5 seizure events respectively. For each one of them, we select a 20-s-long non-seizure recording immediately prior to the seizure onset (the pre-seizure recording), and a 20-s-long seizure recording. After the MEMD procedure, we choose mode 2 to 8 as our targeted modes (corresponding to the 1–48 Hz broad frequency band), and construct the multi-mode weighted complex networks. Thereafter, we calculate the integrated weighted local efficiency and clustering coefficient (representing the efficiency of local communication and the clustered connectivity in networks, respectively) of each weighted recurrence network and then average the values of all of them as the final results. Besides, the weighted local efficiency is defined as $E_{loc}^{w}=\frac{1}{2n}\sum _{i\in N}\frac{\sum _{j,h\in N,j\neq i}(w_{ij}w_{ih} [d_{jh}^{w}(N_{i})]^{-1})^{1/3}}{k_{i}(k_{i}-1)}$ , where $d_{jh}^{w}(N_{i})$ is the length of the shortest path between nodes j and h that contains only neighbors of node i, and ki is the degree of node i [40]. And the weighted clustering coefficient is defined as $C^{w}=\frac{1}{n}\sum _{i\in N}\frac{2t_{i}^{w}}{k_{i}(k_{i}-1)}$ , where $t_{i}^{w}=\frac{1}{2}\sum _{j,h\in N}(w_{ij}w_{ih}w_{jh})^{1/3}$ is the geometric mean of triangles around node i and wij, wih and wjh represent the weight between node i and j, i and h, j and h, respectively [40].

A t-test is performed between non-seizure and seizure results for all events of each patient. The p-value of each t-test is shown in fig. 5, where each p-value is much less than 0.05, reflecting the excellent performance of the two metrics on discriminating between non-seizure and seizure periods. Cw presents a growth trend, namely, the clustering of synchronization among different brain regions experiences a sharp change due to the onset of seizure. In addition, the increase of the integrated $E_{loc}^{w}$ implies that during the seizure period a well-pronounced local synchronization behavior appears and compared to the non-seizure period, its efficiency among different brain regions is stronger. Once again, all these findings validate that our method possesses the capacity of exploring the dynamical properties underlying seizure processes, and is able to provide a more accurate seizure prediction.

Fig. 5:

Fig. 5: (Color online) The p-value for each t-test: (a) Cw; (b) $E_{loc}^{w}$ (see text for definitions of both quantities).

Standard image

Conclusions

In summary, our method allows realizing multi-information fusion. Successful utilization in vertical oil-water two-phase flow experiments demonstrates that the reconstructed network enables to distinguish between typical flow patterns, and further detects the inherent dynamical property during the transition among the different flow patterns. Furthermore, the results also illustrate that our technique performs much better than un-weighted networks and traditional approaches based on Pearson correlation analysis. On the other hand, a strong performance in patient-specific classification of non-seizure and seizure periods paves the way for the utilization of our method in the prediction task of seizure onset. Our method is however not limited to the above applications, but can make contributions in all cases in which addressing multi-information fusion challenges is needed, which is generally encountered in many data-intensive research areas like smart power grids and cities, as well as intelligent transport and health care.

Acknowledgments

This work was supported by National Natural Science Foundation of China under Grant No. 61473203, and the Natural Science Foundation of Tianjin, China under Grant No. 16JCYBJC18200.

Please wait… references are loading.
10.1209/0295-5075/119/50008