Optimizing vetoes for gravitational-wave transient searches

R Essick; L Blackburn; E Katsavounidis

doi:10.1088/0264-9381/30/15/155010

1. Introduction

The Laser Interferometer Gravitational-wave Observatory (LIGO) [1] together with Virgo [2] and GEO600 [3] form a network of detectors employing kilometer-scale interferometers to search for gravitational waves (GWs) from astrophysical and cosmological sources. One such class of sources is expected to result in short-lived signals lasting from milliseconds to several seconds within the sensitive frequency band of the instruments. They may correspond to core-collapse supernovae, neutron star glitches, cosmic string cusps and kinks, magnetars or some binary compact systems (made up of neutron stars/black holes) [4]. Environmental and instrumental noise sources may generate similar short-lived signals through their coupling to the GW sensing channel. These signals are colloquially referred to as 'glitches'. During the first-generation instruments' operation (2002–2010), such glitches were non-Gaussian and non-stationary and presented challenges when searching for transients of astrophysical origin. For well-modeled GW transient sources (like most of the binary compact star coalescences), knowledge of the expected signal waveform significantly helps reject such glitches. These signal-based vetoes have been developed and invoked in existing searches [5, 6]. However, for unmodeled (or poorly modeled) transient searches, such glitches may drive detection thresholds to significantly higher values than one would expect for Gaussian backgrounds [7, 8].

Interferometric detectors record their physical environment and detailed interferometry status through thousands of auxiliary channels that present no or negligible coupling to GWs. Information from these channels presents an important handle for understanding (and fixing) the sources of noise in the instruments, reducing the background and ultimately establishing confidence in detections. The problem of identifying and 'mechanizing' the use of information from auxiliary channels is long standing within the GW data analysis community [9–11].

A number of statistical quantities have been developed [12] in order to help characterize the performance of a particular auxiliary channel or veto strategy, such as veto efficiency: the fraction of GW-channel glitches removed, use percentage: the fraction of auxiliary channel glitches which can be associated with a GW-channel glitch [9, 13], dead-time: the effective fraction of analysis live-time removed when applying the veto strategy, and veto significance: the statistical significance of a measured correlation between auxiliary and GW-channel glitches assuming random coincidence [14]. These veto metrics are most appropriate for a simple veto strategy, such as a coincidence between the auxiliary and GW-channel glitch within a short specified time window. Expansions on this approach include making use of our knowledge of the instrument to anticipate when noise coupling between the auxiliary and GW channels is strongest and/or consistent with observation [15, 16]. The use of machine learning algorithms to digest the large amounts of auxiliary channel information and predict GW-channel glitches is also an active area of study [17]. Aside from auxiliary information, signal consistency of the event across multiple instruments, or between the observed signal and theoretical waveform, provide an extremely powerful way to reject noise transients. However, accidental transient noise coincidence across multiple instruments is still a dominant source of background in astrophysical searches, especially in searches for unmodeled transients.

In this paper we present an algorithm which we will call ordered veto list (OVL). It exploits the information from the auxiliary channels in GW detectors and uses a unified ranking metric, veto efficiency divided by dead-time (described earlier), in order to make inferences about the source of GW-channel events recorded at each detector. The OVL algorithm operates by generating a small time window surrounding a glitch in an auxiliary channel. If a GW-channel transient is present within this window, it is assumed to be noise and removed. An earlier version of the method, described in [18], was used to identify noise transients in a search for GW bursts during LIGO's fifth and Virgo's first science runs [7, 19].

OVL addresses several problems with the simple strategy of removing all live-time associated with a disturbance in an auxiliary channel. First, OVL identifies only those channels with noise that couples in a statistically significant way to the GW channel, thus avoiding the unnecessary removal of live-time. Second, when multiple auxiliary channels cover the same transient disturbances in the instrument, OVL selects the channel which optimally removes background with minimal loss of live-time. Ultimately, OVL provides an ordered list of rules to follow for successively removing time from the analysis based on auxiliary channel information, with the most effective channels at the top of the list [14]. The ordered list also provides a metric representing our confidence that any particular event is instrumental in origin.

Another hierarchical veto selection method, h-veto [20], has been developed. Both methods have a similar strategy for ranking veto channels. The major difference is in the figure-of-merit used in the ordering process. The ranking for h-veto depends on the statistical significance of the correlation between the auxiliary and GW channels under the assumption of a Poisson process. This choice favors large statistics, and generally results in a short list of highly efficient vetoes, where each entry represents the most relevant physical parameters (e.g. time-scale) of the coupling. OVL, on the other hand, ranks vetoes by the ratio of efficiency divided by the fractional dead-time. This favors vetoes which have the highest rate of GW-channel transients within their chosen exclusion windows. In this ranking, it is common for the same auxiliary channel to appear multiple times in the list with different thresholds on glitch strength or different exclusion window durations. Typically a channel is chosen first at the highest thresholds (strongest auxiliary channel disturbances) and smallest exclusion windows, and only appears later with more relaxed parameters. This choice maximizes the overall efficiency obtained using the best veto parameters at a fixed dead-time threshold at the expense of a longer, interlaced list which can be more difficult to interpret. Finally, in OVL, the veto windows are explicitly calculated as a set of non-overlapping time intervals (segments), and overlaps between different veto conditions are calculated exactly. This can be important when considering highly correlated auxiliary channels with significant overlap.

This paper is organized as follows. In section 2, we describe the OVL algorithm, including construction of veto configurations and their application to GW-channel data. In section 3, we discuss OVL's performance when applied to two samples of LIGO data: one month from LIGO's fourth science run (S4) and one week from LIGO's sixth science run (S6). We evaluate the method's performance using receiver operating characteristic (ROC) curves and examine some features of the ordering generated. We conclude in section 5.

2. Description of the algorithm

OVL is based on a simple process, which includes constructing veto configurations, the application of those configurations to GW-channel data, and iteration. It uses glitches identified by a generic transient-finding algorithm called KleineWelle (KW) [21]. Each of these glitches is characterized by a time, duration, frequency content, and significance³. While all our investigations are based on KW, any glitch-finding method able to provide at least a time and a significance for each event can be used. In this way, OVL can be adapted to specific searches for different types of GW signals, or to more general detection problems beyond GW astronomy.

2.1. Construction of veto configurations

The OVL algorithm systematically searches for coincident signals in the GW and auxiliary channels by identifying glitches in the GW channel that fall within pre-defined time window surrounding glitches in an auxiliary channel. In order to accommodate the variety of possible couplings between auxiliary channels and the GW channel, the algorithm tries a variety of window sizes and thresholds on auxiliary glitch thresholds. Each configuration is labeled by a set of three parameters: auxiliary channel name, a threshold on the significance of auxiliary glitches, and time window. OVL uses all permutations of auxiliary channels, significance thresholds and time windows to create a diverse set of configurations. In the specified auxiliary channel, OVL selects only those glitches with significance above the threshold, and then time windows are constructed around the central time of each remaining glitch. The union of these time windows forms a list of (possibly) disjoint segments that are used to remove live-time. Figure 1 demonstrates this procedure using some artificial data.

**Figure 1.** A cartoon showing how veto segments are constructed for a given configuration. Time runs on the x-axis while the y-axis is an arbitrary quantity that may represent the amplitude of the signal in the time-domain or some scalar quantity reflecting its significance in a time-frequency decomposition. In an auxiliary channel (labeled 'vchan' here), a threshold is applied to the glitches (shown as 'vthr') and then windows (indicated as 'vwin' above) are created surrounding those glitches. The union of these windows is then used to remove time from the GW channel (shown as 'DARM_ERR' above), thereby vetoing some GW-channel glitches.
Download figure:
Standard image High-resolution image

The algorithm creates these configurations to separate and categorize different types of auxiliary glitches. It then applies the configurations to the GW-channel data and searches for an optimal order. This allows the method to find highly correlated configurations while removing spurious information from uncorrelated channels, thereby identifying troublesome auxiliary channels or glitches.

There is no reason the parameters describing each configuration must be limited to channel name, time window, and significance threshold. By including other degrees of freedom such as frequency bands, glitch duration, data epoch, or time of day (e.g. mornings versus evenings), the algorithm will be able to find more specific correlations between configurations and GW-channel data, giving better overall performance. However, the total number of configurations must also be balanced against the number of auxiliary glitches available in order to maintain sufficient statistics.

2.2. Initial application of veto configurations to GW-channel data

OVL is based on the idea of ordering veto configurations by their correlations with GW-channel data. We achieve this by associating a figure-of-merit with each veto configuration. Initially, we assume there is no way to know which configuration will perform well, and we give each an equal opportunity⁴. Each veto configuration is applied to the entire GW-channel data set, which we represent as a list of times corresponding to GW-channel glitches. For each veto configuration, veto segments are generated as prescribed in section 2.1. We then count the number of GW-channel glitches that fall within these segments, and determine the configuration's efficiency, the fraction of events removed. Likewise, we determine the fractional dead-time for the configuration by measuring the fraction of live-time contained in the veto segments using explicit segment arithmetic. The ratio of these two quantities defines OVL's figure-of-merit, which we will refer to as efficiency-over-dead-time. This 'base-line' efficiency-over-dead-time measurement is then used to find an initial ordering scheme for all veto configurations. Figure 1 shows how this looks when comparing two time series.

We should note that the efficiency-over-dead-time has a straightforward interpretation in terms of Poisson processes. We can write the efficiency as ε = n_c/N_GW and fractional dead-time as f = t/T where n_c is the observed number of coincident events, N_GW is the total number of GW-channel glitches, t is the amount of time contained in the veto segments and T is the total amount of live-time. We then have

$\begin{equation} \varepsilon / f = \frac{n_c}{t (N_{{\rm GW}} / T)} \approx \frac{n_c}{t \lambda _{{\rm GW}}} \end{equation} \tag{ 1 }$

where λ_GW is an estimate of the rate of GW-channel glitches. This means that tλ_GW ≈ 〈n_c〉, the expected number of GW-channel coincidences that will randomly fall within the veto segments. We then see that the efficiency-over-dead-time is nothing but the ratio of the observed coincident events to the expected number of coincident events. As we will see in the results and analysis section, we observe efficiency-over-dead-times as high as 10⁴ for some highly correlated configurations.

Besides associating an efficiency-over-dead-time with each configuration, OVL also records the Poisson significance of finding the measured number of coincidences. We compute the Poisson significance as the cumulative probability of observing as many or more coincident GW-channel glitches given the expected number of coincident glitches. OVL uses this significance threshold to avoid over-training by requiring the observed correlations to be reasonably unlikely to occur by chance given the large number of configurations tested.

We should note that rankings based on efficiency-over-dead-time and rankings based on Poisson significance (as is done with h-veto) are not equivalent, and the two metrics may generate different lists. We can see this by noting that the statistical significance of observing a given number of coincidences can be written as

$\begin{equation} p = \sum _{k=n_c}^{\infty } \frac{\langle n_c\rangle ^k}{k!} {\,\rm e}^{-\langle n_c\rangle } = \sum _{k=n_c}^{\infty } \frac{(n_c(f/\varepsilon ))^k}{k!} {\,\rm e}^{-n_c(f/\varepsilon )}. \end{equation} \tag{ 2 }$

We then see that the statistical significance is a function of both the number of coincident events and efficiency-over-dead-time. Generally, for a given ε/f, the Poisson statistical significance will favor veto configurations with large n_c, and therefore favors configurations with large number statistics versus a ranking based only on ε/f.

2.3. Iteration and ordering

Once a 'base-line' efficiency-over-dead-time has been determined for each veto configuration, they are ordered accordingly. At this point, veto configurations are applied hierarchically. When a configuration is applied, OVL counts the number of coincident GW-channel glitches and evaluates the associated efficiency-over-dead-time and Poisson significance. The GW-channel glitches and veto segments associated with this configuration are removed from the rest of the analysis. In this way, OVL prevents redundant vetoes. Later configurations do not count GW-channel events or live-time already removed by previous configurations. This data-reduction scheme is applied after each configuration, and therefore each veto configuration in the list sees a slightly different set of GW-channel glitches. Furthermore, the statistics computed for each configuration do not represent global fractions, but rather are fractions of a subset of data. This also allows OVL to measure the additional information contained in later configurations. If they are completely redundant with an earlier configuration, then they will not find any new GW-channel glitches and their efficiency-over-dead-time will vanish. In this way, OVL can remove spurious, extraneous, or redundant veto configurations.

This begs the question of what determines an important configuration. We define these as configurations that yield both a sufficiently high efficiency-over-dead-time and Poisson significance. Because these are functionally dependent, we can think of this as requiring a sufficient strength of correlation and a sufficiently large number of observed coincidences. By including the Poisson significance threshold, OVL tends to reject extremely low number statistics with low correlation strength, which makes the ordered list's performance more robust when applying it to different data sets. If configurations do not perform above these two thresholds, they are removed from the list. This process helps OVL to converge rapidly, and prevents under-performing channels which will be removed in the next iteration from affecting the performance of later channels. For final evaluation of a list's performance after the ordering process is complete, all channels in the list are applied.

In practice, the entire method is repeated several times, which allows for repeated evaluation and ranking of the configurations. After each iteration, the veto configurations are re-ordered according to their most recent efficiency-over-dead-time measurements. The next iteration re-applies the configurations in this new order. If that order is optimal, it won't change between iterations. After about two iterations, the list gains the bulk of its performance, with a nearly monotonic decrease in efficiency-over-dead-time as one reads down the ordered list. The near monotonicity is caused by the finite number of iterations, as well as phenomena analogous to cycles in Markov processes. Once a sufficiently monotonic list is created, further iterations do not have a large impact on performance, but they can help simplify the list by removing unnecessary configurations, and are relatively cheap to calculate because the final lists are comparatively short.

3. Results and analysis

An earlier version of this method [18] was used to reject transient noise artifacts in searches for unmodeled bursts in LIGO's fifth science run (S5) and Virgo's first science run (VSR1) data [7, 19]. In those searches, the method was able to reject 13–45% of single-detector glitches and 5–10% of coincident background (which is generally weak) at under 1% dead-time. In order to further study the performance and systematics of the OVL procedure, we analyzed the entirety of LIGO S4 data (22 February 2005–25 March 2005) from the LIGO Hanford Observatory (H1) and one week of LIGO S6 data (28 May 2010–4 June 2010) from the LIGO Livingston Observatory (L1) using KW glitches [21]. These two data sets were collected from geographically distant detectors and are separated by several years. They therefore represent very different noise environments. Furthermore, because the LIGO detectors underwent significant commissioning between S4 and S6, these data sets may capture different couplings between auxiliary channels and the GW-channel data. Therefore, the similarities in OVL's performance on these very different data sets allows us to draw conclusions about the method rather than peculiarities in either data set.

In this analysis, because there may be causal relations between the GW-channel and auxiliary channels, we restrict ourselves to a subset of auxiliary channels shown to be minimally coupled to the GW channel and thus safe to be used as vetoes. These safety relations are determined through hardware injections at the sites, in which the test masses are driven with a known GW-like transient waveform. Searches for these transients in auxiliary channels during hardware injections at the instruments have been used to establish the exact conditions on their strength (either absolute or related to the GW-channel signal) that make them 'safe', i.e., ensuring that they will not systematically reject an astrophysical GW transient [8, 19, 22].

In what follows, we find similar performance between the two data sets, and describe the characteristics of the method. In both cases, OVL identifies a subset of channels that appear to be correlated with the GW-channel data which is significantly smaller than the set of all auxiliary channels.

Our implementation of this algorithm used python. An AMD 2.7 GHz 32-bit processor processed 3.7 × 10⁴ s of data containing 2.8 × 10³ GW-channel glitches with 250 auxiliary channels from S6 in approximately 9 h. Nearly half the computational effort is spent on the first two iterations, with the following iterations processing much faster.

3.1. ROC curves and bulk statistics

The ROC curve is the basic diagnostic for how the OVL algorithm performs. It shows the fraction of GW-channel glitches removed as a function of the fraction of live-time removed. As a first experiment, we compared two ranking figures-of-merit: efficiency-over-dead-time and the Poisson significance, both described in section 2.2. Figure 2 shows the ROC curves for each ranking scheme. As expected, ranking by incremental efficiency-over-dead-time results in a higher-performing ROC when measuring overall cumulative efficiency as a function of dead-time, though the two rankings begin to converge near at the end where all effective veto conditions are used up. Furthermore, the Poisson significance selects many fewer configurations, each covering a larger number of events. This is seen clearly by the discreteness of the curve. A short veto list is often convenient for simplicity and ease of interpretation. A smooth curve with many configurations may also be preferable in cases where one requires a continuous ranking parameter.

**Figure 2.** Comparison of ranking by two figures-of-merit. Both curves are generated over the S6 data set, using the same code, set of configurations, and performance thresholds. These curves represent the ninth iteration through OVL, when both ordered lists have settled down to a near optimal order. We see that the efficiency-over-dead-time (ε/f: blue) curve is better than the Poisson significance curve (P_poisson: green). Efficiency-over-dead-time also produces a much smoother curve, indicating that Poisson significance favors larger number statistics.
Download figure:
Standard image High-resolution image

As stated above, OVL's performance improves upon iteration, but the majority of the efficiency gains are achieved after two iterations. Figure 3 shows OVL's performance after several different iterations. The main advantage of running to nine iterations is a reduction in the number of veto configurations identified. Table 1 shows that the number of important channels stabilizes rather early, but the number of important configurations continues to decrease. This means that further iteration helps to compress the information stored in the important channels into a shorter list and determine exactly which glitches in which channels are most troublesome.

Table 1. Number of auxiliary channels and veto configurations present in OVL as a function of iteration number. We see that the number of important channels stabilizes quickly and corresponds to the bulk efficiency gains seen in figure 3. The information in these channels is then compressed into a small number of veto configurations (corresponding to different values of auxiliary channel glitch significance and coincidence time window) upon further iterations.

Iteration		Initial	1	2	3	4	5	6	7	8	9
S4 H1	No. chan	161	106	99	52	47	47	47	47	47	47
	No. config	7245	3117	294	196	183	178	176	176	176	175
S6 L1	No. chan	202	44	37	35	35	35	35	35	35	35
	No. config	11 250	4361	209	140	128	119	118	118	117	117

We also note that the ordering improves upon further iteration. Figure 3 shows the relation between a configuration's efficiency-over-dead-time and its position in the list for different iterations. We observe a general smoothing with further iterations. Even though the bulk of OVL's performance is gained after 2 iterations, the list still contains many irrelevant and under-performing configurations. By iteration 9, these configurations have been removed and efficiency-over-dead-time decrease much more smoothly, although there are still some fluctuations. Furthermore, the histograms of efficiency-over-dead-time show the distribution of performance over the list. We see that the peak of the distribution is well away from the Poisson coincidence prediction of 1, and iteration does not damage this distribution.

3.2. Performance on uncorrelated data

In order to further check the overall implementation and performance of our veto algorithm, we examined its output when it was applied on uncorrelated data. For this purpose we constructed artificial data sets by randomly shifting the auxiliary glitches in every channel by a different amount of time using S6 data. In this way, we break all temporal correlations between the GW channel and auxiliary channels, as well as between the auxiliary channels themselves. We therefore expect OVL to be subject only to correlations due to statistical fluctuations in the coincidence of two Poisson processes. We processed this data through the OVL pipeline and examined the ROC curve as well as the dependence of the algorithm's figure-of-merit as a function of a configuration's fractional rank. This is shown in figure 4.

When compared with the plots in figure 3 we see that the ROC curve is much worse for the uncorrelated data, as expected. However, there are common features in the lists, namely the comparable decay in efficiency-over-dead-time as one moves to higher ranks.

Figure 4 shows the relation between efficiency-over-dead-time and a configuration's rank. This ordering consists of over-training only because it represents OVL's performance when evaluated on the same data used to generate the ordered list. We should note that when we apply the threshold on Poisson significance used for the correlated data to this uncorrelated data, only 1% of these configuration survive. However, by lowering the Poisson significance threshold, we are able to examine the underlying distribution of noise in our analysis. We can immediately interpret our threshold on efficiency-over-dead-time from the projected histogram as designed to separate the distributions for correlated data (figure 3) and uncorrelated data (figure 4).

3.3. Round robin algorithm

There exists a potential danger of over-training OVL with a single data set, which may produce artificially high performance that will not generalize to other data. For this purpose, we performed a simple cross-referencing procedure that shows this is not significant for our results. We divided the data into separate sets, which were arranged like a 1-dimensional checker-board in time, meaning the kth bin contains every kth minute of the live-time. Training was then performed on all sets except one, and that one set was used for evaluation. This was carried out with each set, so we have a measure of the algorithm's performance over all our data. Figure 5 shows a comparison between the round robin and non-round robin procedure. We see that there is no significant difference between the two curves at the 68% confidence level, although there is a small systematic error introduced. The consistency between round robin and non-round robin performance is due to OVL's threshold on Poisson significance, described in section 2.2. By requiring the number of observed coincidences to be statistically significant relative to the total number of veto configurations tested, OVL can reject configurations with low number statistics that may pass the performance threshold due to random coincidence alone.

**Figure 5.** A comparison of round robin and non-round robin analysis techniques. Shaded regions represent 68% confidence intervals. The round robin algorithm ensures that lists are developed and then evaluated using disjoint data sets. However, we see that the application to disjoint data has a very small affect on the list's performance. This suggests that OVL finds overwhelmingly true correlations in the data. The observed decrease in performance could be due to a combination small over-training errors as well as statistical fluctuations caused by evaluating a list's performance on a smaller data set (implicit in the division associated with round robin analysis).
Download figure:
Standard image High-resolution image

We also examine the performance of individual configurations when round robin binning is used. Figure 6 shows the efficiency-over-dead-time for correlated and uncorrelated S6 data (discussed further in section 3.2), and should be compared with figures 3 and 4. It is clear that both correlated and uncorrelated data contain some configurations that do not perform well on disjoint data. These are likely due to statistical fluctuations and represent the sharp spikes in these plots. However, we also see that consistent performance is the norm in correlated data, whereas under-performance is the norm for uncorrelated data. The few outliers in the uncorrelated data set are due to single auxiliary glitches that happen to coincide with GW-channel events, and are statistical in nature. We also note that the distributions over efficiency-over-dead-time are very different for the two data sets. Encouragingly, the uncorrelated data appears to have a peak in likelihood near ε/f = 1, which agrees with our Poissonian interpretation. We should also note that the ordering in the uncorrelated data is generated entirely by statistical errors in the estimation of efficiency and dead-time, which is clearly visible in the error estimates for uncorrelated data show in figure 6.

3.4. Effects of algorithmic parameters

OVL constructs a set of configurations based on a list of channels, thresholds, and time windows, using all possible permutations of these parameters. In the bulk of this paper, we used time windows from the set {25 ms, 50 ms, 100 ms, 150 ms, 200 ms} and KW significance thresholds from the set {15, 25, 30, 50, 100, 200, 400, 800, 1600}. These thresholds and windows were chosen because they reflect the types of glitch significance and correlation time-scales observed in the past. We would expect that increasing the dimensionality of these configurations would allow the algorithm to better separate classes of glitches (characterized by the elements of the configuration).

Generally, we see that high auxiliary significance thresholds congregate at the top of the ordered list. This is because these events are relatively rare and are likely to be strongly correlated with GW-channel glitches. We contrast this with low-threshold events, which are relatively common and may be more strongly influenced by a statistical background. This also means that these high-threshold auxiliary glitches are used first and dominate OVL's performance at low dead-time. However, at sufficiently high dead-time, low-significance auxiliary glitches begin to dominate performance. Consider a set of veto configurations with the same auxiliary channel and significance threshold. Then, for a configuration from this set to remove more GW-channel glitches, it must widen the time window applied around auxiliary glitches. If the GW-channel glitches are clustered, this can improve efficiency, but if the GW-channel glitches are sparse, it will mainly increase fractional dead-time without removing more glitches. However, lower threshold events may be able to better select small, widely separated GW-channel glitches without increasing the time window used. Even if the correlation is weaker, this can increase overall performance. Similar effects are observed with the time windows. Small time windows typically show up at the top of the list with larger time windows appearing later and at decreased efficiency-over-dead-time.

4. Applications in instrument characterization

The OVL method we just described generates a wealth of information beyond the list of times that should be used as vetoes in a search for GW transients. This information can be mined and used for general instrument monitoring and characterization. The algorithm is also simple and efficient enough so that to be able to analyze data within a few seconds from real time and without any significant impact on computational resources.

In giving a flavor of the capacity of our analysis, we will focus on the immediately derived quantities out of our statistical analysis⁵. The typical output of our algorithm in the training phase is captured in table 2, where the most relevant auxiliary channels (and their corresponding parameters) are ranked using our unified criterion.

Table 2. This is an example OVL output from iteration 9 of the S6 data. We observe a nearly monotonic decrease in efficiency-over-dead-time as we proceed down the list. We also notice that several channels appear multiple times with different sets of significance thresholds and time windows. In way of definition, live-time is the number of seconds in the analysis; #gwtrg is the number of GW-channel glitches in the analysis; vchan is the auxiliary channel name; vthr is the threshold on auxiliary KW significance; vwin is the time window applied around each auxiliary glitch; deadsec is the number of seconds removed by this configuration; vexp is the expected number of GW-channel glitches coincident contained in deadsec; vact is the actual number of coincident GW-channel glitches; vsignif is the negative-logarithm of the Poisson significance of observing vact coincident GW-channel glitches when we expected vexp; eff is the efficiency; fdt is the fractional dead-time; eff/fdt is efficiency-over-dead-time, OVL's ranking metric.

${\tt livetime}$	${\tt \# gwtrg}$	${\tt vchn}$	${\tt vthr}$	${\tt vwin}$	${\tt deadsec}$	${\tt vexp}$	${\tt vact}$	${\tt vsignif}$	${\tt eff/fdt}$
${\tt 374014.000}$	${\tt 2826}$	$\mbox{ L1$\_$LSC-POB$\_$Q$\_$1024$\_$4096}$	${\tt 400}$	${\tt 0.025}$	${\tt 0.238}$	${\tt 0.00}$	${\tt 6}$	${\tt 44.50}$	${\tt 3334.29}$
${\tt 374013.762}$	${\tt 2820}$	$\mbox{ L1$\_$OMC-PZT$\_$LSC$\_$OUT$\_$DAQ$\_$8$\_$1024}$	${\tt 1600}$	${\tt 0.025}$	${\tt 1.500}$	${\tt 0.01}$	${\tt 32}$	${\tt 225.00}$	${\tt 2829.42 }$
${\tt 374012.262}$	${\tt 2788}$	$\mbox{ L1$\_$OMC-PZT$\_$LSC$\_$OUT$\_$DAQ$\_$8$\_$1024}$	${\tt 400}$	${\tt 0.025}$	${\tt 0.550}$	${\tt 0.00}$	${\tt 11}$	${\tt 77.97}$	${\tt 2683.02 }$
${\tt 374011.712}$	${\tt 2777}$	${\mbox{ L1$\_$ISI-OMC$\_$GEOPF$\_$H2$\_$IN1$\_$DAQ$\_$8$\_$1024}}$	${\tt 1600}$	${\tt 0.100}$	${\tt 0.150}$	${\tt 0.00}$	${\tt 3}$	${\tt 22.19}$	${\tt 2693.64 }$
${\tt 374011.562}$	${\tt 2774}$	$\mbox{ L0$\_$PEM-LVEA$\_$SEISZ$\_$8$\_$128}$	${\tt 200}$	${\tt 0.050}$	${\tt 0.100}$	${\tt 0.00}$	${\tt 2}$	${\tt 15.11}$	${\tt 2696.55 }$
${\tt 374011.462}$	${\tt 2772}$	$\mbox{ L1$\_$ISI-OMC$\_$GEOPF$\_$H1$\_$IN1$\_$DAQ$\_$8$\_$1024}$	${\tt 400}$	${\tt 0.025}$	${\tt 0.266}$	${\tt 0.00}$	${\tt 5}$	${\tt 35.94}$	${\tt 2539.06 }$
${\tt 374011.196}$	${\tt 2767}$	$\mbox{ L1$\_$OMC-PZT$\_$LSC$\_$OUT$\_$DAQ$\_$8$\_$1024}$	${\tt 200}$	${\tt 0.025}$	${\tt 0.541}$	${\tt 0.00}$	${\tt 9}$	${\tt 62.50}$	${\tt 2250.26}$
${\tt 374010.656}$	${\tt 2758}$	$\mbox{ L1$\_$OMC-PZT$\_$LSC$\_$OUT$\_$DAQ$\_$8$\_$1024}$	${\tt 100}$	${\tt 0.025}$	${\tt 0.908}$	${\tt 0.01}$	${\tt 14}$	${\tt 95.28}$	${\tt 2090.54}$
${\tt 374009.747}$	${\tt 2744}$	$\mbox{ L1$\_$ASC-ITMY$\_$P$\_$8$\_$256}$	${\tt 400}$	${\tt 0.025}$	${\tt 3.700}$	${\tt 0.03}$	${\tt 56}$	${\tt 374.35}$	${\tt 2062.93}$
${\tt 374006.047}$	${\tt 2688}$	$\mbox{ L0$\_$PEM-LVEA$\_$BAYMIC$\_$8$\_$1024}$	${\tt 400}$	${\tt 0.025}$	${\tt 1.835}$	${\tt 0.01}$	${\tt 25}$	${\tt 166.23}$	${\tt 1895.94}$

The identification and ranking of channels that contribute to GW-like glitches allows tracking such contributions over time, thus promptly signaling changes in the couplings between instrument channels and their environments. In figure 7 we show an example of how auxiliary channels appear in OVL's final training table (like the one in table 2). The colors reflect the significance of each channel in the corresponding veto configuration list, with dark red corresponding to high efficiency-over-dead-time and blue corresponding to low efficiency-over-dead-time. Applying simple thresholds either in the absolute significance or in the stride-by-stride changes in significance can provide useful handles in identifying and investigating changes in instrument performance.

**Figure 7.** Appearance of auxiliary channels shown over the single week of S6 data. Channels' relative performance is encoded in the color scheme, with dark red implying high ε/f and blue low ε/f. Simple thresholding on absolute significance or relative changes in significance can provide handles for initiating and assisting investigations on instrument performance. We should note that the 'L1' prefix on the channel names corresponds to the LIGO Livingston detector, and the last two numbers at the end of the channel name correspond to the bandwidth of that channel (in hertz).
Download figure:
Standard image High-resolution image

Zooming out to more macroscopic quantities, key summary quantities encoded in our ROC curves like efficiency and dead-time can be readily used for prompt figure-of-merit criteria when making data acquisition decisions as data come in. For example, by fixing a tolerable dead-time one can monitor the efficiency of the veto algorithm as a function of time, or in a symmetric way monitor the dead-time corresponding to a veto configuration that results in a fixed efficiency. We show one such example in figure 8. In this plot the resulting efficiency in removing GW-like glitches from the GW channel over seven days of S6 data is shown at two arbitrary dead-times, one at 0.1% and another one at 0.01%.

**Figure 8.** Efficiency at fixed fractional dead-time over one week of S6 data. This plot shows the change in performance for a single ordered list when it is applied on different days. The abscissa shows the time difference between the training and evaluation sets and we see that there is significant fluctuation in performance, although those fluctuations appear to be bounded.
Download figure:
Standard image High-resolution image

The examples we have used above mostly target identification and studies of what has been a commonly encountered feature of the first-generation GW detectors, namely, non-stationarity. Our algorithm assists the study of such variability at the single-channel level as well as at the macroscopic level, in terms of quantities that will end up being relevant in an end-to-end search (like residual singles rates before and after veto application).

We should also note that OVL can be used to conduct safety studies, in which we determine which auxiliary channels are coupled to possible GW signals. The results presented in this paper are predicated on previous safety studies, conducted via hardware injections. However, OVL need not resort to these external studies. By running the OVL algorithm over a set of fake glitches corresponding to only hardware injections, OVL will naturally select the auxiliary channels that are highly correlated with these injections. By removing these auxiliary channels and re-processing the hardware injections, we can iteratively determine which auxiliary channels are unsafe and remove them from the analysis. Clearly, the procedure should terminate when OVL's performance becomes consistent with uncorrelated data, and example of which is shown in section 3.2. These safety studies can be repeated for various types of injections in order to determine which auxiliary channels are unsafe for certain classes of expected waveforms.

5. Conclusions

A variety of astrophysical sources of GWs are expected to produce short-duration signals within the sensitive frequency band of kilometer-scale ground-based interferometers, such as those operated by the LIGO, GEO600 and Virgo collaborations. Multiple terrestrial noise sources can produce transient artifacts (glitches) that resemble these short-duration signals, and their contribution to the background for searches of these types of signals is a limiting factor for search sensitivity. An active area of research involves developing procedures to remove this transient noise during analysis of GW-channel data by using information derived from the many auxiliary channels which measure the local environment and other instrumental non-GW degrees of freedom. We presented the ordered veto list (OVL) algorithm, which aims to identify the unique and most relevant correlations between GW-channel glitches and similar signals seen in auxiliary channels by use of an iterative application and ranking of auxiliary channel veto conditions.

OVL uses many veto configurations to describe the types of couplings between auxiliary channels and the GW channel. This implementation used auxiliary channel name, a threshold on auxiliary glitch significance, and a time window surrounding auxiliary glitches. However, the algorithm can be easily extended to include other parameters such as frequency information or time of day. OVL then constructs a list of segments using these configuration parameters. The union of time windows surrounding auxiliary glitches with a sufficiently high significance is used to remove data from the GW channel. OVL then computes the configuration's efficiency and dead-time, the fractions of GW-channel glitches and live-time removed, respectively. The configurations are then ranked according to the ratio of these numbers, efficiency-over-dead-time, which has the natural interpretation as the ratio of the number of removed GW-channel glitches and the expected number of removed GW-channel glitches. Furthermore, these configurations are applied hierarchically, in that once a configuration is applied, later configurations do not see any data removed by earlier configurations. Upon iteration, this repeated ranking and application finds highly correlated auxiliary channels and constructs an ordered list based on the efficiency-over-dead-times observed. It also removes any redundant or uninformative channels.

This paper presented OVL results based on the entirety of LIGO S4 data (22 February 2005–25 March 2005) from the LIGO Hanford Observatory (H1) and one week of LIGO S6 data (28 May 2010–4 June 2010) from the LIGO Livingston Observatory (L1) using KleineWelle glitches. These two data sets represent very different noise sources and couplings between auxiliary channels and the GW channel. Therefore, the common trends we observe in the two data sets are indicative of OVL itself, rather than any peculiarities within a data set.

We examined OVL's performance using standard receiver operating characteristic (ROC) curves, which shows the fraction of noise signals we are able to remove as a function of the fractional loss in live-time incurred. We see that the bulk performance is gained after two iterations, and this improves over the application of configurations based solely on their individual performance when applied to the entire data set. Further iteration helps to compress the useful information in the ordered list into a small number of configurations. This helps isolate exactly which correlations are significant and makes the list easier to read. We find that while the ROC curve only marginally improves between two and nine iterations, the ordered list is only 1–2% as long after nine iterations.

Finally, we point out a few possible applications of OVL to instrument characterization, including tracking and accounting for instrument non-stationarity and identification of subsystems containing channels that are highly correlated with GW-channel glitches.

Acknowledgments

RE and EK gratefully acknowledge the support of the National Science Foundation and the LIGO Laboratory. LIGO was constructed by the California Institute of Technology and Massachusetts Institute of Technology with funding from the National Science Foundation and operates under cooperative agreement PHY-0757058. LB is supported by an appointment to the NASA Postdoctoral Program at Goddard Space Flight Center, administered by Oak Ridge Associated Universities through a contract with NASA. The authors would also like to acknowledge comments and feedback received by members of the Bursts, Compact Binary Coalescences and Detector Characterization working groups of the LIGO Scientific Collaboration and the Virgo Collaboration, as well as the entire LIGO Scientific Collaboration and Virgo Collaboration for access to this data. This paper has been assigned LIGO document number LIGO-P1300038.

Optimizing vetoes for gravitational-wave transient searches

Article metrics

Submit

Permissions

Author e-mails

Author affiliations

Dates

Abstract

1. Introduction