Abstract
A low-frequency gravitational-wave background (GWB) from the cosmic merger history of supermassive black holes is expected to be detected in the next few years by pulsar timing arrays. A GWB induces distinctive correlations in the pulsar residuals—the expected arrival time of the pulse less its actual arrival time. Simplifying assumptions are made in order to write an analytic expression for this correlation function, called the Hellings and Downs curve for an isotropic GWB, which depends on the angular separation of the pulsar pairs, the gravitational-wave frequency considered, and the distance to the pulsars. This is called the short-wavelength approximation, which we prove here rigorously and analytically for the first time.
Export citation and abstract BibTeX RIS
Original content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
1. Introduction
Gravitational waves (GWs) are ripples in the fabric of space-time, originating from some of the most violent events in the Universe, including the mergers of supermassive black holes. High frequency GWs from the merger of stellar-mass black holes were first detected by the Laser Interferometer Gravitational-wave Observatory (LIGO) in September 2015 [1], hailing the dawn of gravitational-wave astronomy. However, LIGO can only detect high frequency GWs, in the 100–1000Hz range. Similarly to electromagnetic radiation, different GW detectors are needed to probe different GW frequencies. Currently there are plans to launch a space-based GW detector in 2034—the Laser Interferometer Space Antenna (LISA) [2]—which will probe the millihertz GW frequency regime, thought to be populated primarily by merging supermassive black holes (SMBHs) in the 105–106 M⊙ range. At the very low-frequency end of the GW spectrum, one expects to find nanohertz GWs from very massive inspiraling SMBHs, in the range. These can be detected by timing millisecond pulsars, called a Pulsar Timing Array (PTA) [3–6]. Millisecond pulsars are excellent clocks, and delays or advances in their arrival times—inducing a timing residual—could signal the presence of GWs. PTA experiments are very active, and have been taking data for over a decade [7–9]. With a PTA, one can detect not only GWs from inspiraling SMBH binaries (SMBHBs), see e.g. [10, 11], but the GW background (GWB) from the cosmic merger history of SMBHBs [12–14]. This GWB is expected to be detected in the next few years [15, 16], with the details depending on the underlying astrophysics of the SMBH mergers. More details on PTAs can be found in recent review articles, e.g. [17–20], and an outline GW astrophysics covering nanohertz to kilohertz frequencies can be found in [21].
Indeed, a rigorous exploration and examination of the tools which will be used to make the first detection of a GWB is crucial. An isotropic GWB will induce characteristic correlations in the pulsar timing residuals. By cross-correlating these residuals, one expects to see a characteristic correlation called the Hellings and Downs curve [5]. Deviations from an isotropic GWB can be induced by nearby and/or particularly loud SMBHBs, inducing anisotropy in the GWB. Anisotropic GWBs will induce different correlations patterns, and have been explored by [22–26].
Here we prove analytically, and for the first time, that the Hellings and Downs curve can be extracted from the cross-correlated pulsar residuals, without making assumptions that the pulsars are all at the same distance L from the Earth. Part of this proof is a consequence of the application of the Riemann-Lebesgue lemma and the Lebesgue Dominated Convergence Theorem—well-known in the mathematics community, but less well-known in the field of GWs. We emphasize that no previous work has been able to do this analytically, though computer-aided integration has been used to verify one's intuition numerically.
2. The characteristic strain
The International PTA (IPTA) published combined data on 49 millisecond pulsars in their first data release [14]. These millisecond pulsars are the most stable natural astrophysical clocks known [27], and are regularly monitored by 8 radio telescopes: 5 in Europe [8], 2 in North America [7] and one in Australia [9]. PTAs take advantage of the precise arrival times of millisecond pulsars to enable GW detection.
The GWB is described in terms of its characteristic strain, hc(f), with amplitude A at a reference frequency of 1/yr (e.g. [28]):
The current upper limits on A are difficult to compare, since it was recently discovered that errors in planetary masses and positions (called the solar system ephemeris model) can directly affect the limit on A [12, 29], and in some cases mimic a GWB signal.
While the current upper limit on A from NANOGrav can take this into account, and limit A < 1.35 × 10−15, other PTAs have not yet published updates to their limits. Projections the characteristic strain accessible with future IPTA and Square Kilometer Array (SKA) [30–32] detectors are shown in figure 1.
The observed residuals due to the presence of a GWB with characteristic strain hc(f) is described by the cross-power spectral density of pulsar 1 and pulsar 2 by
see e.g. [34], where Γ1,2 is the so-called overlap reduction function, which describes the GWB-induced correlation signature in the pulsar residuals. This is a function of the frequency of the GWB, the distance to the pulsars L1,2, and the angular separation of the pulsars, ζ. PTA geometry is explored in detail in figure 2. For an isotropic GWB, this is called the Hellings and Downs curve [5], and for anisotropic GWBs see [22–25].
Download figure:
Standard image High-resolution image3. The Hellings and Downs curve
In analogy with [22, 24], we present an overview of how one arrives to the Hellings and Downs curve. A source of GWs in direction , see figure 2, generates a metric perturbation , which we describe as a plane wave:
This can be decomposed over two polarization tensors , and two independent polarization amplitudes [35, 36]:
We note that General Relativity predicts only two independent polarizations, plus +, and cross ×, while other theories predict additional polarizations, such as breathing modes [37–39]. He we restrict ourselves to the well-known tensor polarizations, A = +, ×.
The polarization tensors are uniquely defined by specifying and —the GW principal axes, illustrated in figure 2:
For a stationary, Gaussian, and unpolarized GWB, the polarization amplitudes satisfy (see e.g. [40]):
The metric perturbation will change the proper distance between the Earth and the pulsars, inducing an advance or delay in the pulsar pulse's arrival time at the Earth. Consider for example a millisecond pulsar with frequency ν0 whose location in the sky is described by , at a distance L from the Earth. The metric perturbation affects the frequency of the radio pulses, ν, received at the radio telescope. This frequency shift is given by
where
is the difference between the GW-induced metric perturbation at the Earth , the Earth term, with coordinates , and at the pulsar , the pulsar term, with coordinates :
The indices 'e' and 'p' refer to the Earth and the pulsar, however, it is standard write te = t, see e.g. [22, 24, 41, 42].
We can now write (3.6), using (3.1) and (3.2) as
The fractional frequency shift, z(t), produced by a stochastic GWB is simply given by integrating equation (3.5) over all directions. Using (3.1) and (3.8), we obtain:
where are the antenna beam patterns for each polarization A, which we write as
Searches for the GWB rely on looking for correlations induced by GWs in the timing residuals of pulsar pairs. Indeed, the observed quantity in PTA experiments is the timing residual , which is simply the integral of equation (3.9) in time:
The expected value of the correlation between a residual from pulsar 1 at time tj, with that from a different pulsar, say pulsar 2 at time tk, depends on terms of the form:
where H(f) contains the information of the spectrum of radiation. In analogy with [22, 23, 36], we define the quantity above that depends on the angular separation of the pulsars, ζ, their distances from the Earth, L1, L2, and the GW frequency f, as the overlap reduction function
where
In order to write a closed-form, analytic solution to (3.15), we choose a reference frame where one pulsar is placed along the z-axis and the other in the x-z plane as seen in figure 2. Specifically, we write
We remind the reader that and are the unit vectors pointing to pulsars 1 and 2, respectively, is the direction of GW propagation and and are the GW principal axes, see figure 2. Note that in this reference frame by (3.11), making it a convenient choice.
For an isotropic GWB one is free to choose whichever coordinate system is most convenient, as was done here. However, one must be more careful when considering reference frames which are used to describe pulsar locations in an anisotropic GWB, as was done by [22].
4. Main results
We choose the coordinate system defined in equation (3.17), and apply it to equation (3.15). The result is equations (4.1) and (4.2).
Claim. Let , be real positive constants. Then, for each , as and , we have
except when and , a case covered in [24]. Here,
Note that the above integrals are now written in terms of the coordinate system constructed in equation (3.17), and illustrated in figure 1, which was applied to (3.15) and (3.16).
Until now, one was only able to show this result by picking some values of pulsar distance 1, L1, and pulsar distance 2, L2 and solve (4.1) numerically assuming some GW frequency f. In the literature, e.g. [22, 24], the authors invoke the reader's physical intuition to support the numerical result—that if the exponents in (4.1) are large, , these oscillatory pieces rapidly converge to zero. This is often referred to as the 'short wavelength approximation', and has been used without proof, which we will now provide.
4.1. Proof of claim
To prove this result, we estimate each of the four integrals (4.4)–(4.8) below which make up (4.1) separately. We apply the Lebesgue Dominated Convergence theorem (see appendix), Fubini's Theorem, and the two-dimensional Divergence theorem to get the required limiting value (4.1). A key result used in the proofs which follow is a variant of the Riemann-Lebesgue Lemma in harmonic analysis (see [43], p. 277 and [44], p.2): let a, b be finite (though this is not necessary). Then for a Lebesgue integrable function f (a comprehensive definition and examples of this are given in appendix) over [a, b],
The aforementioned Dominated Convergence Theorem, equation (A.1), basically gives us conditions under which we can interchange the operation of taking the limit of an integral with the integral of the limit.
First, we show that K2(ζ, ϕ, θ) can be made continuous—and so absolutely integrable over its domain of definition—for all values of θ ∈ [0, π], ζ ∈ (0, π], and ϕ ∈ [0, 2π].
We use the identity
to show that the denominator of (4.2), i.e.,
for all θ ∈ [0, π], ζ ∈ (0, π], and ϕ ∈ [0, 2π]. It follows that the only singularities of K2 must occur when the denominator vanishes, and this occurs precisely when and , since both these quantities are necessarily non-negative. This, in turn, implies that for given ζ, θ = π − ζ and ϕ = π or ζ = 0, ϕ any, or ζ = π, ϕ any. Each of these cases is handled by limiting arguments.
For example, we note that
and
The previous equation gives a zero limit as ζ → 0+ and is otherwise finite. It follows from this that K2 can be defined to be a continuous function for any given value of ζ ∈ (0, π] and all values of θ ∈ [0, π] and ϕ ∈ [0, 2π]. Thus K2 is Lebesgue integrable over the region [0, 2π] × [0, π]. Next,
the last of which is identical to the required integral, (4.16). Note that each of the previous four integrals is necessarily finite since the region of integration is finite and K2 is absolutely integrable over it.
Write . Now we treat each of the previous three integrals (4.5)–(4.7) separately, and fix ζ ∈ (0, π].
4.2. Equation (4.5) tends to zero
Here we show that the first of the three equations with the exponential pulsar terms, (4.5), tends to zero. Using the above notation, and using the fact that K2 is absolutely integrable over , Fubini's theorem on the interchange of iterated integrals yields the equality,
which, after the change of variable , gives us,
where in the new variables is still absolutely integrable. Next, since is absolutely integrable over its domain, the ordinary two-dimensional version of the Riemann-Lebesgue Lemma, equation (4.3), implies that
Since the previous integral is itself , the Lebesgue Dominated Convergence theorem (A.1) can be used to interchange the order of the limit and the integral. We find:
Thus, (4.5) tends to zero as .
4.3. Equation (4.6) tends to zero
Preamble. In order to extend the previous idea to more general exponents, we apply integration by parts to double integrals via the Divergence Theorem. In order to prove either (4.6) or (4.7) it suffices that we obtain the decay estimates or as or . What follows is the general idea which we then apply to the various cases. We need to estimate limits of the form
as . (Note that we used Fubini's theorem to justify the interchange of the order of integration in the iterated integral (4.6)). Here along with its boundary (or perimeter), , are completely contained in and are chosen so that on and inside (which necessarily has no points in common with ). By construction, the gradient of , does not vanish on and therefore the quantity
is well-defined on .
We need to estimate the integral in (4.9) for large ω. First, observe that (suppressing the variables for clarity of exposition)
where we assume, in addition, that f is sufficiently smooth so that is defined. Since we have which when inserted into the previous display and integrated over yields,
An application of the divergence theorem to the integral on the left gives us,
where is the unit normal to , itself oriented in the positive direction, and σ is arc length. Combining the two previous displays we get,
Once we know that both integrands are absolutely integrable over and respectively, we get or as , over . The results (4.6) or (4.7) are obtained by a careful limiting analysis of the case where approaches which then gives us the desired decay estimate.
Proof, Case 1. . Set , in (4.9). Then so that if and only if and . For this yields the eight (8) critical (or stationary) points
all of which are located on the perimeter of . Since we want to be critical-point-free, for given , choose
and its perimeter,
Then, by construction, in as well as on its perimeter, Defining as in (4.10) we then obtain (4.11) for suitably smooth functions , i.e., on , for every . Now set and note that both integrals in (4.11) are finite on their respective region of integration. Thus, for every ,
i.e., so taking the limit as ε approaches zero, we must have
All that remains to be shown is that the interchange of the limits in the next expression is justified, i.e.,
as the right hand side of (4.14) is necessarily equal to (4.6) and so must vanish as well by (4.13), which is what we set out to prove. To this end, we note that, by continuity of the integrals,
and that, in fact, the convergence is uniform in ω, for .
Indeed, observe that
and this last integral may be made arbitrarily small, independently of ω, if ε is sufficiently restricted. Hence the convergence in (4.15) is uniform in ω (actually for any but we only require this for ω on the half axis, ).
We are now in a position to apply a fundamental theorem on the interchange of such limits (see citepf, p. 395) to validate the equality in (4.14) and complete the proof in the case where
Case 2. . In this case is independent of ϕ and the resulting double integral can be handled in a similar way as (4.5), the only difference being the presence of a negative sign in the exponent. This, however, causes no difficulty with the argument in that section and so we omit the details.
4.4. Equation (4.7) tends to zero
We use the same basic technique as in the proof of (4.6). The proof of the limiting result for (4.7) can be obtained by reducing it to the case of (4.6) just proved. For example, (4.7) may be rewritten in the form
where now it is that is absolutely integrable over , since K2 is and the exponential term has modulus equal to one. So, it follows from the methods above leading to (4.6) approaching zero as that (4.7) also tends to zero as . Similarly, interchanging the μ and λ terms in the preceding integral we obtain that (4.7) tends to zero as as well.
4.5. Final result: the Hellings and Downs curve
We have shown that for an isotropic GWB, and for , that the pulsar terms tend to zero. We can now write down the final form of the overlap reduction function: the 'Hellings and Downs' curve [5]:
for by [5, 22, 41]. Several comments should be made about equation (4.16) regarding a choice of normalization for the Hellings and Downs curve, the failure of the short-wavelength approximation, and the subsequent approximation of the pulsar term by a delta function for the autocorrelation.
When evaluating the autocorrelation, one can easily see that the value of the overlap reduction function is . Indeed, it is a choice of normalization to set the autocorrelation equal to one when , which requires that equation (4.16) be multiplied by .
Next, one will note the term: this takes into account the failure of the short-wavelength approximation when evaluating the autocorrelation term. We approximate this is a delta function, however, in appendix C of [25], equation C4, it is shown that the exact solution is
where we adopt natural units (c = 1). Here is a spherical Bessel function of the first kind. Since the oscillatory piece is suppressed by a factor of , approximating the pulsar term by a multiplicative factor of 2 for the autocorrelation case is justified.
Mingarelli and Sidery 2014 [24] also explored analytic expressions for the autocorrelation, and found that the term, equation (3.16), can be well approximated as , where . They showed numerically that the term contributed very little for large values of fL.
We note that for small values of ζ and fL, there is an intermediate regime where one requires more than to approximate . This case was explored in detail by Mingarelli and Sidery 2014 [24], who found that there are strong additional correlations from the pulsar term when the pulsars are separated by less than one gravitational wavelength. The authors also gave first and second order corrections for these cases.
5. Discussion and conclusion
We have shown analytically that the Hellings and Downs curve approaches the Earth-term only solution, even when the pulsars are arbitrarily distant from the Earth, and not themselves at the same distance L from the Earth. Of course, the case when is easily recovered, since in this case, and all terms (4.5)–(4.7) approach zero as the common value of this parameter fL approaches infinity.
The proofs indicate that the asymptotic estimate, equation (4.1), holds for sufficiently smooth kernels that are absolutely integrable over the region , and not just kernels of the form as considered here.
The astrophysical interpretation of this result is that if one monitors any galactic millisecond pulsar, and cross-correlates it with a pulsar in e.g. the Large Magellanic Cloud, the Hellings and Downs curve would still be correct correlation function to use, under the assumption that the GWB is isotropic. Anisotropic GWBs can be handled similarly, but care is required when evaluating the new kernel.
To summarize, we have shown that for pulsars at distances L1 and L2 from the Earth, that the pulsar terms tend to zero as the . The asymptotic estimate (4.1) is false if is fixed as there is no reason for the integral (4.6) to tend to zero as , since it is independent of . While this result is consistent with the previous intuition developed in the field of nanohertz GW astronomy, and indeed verified numerically for a few values of fL in [24], it has never before been proven analytically or generally for any . This result is an important validation of a fundamental result in the field, and lends credibility and rigor to current GWB searches as we enter detection era in nanohertz GW astronomy.
Acknowledgments
The authors thank Yacine Ali-Haïmoud for useful comments and a thorough reading of this manuscript. Figure 1 was generated in part with the online tool 'gwplotter'[32]. The Flatiron Institute is supported by the Simons Foundation.
: Appendix A. Overview of Lebesgue integration
Here we give a brief overview of the Lebesgue integral on the real line, , see ([45]), Chapter 10. Readers familiar with this concept may proceed immediately to the Main Results in section 3 and their proofs.
One of the great advantages of the Lebesgue integral over the classical (Riemann) integral is in the handling of limiting processes such as limits of integrals of sequences of functions which, in the classical case, usually requires uniform convergence of the sequence in question—this is not the case for the Lebesgue integral.
Fundamental to this now-standard integral in mathematics is the notion of Lebesgue measure. One intuitive notion of measure assigns the value b − a for the length of an interval or, more generally, , for the area of a plane rectangle formed by the Cartesian product of the two intervals and . The notion of measure allows one to extend this property (length, area, volume, etc) to Lebesgue measurable sets. Aside traditional examples such as rectangles, unions of disjoint intervals, etc one can consider the set of rational numbers on the real line, or its complement, the set of irrational numbers there. Both the latter two sets are Lebesgue measurable, the former has Lebesgue measure zero while the latter has positive Lebesgue measure.
More precisely, a set is said to have Lebesgue measure zero if for any given there is a sequence of intervals , n = 1, 2, ... such that E is contained within the union of all these intervals where, in addition, . For example, the set of all rational numbers has measure zero.
One of the connections between the Lebesgue theory and the ordinary Riemann theory of the integral is the following result: If a function f is bounded on an interval then f is Riemann integrable over if and only if the set of its discontinuities is a set of measure zero. By its very definition, the Lebesgue integrability of f forces that the absolute value of the function, , in question be Lebesgue integrable, which is not so in the case of the Riemann integral. Still, the advantage in using Lebesgue integrals is huge in that we can extend the class of functions and sets over which we are integrating and still get meaningful results.
Now we offer an extremely brief introduction to the Lebesgue integral, with a few key definitions and results. Briefly, with the Lebesgue integral we subdivide the range of a given function f, look for those parts where horizontal lines intersect the graph of f and then drop rectangles onto the domain of f. (Recall that we subdivide the domain of f in the case of the Riemann integral).
More generally, the measure of a set is the greatest lower bound (see e.g. [45], p. 11) of the set of all numbers of the form where the union of all the intervals contains E. A set E is said to be Lebesgue measurable if it has finite measure. A function f is said to be measurable if the special set is measurable for every real number a. These are precisely the functions that we can integrate in the Lebesgue sense, i.e., the Lebesgue integrable functions over E. First, the integral is defined for simple functions (i.e., those functions whose range is a finite set of points). Then, using the fact that for a given measurable function there is a monotonically increasing sequence of simple functions that converges to and whose integrals, , are bounded by a constant C (that depends on f), we define the Lebesgue integral of f over E by
Finally, when f is a general measurable function (i.e., not necessarily non-negative) we define its integral using its decomposition into positive and negative parts, that is, we know that where . Thus, the Lebesgue integral of a general measurable function f over E is by definition,
It is then an easy matter to see that
The Lebesgue Dominated Convergence theorem states that if is a sequence of measurable functions with almost everywhere on E. (This means that at all points x except for a set of measure zero.) In addition, let be Lebesgue integrable on E with then
This theorem is the main one being used in this paper in order to guarantee convergence of the various integrals. A similar theory and similar results hold in 2 or more dimensions.
The space is by definition the space of all complex valued integrable functions f such that
In addition, the space is the space of all such functions f for which there exists a constant, C, depending on f, such that almost everywhere on .