Brought to you by:

Articles

ALL-SKY INTERFEROMETRY WITH SPHERICAL HARMONIC TRANSIT TELESCOPES

, , , , and

Published 2014 January 8 © 2014. The American Astronomical Society. All rights reserved.
, , Citation J. Richard Shaw et al 2014 ApJ 781 57 DOI 10.1088/0004-637X/781/2/57

0004-637X/781/2/57

ABSTRACT

In this paper, we describe the spherical harmonic transit telescope through the use of a novel formalism for the analysis of transit radio telescopes. This all-sky approach bypasses the curved-sky complications of traditional interferometry and so is particularly well-suited to the analysis of wide-field radio interferometers. It enables compact and computationally efficient representations of the data and its statistics, which allow new ways of approaching important problems like map-making and foreground removal. In particular, we show how it enables the use of the Karhunen-Loève transform as a highly effective foreground filter, suppressing realistic foreground residuals for our fiducial example by at least a factor 20 below the 21 cm signal, even in highly contaminated regions of the sky. This is despite the presence of the mode-mixing inherent in real-world instruments with frequency-dependent beams. We show, using Fisher forecasting, that foreground cleaning has little effect on power spectrum constraints compared to hypothetical foreground-free measurements. Beyond providing a natural real-world data analysis framework for 21 cm telescopes now under construction and future experiments, this formalism allows accurate power spectrum forecasts to be made that include the interplay of design constraints and realistic experimental systematics with 21st century 21 cm science.

Export citation and abstract BibTeX RIS

1. INTRODUCTION

Mapping the universe with the 21 cm line of neutral hydrogen will revolutionize our view of the universe. It holds the promise of unraveling the mysteries of dark energy (Chang et al. 2008; Loeb & Wyithe 2008), unveiling the epoch of reionization (Furlanetto et al. 2006), and perhaps even extending our view of the cosmos out far enough to shine light on the primordial dark ages (Loeb & Zaldarriaga 2004). Rapidly probing large volumes of the universe requires new large wide-field telescopes along with powerful new digital processing hardware such as GMRT,4 LOFAR,5 MWA,6 MITEoR (Zheng et al. 2013), PAPER,7 BAOBAB,8 BAORadio,9 BINGO (Battye et al. 2013), CHIME,10 EMBRACE/EMMA (Kant et al. 2011), and Tianlai,11 In recent years it has become increasingly clear that new methods of interpreting and analyzing the data from these revolutionary new instruments will be necessary to realize their scientific potential (Myers et al. 2003; Tegmark & Zaldarriaga 2009; Parsons & Backer 2009; Liu et al. 2010; Liu & Tegmark 2011; Parsons et al. 2012; Dillon et al. 2013).

We describe here the spherical harmonic transit telescope, a new paradigm for analyzing wide-field transit telescopes in the spherical harmonic domain, which is naturally suited to mapping the 21 cm universe. Any telescope with fixed pointing observes the sky transit through its field of view. The periodic rotation about the poles over the course of a sidereal day creates a linear correspondence between time, t, and azimuthal angle, ϕ. We obtain a simple mapping between the observed data and a linear combination of the spherical harmonic coefficients alm of the sky at fixed angular wavenumber m, mediated by the angular response of each element. In the following, we elaborate and make precise this basic idea in the context of wide-field interferometers, including the radial (frequency) direction.

This formalism diverges sharply from traditional characterizations of radio interferometry, which are better suited to observations with a narrow field of view, often assuming tracking of a particular source of interest, and exploit Fourier transform mapping between the sky and the uv-plane. What we describe here is an all-sky formalism for describing interferometry that naturally incorporates the observable modes on the sky—the spherical harmonics. Previous work has described interferometry on the full sky (Kim 2007; McEwen & Scaife 2008), however, in the domain of transit telescopes there are many computational advantages we can exploit.

The foremost challenge for any 21 cm mapping experiment is separating the cosmological signal from astrophysical contaminants which are 103–105 times larger (Furlanetto et al. 2006; Morales & Wyithe 2010). Conceptually this is simple—the primary foreground sources (diffuse synchrotron emission from the Galaxy and emission from extragalactic point sources) are smooth as a function of frequency, while the 21 cm signal decorrelates quickly as each frequency corresponds to a different radial slice of the universe. To remove foregrounds one just needs to model and remove the smooth frequency component from their observations. Unfortunately, in practice, the large dynamic range between the amplitude of the foregrounds and the 21 cm signal makes several real-world effects extremely problematic. While the properties of the cosmological 21 cm signal are thought to be well understood, the astrophysical foregrounds are poorly constrained at the small angular and frequency scales that will be probed by forthcoming 21 cm experiments—a successful technique should be robust to uncertainties in foreground modeling. Of course, these experiments will help characterize the properties of real-world foregrounds. More troublesome is the phenomenon of angular-frequency mode-mixing: in any real experiment the shape of the beam on the sky will vary with the observed frequency (Liu et al. 2009). This mode-mixing makes simple frequency-only foreground methods ineffective in practice. We show below that the spherical harmonic transit telescope formalism can naturally address the issues of model uncertainty and mode-mixing, and enable efficient and effective discrimination of the 21 cm signal from obscuring foregrounds.

Any foreground removal method aims to find a subset of the data within which there is significantly more 21 cm signal than astrophysical foregrounds. However, in the presence of mode-mixing, it is not obvious how to select a basis which separates the two components—what we would like is a method which can automatically generate this. Just such a technique exists in the form of the Karhunen-Loève (KL) transform. In this paper, we show how the m-mode formalism, described below, makes the use of the KL transform computationally feasible. The result is a remarkably effective and robust filter for rejecting bright foregrounds and we demonstrate its effectiveness using realistic simulations of the radio sky and a simple fiducial interferometer configuration.

In Section 2, we introduce the all-sky formalism that is the basis for this technique. In Section 3, we discuss the map-making process in the spherical harmonic transit telescope paradigm. In Section 4, we discuss how to best represent statistics of the cosmological 21 cm signal and foregrounds in the measurement basis, and in Section 5 we discuss how the KL transform can be used to detect faint signals in the presence of bright foregrounds. In Section 6, we quantify the information lost due to foreground removal using Fisher Analysis, and estimate errors on power spectra. We present our conclusions in Section 7. In Appendix A, we discuss the signal and foreground models we employ. In Appendix B, we describe how we create realistic simulations of radio emission.

2. FORMALISM

In this section, we introduce the m-mode formalism, a new description of the measurement process for transit interferometers.

In radio interferometry, a visibility Vij is the instantaneous correlation between two feeds Fi and Fj. We will assume that we can take a linear combination of the signal from a dual polarization antenna, with no cross-polarization or polarization leakage, such that we are sensitive only to the total intensity (Stokes I) part of the sky. The fully polarized extension to this work is also a tractable problem, we address this in a subsequent paper, J. R. Shaw et al. (in preparation; for a treatment for a non-transit telescope, see Kim 2007). At any instant, a visibility is given by

Equation (1)

where $\boldsymbol{u}_{ij} = (\boldsymbol{r}_i - \boldsymbol{r}_j) / \lambda$ is the spatial separation between the two feeds divided by the observed wavelength (that is the separation in the uv-plane), $\hat{\boldsymbol{n}}$ is the position on the celestial sphere, and $A_i(\hat{\boldsymbol{n}})$ gives the primary beam of feed i. In the above we have normalized our visibilities such that they are temperature-like, and we have defined them in terms of the brightness temperature T = λ2I/2kb instead of the total intensity I. The quantity $\Omega _{ij} = \sqrt{\Omega _i \Omega _j}$ is the geometric mean of the individual beam solid angles

Equation (2)

which also gives the effective antenna area AeffΩ = λ2. This ensures that for a sky with uniform brightness temperature T, the auto-correlation of an antenna Vii = T with our definition.

As the Earth turns, both the primary beams and the baseline separations rotate relative to the celestial sphere. This means that the measured visibilities change periodically with the sidereal day. We take this into account by explicitly including the dependence on the azimuthal angle ϕ and by averaging over each sidereal day.

The measured visibilities are also corrupted by instrumental noise for which we add a noise term nij(ϕ). We assume the noise is stationary such that its statistics are independent of ϕ. Rewriting Equation (1) in terms of a transfer function Bij leaves the measured visibility as

Equation (3)

where the transfer function is

Equation (4)

Taking advantage of the periodicity in ϕ, we Fourier transform the system

Equation (5)

Equation (6)

where to proceed to the second line we have inserted the spherical harmonic expansions of both the sky and the beam transfer function

Equation (7)

Equation (8)

Note that we have defined $B^{ij}_{lm}$ relative to the conjugate spherical harmonic in order to simplify later notation. As the ϕ dependence simply rotates the functions about the Earth's polar axis, the transfer function at any ϕ is trivially $B^{ij}_{lm}(\phi) = B^{{ij}}_{lm}(\phi \!=\!0) e^{i m \phi }$. Combined with the exponential factor in the integral, this simply generates the Kronecker delta $\delta _{mm^{\prime }}$, and we find

Equation (9)

This gives a simple description of how the observed sky maps into the measured data given a telescope design (which is contained in the beam transfer matrices $B^{ij}_{lm}$). This transformation does not mix m-modes on the sky, and can therefore be performed on an m-by-m basis—for any particular m and frequency ν, the measured visibilities are simply a projection of the l-modes on the sky for the measured m. As the optical system is of a finite size, this limits both the l and m to which the telescope is sensitive, ensuring we only need to consider a finite number of degrees of freedom, both measured (Vm) and on the sky (alm).

In fact, while the positive and negative m-modes may be independent measurements, they are still observations of the same sky—by transforming the conjugate $V_{-m}^*$ and using that $a_{lm} = a^*_{l,-m}$ for a real field we see that

Equation (10)

In light of this, we will change our notation such that we are considering only the actual degrees of freedom on the sky. Let us separate out the positive and negative m parts by defining

Equation (11)

Equation (12)

which is valid for m ⩾ 0. Additionally, to prevent double counting the m = 0 measurement, we need to set $B_{l 0}^{ij -} = n_0^{ij -} = 0$. This gives a modified version of Equation (14)

Equation (13)

For brevity of notation, we will introduce a label α which indexes both the positive and negative m parts of all of the included feed pairs ij, such that any particular α specifies exactly the values of ij, ± (exactly how α is packed is unimportant). This gives

Equation (14)

The beam transfer matrices above can be written in an explicit matrix notation

Equation (15)

where the row index labels all combinations of baseline (α) and frequency (ν), whereas the column index is over all multipoles (l) and frequencies (ν'). Similarly we can define vectors for the visibilities and harmonic coefficients

Equation (16)

From here onward we will drop the subscript m denoting the spherical harmonic order, all the equations below are valid for any m. This allows us to rewrite Equation (14) as

Equation (17)

This simple linear description of the measurement process of a transit telescope is extremely powerful. By reducing it down to a linear mapping between a finite number of degrees of freedom, it allows us to apply the standard tools of signal processing. In the subsequent sections we apply it to solve two challenging problems in 21 cm radio astronomy.

3. MAP-MAKING

In astronomy, being able to transform our measured signal into an accurate map of the sky is essential. Whilst in this paper we explicitly avoid this process for our analysis, preferring to carry it out directly in the data space, maps are still needed for visualization and cross-checking. Map-making with interferometric data is generally a complicated process performed by algorithms such as CLEAN (Högbom 1974) and its derivatives. This is especially true with wide fields of view where mosaicking and w-projection are generally required. However, the m-mode formalism makes the map-making process on the full sky conceptually simple.

First, we assume that the instrumental noise n follows a complex Gaussian distribution with covariance ${{\bf N}}= \langle \boldsymbol{n}\boldsymbol{n}^\dagger \rangle$, and the different frequency channels are independent.12 For stationary noise, the m-modes are uncorrelated and the likelihood function of the observed sky for a single m and frequency ν is

Equation (18)

where the vector a contains all of the harmonic coefficients for the given m.

To estimate the sky corresponding to a given set of visibilities, we will look for a maximum likelihood solution $d p / d \boldsymbol{a}= 0$. In particular, we want to find the value of a that minimizes

Equation (19)

The matrix N−1/2 represents any factorization such that (N−1/2)†N−1/2 = N−1. Provided N contains no noiseless modes, it is positive-definite and so this factorization must exist. The maximum likelihood solution is given by the Moore–Penrose pseudo-inverse 13

Equation (20)

where the superscript + denotes the pseudo-inverse.

Depending on the number of baselines measured and the maximum l we are sensitive to, the problem may either be over- or under-constrained. In either regime, the Moore–Penrose pseudo-inverse gives a solution; in the former case this reduces to the standard map-making equation $\hat{\boldsymbol{a}} = ({{\bf B}}^\dagger {{\bf N}}^{-1} {{\bf B}})^{-1} {{\bf B}}^\dagger {{\bf N}}^{-1} \boldsymbol{v}$, and in the latter case selects the solution which also minimizes $\left|\hat{\boldsymbol{a}}\right|^2$, effectively setting the unconstrained degrees of freedom to zero.

As both distinct frequencies and m-modes are independent, map-making for a set of full sky observations is a case of collating the estimates for each individual ν and m.

4. TWO-POINT STATISTICS

For intensity mapping experiments, our data has three components: the 21 cm signal which we are ultimately trying to extract, the foregrounds, and instrumental noise. Understanding the two-point statistics of the data is of paramount importance to our analysis—not only do the correlations of the signal encode most of the cosmological information that we are interested in (see Appendix A), but to efficiently extract this we require knowledge of the two-point statistics of all three components. Here we write the linear relationship between these two-point statistics of the data, and how they are related to the underlying physical correlations.

The statistics of instrumental noise live in the visibility space, the basis of our measurements. However, the other components are naturally represented on the sky, and must be projected into this space using Equation (14). The lowest non-zero moment of the visibilities is their covariance

Equation (21)

This is the covariance between all measured degrees of freedom: baselines, frequencies, and m-modes. For the experiments listed in Section 1, we expect ≳ 103, ∼102, and 103, respectively. This gives matrices of dimension ≳ 108, too large to be tackled with current technology, both in terms of computation and storage.

Instead, let us make an approximation that will dramatically reduce this complexity. If we think of the sky as a statistically isotropic random field, its two-point statistics become dramatically simpler

Equation (22)

and importantly, they are automatically uncorrelated in the m index. This means that the full signal covariance Equation (21) is block diagonal and thus allows us to calculate all statistics on an m-by-m basis. For a specific m-mode

Equation (23)

where we have dropped all the m-indices, and N is the power spectrum on the instrumental noise. As the number of m-modes we are sensitive to is usually ≳ 103, assuming statistical isotropy saves at least a factor of a million in computation and a thousand in storage. This ability to efficiently perform calculations incorporating the full statistics opens up new avenues for the data analysis of transit instruments. Synchrotron emission from our Galaxy clearly violates this assumption of statistical isotropy though, as we will demonstrate, this does not appear to affect our analysis and in particular our ability to clean foregrounds.

In matrix notation

Equation (24)

where we will split Csky into independent 21 cm signal and foreground parts Csky = C21 + Cf.

The statistical models used for each component are chosen to be appropriate for the frequency ranges of interest. In the fiducial example that follows this is 400–600 MHz, corresponding to z ∼ 1–2 for the cosmological signal. These are described in Appendix A.

5. FOREGROUND REMOVAL WITH THE KARHUNEN-LOÈVE TRANSFORM

To clean our data we simply aim to find a subset within which there is significantly more 21 cm signal than astrophysical foregrounds. However, in the presence of mode-mixing there is no immediately apparent representation in which to perform this. This basis can be found using the KL transform (often called the signal-to-noise eigendecomposition), which has a long history in cosmology (e.g., Bond 1995; Vogeley & Szalay 1996; Tegmark et al. 1997) and has been used for the analogous problem of E/B mode separation for polarization of the cosmic microwave background (Lewis et al. 2002; Bunn et al. 2003). This transform simultaneously diagonalizes both the signal and foreground covariance matrices, generating a set of modes with no foreground or signal correlations. This makes comparing the amount of signal and foreground power in each mode trivial.

To reduce the risk of foreground uncertainties biasing our analysis, we will prioritize the removal of foreground contaminated modes at the expense of cosmological signal. In contrast, the instrumental noise is well understood, and we should be able to dig deeper into this contaminant to extract useful cosmological information with little risk. In practice, this means we will start with a filter which aggressively removes foregrounds only; subsequently we will add the instrumental noise back in, which will allow us to compress the data by completely removing noise-dominated modes, while retaining those with a small fraction of signal.

This requires models for the statistics of both the signal and the foregrounds. The signal is modeled as a simple Gaussian random field for the 21 cm emission, whereas the foreground model includes both the synchrotron emission from our Galaxy, and the contribution from a background of extragalactic point sources. The details of both are discussed in Appendix A.

The KL transform seeks to find a linear transformation of the data $\boldsymbol{v}^{\prime } = {{\bf P}}\boldsymbol{v}$ such that the 21 cm signal S = BC21B† and foreground F = BCfB† covariance matrices are jointly diagonalized. That is

Equation (25)

and

Equation (26)

where Λ is a diagonal matrix, and I is the identity. In this diagonal basis we can simply compare the amount of power expected in each mode by the ratio of the diagonal elements (this is given by the corresponding entries of Λ), and identify the regions of the space with low foreground contamination (large entries in Λ).

This transformation can be found by solving the generalized eigenvalue problem ${{\bf S}}\boldsymbol{x}= \lambda {{\bf F}}\boldsymbol{x}$ (for our purposes the routine ZHEGVD from LAPACK is sufficient). This gives a set of eigenvectors x, and corresponding eigenvalues λ. Writing the eigenvectors in a matrix P, row-wise, gives the transformation matrix to diagonalize the covariances. The eigenvalues λ corresponding to each eigenvector give the diagonal matrix Λ.

To isolate the 21 cm signal, we want to select modes with eigenvalue (signal-to-foreground power; S/F) greater than some threshold (see Figure 1). To project into this basis we define the matrix Ps which contains only the rows from P corresponding to eigenvalues greater than the threshold s.

Figure 1.

Figure 1. S/F spectrum for all m-modes. We have plotted log10λim, where for each m the eigenvalues have been sorted in ascending order (thus there is no physical interpretation to the vertical direction). The contours are drawn at −4, −2, 0, 2, and 4.

Standard image High-resolution image

For most of the analysis we can work directly in the eigenbasis. However, for visualizing our results, we want to be able to transform back to the sky (by way of the measured visibilities). To project back into the higher dimensional space we simply generate the full inverse P−1 and remove columns corresponding to the rejected modes (we denote this matrix $\bar{{{\bf P}}}_{s}$). This is equivalent to projecting into the full eigenbasis, zeroing the foreground contaminated modes, and then using the full-inverse P−1.

For further analysis, we must include all noise terms, both foregrounds and instrumental. Writing the total noise contribution as Nall = F + N, the matrix in the truncated basis is

Equation (27)

As the transformed instrumental noise matrix will not remain diagonal, this gives a correlated component between all our modes. However, as it is useful if our modes are uncorrelated, we make a further KL transformation on the foreground removed signal ${{\bf S}}_s = { {\boldsymbol \Lambda }}_s$, and total noise ${{\bf N}}^{\rm all}_s$ covariance matrices. For computational and storage efficiency we apply a further cut-off to include only modes with a signal-to-total-noise ratio greater than the cut-off value t. We denote this as projection matrix $\tilde{{{\bf Q}}}$. For notational convenience we will write the total transformation in terms of a single matrix R = QtPs, having chosen suitable values for the two cut-offs s and t. As above, we will define an inverse $\bar{{{\bf R}}} = \bar{{{\bf P}}}_s \bar{{{\bf Q}}}_t$, which remains orthogonal to the removed space. Quantities in this final basis we denote with tildes, for example a visibility mapped into this basis is $\tilde{\boldsymbol{v}} = {{\bf R}}\boldsymbol{v}$, and a covariance is $\tilde{{{\bf C}}}= {{\bf R}}{{\bf C}}{{\bf R}}^\dagger$.

5.1. Cylinder Example

While this method works with any transit telescope, to illustrate the foreground removal process we will simulate a cylinder telescope, such as CHIME or the Pittsburgh Cylinder Telescope (Bandura 2011). These are transit interferometers composed of multiple parabolic cylinders where each only focuses in the East–West direction. This gives a long and thin primary beam on the sky, extending nearly from horizon to horizon in the North–South direction but which is only around 1°wide East–West.

Feeds are spaced along the axis of each cylinder—when correlated these provide resolution in the N–S direction. Correlations between cylinders enhance the E–W resolution. The telescope operates as a transit telescope such that the entire visible sky is observed once per sidereal day.

For a cylinder uniformly illuminated by a particular feed, near the axis the beam pattern is a sinc function in the E–W direction, and uniform in the N–S direction (Wilson et al. 2009, chapter 6). To extend this off-axis, we modulate by projected area of the telescope giving

Equation (29)

where W is the cylinder width and $\hat{\boldsymbol{z}}$ is a unit vector pointing to the zenith and $\hat{\boldsymbol{u}}$ is a unit vector pointing East in the ground-plane. The step function Θ masks out the regions where the sky is below the horizon, and the final factor $\hat{\boldsymbol{n}}\cdot \hat{\boldsymbol{z}}$ accounts for the projected area of the telescope.

We model a two cylinder telescope observing the sky with 64 frequency channels from 400–600 MHz. Each cylinder is 15 m wide and has 60 feeds regularly spaced by 0.25 m (with the feeds lining up E–W between cylinders). The telescope is located at a latitude of 45°. These specifications correspond to a slightly smaller half-bandwidth version of the CHIME pathfinder telescope being constructed at the Dominion Radio Astrophysical Observatory.

The noise covariance is diagonal for all m, frequencies, and baselines. For small m-modes with m ≪ 1/(2πΔϕ) (where Δϕ is the angular integration time), the noise variance is

Equation (30)

where Tsys, i is the system temperature of a single polarization of feed i, Nday is the number of sidereal days observed, tsid is the length of a sidereal day, and Δν is the width of the frequency channel. As we combine the two polarizations into a single unpolarized signal, this reduces the noise power spectrum by a factor of two. For this example Tsys = 50 K, and we assume two full years of observation (that is, 730 complete sidereal days).

In Figure 1 we show the spectrum of S/F eigenvalues for the telescope. The KL mode distribution of S/F has an extremely rapidly rising spectrum so that the information retained (approximately the number of modes) is quite insensitive to the cut threshold s for values between 10−2 and 102.

To demonstrate the foreground removal process, we simulate time-streams from separate realizations of the signal and foregrounds using Equation (17), and project them through the filtering process to make maps. The visibilities are filtered using $\boldsymbol{v}_{\rm clean} = \bar{{{\bf R}}} {{\bf R}}\boldsymbol{v}$, and then are turned into a three-dimensional (3D) map using Equation (20). In Figure 2 we show the original simulation, the map made from the unfiltered visibilities, and the map made from the foreground filtered visibilities. The simulated signal and foreground maps are described in Appendix B. Note that the foreground maps are not simply realizations of the model used to generate the foreground filter—unlike the input model they are both non-Gaussian and anisotropic. Figure 2 clearly illustrates how the foreground amplitude is dramatically reduced by the process, whilst the signal retains its overall character. Though the foreground residuals are clearly highest in the Galactic center, even these are significantly lower than the filtered signal.

Figure 2.

Figure 2. This plot illustrates the foreground removal process in action on simulations of the foregrounds-only (top row) and signal-only (bottom row). Each plot has two elements, an image of the 400 MHz frequency slice on top, and beneath, a cut through the celestial equator (from 270° to 300°) showing the frequency axis. The leftmost column shows the original simulations on the sky. The band appearing in the foreground frequency slice is the Galactic plane. The middle column shows the maximum likelihood map that we would produce from the measured visibilities without subtracting the low S/F modes. The maps are blank below δ = −45° because this area is always below the horizon for the telescope at a latitude of 45°. The final column shows the maps produced after the foreground removal process (in this case we have discarded modes with S/F < 10). This leaves a clear correspondence between the original signal simulation and the foreground subtracted signal, whilst leaving the foreground residuals over 20 times smaller in amplitude.

Standard image High-resolution image

6. FISHER ANALYSIS

In the previous section, we have demonstrated that the KL transform gives an effective method for removing foregrounds. Though a visual inspection of Figure 2 suggests that the 21 cm signal is largely untouched, we would like to be able to quantify how much useful information remains. In this section, we will make use of the Fisher Information matrix, a measure of the information contributed to our knowledge of a set of parameters for a given observation, by telling us how much the covariance of those parameters is improved (see Dodelson 2003, chapter 11 for an overview). We will use this to forecast power spectrum errors, for the same telescope, with and without foreground removal.

After projection into the reduced eigenbasis, let us assume that the remaining modes follow a complex Gaussian distribution with zero mean. This assumption should be reasonable provided we have successfully removed the modes containing any significant foreground contribution. In this case the contribution of the mth mode to the ab-element of the Fisher Information matrix for a set of parameters pa is

Equation (31)

where $\tilde{{{\bf C}}}_a = \partial \tilde{{{\bf C}}}/ \partial p_a$, and the common factor of 1/2 does not appear as we are dealing with a complex Gaussian distribution. Though in the constructed eigenbasis $\tilde{{{\bf C}}}= \tilde{{{\bf \Lambda }}}+ {{\bf I}}$ is diagonal, $\tilde{{{\bf C}}}_a$ can have off-diagonal elements. Again this process is performed on a per-m basis. As there is no coupling between them, the total Fisher information is simply the sum over all m-modes

Equation (32)

For a set of parameters pa that we are trying to determine, the inverse of the Fisher matrix is the lowest order approximation to their covariance.

In this work, we will focus on forecasting the errors on the shape of the matter power spectrum P(k) whilst keeping all other cosmological parameters fixed. Such forecasting has been performed using the uv-plane in Seo et al. (2010) and Ansari et al. (2012).

We parameterize the power spectrum in terms of a linear summation of different basis functions

Equation (33)

In Appendix A, we describe how to project this quantity into the angular power spectrum of 21 cm fluctuations that we use to calculate the visibility correlations. For simplicity, each of our bands is simply equal to the input power spectrum within a fixed k-band, and zero outside, such that the fiducial model is pa = 1.

For the band-powers pa that we are trying to estimate, the matrices $\tilde{{{\bf C}}}_a$ are simply the projection of the basis functions $P_a(\boldsymbol{k})$ into the eigenbasis. Starting from the angular power spectra Ca; l(ν, ν'), corresponding to each of the basis functions $P_a(\boldsymbol{k})$ (using Equation (A2))

Equation (34)

In practice, explicitly calculating the $\tilde{{{\bf C}}}_a$ in this way is computationally expensive, we instead use a Monte Carlo technique. We can form the estimator $\hat{q}_a = \tilde{\boldsymbol{x}}^\dagger \tilde{{{\bf C}}}^{-1} \tilde{{{\bf C}}}_a \tilde{{{\bf C}}}^{-1} \tilde{\boldsymbol{x}}$, which has the property that its covariance $\langle \hat{q}_a \hat{q}_b \rangle - \left\langle \hat{q}_a \right\rangle \left\langle \hat{q}_b \right\rangle = F_{ab}$ (Padmanabhan et al. 2003). This means we can estimate the Fab by averaging over realizations of $\tilde{\boldsymbol{x}}$. For details, see Dillon et al. (2013).

In Figure 3 we plot the power spectrum errors for two cases: in the presence of foregrounds that have been cleaned using our method and without foregrounds at all. In the case without foregrounds, F = 0 and we only perform the final KL transform to diagonalize the signal and instrumental noise. For the foregrounds we have cleaned modes with S/F < 10 and additionally have removed modes with a small ratio of signal-to-total-power. This corresponds to setting s = 10 and t = 0.01. This is a clear demonstration of the effectiveness of the technique—it reduces our sensitivity on large scales as we would expect (as the removed foreground are smooth on large scales), while only slightly reducing our ability to constrain the small scale power spectrum.

Figure 3.

Figure 3. 21 cm intensity mapping provides a powerful technique for measuring the shape of the matter power spectrum. In the plots above we illustrate the power spectrum constraints that could be achieved with the large cylinder telescope. The top plot shows the constraints on the whole power spectrum, the lower plot zooms in on the region with BAO, dividing with a smoothed spectrum to remove the general trend. The dark shaded bands are the errors we would find without foregrounds, where the only noise is instrumental, the light bands include both contributions.

Standard image High-resolution image

7. CONCLUSION

In this paper, we have introduced a powerful formalism for describing the measurement process of transit telescopes (either interferometric or otherwise). It is a natural formalism to describe interferometry on the full sky—sidestepping the standard complications that arise when dealing with wide-field interferometric data such as mosaicking and w-projection. A spherical harmonic transit telescope allows for compact and computationally efficient representations of the data and its statistics, which enable new ways of approaching important problems like map-making and foreground removal.

Using the m-mode formalism and approximating the foregrounds as statistically isotropic, allows the powerful KL transformation to be used, automatically finding the basis in which the astrophysical foregrounds and 21 cm signal are maximally separated. The KL approach would be computationally impossible otherwise and is a key advantage of the m-mode formalism. Using this technique we can take the full 3D data set into account and overcome the mode-mixing problem. The filters we construct are highly effective and robust, a fact we have demonstrated by propagating through realistically simulated 21 cm and foreground timestreams. In our fiducial example, shown in Figure 2, peak-to-peak foreground amplitude was reduce by a factor of ∼2 × 107 leaving the peak-to-peak amplitude of the 21 cm signal around 20 times brighter that the foreground residuals.

We have also used this formalism to produce realistic forecasts for the power spectrum constraints from a fiducial 21 cm cylinder interferometer. We have demonstrated that foreground cleaning does not significantly degrade 21 cm power spectrum estimates on baryon acoustic oscillation (BAO) scales and below compared to a hypothetical foreground-free measurement. We anticipate that the spherical harmonic transit telescope formalism will be a powerful tool that can be applied to inform experimental design and test the interplay between real-world systematics and design constraints on 21st century 21 cm science. We will explore this further in J. R. Shaw et al. (2013, in preparation).

We thank the CHIME team for stimulating discussions, and Matt Dobbs and Keith Vanderlinde for comments on an earlier version of this manuscript. K.S., U.P., and M.S. are supported in part by the Natural Sciences and Engineering Research Council (NSERC) of Canada. The work of A.S. was supported by the DOE at Fermilab under Contract No. DE-AC02-07CH11359. K.S. thanks the Perimeter Institute for Theoretical Physics for its hospitality. Some of the results in this paper have been derived using the HEALPix14 package (Górski et al. 2005). Computations were performed on the GPC supercomputer at the SciNet HPC Consortium. SciNet is funded by the Canada Foundation for Innovation under the auspices of Compute Canada, the Government of Ontario, Ontario Research Fund–Research Excellence, and the University of Toronto.

APPENDIX A: SIGNAL AND FOREGROUND MODELS

We model the 21 cm signal and foregrounds as isotropic fields described by an angular power spectrum Cl(ν, ν'). We base our models on Santos et al. (2005), although we will only include the Galactic synchrotron and extragalactic point source contributions. Both these contributions are assumed to take the form

Equation (A1)

As the models are calibrated for observations of the reionization epoch, we need to transform them into the higher frequencies we are concerned with. We list the parameters for both these models in Table 1.

Table 1. Power Spectrum Model Parameters

Component A (K2) α β ζ
Galaxy 6.6 × 10−3 2.80 2.8 4.0
Point sources 3.55 × 10−4 2.10 1.1 1.0

Note. Our model for the angular power spectrum Cl(ν, ν') is adapted from Santos et al. (2005) to better suit the full-sky intensity mapping regime.

Download table as:  ASCIITypeset image

For the point source model, which is based on the results of Di Matteo et al. (2002), we change the pivot frequency ν0 from 150 MHz to 408 MHz and also rescale the amplitude in order to raise the maximum flux of unsubtracted sources from 0.1 mJy to 0.1 Jy.

The galactic synchrotron model we use is not only calibrated for low frequencies, but also high Galactic latitudes. As we will measure large fractions of the sky, we take this into account by changing the A and angular power law index β to be consistent with the angular power spectrum of the 408 MHz Haslam map for Galactic latitudes |b| > 5° from La Porta et al. (2008).

We model the 21 cm brightness temperature as being a biased tracer of the underlying matter fluctuations. These fluctuations are naturally characterized by the angular power spectrum (Lewis & Challinor 2007; Datta et al. 2007). However exact calculation of this quantity requires double-integration over highly oscillatory functions, instead we use the flat-sky approximation from Datta et al. (2007)

Equation (A2)

where χ and χ' are the comoving distances to redshift z and z'. Their difference is denoted by Δχ = χ − χ'. The vector k has the components k and $l / \bar{\chi }$ in the directions parallel and perpendicular to the line of sight ($\bar{\chi }$ is the mean of χ and χ'). This approximation is accurate to the 1% level for l > 10 (Datta et al. 2007).

We model the 21 cm brightness temperature power spectrum $P_{T_b}$ as

Equation (A3)

where Pm(k; z, z') = P(k)D+(z)D+(z') is the real-space matter power spectrum, D+(z) is the growth factor normalized such that D+(0) = 1, b is the bias, and the growth rate f = dln D+/dln a, the logarithmic derivative of the growth factor D+. We assume that the bias b = 1 at all redshifts. The mean brightness temperature is assumed to take the form

Equation (A4)

given in Chang et al. (2008). We assume that the neutral hydrogen fraction takes a value $\Omega _{{\rm H}\,\scriptsize {\rm I}} ={5\times 10^{-4}}$ (Masui et al. 2013).

APPENDIX B: SIMULATING ALL-SKY RADIO EMISSION

B.1. Galactic Synchrotron

In order to test our methods we require simulated maps of the Galactic emission from our own Galaxy in the range 400–1400 MHz with 1 MHz resolution. Though there are maps at both 800 MHz and 1420 MHz, the only public all-sky radio survey in this range is the 408 MHz Haslam map (Haslam et al. 1982). However, the Global Sky Model (de Oliveira-Costa et al. 2008) is based on a compilation of maps from 10 MHz to 94 GHz. We use the Global Sky Model to generate maps at both 400 MHz and 1420 MHz, and use these to estimate an effective spectral index at each location on the sky

Equation (B1)

By combining this with the Haslam map15 we can extrapolate to simulate a map of the sky at any desired set of frequencies

Equation (B2)

Unfortunately this map lacks both the small scale angular fluctuations (because of the limited resolution of the Haslam map) and any spectral variations (because of the power law extrapolation) that would be present on the real sky. It is essential to include these to accurately test any foreground removal method.

To include these fluctuations we could add Gaussian realizations of Equation (A1) (with the galactic synchrotron parameters, see Table 1) to the base map, which contain frequency and angular fluctuations at arbitrary resolutions. However the Haslam map already constrains what the sky looks like on scales ≳ 1°, and the extrapolation with the spectral index map is a constraint on the sky at 1420 MHz (on scales larger than 5fdg1, the resolution of the Global Sky Model). Therefore, we would like the combined simulated map to be consistent with these observations. We can do this by constraining the realizations to ensure there are no fluctuations on the scales constrained. In practice, we do this by manipulating the amplitudes of the two highest valued eigenmodes of Cl(ν, ν') (from Equation A1) in each realization, to ensure that the 408 MHz and 1420 MHz slices are zero when smoothed on 1° and 5fdg1 scales, respectively.

A further problem is that we know the amplitude of small-scale fluctuations varies over the sky, however our realizations are statistically isotropic. This is clearly demonstrated in the analysis of La Porta et al. (2008), which shows that the amplitude of the angular power spectrum traces the galactic structure. To reproduce this we use the rms amplitude of fluctuations across the Haslam map in ∼4° patches (corresponding to Healpix pixels with NSIDE = 16, giving 3072 pixels covering the sky), to rescale the fluctuations. In particular this generates an angular power spectrum on the whole sky which is consistent with a single power-law even when crossing through the beam-scale of the Haslam map into the simulated fluctuations. We do not include variations of the power law index of the angular power spectrum as there appears to be no structure to the small variations found in La Porta et al. (2008).

B.2. Extra-galactic Point Sources

Our point source maps are constructed from two components, a population of bright point sources (S > 0.1 Jy at 151 MHz) is simulated directly, and a background of dimmer unresolved point sources (S < 0.1 Jy) is modeled as a Gaussian random field.

The former is constructed directly by drawing from the point source distribution of Di Matteo et al. (2002), each sourced is modeled as having pure power law emission with a random spectral index. The sources are distributed randomly over the sky. Very bright sources (S > 10 Jy) are assumed to have been subtracted such that their residuals are less than this threshold.

The unresolved background is simulated by drawing a Gaussian realization from Equation (A1) with the point source model detailed in Table 1.

B.3. 21 cm Signal

Simulations of the cosmological 21 cm emission are performed by drawing Gaussian realizations from the flat-sky angular power spectrum (calculated using Equation (A2)).

Footnotes

Please wait… references are loading.
10.1088/0004-637X/781/2/57