WIDE-FIELD WIDE-BAND INTERFEROMETRIC IMAGING: THE WB A-PROJECTION AND HYBRID ALGORITHMS

S. Bhatnagar; U. Rau; K. Golap

doi:10.1088/0004-637X/770/2/91

1. INTRODUCTION

Observations in the radio band offer distinct, and often times unique, scientific advantages in probing certain areas of astrophysical research (e.g., in the detection of the EoR signal, studies of the high-redshift universe in general, large-scale structure formation, early galaxies, etc.).

All next-generation radio telescopes, many in operation now, offer at least an order of magnitude improvement in the sensitivity and angular resolution compared to the telescopes operated in the past decades. The two key instrumental parameters that afford such high sensitivities, impact the imaging performance, and are significantly different from previous-generation telescopes are: (1) the wide instantaneous fractional bandwidths, and (2) larger collecting area. The effects of wide instantaneous fractional bandwidths that classical calibration and imaging algorithms ignore lead to errors higher than the sensitivity that these new telescopes offer. Examples, relevant for some of the telescopes already in operation include the effects of time and frequency variant primary beams (PB), frequency dependence of the emission from the sky, and antenna pointing errors. The effects of wide fractional bandwidth and ionospheric phase screen limit the imaging performance below ∼1 GHz. Additionally, significant variations in the shape of the wide-band PB for aperture array telescopes lead to errors of similar magnitude. All of these effects form the general class of problems referred to in the literature as "direction-dependent effects" or DD effects.

Both wide fractional bandwidths and larger collecting area lead to many orders of magnitude increase in the data volume, putting severe constraints on the run-time performance of the algorithms for calibration and imaging. Furthermore, the cost of software development and maintenance also scales with algorithm complexity. Efficient algorithms to simultaneously account for all time-, frequency-, and polarization-dependent DD effects, which can also process large data volumes without significantly increasing algorithmic and software complexity are required.

In the following sections, we discuss various possible approaches to full-beam, wide-band continuum imaging. We present a modification of the A-Projection algorithm, which we call the wide-band A-Projection, or WB A-Projection algorithm, where a modified A-term also compensates, to a large extent, the frequency dependence of the PB. We also discuss the use of the unmodified A-Projection algorithm (Bhatnagar et al. 2008, henceforth referred to as Paper I), which we call the narrow-band A-Projection, or NB A-Projection, along with various forms of image-plane normalizations for wide-band continuum imaging and the resulting issues and limitations.

2. THEORY

Using the notation developed by Hamaker et al. (1996), full polarimetric measurements from a single baseline calibrated for the effects of direction-independent (DI) gains, can be described by the following measurement equation:

$\begin{equation} \boldsymbol {V}^{{\rm Obs}}_{ij}(\nu ,t) = W_{ij}(\nu ,t) \int P^{{\rm Sky}}_{ij}(\boldsymbol {s},\nu ,t) \boldsymbol {I}(\boldsymbol {s},\nu) e^{\iota \boldsymbol {b}_{ij} \cdot \boldsymbol {s}}d\boldsymbol {s}, \end{equation} \tag{ 1 }$

where $\boldsymbol {V}^{{\rm Obs}}_{ij}$ are the observed visibility samples measured by the pair of antennas designated by the subscript i and j, separated by the vector $\boldsymbol {b}_{ij}$ and weighted by the measurement weights W_ij. $P^{{\rm Sky}}_{ij}$ is the radio-Mueller matrix¹ in the image domain representing the full polarization description of the antenna PB as a function of the direction $\boldsymbol {s}$ , frequencyν, and time t, and $\boldsymbol {I}$ is the image vector. The vectors $\boldsymbol {V}$ and $\boldsymbol {I}$ are full polarization vectors in the data and image domain, respectively. $P^{{\rm Sky}}_{ij}$ and $\boldsymbol {I}$ are the unknowns in this equation.

Equation (1) cannot be directly inverted as, in general, it is not a Fourier transform relation. It is also sampled only at a limited number of points, and therefore the data has insufficient information to allow an exact solution. Estimation of $\boldsymbol {I}$ is therefore typically done via iterative nonlinearχ²-minimization (Cornwell 1995; Rau et al. 2009). Below we briefly review the theory of imaging with A-Projection to correct for the time and polarization dependence of $P^{{\rm Sky}}_{ij}$ in narrow-band imaging and motivate the need for a WB A-Projection algorithm to also correct for frequency dependence of $P^{{\rm Sky}}_{ij}$ in wide-band imaging.

2.1. Imaging with A-Projection

To clarify the full-polarization nature of the A-Projection algorithm, we define the outer-convolution operator and denote it by the symbol⊛. The outer-convolution operator is similar to the outer-product operation used in the DI description of Hamaker et al. (1996) with a minor difference. The element-by-element algebra of the outer-convolution operator is the same as that of the outer-product operator, except that the complex multiplications in outer-product operator are replaced by convolutions. Using the outer-convolution operator and the subscripts i and j to explicitly denote the antenna pair for baseline i − j, the A-matrix used in A-Projection at a frequencyν and time t can be written in terms of antenna based quantities as

$\begin{equation} {\mathsf {A}}_{ij} = {\mathsf {J}}_i {\circledast }{{\mathsf {J}}}^{*}_j \end{equation} \tag{ 2 }$

where

$\begin{equation} {\mathsf {J}}_i = \left[\begin{array}{ll}{\mathsf {E}}^p_i & {{\mathsf {E}}}_i^{p\rightarrow q} \\ \ms {{\mathsf {E}}}_i^{q\rightarrow p} & {{\mathsf {E}}}^{q}_i \\ \end{array} \right]. \end{equation} \tag{ 3 }$

${\mathsf {E}}^p$ and ${\mathsf {E}}^q$ are the polarized antenna aperture illumination patterns for the two polarization states. The off-diagonal terms are the leakage patterns. ${\mathsf {A}}_{ij}$ is the DD equivalent of the4 × 4 DI Muëller matrix for a given antenna pair. The elements of ${\mathsf {A}}_{ij}$ are the complex convolution of the two antenna aperture illumination patterns $({{\mathsf {E}}}^p_i \star {{\mathsf {E}}}^{p^*}_j)$ , $({{\mathsf {E}}}^p_i \star {{\mathsf {E}}}^{{p\rightarrow q}^*}_j)$ , etc. For comparison, the elements of ${\mathsf {J}}_i \otimes {{\mathsf {J}}}^{*}_j$ would be $({{\mathsf {E}}}^p_i \cdot {{\mathsf {E}}}^{p^*}_j)$ , $({{\mathsf {E}}}^p_i \cdot {{\mathsf {E}}}^{{p\rightarrow q}^*}_j)$ , etc.

To keep the notation simple, in the following description we use a single subscriptk ≡ (ij, ν, t) to refer to a measurement from a single baselineij, at a spot frequencyν and an instant in time t. The vectors $\boldsymbol {V}$ andδ are full polarization vectors whose elements are two-dimensional (2D) functions in the visibility plane (the uv-plane). Elements of $\boldsymbol {V}$ are the 2D visibility data and elements ofδ are 2D Delta functions representing the uv-sampling function for the data sample k. The superscriptsobs, M, and○ refer to the observed, model, and true values, respectively.

Using the notation described above, theχ² can be written as

$\begin{equation} \chi ^2 =\sum _k \boldsymbol {V}^{{R^{\dag} }}_k \Lambda _k \boldsymbol {V}^R_k, \end{equation} \tag{ 4 }$

where $\boldsymbol {V}^R_k = \boldsymbol {V}^{{\rm Obs}}_k - \boldsymbol {V}^M_k$ andΛ_k is inverse of the noise covariance matrix. The vector ${{\boldsymbol V}^{{\rm Obs}}}$ can be expressed in terms of ${\mathsf {A}}$ as

$\begin{equation} \boldsymbol {V}^{{\rm Obs}}_k = \left({\mathsf {A}}^\circ _k \star \boldsymbol {V}^\circ \right) \delta _k. \end{equation} \tag{ 5 }$

Note that, as mentioned before, the elements of ${{\mathsf {A}}^\circ }$ , $\boldsymbol {V}^\circ$ andδ_k are 2D functions. The symbol "⋆" represents the element-by-element convolution. $\boldsymbol {V}^\circ$ —without a subscript—represents the true continuous Coherence function. $\boldsymbol {V}^{{\rm obs}}_k$ represents a sample of this Coherence function measured at the parameters represented by subscript k.

The calibration matrix for Equation (5) to correct for the effects of ${{\mathsf {A}}}^\circ _k$ is ${{\mathsf {A}}}^{\circ ^{-1}}_k$ given by

$\begin{equation} {\mathsf {A}}^{\circ ^{-1}}_k=\frac{{{\rm adj}\,({{\mathsf {A}}}^\circ _k)}}{{{\rm det}\,({{\mathsf {A}}}^\circ _k)}}. \end{equation} \tag{ 6 }$

The equivalence between Equation (6) as a generalized DD calibration and standard DI calibration is discussed in more detail in Section 2.1.2.

As in DI calibration where calibration is done by the application of the inverse of the appropriate Mueller matrix, correction for the effects of ${\mathsf {A}}$ requires the application of ${\mathsf {A}}^{-1}$ . The difference between DI and DD calibration is that while the operator for the application of the DI calibration matrix to the data is the matrix multiplication operator, for DD calibration this operator is the element-by-element convolution operator (the⋆ operator in Equation (5)).

Since DD calibration fundamentally cannot be separated from imaging, the application of the ${\mathsf {A}}^{-1}$ matrix is done via the A-Projection algorithm. This is achieved in two steps. The term in the numerator of Equation (6), ${{\rm adj}\left({{\mathsf {A}}}_k\right)}$ , is applied during resampling of the observed data (the right-hand side of Equation (5)) on a regular grid using convolutional gridding with ${{\mathsf {A}}}^{{M{\dag} }}_k$ , a model of ${\mathsf {A}}^{{\circ{{\dag} }}}_k$ , as the convolution function. The resulting gridded data is accumulated in the data domain and then Fourier transformed to compute the continuum image. The scaling by the denominator of Equation (6) is done by also accumulating ${\mathsf {A}}^{M}_k$ and diving the image by its Fourier transform. The resulting image using A-Projection is given by

$\begin{equation} I^R = \frac{{\mathsf {F}}\sum _k {{\rm adj}\left({\mathsf {A}}^{M}_k\right)} \star \left({\mathsf {A}}^\circ _k \star V^\circ \right) \delta _k}{{\rm det}\left({\mathsf {F}}\left[\sum _k {{\mathsf {A}}}^M_k\right]\right)}. \end{equation} \tag{ 7 }$

This effectively applies the DD calibration operator ${{\mathsf {A}}}^{-1}$ and corrects for its effects, provided ${{\mathsf {A}}}^M$ is a close enough approximation of ${{\mathsf {A}}}^\circ$ . Further details, results, and discussion on the imaging performance of A-Projection are in Paper I.

For continuum imaging, the accumulation for all k in Equation (7) can be done in either domain (data or image domain). Continuum imaging of wide-band data using the multi-frequency synthesis (MFS) approach is done by accumulation in the data domain. Since the A-Projection algorithm does not explicitly account for the frequency dependence of ${\mathsf {A}}$ , an algorithm to project out this frequency dependence before accumulation is required. The WB A-Projection algorithm for this is described in Section 3. Various hybrid imaging algorithms using the NB A-Projection algorithm and accumulation in the image domain are also possible. These are discussed in Section 4.

2.1.1. Algorithmic Steps for A-Projection

For completeness and as a reference for later discussions, the algorithmic steps for MFS imaging using the A-Projection algorithm described in Paper I are repeated below:

1.
Initialize the model and the residual images I^M and I^R
2.
Major cycle:
- (a)
  Predict the model data and accumulate ${\mathsf {A}}_k$ as
  $\begin{eqnarray*} \boldsymbol {V}^{M}_{k} &=& \sum _k {{\rm adj}\left({\mathsf {A}}_{k}\right)}\star ({\mathsf {F}}^{-1}\boldsymbol {I}^{M})\nonumber \\ \overline{{\mathsf {A}}^M} &=& \sum _k {{\mathsf {A}}}^M_k. \nonumber \end{eqnarray*}$
- (b)
  Compute the residual data $\boldsymbol {V}^{R}_{k}=\boldsymbol {V}^{{\rm Obs}}_{k} - \boldsymbol {V}^{M}_{k}$ .
- (c)
  Use Equation (7) to compute the continuum residual image as $\boldsymbol {I}^R = ({\mathsf {F}}\sum _k \boldsymbol {V}^R_k )/ {{\rm det}({\mathsf {F}}\overline{{\mathsf {A}}^M})}$ .
3.
Minor cycle: Invoke the appropriate minor-cycle algorithm using $\boldsymbol {I}^R$ to solve for image-plane parameters and update the model image $\boldsymbol {I}^M$ .
4.
If not converged, go to Step 2.

Since ${\mathsf {A}}_k$ does not change from one major cycle to another, accumulation of $\overline{{\mathsf {A}}^M}$ is done in the first major cycle and cached for use in subsequent major cycles.

2.1.2. A-Projection: A Direction-dependent Gain Correction Algorithm

The antenna illumination pattern is essentially a DD description of antenna based complex gains in the data domain. ${\mathsf {A}}_k$ is a DD generalization of the G-Jones matrix–the DI Muëller matrix for antenna gains in the Hamaker et al. (1996) formulation and since the A-Projection algorithm corrects for the effects of ${\mathsf {A}}_k$ , it can be thought of as an algorithm for DD calibration.

To establish the equivalence between DD corrections via A-Projection and DI antenna-based complex gain correction, we note that Equation (5) is the DD equivalent of DI measurement equation given by

$\begin{equation} \boldsymbol {V}^{{\rm obs}}_{ij} = G_{ij} \cdot \boldsymbol {V}^\circ _{ij}. \end{equation} \tag{ 8 }$

${{\mathsf {A}}}_k$ in Equations (2) and (5) is the DD equivalent of G_ij and the outer-convolution ("⊛") and the "⋆" operators are the DD equivalent of the outer-product ("⊗") and element multiplication. Calibration for G_ij is done by multiplying Equation (8) by $G^{-1}_{ij}$ given by

$\begin{equation} G^{-1}_{ij}=\frac{G_{ij}^{*}}{|G_{ij}|^2}. \end{equation} \tag{ 9 }$

For calibration, Equation (6) is the DD equivalent of the above equation for the DD calibration.

For an intuitive understanding, we note that for the simpler case where the off-diagonal elements of ${\mathsf {J}}$ are negligible, ${{\rm adj}\left({{\mathsf {A}}}_k\right)} = {{\mathsf {A}}}^{\dag}$ and ${{\rm det}\left({{\mathsf {A}}}_k\right)}={\rm trace}\left({\mathsf {A}}_k\right)={{\mathsf {E}}}^{pp}_k{{\mathsf {E}}}^{{qq^{*}}}_k$ . For this simpler case, examination of Equations (6) and (9) shows the equivalence between DI gain calibration and the A-Projection algorithm more clearly. The process of imaging using the ${{\rm adj}\left({\mathsf {A}}_k\right)}$ and normalizing the resulting image by ${{\rm det}\,(\sum _k {\mathsf {A}}_k)}$ therefore, even for the general case, is the DD generalization of the antenna-based complex gain calibration.

3. THE WB A-PROJECTION ALGORITHM

${{\rm adj}\left({\mathsf {A}}_k\right)}$ , when used in the A-Projection algorithm, corrects for the polarization and other DD effects that can be encoded in the phase of ${\mathsf {A}}^\circ$ (e.g., time-varying gains due to polarization squint, ionospheric phase-screen, etc.). It is, however, not a conjugate operator for variations along the time or frequency axis.

The image domain effects of the time varying gains are largely in the amplitude scaling only (i.e., they do not disperse the flux in the image domain). Since the minor cycle algorithms typically assume that the sky brightness distribution is time-invariant and do not parameterize the model image in time, the effects of such variations can be ignored in the transform from the data to image domain. If the deviations from the average value are small (e.g., for antenna arrays and long integrations where time variability is cyclic, or antennas with three-axis mounts where beams do not rotate on the sky) the model prediction stage, which properly includes these effects, corrects for time variability in an the iterative deconvolution scheme.

Frequency dependence of ${\mathsf {A}}^\circ$ also varies with time (and direction). This variation is not cyclic and its maximum deviation from the average value increases with fractional bandwidth. Due to this, as for time varying gains, while the frequency dependence of ${\mathsf {A}}^\circ$ in the inner part of the main lobe of the PB can also be corrected via the model prediction stage, the convergence is significantly slower (requiring more major cycles and hence higher computing). Alternatively, this frequency dependence can be absorbed, to some extent, in the multi-term MFS (MT-MFS) minor cycle algorithm, which solves for the time-invariant frequency dependence in the image plane. But this requires more Taylor-terms, which also is inadvisable (see Section 4.3.3).

Correction for the time-variable frequency dependent effects of ${\mathsf {A}}^\circ$ requires a wide-band version of the ${\mathsf {A}}^M$ matrix such that ${{\rm adj}\,({{\mathsf {A}}}^M_k)} \star {{\mathsf {A}}}^\circ _k$ results in a function which does not vary with frequency (i.e., ${{\rm adj}\,({\mathsf {A}}^M)}$ is also a conjugate operator for frequency). The frequency dependence will then be projected out prior to accumulation, resulting in an image corrected for the frequency dependent effects of ${\mathsf {A}}$ .

3.1. The Wide-band ${\mathsf {A}}$ Operator

For the reasons given above, as well as to keep the parameters for modeling the instrumental effects in the data domain separate from parameters for modeling the sky brightness distribution, we need to construct the ${\mathsf {A_*}}(\nu)$ matrix, which projects out the dependence on frequency during imaging. One such possibility is the following:

$\begin{equation} {\mathsf {A_*}}(\nu) = {\mathsf {F}}^{-1} \left[ \frac{P_{{\rm eff}}(\nu) P^*_{{\rm eff}}(\nu)}{P(\nu)}\right], \end{equation} \tag{ 10 }$

where $P(\nu) = {\mathsf {F}}^{\dag } {\mathsf {A}}(\nu)$ , andP_eff(ν) is the desired effective PB, which varies minimally with frequency. In the data domain, ${{\rm adj}\left({\mathsf {A}}_*(\nu)\right)}\star {\mathsf {A}}(\nu)$ can be shown to be frequency-independent to high orders, and this use of ${\mathsf {A_*}}(\nu)$ as a model for ${\mathsf {A}}(\nu)$ in the reverse transform can correct for the frequency dependence of ${\mathsf {A}}(\nu)$ . However, it also has a large support size, and therefore, in itself, is not an efficient reverse transform operator.

3.2. The Conjugate Frequency

Equation (10) is valid for any frequency dependence. For the special case of PB scaling with frequency, we explore usable approximations for ${\mathsf {A_*}}$ , we define the conjugate frequencyν_*, given by²

$\begin{equation} \nu _* = \sqrt{2\nu _{{\rm ref}}^2 - \nu ^2 } \end{equation} \tag{ 11 }$

whereν_ref is the reference frequency of the continuum image, and examine the effects of choosing ${\mathsf {A}}_*(\nu) \equiv {\mathsf {A}}(\nu _*)$ .

Using the same model for ${\mathsf {A}}$ as used in Paper I (i.e., a model for the Very Large Array, VLA, antenna PB), one-dimensional cuts through the model PB given by $P_M(\nu)={\mathsf {F}}A(\nu)$ , the effective PB given by $P_{{\rm eff}}(\nu)={\mathsf {F}}\left[A(\nu _*)\star A(\nu)\right]/\left[{\mathsf {F}}A(\nu _{{\rm ref}})\right]$ , and the first and second derivatives ofP_eff(ν) with respect to frequency are shown in Figure 1. A comparison of the derivatives ofP(ν) andP_eff(ν) with frequency, shown in the two panels as blue dashed lines, shows that the effective PB is frequency independent to the first order. While it changes in structure, the maximum second derivative remains almost the same in magnitude. These figures show that the approximation in Equation (11) and use of ${\mathsf {A}}(\nu _*)$ is good enough for imaging data which are not sensitive to the higher order frequency dependent effects. This approximation is useful since it can be easily implemented, is appropriate for the sensitivity of current telescopes and covers a large fraction of scientific observations for simultaneous Stokes-I and spectral-index mapping. The frequency dependence in ${\mathsf {A}}(\nu)$ is reduced overall by an order of magnitude. When used in the A-Projection algorithmic steps in Section 2.1.2, it effectively corrects for the frequency dependence of the PB prior to the accumulation along the frequency axis in Equation (7). For future studies, more sensitive telescopes that will be sensitive to second order frequency dependent effects also, ${\mathsf {A_*}}$ as in Equation (10) may be required. However, when software implementation itself requires partitioning along the time and/or frequency axis, hybrid approaches discussed in Section 4 may be more efficient.

**Figure 1.** Plot in the left panels shows the one-dimensional cuts through the PB model at a reference frequency (continuous red line) for the VLA and its first (dashed blue line) and second derivatives (dash-dotted green line) with respect to frequency. The plot on the right shows the same cuts through theP_eff(ν)—the effective frequency-independent PB.
Download figure:
Standard image High-resolution image

In practice, there might be multiple terms that make up the A-term, some of which may scale with frequency while others may not. For example, aperture illumination will scale with frequency, but the pointing-offset term or resonance effects will not. The terms that do not scale with frequency can be applied as multiplicative terms to ${\mathsf {A}}(\nu _*)$ during gridding. We have tested this approach to work for parallel-hand polarization effects. While we have not tested it, we think this may also work for cross-hand polarizations.

To avoid confusion and for brevity, we will refer to the algorithm in Paper I as the NB A-Projection algorithm, and refer to the use of ${\mathsf {A}}(\nu _*)$ (instead of ${\mathsf {A}}(\nu)$ ) for the data-to-image domain transform as the WB A-Projection algorithm.

3.3. Algorithm Validation

The image deconvolution algorithm described in Section 3 was tested using simulated wide-band data with 66% fractional bandwidth. The VLA C-array was used for antenna configuration and the observations covered an hour angle range of ±3^h. The model for the PB used in Paper I was scaled by frequency and rotated with parallactic angle to simulate time-varying frequency-dependent effects. To clearly highlight the effects of time and frequency dependence of ${\mathsf {A}}$ , we used a model of the sky consisting of five point sources located at 0.99, 0.83, 0.60, and 0.11 levels of the PB within the main lobe and one source located in the first side lobe (PB gain of 0.025). All the point sources were assigned a flux of 1 Jy with flat spectra. The effective spectral-indices due to the PB at the five locations are −0.026, −0.38, −1.0, −5.32, and +0.47, respectively. No noise was added to these simulations, and all imaging and deconvolution runs were with a loop-gain of 0.2.

Figure 2 shows deconvolved images produced without (first panel) and with (second panel) WB A-Projection gridding. This comparison demonstrates that with an accurate model of the PB, it is possible to correct for its time and frequency variability down to numerical precision levels in wide-band, wide-field imaging.

4. APPLYING NB A-PROJECTION FOR WIDE-BAND IMAGING

NB A-Projection was designed for a single reference frequency and does not automatically account for the frequency dependence of PB during gridding. However, it can still be used for wide-band imaging, as long as the frequency dependence of the far-field pattern is known and characterized byP_ν. Several algorithmic options exist, all with different numerical approximations and computing load. This flexibility allows the implementation to be tuned according to the available computing resources, architecture of the hardware platform, and the desired imaging accuracy.

4.1. Cube Imaging + Cube Deconvolution

The simplest approach is spectral-cube imaging where each frequency channel (with its limiteduv coverage) is treated separately. NB A-Projection is applied as-is per channel, and the minor cycle run independently per channel. A continuum image is later constructed by adding together deconvolved and restored images from all channels.

The residual image per channel can be approximately rewritten from Equation (7) in the image domain, for the case where the aperture illumination functions are identical for all baselines and times, and only one polarization-pair is being imaged:

$\begin{equation} I^{R}_{\nu } = {{ {{P_{\nu }}\times \left(I^{{\rm psf}}_{\nu } \star \left({P_{\nu }}\times I^{{\rm sky}}_{\nu }\right) \right)}} \over {{P_{\nu }}^2} }, \end{equation} \tag{ 12 }$

where $I^{{\rm psf}}_{\nu } = F^{-1}\sum _k \delta _{k,\nu }$ (k ≡ ij, t) is the point-spread function (PSF) for one channel andP_ν = F⁻¹A_ν. In this expression, the division byP_ν² implies a flat-sky³ normalization, but a flat-noise⁴ normalization may be used instead.

This method is straightforward and will suffice for modest imaging dynamic ranges and uncomplicated spatial structure (point sources). However, the angular resolution of the continuum image and any estimate of the sky spectrum will be limited to that of the lowest frequency in the band. Also, reconstruction uncertainties may be inconsistent across frequency when there is insufficientuv-coverage per channel or complicated spatial structure, leading to spurious spectral structure.

In this paper, we are focusing on imaging problems that require more accuracy and dynamic range than what the above offers. In the next two sections, we discuss A-Projection in the context of MFS where the combineduv-coverage is used for model reconstruction (minor cycle).

4.2. Cube Imaging + MFS Deconvolution

The simplest extension of NB A-Projection for MFS is to grid, Fourier transform, and normalize each frequency channel (or sets of channels) independently, and then produce a continuum image by an image-domain accumulation, before the minor cycle. When the sky spectrum is not flat, or when flat-noise normalization is used per channel, a wide-band minor-cycle algorithm such as MT-MFS can be applied to simultaneously solve for the sky intensity and spectrum. Taylor-weighted residual images are constructed as follows, before proceeding to the minor cycle:

$\begin{eqnarray} &&I^{R}_t = \sum _{\nu } {w_{\nu }^t}I^{R}_{\nu } = \sum _{\nu } {w_{\nu }^t}\left\lbrace {{P_{\nu }}\times \left({I_{\nu }}^{{\rm psf}} \star \left({P_{\nu }}\times {I_{\nu }}^{{\rm sky}}\right) \right)} \over {{P_{\nu }}^2} \right\rbrace , \nonumber \\ \end{eqnarray} \tag{ 13 }$

where ${w_{\nu }^t}= { ({\nu -\nu _0}/ {\nu _0} ) }^t$ are weights that represent Taylor polynomial basis functions (Rau & Cornwell 2011). The interpretation of the output Taylor coefficients depends on the choice of normalization as follows.

1.
With flat-sky normalization per channel, the output Taylor coefficients will represent $I^{{\rm sky}}_{\nu }$ .
2.
For flat-sky normalization per channel, but a regularized flat noise minor cycle, the Taylor-weighted residual images can be multiplied by aP_ref after the frequency summation and before deconvolution. The output Taylor coefficients will represent $P_{{\rm ref}} I^{{\rm sky}}_{\nu }$ and a post-deconvolution division byP_ref will be required for the intensity image. No corrections are needed for the spectral index map.
3.
With flat-noise normalization per channel before Taylor-weighted averaging, the output Taylor coefficients will represent $P_{\nu } I^{{\rm sky}}_{\nu }$ , and a post-deconvolution polynomial division of the PB spectrum will be required to correct both the intensity and the spectral index.

Qualitatively, the reverse transform with flat-sky normalization per channel followed by frequency-averaging and a flat-noise regularization byP_ref before the minor cycle, is equivalent to the use of WB A-Projection during gridding. It requires multiple images (one per channel) and fast Fourier transforms (FFTs) and has repeated beam divisions and multiplications that can increase numerical errors. However, it is naturally parallelizable, making it an attractive option for extremely large data sets and imaging goals where inaccuracy in low gain regions of the PB can be tolerated.

4.2.1. MFS Imaging + MFS Deconvolution

To optimize on memory use and FFT costs (especially in a non-parallel imaging run), MFS gridding can be done, where averages over baseline, time, and frequency are accumulated onto a single grid, followed by a single FFT and normalization by an average PB (or its square). Here, attention must be paid to the consequences of averaging over frequency before normalization. The use of ${{\rm adj}\left({\mathsf {A}}(\nu)\right)}$ as the gridding convolution function (the NB A-Projection algorithm), introduces an additional frequency dependenceP_ν that gets averaged over before it can be removed. Once gridded, this extra frequency dependence is locked in, and can be accounted for only as an artificially steeper spectrum in the minor-cycle of the MT-MFS wide-band imaging algorithm.

Multi-term Taylor-weighted residual images must be constructed as follows:

$\begin{equation} I^{R}_t = { { \sum _{\nu } {w_{\nu }^t}\left\lbrace {{P_{\nu }}\times \left({I_{\nu }}^{{\rm psf}} \star \left({P_{\nu }}\times {I_{\nu }}^{{\rm sky}}\right) \right)} \right\rbrace } \over {\sum _{\nu } {P_{\nu }}^2} }, \end{equation} \tag{ 14 }$

and the output Taylor coefficients will depend on the choice of normalization as follows:

1.
For flat-sky normalization, the model intensity represents $I^{{\rm sky}}_{{\rm ref}}$ , but the spectrum represents the product of the sky spectrum and the square of the PB spectrum.
2.
For flat-noise normalization, the model intensity represents $I^{{\rm sky}}_{{\rm ref}} P_{{\rm avg}}$ and the spectrum represents the product of the sky spectrum and the square of the PB spectrum.

In both cases, appropriate wide-band post-deconvolution corrections for the average PB and two instances of the PB spectrum must be applied.

Such corrections are inelegant, and are susceptible to numerical instabilities in low-gain regions of the PB. Section 4.3 shows a comparison of some of these methods with WB A-Projection, for a simulation with source spectra that are not flat and therefore require MT-MFS imaging and deconvolution.

4.3. Comparison of Hybrids with WB A-Projection

To test the algorithm described in Sections 3 and 3.1 with non-flat source spectra, we used the sky brightness distribution as in Figure 2, but assigned a spectral index ofα = −0.5 to all sources such thatI(ν)∝(ν/ν_o)^α.

Figure 3 shows deconvolved images produced with and without time-dependent and frequency-dependent PB corrections during gridding, emphasizing the different types of error patterns that arise when one or more effects are ignored. An image formed from the hybrid method described in Section 4.2.1 to absorb all frequency dependence into the minor cycle solver is also shown for comparison. Figure 4 shows Stokes-I and spectral index values for these point sources after PB correction, to illustrate the accuracy to which different methods are able to recover the true-sky spectral index at various locations in the PB. The various methods tested and results obtained are described below. The rms and peak-residuals from using these algorithms are tabulated in Table 1.

Table 1. Comparison of the Imaging Performance of the Algorithms Described in Section 4.3

Panel in Figure 3	Algorithm	Description	rms	Peak Residual	Comments
Panel in Figure 3	Algorithm	Description	(Jy/beam)	(Jy/beam)	Comments
Top Left	MFS + SI	Standard wide-band imaging	6 × 10⁻⁴	2.3 × 10⁻³	Ignore time & frequency dependence. Artifacts due to time and frequency variations of the PB.
Top Right	MT-MFS + SI	Multi-term imaging with standard gridding	1 × 10⁻⁴	5 × 10⁻⁴	Ignore time dependence. Absorb time-averaged frequency dependence in MT-MFS. Artifacts due to time-variability of the PB.
Lower Left	MT-MFS + A-Projection	Multi-term imaging with NB A-Projection gridding	4 × 10⁻⁵	8 × 10⁻⁴	Account for time variability of PB, and absorb the resulting PB² frequency dependence in MT-MFS. Artifacts due to stronger spectral structure.
Lower Right	MT-MFS + WB A-Projection	Multi-term imaging with wide-band A-Projection gridding	3.5 × 10⁻⁵	2 × 10⁻⁴	Account for PB time- & frequency-dependence in WB A-Projection. Account for static sky-frequency dependence in MT-MFS. Minimal artifacts.

Download table as: ASCII Typeset image

**Figure 4.** Comparison of the accuracy of the PB-corrected intensity (LEFT) and spectral-index (RIGHT) for the five simulated point sources, using the four methods whose results are shown in Figure 3. The labels "`AWP`" and "`WBAWP`" are used for A-Projection and WB A-Projection in the figure. The algorithms compared are MFS+SI, MT-MFS+SI, MT-MFS+A-Projection and MT-MFS+WB A-Projection. Spectral-indices are shown only for methods using MT-MFS, with post-deconvolution (average) spectral-index corrections done for the SI and A-Projection runs. Results for the five sources are shown from left to right with increasing distance from the pointing-center. The reference-PB gain and effective PB-spectral-index at the locations of the five sources are listed on the x-axis. These plots show that outside the HPBW at the reference-frequency, methods that do not account for time-variable PB-spectra have considerably higher errors, and the combination of MT-MFS+WB A-Projection delivers accurate corrections even out in the sidelobe.
Download figure:
Standard image High-resolution image

4.3.1. MFS + SI (Standard Imaging)

The image in the top left panel of Figure 3 is the result of standard Cotton–Schwab CLEAN with MFS gridding using prolate-spheroidal functions as gridding-convolution functions, and a flat-spectrum assumption during the minor cycle:

$\begin{equation} I^{R} = \sum _{\nu } \left\lbrace { \left({I_{\nu }}^{{\rm psf}} \star \left({P_{\nu }}\times {I_{\nu }}^{{\rm sky}}\right) \right)} \right\rbrace. \end{equation} \tag{ 15 }$

Time and frequency variability of both the sky and the instrument are ignored, and for a 66% bandwidth, imaging artifacts around all sources away from the pointing center are dominated by spectral effects due toP_ν present in the data. A post-deconvolution division by an average PB can recover the true source intensity to within a few percent, out to the half-power point of the PB, but errors increase with distance from the pointing center.

4.3.2. MT-MFS+ SI

The image in the top right panel of Figure 3 is the result of the MT-MFS algorithm in the minor cycle, with standard gridding (prolate-spheroidal functions). The minor cycle solves for the average intensity and spectrum ofI(ν)P(ν) using a two-term Taylor-polynomial approximation:

$\begin{equation} I^{R}_t = \sum _{\nu }{w_{\nu }^t}\left\lbrace { \left({I_{\nu }}^{{\rm psf}} \star \left({P_{\nu }}\times {I_{\nu }}^{{\rm sky}}\right) \right)} \right\rbrace. \end{equation} \tag{ 16 }$

Average PB-spectral effects are absorbed into the sky model, and the dominant remaining error is due to the time variability of the PB. A post-deconvolution correction of the continuum intensity and spectral index are accurate to within a few percent in intensity and ±0.1 in spectral index out to approximately the half-power point. Beyond this field of view (FoV), errors increase (to ±0.4 or more in spectral index) primarily because a time-averaged PB spectrum is not a good estimate in regions of the image whereP_ν changes by 100% with time as the beams rotate on the sky.

4.3.3. MT-MFS+ A-Projection

The image in the bottom left panel of Figure 3 is the result of MT-MFS in the minor cycle (two terms), but with the NB A-Projection gridding as described in Section 4.2.1 with a flat-noise normalization before the minor cycle, followed by a post-deconvolution correction of the intensity byP_ref (average PB) and the spectral index by twice that of the PB. Artifacts due to frequency-independent time variability (antenna rotation) no longer exist within the HPBW (PB gain of 0.5), but new spectral artifacts appear away from the pointing center (beginning around the 10% level).

These errors are partly due to the increased nonlinearity of theP²(ν) spectrum away from the pointing center, for which a two-term Taylor-polynomial approximation is insufficient. A run with three terms partially reduces this problem, indicating that errors in approximating the combined spectrum with a low-order polynomial dominates the errors, but higher order polynomials are inadvisable because of instability in low-S/N regions.

Errors also arise from the high time variability of the PB-spectrum, which is ignored because only time-averaged spectra are used for spectral-correction. A post-deconvolution correction of the spectral-index map forP²(ν) results in errors at the ±0.3 level beyond the ∼50% point.

4.3.4. MT-MFS + WB A-Projection

The image in the bottom right panel of Figure 3 is the result of MT-MFS in the minor cycle (two terms), and WB A-Projection gridding (Section 3). Artifacts around all sources are gone, and the dominant errors are numerical (at the floating-point precision level). The spectral-index map produced by MT-MFS is accurate to within 0.01 in the main lobe, and 0.05 out in the sidelobe. Such accuracies allows the recovery of source-spectra further-out in the PB than previously possible. The main difference between this method and all others, is that time and frequency variability of the PB has been corrected for in the data domain, before any averaging is done to construct a continuum image to send to the minor cycle. The minor cycle sees a flat-noise normalization, preserving the convolution-equation and allowing for deeper "cleaning" before triggering the next major cycle (i.e., faster convergence).

This method shows the lowest errors in Figure 4, indicating that if the PB can be accurately modeled, its time and frequency variability can be corrected for during gridding, resulting in an accurate reconstruction in the minor cycle.

4.4. Imaging Results with VLA L-band Data

Figure 5 shows the continuum intensity and spectral index distribution of the G55.7+3.4 Galactic supernova remnant (SNR), using the VLA in L-band and D-configuration and made with and without the WB A-Projection algorithm. The peak brightness is 6 mJy, and with an off-source rms of 11μJy, this is a modest dynamic range. The peak brightness comes from a background pulsar with a known spectral index of −2.3, and the brighter synchrotron-emission filaments are at the 1 mJy level. The half-power beam width (HPBW) of the PB is 30 arcmin, and extended emission from the SNR fills the PB at the reference frequency. The spurious spectral index at the HPBW due to the PB variation between 1 and 2 GHz is approximately −1.4.

The left column of panels in Figure 5 shows continuum intensity, and the right column shows corresponding spectral index maps.

1.
The top row shows flat-noise results with MTMFS+SI, where A-Projection was not used, and PB correction was not done. There is considerable artificial steepening (darkening on the plot) of the observed source spectrum as distance from the pointing center increases. The spectral index of the bright background pulsar is −3.05.
2.
The middle row shows flat-sky results from the same run as above, where MTMFS+SI was followed by a post-deconvolution wide-band PB correction of both the intensity and the spectrum. The spectral index map shows that this post-deconvolution correction has restored the spectral indices of the outer part of the SNR as well as the background sources to more realistic values. The spectral index of the bright background pulsar is −2.61 (after correction for an average estimated PB spectral index of −0.44 at the 0.8 gain level).
3.
The bottom row shows flat-sky results from an MTMFS+WBAWP run where the intensity image has been corrected for the average PB gain, but the spectral index image is just what came out of the imaging run. The noise properties of the intensity image are slightly better than the middle row, and the spectral index map shows slightly more coherent and less noisy structure across the SNR. The spectral index of the bright background pulsar is −2.29, which is the closest so far to the expected value.

5. CONCLUSIONS

In this paper we describe the WB A-Projection algorithm, which extends the NB A-Projection algorithm (Bhatnagar et al. 2008) to correct for the wide-band effects of the PB prior to integration in time and frequency for continuum imaging. We demonstrate that the combined WB A-Projection and MT-MFS algorithm for simultaneous intensity and spectral index mapping performs as expected.

Theoretical analysis in Section 2.1.2 draws equivalence between standard antenna complex gain and bandpass calibration and the NB A-Projection and WB A-Projection algorithms, respectively, and show that the latter two algorithms are the DD generalization of the former DI algorithms. The A-term of the A-Projection algorithm represents the DD complex antenna gain pattern in the data domain. Since it is DD, corrections for it fundamentally cannot be decoupled from imaging and must be corrected for during imaging (see Rau et al. 2009). The A-Projection algorithm, which corrects for the DD antenna gains during imaging, therefore can be thought of as an algorithm for DD calibration. Similarly, the WB A-Projection algorithm, which includes corrections for the frequency dependence of the A-term, can be thought of as the DD generalization of the standard bandpass calibration algorithm. We feel that making these connections with simpler, intuitively better understood and widely used algorithms in the community makes it easier to understand the newer more general techniques.

We also analyzed hybrid schemes for wide-band imaging using the NB A-Projection and image-plane correction for the effects of PB. Our conclusion is that while for non-parallel implementation, WB A-Projection is required for wide-band imaging, for implementations that may require partitioning the data along time and/or frequency axis, hybrid approaches are also sufficient. However the use of WB A-Projection in implementations on parallel processing platforms allows the freedom to tune the distribution of the data to suite the available hardware and computing resources (e.g., this allows imaging smaller chunks of the total bandwidth in parallel, even if these smaller chunks need wide-band PB corrections. Without WB A-Projection, the data distribution is restricted to be partitioned in frequency such that each chunk can be imaged using NB A-Projection).

Comparisons show that the WB A-Projection plus MT-MFS enables simultaneous intensity and spectral index imaging throughout the PB in wide-field imaging. Moderate dynamic range imaging within the half-power point of the PB is possible where all frequency dependence in the image is absorbed in the solution of the MT-MFS algorithm. Beyond this FoV, errors increase because a time-averaged PB spectrum is not a good estimate in regions of the image where PB changes by 100% with time as the beams rotate on the sky. The FoV can be increased until the ∼10% point of the PB by combining MT-MFS with NB A-Projection. The time-dependence of the PB is accounted for via A-Projection and its frequency dependence is absorbed in MT-MFS. Using larger number of Taylor-terms in MT-MFS improves the imaging performance for simpler fields, but is inadvisable because of instability in low-S/N regions.

Finally, we would like to note that while only the effects of the antenna PB were included in the A-term used in this paper, other antenna-based DD effect can also be easily included. The effect of non-isoplanatic ionospheric/atmospheric phases is comparable to the effect of PB for wide-band, wide-field imaging at low frequencies, particularly with aperture-array antenna elements. Similar effects come from the irregularities in the water vapor content in the lower atmosphere for imaging at high frequencies. The effects due to ionosphere/atmosphere and PB need to be corrected simultaneously, often for wide-band data in full polarization. It may be possible to extend the WB A-Projection algorithm presented here to include corrections for ionospheric effects. Work to test these extensions is underway and will be reported in future publications.

This work was done using the R&D branch of the CASA code base. We thank the CASA Group for the underlying libraries. We thank T. J. Cornwell for his useful and detailed comments/suggestions as a referee. Part of this work was funded by the ALBiUS work package of the European Commission Radionet FP7 program. The National Radio Astronomy Observatory is a facility of the National Science Foundation operated under cooperative agreement by Associated Universities, Inc.

WIDE-FIELD WIDE-BAND INTERFEROMETRIC IMAGING: THE WB A-PROJECTION AND HYBRID ALGORITHMS

Article metrics

Permissions

Author e-mails

Author affiliations

Dates

ABSTRACT

1. INTRODUCTION

2. THEORY

2.1. Imaging with A-Projection

2.1.1. Algorithmic Steps for A-Projection

2.1.2. A-Projection: A Direction-dependent Gain Correction Algorithm

3. THE WB A-PROJECTION ALGORITHM

3.1. The Wide-band ${\mathsf {A}}$ Operator

3.2. The Conjugate Frequency

3.3. Algorithm Validation

4. APPLYING NB A-PROJECTION FOR WIDE-BAND IMAGING

4.1. Cube Imaging + Cube Deconvolution

4.2. Cube Imaging + MFS Deconvolution

4.2.1. MFS Imaging + MFS Deconvolution

4.3. Comparison of Hybrids with WB A-Projection

4.3.1. MFS + SI (Standard Imaging)

4.3.2. MT-MFS+ SI

4.3.3. MT-MFS+ A-Projection

4.3.4. MT-MFS + WB A-Projection

4.4. Imaging Results with VLA L-band Data

5. CONCLUSIONS

Footnotes

WIDE-FIELD WIDE-BAND INTERFEROMETRIC IMAGING: THE WB A-PROJECTION AND HYBRID ALGORITHMS

Article metrics

Permissions

Share this article

Author e-mails

Author affiliations

Dates

ABSTRACT

1. INTRODUCTION

2. THEORY

2.1. Imaging with A-Projection

2.1.1. Algorithmic Steps for A-Projection

2.1.2. A-Projection: A Direction-dependent Gain Correction Algorithm

3. THE WB A-PROJECTION ALGORITHM

3.1. The Wide-band {\mathsf {A}} Operator

3.2. The Conjugate Frequency

3.3. Algorithm Validation

4. APPLYING NB A-PROJECTION FOR WIDE-BAND IMAGING

4.1. Cube Imaging + Cube Deconvolution

4.2. Cube Imaging + MFS Deconvolution

4.2.1. MFS Imaging + MFS Deconvolution

4.3. Comparison of Hybrids with WB A-Projection

4.3.1. MFS + SI (Standard Imaging)

4.3.2. MT-MFS+ SI

4.3.3. MT-MFS+ A-Projection

4.3.4. MT-MFS + WB A-Projection

4.4. Imaging Results with VLA L-band Data

5. CONCLUSIONS

Footnotes

3.1. The Wide-band ${\mathsf {A}}$ Operator