Action-based Dynamical Modeling for the Milky Way Disk: The Influence of Spiral Arms

Wilma H. Trick; Jo Bovy; Elena D’Onghia; Hans-Walter Rix

doi:10.3847/1538-4357/aa67db

1. Introduction

An important basis for learning more about the Milky Way's (MW) overall gravitational potential and orbit distribution function (DF) is to find the "best possible" axisymmetric model for the Galaxy. Given such a model, the identification and characterization of non-axisymmetries like spiral arms or stellar streams in stellar phase-space (and chemical abundance) data would then become more straightforward.

Several approaches to constrain an axisymmetric potential and/or orbit DF have recently been put forward: Bovy & Rix (2013) and Piffl et al. (2014) fitted potential and DF simultaneously to stellar kinematics in the disk and got precise constraints on the overall potential; Sanders & Binney (2015) and Das & Binney (2016) investigated extended DFs for the disk and halo, respectively (given a fiducial potential), that included the metallicity of each star, in addition to the distribution in orbit space.

In this work, we will continue our investigation of the RoadMapping approach ("Recovery of the Orbit Action Distribution of Mono-Abundance Populations and Potential INference for our Galaxy"). The first application of RoadMapping was performed by Bovy & Rix (2013). Trick et al. (2016), hereafter Paper I, subsequently performed a detailed analysis of the strengths and limitations of the approach. RoadMapping presumes that simple stellar populations in the MW disk—be it mono-abundance populations (MAPs), i.e., stars with the same $[\mathrm{Fe}/{\rm{H}}]$ and $[\alpha /\mathrm{Fe}]$ (Bovy et al. 2012b, 2012c, 2012d, 2016), or maybe also mono-age populations (Bird et al. 2013; Martig et al. 2014, 2016; Minchev et al. 2014, 2017; Ness et al. 2016)—follow simple orbit DFs, like, e.g., the quasi-isothermal DF (qDF) by Binney & McMillan (2011). MAPs in the MW are well described by the qDF, which was first shown by Ting et al. (2013). The qDF is expressed in terms of the orbital actions ${\boldsymbol{J}}=({J}_{R},{J}_{\phi }={L}_{z},{J}_{z})$ , which are integrals of motion, quantify the amount of the orbit's oscillation in each of the coordinate directions $(R,\phi ,z)$ , and are therefore excellent orbit labels. Given an assumed gravitational potential, one can calculate the orbital actions from the stars' current phase-space positions $({\boldsymbol{x}},{\boldsymbol{v}})$ (Binney 2012; Sanders & Binney 2016; see also Bovy 2014 for a general method to compute $({\boldsymbol{x}},{\boldsymbol{v}})\longrightarrow ({\boldsymbol{J}},{\boldsymbol{\theta }})$ ). Only if this assumed gravitational potential is close to the true potential, the action distribution of the stellar MAP in question will follow an orbit DF of qDF-shape. This is the idea on which RoadMapping builds, and which allows us to simultaneously fit potential and orbit DF to observations.

Bovy & Rix (2013) employed this approach to measure the MW's surface density profile within $1.1\,\mathrm{kpc}$ using 43 MAPs in the Galactic disk from the SDSS/SEGUE survey (Yanny et al. 2009). To avoid spiral arm effects, they did not use in-plane motions. Their potential model had only two free parameters (disk scale length and relative halo-to-disk contribution to the radial force at the solar radius). To account for missing model flexibility, they constrained the surface density for each MAP only at one best radius. The profile they derived in this fashion had a scale length of ${R}_{{\rm{s}}}=2.5\,\mathrm{kpc}$ and was—in the regime of $R\gt 6.6\,\mathrm{kpc}$ —later confirmed by Piffl et al. (2014) using a different action-based procedure.

Given the success of this first application and in anticipation of the upcoming data releases (DRs) from Gaia in 2016–2022 (Eyer et al. 2013), Paper I improved the RoadMapping machinery and studied its strengths and breakdowns in detail, by investigating a large suite of mock data sets. Under the prerequisite of axisymmetric data and model, we found that RoadMapping's modeling success is stable against minor misjudgments of DF or selection function, and that—if the true potential is not contained in the proposed family of model potentials—one can still find a good fit that returns the correct forces, given the limitations of the model. Paper I also found that measurement uncertainties of the order of those by the final Gaia DR should be good enough (within $3\,\mathrm{kpc}$ from the Sun) to allow for precise and unbiased modeling results.

The MW is, however, not axisymmetric. The bulge contains a strong bar (Liszt & Burton 1980; Binney et al. 1991, 1997; Blitz & Spergel 1991; Hammersley et al. 2000; Wegg & Gerhard 2013) and the disk itself is threaded by spiral arms (Oort et al. 1958; Georgelin & Georgelin 1976; Churchwell et al. 2009; Reid et al. 2014), and (ring-like) overdensities (Newberg et al. 2002; Jurić et al. 2008; Xu et al. 2015), which induce non-circular motions and asymmetries in stellar number counts. There is also kinematic evidence in the disk for moving groups (Dehnen 1998; Famaey et al. 2005; Bovy et al. 2009; Bovy & Hogg 2010) and streaming motions (in 21 cm or velocities; Siebert et al. 2012; Williams et al. 2013; Bovy et al. 2015), both of which are likely caused by non-axisymmetric perturbations to the gravitational potential.

As RoadMapping and related approaches can only build axisymmetric models, this is an important breakdown of modeling assumptions, which was not investigated in Paper I. In this paper, we want to understand in which respects RoadMapping will still give reliable constraints on the MW's gravitational potential in the presence of spiral arms.

Our investigation makes use of an N-body simulation snapshot of a spiral galaxy with strong spiral arms presented in D'Onghia et al. (2013; see their Figure 8, top left panel). From this snapshot, we draw mock data in regions with different spiral arm strengths. We then apply the RoadMapping machinery to these data sets and test how well we recover the local and overall gravitational potential.

In Paper I, we confirmed and tested separately the robustness of RoadMapping in the case that the data came from a different model family—for either the potential or DF—than assumed in the dynamical modeling. What would happen if both potential and DF model families were slightly wrong at the same time? The setup of this study will automatically cover this important test case. The potential and orbit DF model that we are using were picked as a pragmatic compromise between (1) being a reasonable choice given the initial axisymmetric setup of the galaxy simulation and (2) because of their simplicity, computational advantages, and—in case of the qDF—because that is what we are planning to use in the MW. Given that the simulation has evolved away from its axisymmetric beginnings, we expect our chosen model to be reasonable, but not particularly well-suited to model this galaxy.

Spiral arms introduce another—but possibly minor—breakdown of the modeling assumptions: in a non-axisymmetric gravitational potential, the three actions will not be strict integrals of motions anymore (Binney & Tremaine 2008; Minchev et al. 2011, 2012; Grand et al. 2012; Solway et al. 2012; Vera-Ciro & D'Onghia 2016). It will be interesting to see if the action-based DF in RoadMapping modeling is still informative, even if it only uses the approximate actions estimated in an axisymmetric potential.

In the MW, we expect the central bar to introduce additional non-axisymmetries in the Galactic disk—but presumably not stronger ones than the spiral arms. Because the galaxy simulation in this work does not have a central bar, we do not investigate specific bar effects here.

Though non-axisymmetry could be a severe problem for RoadMapping, we show in this paper that RoadMapping potential estimates are still surprisingly accurate, which makes us optimistic that they will also be so for the MW.

This paper is organized as follows. Section 2 describes the N-body simulation snapshot of a spiral galaxy that we are going to model in this study, explains how we extract 6D stellar phase-space data from it, and how we quantify the spiral arm strength. There we also review similarities and differences between the simulation in this work and what we know about the MW. Section 3 summarizes the RoadMapping dynamical modeling framework, and introduces the DF and potential model that we will fit to the data. Section 4 is dedicated to presenting the results: in Section 4.1, we discuss in detail the RoadMapping modeling results derived from one data set within a survey volume with radius ${r}_{\max }=4\,\mathrm{kpc}$ around the Sun. Section 4.2 then investigates a whole suite of RoadMapping analyses, corresponding to survey volumes of different sizes and different positions within the galaxy and with respect to the spiral arms. In Section 5, we discuss the results and give an outlook to the application of RoadMapping to Gaia data. We conclude in Section 6.

2. Data from a Galaxy Simulation

RoadMapping requires 6D phase-space coordinates $({{\boldsymbol{x}}}_{i},{{\boldsymbol{v}}}_{i})$ for a large set of stars that move independently in a collisionless galactic potential. If we want to test RoadMapping on a simulated galaxy, it is most convenient to apply it to an N-body simulation with a huge number of low-mass "star" particles. In that way, we can directly take the positions and the motions of individual particles as independent tracers of the potential, just as with the stars in the MW, without having to use an error-prone prescription to turn a single particle into many stars. The high-resolution simulations with its millions of particles by D'Onghia et al. (2013) satisfy this requirement.

2.1. Description of the Galaxy Simulation Snapshot

The high-resolution N-body simulation snapshot of a disk galaxy by D'Onghia et al. (2013), which we use in this work, was carried out with the GADGET-3 code, and set up in the manner described in Springel et al. (2005). In this simulation, overdensities with properties similar to giant molecular clouds induced four prominent spiral arms—and therefore a non-axisymmetric substructure—via the swing amplification mechanism. This galaxy simulation was also investigated by D'Onghia et al. (2013; their Figure 8, top left panel) and D'Onghia (2015; their Figure 2, top left panel). For details, see D'Onghia et al. (2013). Here, we summarize the essential characteristics.

The simulation has a gravitationally evolving stellar disk within a static/rigid analytic dark matter (DM) halo.

The analytic halo follows a Hernquist (1990) profile

$\begin{eqnarray}&&{\rho }_{\mathrm{dm}}(r)=\displaystyle \frac{{M}_{\mathrm{dm}}}{2\pi }\displaystyle \frac{{a}_{\mathrm{dm}}}{r{(r+{a}_{\mathrm{dm}})}^{3}}\end{eqnarray} \tag{ 1 }$

with total halo mass ${M}_{\mathrm{dm}}=9.5\times {10}^{11}\,{M}_{\odot }$ and scale length ${a}_{\mathrm{dm}}=29\,\mathrm{kpc}$ .

The disk consists of 10⁸ "disk star" particles, each having a mass of $\sim 370\,{M}_{\odot }$ , and 1000 "giant molecular cloud" particles with masses of $\sim 9.5\times {10}^{5}\,{M}_{\odot }$ . The initial vertical mass distribution of the stars in the disk is specified by the profile of an isothermal sheet with a radially constant scale height ${z}_{{\rm{s}},\mathrm{init}}$ , i.e.,

$\begin{eqnarray}&&{\rho }_{* }(R,z)=\displaystyle \frac{{M}_{* }}{4\pi {z}_{{\rm{s}},\mathrm{init}}{R}_{{\rm{s}}}^{2}}{\rm sech }^{2}\left(\displaystyle \frac{z}{{z}_{{\rm{s}},\mathrm{init}}}\right)\exp \left(-\displaystyle \frac{R}{{R}_{{\rm{s}}}}\right),\end{eqnarray} \tag{ 2 }$

with a total disk mass of ${M}_{* }=0.04{M}_{\mathrm{dm}}=3.8\times {10}^{10}\,{M}_{\odot }$ . The scale length ${R}_{{\rm{s}}}$ is assumed to be $2.5\,\mathrm{kpc}$ and ${z}_{{\rm{s}},\mathrm{init}}=0.1{R}_{{\rm{s}}}$ . In this model, the disk fraction within 2.2 scale lengths is 50% of the total mass, leading to a formation of approximately four arms (D'Onghia 2015) (see Figure 2(b)).

The bulge consists of 10⁷ "bulge star" particles with masses of $\sim 950\,{M}_{\odot }$ and they are distributed following a spherical Hernquist profile analogous to Equation (1), with a total mass of ${M}_{\mathrm{bulge}}\,=0.01{M}_{\mathrm{dm}}=9.5\times {10}^{9}\,{M}_{\odot }$ and a scale length of ${a}_{\mathrm{bulge}}\,=0.1{R}_{{\rm{s}}}=0.25\,\mathrm{kpc}$ .

The initial velocity setup of the "disk star" particles assumes for simplicity Gaussian velocity dispersion profiles (Springel et al. 2005).

The simulation snapshot, which we are using in this work, has evolved from these initial conditions in isolation for $\sim 250\,\mathrm{Myr}$ , which corresponds to approximately one orbital period at $R\sim 8\,\mathrm{kpc}$ . The mass density of simulation particles (without the DM halo) at this snapshot time is shown in Figure 1. The "molecular cloud perturbers," which caused the formation of the four pronounced spiral arms, can be seen in Figure 1 as small overdensities in the disk. The spherical bulge and very flattened disk are shown in Figure 1(b).

**Figure 1.** Simulation snapshot by D'Onghia et al. (2013). Shown are the surface mass density (in the (x, y)-plane, panel (a)) and mass density (in the (R, z)-plane, panel (b)) of the "star" particles belonging to disk, bulge, and giant molecular clouds (the dark matter halo in this simulation is static and analytic and not shown here). Overplotted are the disk's scale length ${R}_{{\rm{s}}}=2.5\,\mathrm{kpc}$ (see Section 2.1) and the radii at which we center our test survey volumes in this investigation, ${R}_{0}=8\,\mathrm{kpc}$ and ${R}_{0}=5\,\mathrm{kpc}$ (see Section 2.2). The centers of the different survey volumes are marked with a square, if the survey volume is centered on a spiral arm (`S8` and `S5`), or with a circle, if the volume is centered on an inter-arm region (`I8` and `I5`). The coordinates are summarized in Table 1. The orange circle with a radius of ${r}_{\max }=4\,\mathrm{kpc}$ marks the survey volume in which we conduct the analysis discussed in detail in Section 4.1.
Download figure:
Standard image High-resolution image

**Figure 1.** Simulation snapshot by D'Onghia et al. (2013). Shown are the surface mass density (in the (x, y)-plane, panel (a)) and mass density (in the (R, z)-plane, panel (b)) of the "star" particles belonging to disk, bulge, and giant molecular clouds (the dark matter halo in this simulation is static and analytic and not shown here). Overplotted are the disk's scale length ${R}_{{\rm{s}}}=2.5\,\mathrm{kpc}$ (see Section 2.1) and the radii at which we center our test survey volumes in this investigation, ${R}_{0}=8\,\mathrm{kpc}$ and ${R}_{0}=5\,\mathrm{kpc}$ (see Section 2.2). The centers of the different survey volumes are marked with a square, if the survey volume is centered on a spiral arm (`S8` and `S5`), or with a circle, if the volume is centered on an inter-arm region (`I8` and `I5`). The coordinates are summarized in Table 1. The orange circle with a radius of ${r}_{\max }=4\,\mathrm{kpc}$ marks the survey volume in which we conduct the analysis discussed in detail in Section 4.1.
Download figure:
Standard image High-resolution image

We have confirmed that the gravitational center of the particles corresponds to the coordinate origin.

2.2. Data Selection and Survey Volume

The selection function of all-sky surveys like Gaia, that are only limited by the brightness of the tracers, are contiguous and—when ignoring anisotropic effects like dust obscuration—spherical in shape. For simplicity, we will use spherical survey volumes centered on different vantage points, and with sharp edges at a distance ${r}_{\max }$ around it (see Equation (8)), which corresponds to a magnitude cut for stellar tracers all having the same luminosity.

Figure 1 illustrates the different survey volume positions analyzed in this study. We selected volumes with ${r}_{\max }=[0.5,1,2,3,4,5]\,\mathrm{kpc}$ centered on a spiral arm (S) and on an inter-arm region (I) at both the equivalent of the solar radius, ${R}_{0}=8\,\mathrm{kpc}$ , and at ${R}_{0}=5\,\mathrm{kpc}$ , where the disk strongly dominates (see Figure 3), and the spiral arms are more pronounced than at ${R}_{0}=8\,\mathrm{kpc}$ (see Figure 2). The exact positions of the vantage points S8, I8, S5, and I5 are summarized in Table 1.

**Figure 2.** Demonstrating the spiral arm strength at different radii. Panel (a) shows the surface density along the azimuth angle at the radii R₀ on which we center our survey volumes. We also mark the corresponding ${\phi }_{0}$ from Table 1. The difference between the surface density at `S5` () and `I5` () is 200% of the mean surface density at R₀; for `S8` () and `I8` () the difference is 130%. Panel (b) shows the Fourier model amplitudes for $m=2,4,6,8$ calculated as ${A}_{m}/{A}_{0}=| {\sum }_{l}{M}_{l}\exp ({im}{\phi }_{l})| /{\sum }_{l}{M}_{l}$ for all disk particles at a given radius with mass M_l and azimuth position ${\phi }_{l}$ . As can be seen, the simulation has overall four strong spiral arms, dominating between $R=4\,\mathrm{kpc}$ and $R=7\,\mathrm{kpc}$ . Inside of that there are two, and outside of that six or more arms. As the particle density increases with smaller radius, most tracers in the analysis will come from regions with only a few strong spiral arms.
Download figure:
Standard image High-resolution image

**Figure 2.** Demonstrating the spiral arm strength at different radii. Panel (a) shows the surface density along the azimuth angle at the radii R₀ on which we center our survey volumes. We also mark the corresponding ${\phi }_{0}$ from Table 1. The difference between the surface density at `S5` () and `I5` () is 200% of the mean surface density at R₀; for `S8` () and `I8` () the difference is 130%. Panel (b) shows the Fourier model amplitudes for $m=2,4,6,8$ calculated as ${A}_{m}/{A}_{0}=| {\sum }_{l}{M}_{l}\exp ({im}{\phi }_{l})| /{\sum }_{l}{M}_{l}$ for all disk particles at a given radius with mass M_l and azimuth position ${\phi }_{l}$ . As can be seen, the simulation has overall four strong spiral arms, dominating between $R=4\,\mathrm{kpc}$ and $R=7\,\mathrm{kpc}$ . Inside of that there are two, and outside of that six or more arms. As the particle density increases with smaller radius, most tracers in the analysis will come from regions with only a few strong spiral arms.
Download figure:
Standard image High-resolution image

Table 1. Vantage Points for Survey Volumes within the Galaxy Simulation Snapshot

Name	Position	R₀ (kpc)	${\phi }_{0}$ (°)
`S8`	on spiral arm	8	5
`I8`	in inter-arm region	8	−15
`S5`	on spiral arm	5	60
`I5`	in inter-arm region	5	0

Note. All spherical survey volumes have a radius r_max and are centered on ${z}_{0}=0$ in the plane of the disk at the position given in this table. ${\phi }_{0}$ is measured counter-clockwise from the positive x-coordinate axis.

Download table as: ASCII Typeset image

From within each volume, we drew ${N}_{* }\,=$ 20,000 random "disk star" particles, and used their phase-space positions $({{\boldsymbol{x}}}_{i},{{\boldsymbol{v}}}_{i})$ within the simulated galaxy's rest-frame as data.

To make the data sample more realistic, one would actually have to add measurement uncertainties, especially to the distances from the survey volume's central vantage point and the proper motions measured from there. We decided not to include measurement uncertainties. First, their effect on RoadMapping modeling has already been investigated in Paper I, and we found that the measurement uncertainties of the last DR of Gaia should be small enough to not significantly disturb the modeling. Second, in this study, we want to isolate and investigate the deviations of the data from axisymmetry and the assumed potential and DF model independently of other effects.

2.3. True Symmetrized Potential

For a galaxy with pronounced spiral arms, an axisymmetric model matter distribution per se cannot reproduce the true matter distribution globally. We therefore obtain an "overall best-fit symmetrized" potential model from the distribution of particles to be able (1) to quantify the non-axisymmetries in the simulation snapshot better and (2) to compare how close our axisymmetric RoadMapping results can get to it.

We derive this model by fitting axisymmetric analytical functions to the density distribution of each of the galaxy components' particles. The bulge and halo follow Hernquist profiles by construction (see Section 2.1).

The disk in this simulation snapshot deviates from its initial conditions in Equation (2): after $250\,\mathrm{Myr}$ pronounced spiral arms have formed, also causing some in-plane heating. Except of an overdensity around $R\sim 6\,\mathrm{kpc}$ (see Figure 6(d) in Section 4.1.2), the overall radial surface density profile (i.e., the azimuthal average) did not change by much, and appears smooth and exponential. We therefore chose a double exponential disk model to fit the particle distribution in the disk. The fit assumes the total disk mass to be known and to be equal to the total mass of all disk particles. The best-fit double exponential disk profile is found by maximizing the likelihood for all disk particles to be drawn from this axisymmetric density profile. In this way, the fit does not depend on binning choices and is driven by the number and location of the stars—analogous to our RoadMapping procedure.

The best-fit parameters for this reference potential, which we will refer to as the DEHH-Pot (Double Exponential disk + Hernquist halo + Hernquist bulge) in the remainder of this work, are given in Table 2. As can be seen in Figure 6 in Section 4.1.2 below, the DEHH-Pot fits the overall true density distribution very well, with the exception of $z\sim 0$ , where the particle distribution is not as cuspy as the exponential disk. Figure 3 shows the circular velocity curve of the DEHH-Pot, and its decomposition into disk, halo, and bulge contribution. The disk clearly dominates between $R\sim 2\,\mathrm{kpc}$ and $R\sim 7\,\mathrm{kpc}$ .

**Figure 3.** Circular velocity curve of the `DEHH-Pot`, i.e., the symmetrized best fit to the N-body simulation, and its disk, halo, and bulge components. The rotational support at 2.2 scale lengths is ${({v}_{\mathrm{circ},\mathrm{disk}}/{v}_{\mathrm{circ},\mathrm{total}})}^{2}\sim 47 \%$ . This demonstrates that the simulation is a disk-dominated spiral galaxy.
Download figure:
Standard image High-resolution image

${({v}_{\mathrm{circ},\mathrm{disk}}/{v}_{\mathrm{circ},\mathrm{total}})}^{2}\sim 47 \% $ — **Figure 3.** Circular velocity curve of the `DEHH-Pot`, i.e., the symmetrized best fit to the N-body simulation, and its disk, halo, and bulge components. The rotational support at 2.2 scale lengths is ${({v}_{\mathrm{circ},\mathrm{disk}}/{v}_{\mathrm{circ},\mathrm{total}})}^{2}\sim 47 \%$ . This demonstrates that the simulation is a disk-dominated spiral galaxy.
Download figure:
Standard image High-resolution image

Table 2. Best-fit Parameters of the DEHH-Pot

Circular velocity	${v}_{\mathrm{circ}}({R}_{\odot })$	$222\,\mathrm{km}\ {{\rm{s}}}^{-1}$
Disk scale length	${R}_{{\rm{s}}}$	$2.5\,\mathrm{kpc}$
Disk scale height	${z}_{{\rm{s}}}$	$0.17\,\mathrm{kpc}$
Halo fraction	${f}_{\mathrm{halo}}$	0.54
Halo scale length	${a}_{\mathrm{halo}}$	$29\,\mathrm{kpc}$
Bulge mass	${M}_{\mathrm{bulge}}$	$0.95\times {10}^{10}\,{M}_{\odot }$
Bulge scale length	${a}_{\mathrm{bulge}}$	$0.25\,\mathrm{kpc}$

Note. The DEHH-Pot is introduced in Section 2.3, and we use it as the global best-fit symmetrized potential model for the simulated galaxy. The halo fraction, ${f}_{\mathrm{halo}}$ , and circular velocity at the "solar" radius, ${v}_{\mathrm{circ}}({R}_{\odot })$ , which scales the total mass of the model, are defined in Equations (18) and (19), with ${R}_{\odot }=8\,\mathrm{kpc}$ .

Download table as: ASCII Typeset image

2.4. Quantifying the Strength of Spiral Arms

Depending on the size and position of the survey volume, spiral arms and inter-arm regions dominate the stellar distribution within the volume to different degrees. To quantify the strength of the spiral arms, we introduce the quantity

$\begin{eqnarray}&&{{\rm{\Delta }}}_{\mathrm{Spiral}}({x}_{k},{y}_{k})\equiv \displaystyle \frac{{{\rm{\Sigma }}}_{1.5\mathrm{kpc},\mathrm{disk},T}({x}_{k},{y}_{k})}{{{\rm{\Sigma }}}_{1.5\mathrm{kpc},\mathrm{disk},S}({x}_{k},{y}_{k})}-1\end{eqnarray} \tag{ 3 }$

where ${{\rm{\Sigma }}}_{1.5\mathrm{kpc},\mathrm{disk},\alpha }$ is the true surface density of the disk component of the simulation snapshot ( $\alpha =T$ for "true"), or of the symmetrized snapshot model DEHH-Pot in Section 2.3 ( $\alpha =S$ for "symmetrized"),

$\begin{eqnarray}&&{{\rm{\Sigma }}}_{1.5\mathrm{kpc},\mathrm{disk},\alpha }({x}_{k},{y}_{k})\equiv {\int }_{-1.5\,\mathrm{kpc}}^{1.5\,\mathrm{kpc}}{\rho }_{\mathrm{disk},\alpha }({x}_{k},{y}_{k},z)\ {dz}.\end{eqnarray} \tag{ 4 }$

$({x}_{k},{y}_{k})$ are the coordinates of regular grid points with a spacing⁵ of $\delta =0.25\,\mathrm{kpc}$ . $({x}_{c}={R}_{0,c}\cdot \cos {\phi }_{0,c},$ ${y}_{c}={R}_{0,c}\,\cdot \sin {\phi }_{0,c},{z}_{c}=0)$ is the position of the survey volume's center within the simulation, with $c\in \{{\mathtt{S}}{\mathtt{8}},{\mathtt{I}}{\mathtt{8}},{\mathtt{S}}{\mathtt{5}},{\mathtt{I}}{\mathtt{5}}\}$ and $({R}_{0,c},{\phi }_{0,c})$ given in Table 1. We consider all $n\simeq \pi {r}_{\max }^{2}/{\delta }^{2}$ values of ${{\rm{\Delta }}}_{\mathrm{Spiral}}({x}_{k},{y}_{k})$ inside a given survey volume of radius ${r}_{\max }$ around position c and calculate the mean and standard deviation,

$\begin{eqnarray}&&\langle {{\rm{\Delta }}}_{\mathrm{Spiral}}\rangle \equiv \displaystyle \frac{1}{n}\displaystyle \sum _{k=1}^{n}{{\rm{\Delta }}}_{\mathrm{Spiral}}({x}_{k},{y}_{k}),\end{eqnarray} \tag{ 5 }$

$\begin{eqnarray}&&{\sigma }_{{\rm{\Delta }}\mathrm{Spiral}}\equiv \sqrt{\displaystyle \frac{1}{n}\displaystyle \sum _{k=1}^{n}{[{{\rm{\Delta }}}_{\mathrm{Spiral}}({x}_{k},{y}_{k})-\langle {{\rm{\Delta }}}_{\mathrm{Spiral}}\rangle ]}^{2}},\end{eqnarray} \tag{ 6 }$

$\begin{eqnarray}&&\mathrm{with}\ {({x}_{k}-{x}_{c})}^{2}+{({y}_{k}-{y}_{c})}^{2}\leqslant {r}_{\max }^{2}.\end{eqnarray} \tag{ 7 }$

These quantities tell us if and how much a spiral arm or an inter-arm region dominates the survey volume ( $\langle {{\rm{\Delta }}}_{\mathrm{Spiral}}\rangle \gt 0$ for spiral arms, $\langle {{\rm{\Delta }}}_{\mathrm{Spiral}}\rangle \lt 0$ for inter-arm regions) and how large the relative contrast between spiral arms and inter-arm regions is ( ${\sigma }_{{\rm{\Delta }}\mathrm{Spiral}}$ ). For example, volumes will have a smaller relative spiral contrast ${\sigma }_{{\rm{\Delta }}\mathrm{Spiral}}$ , if they are either small and sitting completely within an inter-arm region, or if they are large volumes that contain—in addition to some spiral arms and depleted inter-arm regions—large areas of unperturbed disk.

Figure 4 shows ${{\rm{\Delta }}}_{\mathrm{Spiral}}$ as a function of $({x}_{k},{y}_{k})$ , a histogram over all ${{\rm{\Delta }}}_{\mathrm{Spiral},k}$ within the galaxy, and $\langle {{\rm{\Delta }}}_{\mathrm{Spiral}}\rangle$ and ${\sigma }_{{\rm{\Delta }}\mathrm{Spiral}}$ calculated for all test survey volumes in this work (see Section 2.2), depending on position and size.⁶

2.5. Comparison of the N-body Simulation to the MW

The N-body simulation in this work is not a perfect match to the MW. However, to be suitable for this study it is most important that it satisfies our main requirements: it is a disk-dominated spiral galaxy with a very high number of star particles and strong spiral arms. To set the context, we discuss in the following the differences between the simulation at hand and what we know about the MW.

Bar—The MW has a central bar; our simulation does not. The Galactic bar can introduce non-axisymmetries around the co-rotation and Lindblad resonances (Dehnen 2000; Fux 2001; Quillen 2003; Quillen & Minchev 2005; Minchev & Famaey 2010; Sellwood 2010; Monari et al. 2017; see also review by Gerhard 2011). Bar effects are consequently not part of this study and remain a possible source of uncertainty in RoadMapping. We suspect, however, that its influence is modest outside of the bar region, except close to the outer Lindblad resonance.

Mass components—The MW's dark halo is estimated to have a mass of ${M}_{\mathrm{dm},200,\mathrm{MW}}\approx {10}^{12}\,{M}_{\odot }$ (Bland-Hawthorn & Gerhard 2016); the simulation's halo is only slightly less massive with ${M}_{\mathrm{dm},200}\approx 7\times {10}^{11}\,{M}_{\odot }$ . The stellar bulge mass of ${M}_{\mathrm{bulge}}=9.5\times {10}^{9}\,{M}_{\odot }$ in this simulation is a bit smaller than the estimated stellar mass of the MW bulge with ${M}_{\mathrm{bulge},\mathrm{MW}}=(1.4-1.7)\times {10}^{10}\,{M}_{\odot }$ (Portail et al. 2015). The total stellar mass of the MW is estimated to be ${M}_{\mathrm{stars},\mathrm{MW}}=(5\pm 1)\times {10}^{10}\,{M}_{\odot }$ (Bovy & Rix 2013; Bland-Hawthorn & Gerhard 2016), consistent with the total baryonic mass in the simulation of ${M}_{\mathrm{stars}}=4.75\times {10}^{10}\,{M}_{\odot }$ . The fraction of bulge mass to total stellar mass of the MW is ${M}_{\mathrm{bulge},\mathrm{MW}}/{M}_{\mathrm{stars},\mathrm{MW}}=0.3\pm 0.06;$ in our simulation it is 0.25.

Disk—The best estimate for the MW's thin disk scale length from combining several measurements in the literature is ${R}_{{\rm{s}}}=2.6\pm 0.5\,\mathrm{kpc}$ (Bland-Hawthorn & Gerhard 2016). Bovy & Rix (2013), for example, found a stellar disk scale length of ${R}_{{\rm{s}}}=2.15\pm 0.14\,\mathrm{kpc}$ . The disk of this N-body simulation has a similar scale length, ${R}_{{\rm{s}}}=2.5\,\mathrm{kpc}$ . The stellar disk, however, is thinner than in the MW. Jurić et al. (2008) found scale heights of $300\,\mathrm{pc}$ and $900\,\mathrm{pc}$ (with 20% uncertainty) for the thin and thick disks of the MW, respectively; Bovy et al. (2012d), who considered the disk to be a superposition of many exponential MAPs, measured scale heights from $\approx 200\,\mathrm{pc}$ up to $1\,\mathrm{kpc}$ , continuously increasing with the age of the sub-population. In our simulation, there is only a very thin stellar disk component with a scale height of ${z}_{{\rm{s}}}=170\,\mathrm{pc}$ , and no gas and thick disk component as compared to the MW. This discrepancy does, however, not affect the objective of this study. The thick disk has a much higher velocity dispersion and is less prone to spiral perturbations. When we will apply RoadMapping to real MW data, the gas disk can be included as an additional component in the mass model, analogous to Bovy & Rix (2013), and if needed also the thick disk. By slicing data according to MAPs, the thick disk will be accounted for implicitly in the tracer selection.

Disk fraction—The strength of the perturbations in the disk depends on the disk fraction (e.g., D'Onghia 2015). The total disk mass in the simulation might be only 4% of the halo mass, but within $2.2{R}_{{\rm{s}}}$ it is already 50% of the total mass. It is still under debate if the MW disk is maximal (i.e., rotational support of the disk at $2.2{R}_{{\rm{s}}}$ is ∼55%–90%; Sackett 1997; Binney & Tremaine 2008, Section 6.3.3). Bovy & Rix (2013) found, for example, a maximum disk with rotational support ${({v}_{\mathrm{circ},\mathrm{disk}}/{v}_{\mathrm{circ},\mathrm{total}})}^{2}=(69\pm 6) \%$ at $2.2{R}_{{\rm{s}}}$ with ${R}_{{\rm{s}}}=2.15\,\pm 0.14\,\mathrm{kpc}$ . The DEHH-Pot (and therefore our N-body model) has ${({v}_{\mathrm{circ},\mathrm{disk}}/{v}_{\mathrm{circ},\mathrm{total}})}^{2}\approx 47 \%$ at $2.2{R}_{{\rm{s}}}$ and is therefore slightly sub-maximal (see Figure 3). The decomposed rotation curve in Figure 3 is qualitatively similar to the MW models by McMillan (2011) (their Figure 5) and by Barros et al. (2016; their Figure 5, left panels, model MI). Their disks dominate between $R\approx 2.5\mbox{--}8.5\,\mathrm{kpc}$ and $R\approx 5\mbox{--}9.5\,\mathrm{kpc}$ , respectively, while our simulation's disk dominates between $R\approx 2\mbox{--}7\,\mathrm{kpc}$ . We account for this slight difference by also drawing mock data sets from regions around ${R}_{0}=5\,\mathrm{kpc}$ (see Section 2.2).

Number of spiral arms—The exact number of spiral arms in the MW is still under debate. The distribution of star-forming regions traced, e.g., by H ii regions (Georgelin & Georgelin 1976), maser sources (Reid et al. 2009, 2014), and young massive stars (Urquhart et al. 2014), suggest that the MW has four major spiral arms (see also Vallée 2008, 2014 and references therein). Observations in the infrared, e.g., of old red-clump giant stars in the Spitzer/GLIMPSE survey (Churchwell et al. 2009) or of stellar NIR-emission by the COBE satellite (Drimmel & Spergel 2001) indicate that the MW is a grand-design two-armed spiral. One hypothesis is that the four-armed spiral observed in young stars and gas is due to the response of the gas to the two-armed spiral in old stars (Drimmel 2000; Martos et al. 2004). Our simulation is overall a four-armed spiral galaxy, with the mode m = 4 dominating between $R=4\mbox{--}7\,\mathrm{kpc}$ , m = 2 at smaller and $m\geqslant 6$ at larger radii (see Figure 2(b)).

Strength of spiral arms—As demonstrated in Figure 2(a) the spiral arms introduce strong peak-to-peak differences in the stellar surface density (e.g., $\sim 200 \%$ at $R=5\,\mathrm{kpc}$ ). The disk dominates inside $R=8\,\mathrm{kpc}$ (see Figure 3), and in Section 4.1.2, Figure 7, we will see that the spiral arms introduce relative perturbations in the total gravitational forces of up to 30%. Figure 5(c), which we will discuss in Section 4.1.1, suggests an excess of stars with radial velocities of up to $50\,\mathrm{km}\ {{\rm{s}}}^{-1}$ in our simulated galaxy as compared to an axisymmetric model. This can be compared to Reid et al. (2014), who measured, for example, that typical peculiar non-circular motions in the MW spiral arms were around $10\mbox{--}20\,\mathrm{km}\ {{\rm{s}}}^{-1}$ for $R\gtrsim 4\,\mathrm{kpc}$ . Siebert et al. (2012) and Bovy et al. (2015) found velocity fluctuations of the same order in the solar neighborhood. We therefore expect the strength of the perturbation to the total potential due to the spiral arms in this simulation—especially inside $R=8\,\mathrm{kpc}$ , where most of our mock data is drawn from—to be similar or even stronger than those in the MW.

**Figure 5.** Comparison of the true and best-fit stellar DF $({\boldsymbol{x}},{\boldsymbol{v}})$ in position–velocity space. The true DF $({\boldsymbol{x}},{\boldsymbol{v}})$ (i.e., the data D) is the distribution of all the stars in the data set drawn from the simulation snapshot (with ${N}_{* }=$ 20,000 and ${r}_{\max }=4\,\mathrm{kpc}$ centered on position `S8`). The best-fit DF $({\boldsymbol{x}},{\boldsymbol{v}})$ (i.e., the model M) is generated by MC sampling of the best-fit qDF $({\boldsymbol{J}})$ in the best-fit potential model from *RoadMapping* in Table 3, given the known selection function. Panel (a) shows the spatial density residual significance $(D-M)/\sqrt{M}$ of the projection to the (x, y) and (R, z) plane (the MC sampled M is the expected number of stars per bin, and $\sqrt{M}$ is the expected error due to Poisson statistics). In the (x, y) panel, the following regions are marked: $R\lt 8\,\mathrm{kpc},\phi \gt 5^\circ$ (blue), $R\lt 8\,\mathrm{kpc},\phi \lt 5^\circ$ (red), $R\gt 8\,\mathrm{kpc},\phi \gt 5^\circ$ (green), and $R\gt 8\,\mathrm{kpc},\phi \lt 5^\circ$ (yellow). Panels (b) and (c) show the density profiles and velocity residual significance along each of the 6D phase-space coordinates separately for each of the four spatial regions. The blue region is very much dominated by the non-axisymmetric spiral arm. For the yellow region, the axisymmetric single-qDF model is a good description. Overall, the qDF is a good average axisymmetric model for the data. (In panel (a), we overplot the radii ${R}_{\mathrm{spiral}}\in [5.6,6.8]\,\mathrm{kpc}$ as black dotted lines to mark the approximate extent of the stronger spiral arm, to compare it with Figure 8.).
Download figure:
Standard image High-resolution image

**Figure 5.** Comparison of the true and best-fit stellar DF $({\boldsymbol{x}},{\boldsymbol{v}})$ in position–velocity space. The true DF $({\boldsymbol{x}},{\boldsymbol{v}})$ (i.e., the data D) is the distribution of all the stars in the data set drawn from the simulation snapshot (with ${N}_{* }=$ 20,000 and ${r}_{\max }=4\,\mathrm{kpc}$ centered on position `S8`). The best-fit DF $({\boldsymbol{x}},{\boldsymbol{v}})$ (i.e., the model M) is generated by MC sampling of the best-fit qDF $({\boldsymbol{J}})$ in the best-fit potential model from *RoadMapping* in Table 3, given the known selection function. Panel (a) shows the spatial density residual significance $(D-M)/\sqrt{M}$ of the projection to the (x, y) and (R, z) plane (the MC sampled M is the expected number of stars per bin, and $\sqrt{M}$ is the expected error due to Poisson statistics). In the (x, y) panel, the following regions are marked: $R\lt 8\,\mathrm{kpc},\phi \gt 5^\circ$ (blue), $R\lt 8\,\mathrm{kpc},\phi \lt 5^\circ$ (red), $R\gt 8\,\mathrm{kpc},\phi \gt 5^\circ$ (green), and $R\gt 8\,\mathrm{kpc},\phi \lt 5^\circ$ (yellow). Panels (b) and (c) show the density profiles and velocity residual significance along each of the 6D phase-space coordinates separately for each of the four spatial regions. The blue region is very much dominated by the non-axisymmetric spiral arm. For the yellow region, the axisymmetric single-qDF model is a good description. Overall, the qDF is a good average axisymmetric model for the data. (In panel (a), we overplot the radii ${R}_{\mathrm{spiral}}\in [5.6,6.8]\,\mathrm{kpc}$ as black dotted lines to mark the approximate extent of the stronger spiral arm, to compare it with Figure 8.).
Download figure:
Standard image High-resolution image

3. RoadMapping Modeling

In this section, we summarize the mathematical ingredients of RoadMapping, and motivate the DF and potential model that we are going to fit to the data. RoadMapping makes extensive use of the galpy python library by Bovy (2015).⁷ For full details on the RoadMapping machinery, see also Paper I.

3.1. Likelihood

As already laid out in Section 2.2, we use as data the 6D $({{\boldsymbol{x}}}_{i},{{\boldsymbol{v}}}_{i})$ coordinates of N_* stars within a spherical survey volume. The corresponding, purely spatial selection function $\mathrm{SF}({\boldsymbol{x}})$ is

$\begin{eqnarray}\mathrm{SF}({\boldsymbol{x}})\equiv \left\{\begin{array}{ll}1 & {\rm{if}}\,{\rm{}}| {\boldsymbol{x}}-{{\boldsymbol{x}}}_{0}| \leqslant {r}_{\max }\\ 0 & {\rm{otherwise}}\end{array},\right.\end{eqnarray} \tag{ 8 }$

with ${{\boldsymbol{x}}}_{0}=({R}_{0},{\phi }_{0},{z}_{0}=0)$ from Table 1.

Given a parametrized axisymmetric potential model ${\rm{\Phi }}(R,z)$ with parameters p_Φ, the probability that the ith star is on an orbit with the actions

$\begin{eqnarray}&&{{\boldsymbol{J}}}_{i}\equiv {\boldsymbol{J}}[{{\boldsymbol{x}}}_{i},{{\boldsymbol{v}}}_{i}\,| \,{p}_{{\rm{\Phi }}}],\end{eqnarray} \tag{ 9 }$

is proportional to the given orbit distribution function $\mathrm{DF}({\boldsymbol{J}})$ with parameters ${p}_{\mathrm{DF}}$ ,

$\begin{eqnarray}&&\mathrm{DF}({{\boldsymbol{J}}}_{i}\,| \,{p}_{\mathrm{DF}})\equiv \mathrm{DF}({{\boldsymbol{x}}}_{i},{{\boldsymbol{v}}}_{i}\,| \,{p}_{{\rm{\Phi }}},{p}_{\mathrm{DF}}).\end{eqnarray} \tag{ 10 }$

The joint likelihood of a star being within the survey volume and on a given orbit is therefore

$\begin{eqnarray}{{\mathscr{L}}}_{i} & \equiv & {\mathscr{L}}({{\boldsymbol{x}}}_{i},{{\boldsymbol{v}}}_{i}\,| \,{p}_{{\rm{\Phi }}},{p}_{\mathrm{DF}})\\ & = & \displaystyle \frac{\mathrm{DF}({{\boldsymbol{x}}}_{i},{{\boldsymbol{v}}}_{i}\,| \,{p}_{\phi },{p}_{\mathrm{DF}})\cdot \mathrm{SF}({{\boldsymbol{x}}}_{{\boldsymbol{i}}})}{\displaystyle \int \mathrm{DF}({\boldsymbol{x}},{\boldsymbol{v}}\,| \,{p}_{{\rm{\Phi }}},{p}_{\mathrm{DF}})\cdot \mathrm{SF}({\boldsymbol{x}}){d}^{3}{{xd}}^{3}v}.\end{eqnarray} \tag{ 11 }$

The details of how we numerically evaluate the likelihood normalization to a sufficiently high precision are discussed in Paper I.⁸

In the scenario considered in this paper, it can happen that there are a few (∼1 in 20,000) stars entering the catalog that are for some reason on rather extreme orbits, e.g., moving radially directly toward the center. These kinds of orbits do not belong to the set of orbits that we classically expect to make up an overall smooth galactic disk. To avoid such single stars with very low likelihoods having a strong impact on the modeling, we employ here a simple strategy to ensure a robust likelihood,

$\begin{eqnarray}&&{{\mathscr{L}}}_{i}\longrightarrow \max ({{\mathscr{L}}}_{i},\epsilon \times \mathrm{median}({\mathscr{L}})),\end{eqnarray} \tag{ 12 }$

where $\epsilon =0.001$ for ${N}_{* }\,=$ 20,000 stars and $\mathrm{median}({\mathscr{L}})$ is the median of all the N_* stellar likelihoods ${{\mathscr{L}}}_{i}$ with the given p_Φ and ${p}_{\mathrm{DF}}$ . This robust likelihood method was not used in Paper I. For future applications to real MW data, an outlier model similar to the one in Bovy & Rix (2013) could be added: a stellar outlier distribution with constant spatial number density and velocities following a broad Gaussian.

Following Paper I, we assume for now uninformative flat priors on the model parameters p_Φ and ${p}_{\mathrm{DF}}$ and find the maximum and width of the posterior probability function $\mathrm{pdf}({p}_{{\rm{\Phi }}},{p}_{\mathrm{DF}}\,| \,\mathrm{data})\ \propto {\prod }_{i=1}^{{N}_{* }}{{\mathscr{L}}}_{i}\times {\rm{prior}}({p}_{{\rm{\Phi }}},{p}_{\mathrm{DF}})$ using a nested-grid approach and then explore the full shape of the pdf using a Monte Carlo Markov Chain (MCMC).⁹ Full details on this procedure are given in Paper I.

3.2. DF Model

The most simple action-based orbit DF is the qDF introduced by Binney (2010) and Binney & McMillan (2011), which has been a successful ingredient in Paper I and many disk modeling approaches (Bovy & Rix 2013; Piffl et al. 2014; Sanders & Binney 2015). The exact functional form of the qDF $({J}_{R},{L}_{z},{J}_{z}\,| \,{p}_{\mathrm{DF}})$ is given, for example, in Binney & McMillan (2011), or in Equations (2)–(4) of Paper I.

The qDF is expressed in terms of actions, frequencies, and scaling profiles for the radial stellar tracer density $n({R}_{g})$ , and velocity dispersion profiles ${\sigma }_{z}({R}_{g})$ and ${\sigma }_{R}({R}_{g})$ . The latter are functions of the guiding-center radius R_g, i.e., the radius of a circular orbit with given angular momentum L_z in a given potential. We set the scaling profiles to

$\begin{eqnarray}&&n({R}_{g}\,| \,{p}_{\mathrm{DF}})\propto \exp \left(-\displaystyle \frac{{R}_{g}}{{h}_{R}}\right),\end{eqnarray} \tag{ 13 }$

$\begin{eqnarray}&&{\sigma }_{R}({R}_{g}\,| \,{p}_{\mathrm{DF}})={\sigma }_{R,0}\times \exp \left(-\displaystyle \frac{{R}_{g}-{R}_{\odot }}{{h}_{\sigma ,R}}\right),\end{eqnarray} \tag{ 14 }$

$\begin{eqnarray}&&{\sigma }_{z}({R}_{g}\,| \,{p}_{\mathrm{DF}})={\sigma }_{z,0}\times \exp \left(-\displaystyle \frac{{R}_{g}-{R}_{\odot }}{{h}_{\sigma ,z}}\right).\end{eqnarray} \tag{ 15 }$

The free model parameters of the qDF are

$\begin{eqnarray}&&{p}_{\mathrm{DF}}\equiv \{\mathrm{ln}{h}_{R},\mathrm{ln}{\sigma }_{R,0},\mathrm{ln}{\sigma }_{z,0},\mathrm{ln}{h}_{\sigma ,R},\mathrm{ln}{h}_{\sigma ,z}\}.\end{eqnarray} \tag{ 16 }$

In an axisymmetric potential superimposed with non-axisymmetric perturbations, the actions are not exact integrals of motion. However, if one simply considers action-angles as phase-space coordinates, a DF, which is a function of the actions only, might still be a good model for $\mathrm{DF}({\boldsymbol{x}},{\boldsymbol{v}},t)\,=\mathrm{DF}({\boldsymbol{J}},{\boldsymbol{\theta }},t)$ at a given point in time and if the system is well-mixed in phase.

We motivate the use of the qDF as specific action-based DF for the simulation snapshot in this work as follows. There is no stellar abundance or age information in the simulation. We therefore cannot define stellar sub-populations, as we normally would for the MW (see Paper I and Bovy & Rix 2013). However, the disk of the galaxy simulation was originally set up as a single axisymmetric flattened particle population whose density decreases exponentially with radius (see Section 2.1). This is actually very similar to the stellar distribution generated by a single qDF (see, e.g., Ting et al. 2013). Because all particles in the disk have evolved for the same $\sim 250\,\mathrm{Myr}$ since its axisymmetric setup, we can consider the disk essentially as a mono-age population. All of this motivates us therefore to use one single qDF to model the whole disk.

Locally, the current particle distribution in the snapshot at hand might be dominated by non-axisymmetries, which evolved later in the simulation. We have no indication if for small survey volumes the qDF is still a good model for the data. We will use it anyway—to see how far we can get with the simplest model possible and to test if actions are still informative in this case.

3.3. Potential Model

In all RoadMapping analyses in this work, we will fit an axisymmetric gravitational potential model consisting of a (fixed and known) Hernquist bulge, a free Hernquist halo and a free Miyamoto–Nagai disk (Miyamoto & Nagai 1975),

$\begin{eqnarray}&&{{\rm{\Phi }}}_{\mathrm{disk}}(R,z)=-\displaystyle \frac{{GM}}{\sqrt{{R}^{2}+{({a}_{\mathrm{disk}}+\sqrt{{z}^{2}+{b}_{\mathrm{disk}}^{2}})}^{2}}},\end{eqnarray} \tag{ 17 }$

where ${a}_{\mathrm{disk}}$ and ${b}_{\mathrm{disk}}$ are the equivalents of a disk scale length and scale height. Using Hernquist profiles for halo and bulge is motivated by our knowledge of the snapshot galaxy, and we fix the bulge's total mass and scale length to the true values (see Section 2.1 and Table 3). As the bulge contribution to the total radial force at ${R}_{\odot }\equiv 8\,\mathrm{kpc}$ is only ∼9%–10%, this will not give the modeling an unfair advantage. The free model parameters of the halo are the halo scale length ${a}_{\mathrm{halo}}$ and the halo fraction, i.e., the relative halo-to-disk contribution to the radial force at ${R}_{\odot }$ , defined as

$\begin{eqnarray}&&{f}_{\mathrm{halo}}\equiv {\left.\displaystyle \frac{{F}_{R,\mathrm{halo}}}{{F}_{R,\mathrm{disk}}+{F}_{R,\mathrm{halo}}}\right|}_{\mathop{z=0}\limits^{R={R}_{\odot }}}.\end{eqnarray} \tag{ 18 }$

As a parameter that scales the total mass of the galaxy model, we use the circular velocity at the "solar" radius ${R}_{\odot }$ ,

$\begin{eqnarray}&&{v}_{\mathrm{circ}}({R}_{\odot }=8\,\mathrm{kpc})\equiv {\left.\sqrt{R\displaystyle \frac{\partial {\rm{\Phi }}}{\partial R}}\right|}_{\mathop{z=0}\limits^{R={R}_{\odot }}}.\end{eqnarray} \tag{ 19 }$

The total set of free potential model parameters is therefore

$\begin{eqnarray}&&{p}_{{\rm{\Phi }}}\equiv \{{v}_{\mathrm{circ}}({R}_{\odot }),{a}_{\mathrm{disk}},{b}_{\mathrm{disk}},{a}_{\mathrm{halo}},{f}_{\mathrm{halo}}\}.\end{eqnarray} \tag{ 20 }$

We will call this potential model the MNHH-Pot (Miyamoto–Nagai disk + Hernquist halo + Hernquist bulge) in the remainder of this work.

Table 3. Best-fit MNHH-Pot and qDF Parameters as Recovered from the RoadMapping Analysis of a Survey Volume with ${r}_{\max }=4\,\mathrm{kpc}$ Centered on a Spiral Arm at ${R}_{0}=8\,\mathrm{kpc}$ (Position S8)

Circular velocity at ${R}_{\odot }=8\,\mathrm{kpc}$	${v}_{\mathrm{circ}}({R}_{\odot })$	$(223.0\pm 0.1)\,\mathrm{km}\ {{\rm{s}}}^{-1}$
Miyamoto–Nagai disk scale length	${a}_{\mathrm{disk}}$	$({3.62}_{-0.05}^{+0.06})\,\mathrm{kpc}$
Miyamoto–Nagai disk scale height	${b}_{\mathrm{disk}}$	$(0.26\pm 0.02)\,\mathrm{kpc}$
Halo fraction at ${R}_{\odot }=8\,\mathrm{kpc}$	${f}_{\mathrm{halo}}$	$(0.53\pm 0.02)$
Halo scale length	${a}_{\mathrm{halo}}$	$(21\pm 2)\,\mathrm{kpc}$
Bulge mass	${M}_{\mathrm{bulge}}$	$0.95\times {10}^{10}\,{M}_{\odot }$ (fixed)
Bulge scale length	${a}_{\mathrm{bulge}}$	$0.25\,\mathrm{kpc}$ (fixed)
qDF tracer scale length	h_R	$({3.34}_{-0.04}^{+0.05})\,\mathrm{kpc}$
qDF radial velocity dispersion	${\sigma }_{R,0}$	$(15.91\pm 0.08)\,\mathrm{km}\ {{\rm{s}}}^{-1}$
qDF vertical velocity dispersion	${\sigma }_{z,0}$	$({14.0}_{-0.1}^{+0.2})\,\mathrm{km}\ {{\rm{s}}}^{-1}$
qDF radial velocity dispersion scale length	${h}_{\sigma ,R}$	$(4.6\pm 0.5)\,\mathrm{kpc}$
qDF vertical velocity dispersion scale length	${h}_{\sigma ,z}$	$({5.65}_{-0.07}^{+0.06})\,\mathrm{kpc}$

Note. The bulge mass and scale length were fixed in the analysis to their true values, see Sections 2.1 and 3.3.

Download table as: ASCII Typeset image

To estimate the stellar actions ${\boldsymbol{J}}=({J}_{R},{L}_{z},{J}_{z})$ in the axisymmetric MNHH-Pot, we use the Stäckel fudge algorithm by Binney (2012) with fixed focal length ${\rm{\Delta }}=0.45$ , and interpolate the actions on a grid (Binney 2012; Bovy 2015). We made sure that the accuracy of the parameter estimates are not degraded by interpolation errors.¹⁰

Galaxy disks, in general, as well as the simulated disk in this work, have exponential radial density profiles. A single Miyamoto–Nagai disk is more massive at large radii than an exponential disk (see, e.g., Smith et al. 2015). By construction, the DEHH-Pot introduced in Section 2.3 is therefore better suited to reproduce the overall density distribution in the simulation than the MNHH-Pot (as we will see in Figure 6 in the next section). However, the closed form expression of the Miyamoto–Nagai potential in Equation (17) has the crucial advantage of allowing much faster force and therefore action calculations. In addition, by using a potential model, where we already know that it is not be the optimal model for the galaxy's disk, we challenge RoadMapping even further.

**Figure 6.** Comparison of the true and recovered gravitational potential Φ. Panel (a) compares in the (R, z) plane equidensity contours of the overall matter density distribution ${\rho }_{{\rm{\Phi }}}$ , generating the potential. Panels (c) and (d) show the potential's circular velocity curve and radial surface density profile within $| z| =1.1\,\mathrm{kpc}$ , respectively. We compare the true ${\rho }_{{\rm{\Phi }}}$ , ${v}_{\mathrm{circ}}$ , and ${{\rm{\Sigma }}}_{1.1\mathrm{kpc}}$ , of the galaxy simulation (azimuthally averaged over the whole galaxy; black solid lines), with 100 `MNHH-Pot` potentials drawn from the pdf of the best-fit *RoadMapping* model (blue lines). This model was derived from ${N}_{* }=\mathrm{20,000}$ stars in the spherical survey volume at `S8` with ${r}_{\max }=4\,\mathrm{kpc}$ . (The extent of the survey volume is marked in orange.) The best-fit parameters are given in Table 3, and Panel (b) shows the residuals between the matter density corresponding to the median values in this table, ${\rho }_{{\rm{\Phi }},{\mathtt{M}}}$ , and the true density ${\rho }_{{\rm{\Phi }},{\mathtt{T}}}$ . Overplotted in Panels (a), (c), and (d) is also the reference `DEHH-Pot` (see Section 2.3; black dotted line). Over wide areas even outside of the survey volume the relative difference between true and recovered density is less than 15%. At $R\gtrsim 8\,\mathrm{kpc}$ and $z\sim 0$ it becomes apparent that the chosen potential model cannot perfectly capture the structure of the disk. However, in the plane of the disk and at smaller radii within the survey volume, where most of the stars are located, the model gives good constraints on the density. The circular velocity curve is recovered to less than 5%.
Download figure:
Standard image High-resolution image

**Figure 6.** Comparison of the true and recovered gravitational potential Φ. Panel (a) compares in the (R, z) plane equidensity contours of the overall matter density distribution ${\rho }_{{\rm{\Phi }}}$ , generating the potential. Panels (c) and (d) show the potential's circular velocity curve and radial surface density profile within $| z| =1.1\,\mathrm{kpc}$ , respectively. We compare the true ${\rho }_{{\rm{\Phi }}}$ , ${v}_{\mathrm{circ}}$ , and ${{\rm{\Sigma }}}_{1.1\mathrm{kpc}}$ , of the galaxy simulation (azimuthally averaged over the whole galaxy; black solid lines), with 100 `MNHH-Pot` potentials drawn from the pdf of the best-fit *RoadMapping* model (blue lines). This model was derived from ${N}_{* }=\mathrm{20,000}$ stars in the spherical survey volume at `S8` with ${r}_{\max }=4\,\mathrm{kpc}$ . (The extent of the survey volume is marked in orange.) The best-fit parameters are given in Table 3, and Panel (b) shows the residuals between the matter density corresponding to the median values in this table, ${\rho }_{{\rm{\Phi }},{\mathtt{M}}}$ , and the true density ${\rho }_{{\rm{\Phi }},{\mathtt{T}}}$ . Overplotted in Panels (a), (c), and (d) is also the reference `DEHH-Pot` (see Section 2.3; black dotted line). Over wide areas even outside of the survey volume the relative difference between true and recovered density is less than 15%. At $R\gtrsim 8\,\mathrm{kpc}$ and $z\sim 0$ it becomes apparent that the chosen potential model cannot perfectly capture the structure of the disk. However, in the plane of the disk and at smaller radii within the survey volume, where most of the stars are located, the model gives good constraints on the density. The circular velocity curve is recovered to less than 5%.
Download figure:
Standard image High-resolution image

4. Results

At the core of this work is a suite of 22 data sets consisting of the phase-space coordinates of stellar tracer particles, drawn from the spiral galaxy simulation snapshot introduced in Section 2.1. Each data set comes from a different survey volume within the galaxy's disk (see Section 2.2). We modeled all data sets with RoadMapping as described in Section 3, by fitting to it a single qDF (see Section 3.2) and the potential model MNHH-Pot (introduced in Section 3.3). This resulted in 22 independent measurements of the simulated galaxy's potential and DF.

We present our results in two steps. In Section 4.1, we look at one of these RoadMapping models in detail. In Section 4.2, we then compare all 22 RoadMapping results, and discuss their differences in the context of spiral arms.

4.1. An Axisymmetric Galaxy Model from RoadMapping

In this section, we will discuss all aspects of a RoadMapping model for one single data set. This data set has ${N}_{* }=$ 20,000 stars that were drawn from the spherical volume with ${r}_{\max }=4\,\mathrm{kpc}$ centered on a spiral arm at the "solar" radius ${R}_{0}=8\,\mathrm{kpc}$ . This volume is shown in orange in Figure 1 (position S8). We chose this volume because of its position centered on a smaller spiral arm, similar to our Sun being located in the Orion spiral arm. It is a bit larger than our conservative guess for the survey volume size for which we currently expect unbiased potential estimates from the final Gaia DR, ${r}_{\max }=3\,\mathrm{kpc}$ from the Sun (see Paper I and Section 5.3). However, improvements in RoadMapping with respect to the treatment of measurement errors might ultimately allow us to also model larger volumes. This example survey volume ranging from $R=4\mbox{--}12\,\mathrm{kpc}$ also has the advantage of being crossed by several spiral arms of different spiral strength (see Figures 1(a) and 2). In this section, our goal is now to investigate the ability of the best-fit RoadMapping model to serve as an overall axisymmetric model for the galaxy.

The parameters of the best-fit MNHH-Pot and qDF recovered with RoadMapping from this data set are summarized in Table 3. The circular velocity ${v}_{\mathrm{circ}}({R}_{\odot })$ and halo fraction ${f}_{\mathrm{halo}}$ are especially well recovered (compare to Table 2).

4.1.1. Recovering the Stellar Distribution

The RoadMapping fit itself takes place in action space. However, an important sanity check to decide if the fit was successful, is to test if the best-fit RoadMapping model (i.e., best-fit action-based DF in best-fit potential and in given selection function) generates a stellar distribution that reproduces the distribution of data points in observable phase-space, $({\boldsymbol{x}},{\boldsymbol{v}})$ . This comparison is shown in Figure 5.

We note that the spiral arms introduce very strong non-axisymmetries in the data, both in the spatial and the velocity distribution (especially in v_R, where a significantly larger number of stars move outward than inward as compared to an axisymmetric model). We therefore compare the data and fit separately for different spatial regions, $R\gt 8\,\mathrm{kpc}$ and $R\lt 8\,\mathrm{kpc}$ , and $\phi \lt 5^\circ$ and $\phi \gt 5^\circ$ . In the region where the spiral arm dominates (blue in Figure 5), the best-fit RoadMapping model is actually a very poor model. However, what the model underestimates in the spiral arm, it slightly overestimates in the other regions and is therefore indeed something like a good average model for the overall distribution. The region at $R\gt 8\,\mathrm{kpc}$ and $\phi \lt 5^\circ$ (yellow in Figure 5), where neither the spiral arm nor the inter-arm regions dominate strongly, is especially well described by the model.

Overall, the qDF appears to be a good model for unperturbed regions of the disk and averages over spiral arms.

4.1.2. Recovering the Gravitational Potential

As shown in the previous section, the best-fit RoadMapping model seems to reproduce the average stellar phase-space distribution quite well. But is the corresponding potential close to the true potential?

Figure 6 compares the true potential from the simulation snapshot (symmetrized by averaging over the whole ${\rm{\Delta }}\phi =2\pi$ ) and the axisymmetric reference DEHH-Pot from Table 2 with the best-fit MNHH-Pot from the RoadMapping analysis. In particular, Figure 6 illustrates the overall matter density distribution, the rotation curve and the surface density profile. Figure 7 compares the true and recovered (median) gravitational forces at the position of each star in the data set.

**Figure 7.** Recovery of the gravitational forces with *RoadMapping*. We compare the true gravitational forces with the forces estimated from the *RoadMapping* best-fit potential in Table 3 at the (x, y) positions (upper panels) and (R, z) positions (lower panels) of the stars that entered the analysis. In particular, we color-code the positions of the stars according to the radial (panel (a)) and vertical (panel (b)) force residuals scaled by a typical force, i.e., we show ${\rm{\Delta }}{F}_{R}({{\boldsymbol{x}}}_{* ,i})$ and ${\rm{\Delta }}{F}_{z}({{\boldsymbol{x}}}_{* ,i})$ in Equations (23) and (24). The overplotted gray contours correspond to ${{\rm{\Sigma }}}_{1.5\mathrm{kpc},\mathrm{disk},T}/{{\rm{\Sigma }}}_{1.5\mathrm{kpc},\mathrm{disk},S}=1.15$ , i.e., the true vs. the symmetric disk surface density, and mark the position of the spiral arms. The red dots mark stars for which the best-fit model underestimates the (absolute value of the) force. This is the case for the radial force in the leading sides of the spiral arms and the vertical force within the spiral arms, which cannot be reproduced. Blue marks correspond to stars for which the force is overestimated. Overall the radial forces are very well recovered, which is related to the good recovery of the circular velocity curve in Figure 6(c). There are more problems with the vertical force, which is related to the higher surface densities in spiral arms. This slightly biases the overall *RoadMapping* model.
Download figure:
Standard image High-resolution image

**Figure 7.** Recovery of the gravitational forces with *RoadMapping*. We compare the true gravitational forces with the forces estimated from the *RoadMapping* best-fit potential in Table 3 at the (x, y) positions (upper panels) and (R, z) positions (lower panels) of the stars that entered the analysis. In particular, we color-code the positions of the stars according to the radial (panel (a)) and vertical (panel (b)) force residuals scaled by a typical force, i.e., we show ${\rm{\Delta }}{F}_{R}({{\boldsymbol{x}}}_{* ,i})$ and ${\rm{\Delta }}{F}_{z}({{\boldsymbol{x}}}_{* ,i})$ in Equations (23) and (24). The overplotted gray contours correspond to ${{\rm{\Sigma }}}_{1.5\mathrm{kpc},\mathrm{disk},T}/{{\rm{\Sigma }}}_{1.5\mathrm{kpc},\mathrm{disk},S}=1.15$ , i.e., the true vs. the symmetric disk surface density, and mark the position of the spiral arms. The red dots mark stars for which the best-fit model underestimates the (absolute value of the) force. This is the case for the radial force in the leading sides of the spiral arms and the vertical force within the spiral arms, which cannot be reproduced. Blue marks correspond to stars for which the force is overestimated. Overall the radial forces are very well recovered, which is related to the good recovery of the circular velocity curve in Figure 6(c). There are more problems with the vertical force, which is related to the higher surface densities in spiral arms. This slightly biases the overall *RoadMapping* model.
Download figure:
Standard image High-resolution image

The recovery of density, surface density, and circular velocity curve is especially good in the region where most of the stars are located, around $R\sim 6\,\mathrm{kpc}$ and in the plane of the disk. In large regions inside the survey volume, and even outside, the density is recovered to within 15%. The circular velocity curve is recovered to within 5%, which is also approximately the extent of perturbation that the spiral arms cause with respect to a smooth rotation curve. There is, however, a very small ( $\lt 1.5 \%$ ) underestimation of ${v}_{\mathrm{circ}}$ at larger radii. We suspect that this bias is introduced by the spiral arms (see the discussions in Sections 4.2.6 and 4.1.4). The overall surface density profile is a bit overestimated ( $\sim 15 \%$ ) at smaller radii; this is clearly due to the local spiral arm at $R\sim 6\,\mathrm{kpc}$ with its higher surface density and many stars entering the analysis, which bias the result and which was to be expected. At larger radii ( $R\sim 10\mbox{--}12\,\mathrm{kpc}$ ), the fit of the local density in the disk and surface density profile starts to flare due to the choice of the Miyamoto–Nagai disk family with its shallow profile (see Section 4.1.4). But again, where most of the stars are located, our RoadMapping model is a very good average model for the true galaxy.

The aspect of the potential to which the stellar orbits are actually sensitive is the gravitational forces. In Figure 7, we therefore compare the true force that each star in the data set feels (i.e., the radial force, ${F}_{R,T}({{\boldsymbol{x}}}_{* ,i})$ , and vertical force, ${F}_{z,T}({{\boldsymbol{x}}}_{* ,i})$ , calculated as the sum of the individual contributions by each particle in the simulation and the analytic DM halo at the position of each star ${{\boldsymbol{x}}}_{* ,i}\equiv ({x}_{i},{y}_{i},{z}_{i})$ ) with the force that the RoadMapping median model predicts for each star ( ${F}_{R,M}({{\boldsymbol{x}}}_{* ,i})$ and ${F}_{z,M}({{\boldsymbol{x}}}_{* ,i});$ M for "median model"). We scale the difference between truth and model by a typical radial or vertical force at the given radius for which we use

$\begin{eqnarray}&&{F}_{R,\mathrm{typ}}(R)\equiv {v}_{\mathrm{circ},S}^{2}(R)/R,\end{eqnarray} \tag{ 21 }$

$\begin{eqnarray}&&{F}_{z,\mathrm{typ}}(R)\equiv {F}_{z,S}(R,z={z}_{{\rm{s}}}),\end{eqnarray} \tag{ 22 }$

where ${v}_{\mathrm{circ},S}$ and ${F}_{z,S}$ are the circular velocity and vertical force evaluated in the "true symmetric" reference potential DEHH-Pot in Table 2. As the typical vertical force at a given radius, we use ${F}_{z,S}$ evaluated at the scale height of the disk, ${z}_{{\rm{s}}}=0.17\,\mathrm{kpc}$ . This is of the same order as the true vertical force averaged over all stars at this radius. Figure 7 shows therefore

$\begin{eqnarray}&&{\rm{\Delta }}{F}_{R}({{\boldsymbol{x}}}_{* ,i})\equiv \displaystyle \frac{| {F}_{R,M}({R}_{i},{z}_{i})| -| {F}_{R,T}({x}_{i},{y}_{i},{z}_{i})| }{| {F}_{R,\mathrm{typ}}({R}_{i})| },\end{eqnarray} \tag{ 23 }$

$\begin{eqnarray}&&{\rm{\Delta }}{F}_{z}({{\boldsymbol{x}}}_{* ,i})\equiv \displaystyle \frac{| {F}_{z,M}({R}_{i},{z}_{i})| -| {F}_{z,T}({x}_{i},{y}_{i},{z}_{i})| }{| {F}_{z,\mathrm{typ}}({R}_{i})| }\end{eqnarray} \tag{ 24 }$

for each star ( ${R}_{i}=\sqrt{{x}_{i}^{2}+{y}_{i}^{2}}$ ). The recovery is as expected. The true vertical force is stronger in the spiral arms (due to the higher surface density) and weaker in the inter-arm regions as compared to the axisymmetric best-fit model. The radial force is well recovered where the majority of the stars are located, i.e., in the wide inter-arm regions and in the peaks of the spiral arms. Misjudgments happen in the wings of the spiral arms: the true radial force (i.e., the pull toward the galactic center) is stronger at the outer edge/leading side of the spiral arm because of the additional gravitational pull toward the massive spiral arm, and for the same reason weaker at the inner edge/trailing side. Overall, the recovered RoadMapping model appears to be a good mean model, averaging over spiral arms and inter-arm regions.

4.1.3. Recovering the Action Distribution

In Sections 4.1.1 and 4.1.2, we have demonstrated the goodness of the fit in the configuration space of the data, and of the recovered gravitational potential. What RoadMapping is actually fitting, however, is the distribution in action space. Figure 8 compares the data and the model action distribution (generated by the best-fit qDF) given the best-fit median MNHH-Pot in Table 3. (We use this axisymmetric potential to calculate the actions that lead to the best-fit model, and do not attempt to estimate the true actions in the true potential.)

**Figure 8.** Comparison of the stellar action distribution of the data set D used in the analysis and the recovered axisymmetric action distribution M (see Figure 5 for the comparison in configuration space). All actions of the data set and best-fit distribution were calculated in the best-fit `MNHH-Pot` in Table 3. The upper panel in each column shows one-dimensional histograms of the D and M distribution of angular momentum, L_z, the radial action, J_R, and the vertical action, J_z. The other panels display the residual significance $(D-M)/\sqrt{M}$ , as both one-dimensional and two-dimensional distribution (the model M was constructed by MC sampling the best-fit qDF, $\sqrt{M}$ is the expected noise due to Poisson statistics, and $(D-M)/\sqrt{M}$ therefore describes how significant any difference between D and M is). The two-dimensional residuals are overplotted with equidensity contours of the data D's two-dimensional action distribution (gray solid lines). In Figure 5(a), we have marked the approximate radial extent of the stronger spiral arm with black dotted lines ( ${R}_{\mathrm{spiral}}\in [5.6,6.8]\,\mathrm{kpc}$ ); the dotted lines in the L_z distributions in this figure correspond to ${L}_{z}={R}_{\mathrm{spiral}}\times {v}_{\mathrm{circ}}({R}_{\mathrm{spiral}})$ . This comparison gives a first impression of how the approximate action distribution in spiral arms might look.
Download figure:
Standard image High-resolution image

**Figure 8.** Comparison of the stellar action distribution of the data set D used in the analysis and the recovered axisymmetric action distribution M (see Figure 5 for the comparison in configuration space). All actions of the data set and best-fit distribution were calculated in the best-fit `MNHH-Pot` in Table 3. The upper panel in each column shows one-dimensional histograms of the D and M distribution of angular momentum, L_z, the radial action, J_R, and the vertical action, J_z. The other panels display the residual significance $(D-M)/\sqrt{M}$ , as both one-dimensional and two-dimensional distribution (the model M was constructed by MC sampling the best-fit qDF, $\sqrt{M}$ is the expected noise due to Poisson statistics, and $(D-M)/\sqrt{M}$ therefore describes how significant any difference between D and M is). The two-dimensional residuals are overplotted with equidensity contours of the data D's two-dimensional action distribution (gray solid lines). In Figure 5(a), we have marked the approximate radial extent of the stronger spiral arm with black dotted lines ( ${R}_{\mathrm{spiral}}\in [5.6,6.8]\,\mathrm{kpc}$ ); the dotted lines in the L_z distributions in this figure correspond to ${L}_{z}={R}_{\mathrm{spiral}}\times {v}_{\mathrm{circ}}({R}_{\mathrm{spiral}})$ . This comparison gives a first impression of how the approximate action distribution in spiral arms might look.
Download figure:
Standard image High-resolution image

We note that the radial and vertical action distribution fits quite well; the axisymmetric model, however, contains many more stars on close-to-circular orbits $({J}_{R}\sim 0,{J}_{z}\sim 0)$ than the simulation. In the data set, there is an excess of stars in the galactic plane $({J}_{z}\sim 0)$ that have more eccentric orbits than the axisymmetric model would predict. In Figure 5(a), we have marked the radial extent of the stronger spiral arm with dotted lines ( ${R}_{\mathrm{spiral}}\in [5.6,6.8]\,\mathrm{kpc}$ ), and overplotted the corresponding angular momenta ${L}_{z}={R}_{\mathrm{spiral}}\times {v}_{\mathrm{circ}}({R}_{\mathrm{spiral}})$ in Figure 8. This serves as a rough estimate for the region in action space, where we expect the stars of this spiral arm to be located. It is again obvious that this spiral arm contains (1) more stars in general and (2) more stars with eccentric orbits $({J}_{R}\gt 0)$ , which are (3) mostly located close to the plane $({J}_{z}\sim 0)$ , as compared to the axisymmetric model. All of this confirms our expectations for orbits in a spiral arm.

One of the open tasks that the Galactic dynamical modeling community faces is the description of the orbit distribution of spiral arms. The above exercise of comparing the data and the model actions in a best-fit axisymmetric potential should therefore be performed for any future application to data in the MW as well. It could help to learn more about the approximate orbits that stars move on in real spiral arms, and how spiral arms perturb axisymmetric action DFs.

4.1.4. Calibrating the Method by Modeling a Snapshot without Spiral Arms

To better understand the effect of spiral arms on the RoadMapping modeling in the previous sections, we also performed a calibration test run. For that, we applied RoadMapping to a mock data set drawn from the same spherical survey volume as in Section 4.1 (centered at S8 with a maximum radius of ${r}_{\max }=4\,\mathrm{kpc}$ ), but from the initial axisymmetric snapshot of the galaxy simulation ( $t=0\,\mathrm{Myr}$ ), in which no spiral arms had evolved yet.

The spatial tracer distribution was very well recovered by RoadMapping, as well as the distribution of tracers in v_R and v_z. There were some deviations in the v_T distribution between data and best-fit model, however, with our model predicting more asymmetric drift. We attribute this to the simple initial setup of the disk's stellar velocities, which does not follow a physical DF but only assumes a triaxial Gaussian velocity distribution (Springel et al. 2005). We expect the modeling in this calibration run to be slightly biased by this. Other than that, this confirms our expectation that the qDF is indeed a good model for the unperturbed disk, and that the deviations between data and model in Figure 5 were purely due to the spiral arms.

The surface density profile generated by the best-fit potential was a perfect fit inside of $R\sim 7\,\mathrm{kpc}$ , but it started flaring further out. This was expected, because the Miyamoto–Nagai disk is known to be more massive at larger radii than an exponential disk (Smith et al. 2015). Also, if the fit is driven by the majority of stars, the fit is expected to be better at smaller radii with its higher tracer number density. The same flaring in the surface density also showed up in Figure 6. Overall, our chosen potential model will therefore systematically bias vertical forces at large radii to be too strong as compared to the truth.

The recovered potential parameters for the initial snapshot, ${a}_{\mathrm{disk}}=(3.73\pm 0.04)\,\mathrm{kpc}$ , ${b}_{\mathrm{disk}}=(0.34\pm 0.03)\,\mathrm{kpc}$ , ${f}_{\mathrm{halo}}\,=(0.52\pm 0.03)$ , and ${a}_{\mathrm{halo}}=(24\pm 2)\,\mathrm{kpc}$ , are consistent with the spiral arm-affected measurements in Table 3 to within 2, 4, 1, and 1.5 times the statistical error, respectively. While the halo scale length in this calibration run is also consistent with the truth within three times the error, we expect it to be underestimated to at least partly account for the Miyamoto–Nagai disk being too massive at large radii. As we will see later (in Section 4.2.3 and Figure 12), we seem to need an even larger survey volume to have enough radial coverage to constrain the halo scale length properly.

Compared to the truth, the circular velocity curve derived from the initial snapshot was underestimated by $\sim 2.5 \% ;$ ${v}_{\mathrm{circ}}({R}_{\odot })=(219.36\pm 0.08)\,\mathrm{km}\ {{\rm{s}}}^{-1}$ . For the snapshot with spiral arms (see Figure 6), we observe an underestimation of only $\sim 1.5 \%$ , so the bias in the initial snapshot might be related to its unphysical setup of the v_T velocities. On the one hand, because this underestimation is present for both snapshots, it could also be a systematic bias due to the chosen Miyamoto–Nagai disk model. On the other hand, the bias also shows up in the DEHH-Pot's circular velocity curve in Figure 6. The DEHH-Pot has a more realistic shape and was found as a direct fit to the spiral-arm-affected disk particle distribution. The $\sim 1.5 \%$ underestimate is therefore most likely introduced by the spiral arms (see Section 4.2.6).

Because the simulation immediately develops spiral arms after the initial snapshot, there was no snapshot that was still axisymmetric, yet already in a dynamical steady state.

4.2. The Influence of Spiral Arms in RoadMapping Modeling

In the previous section, we showed that for a large survey volume ( ${r}_{\max }=4\,\mathrm{kpc}$ ) RoadMapping can construct a good average axisymmetric potential (and DF) model for a galaxy with spiral arms. In the following, we want to investigate how this modeling success depends on the position and the size of the survey volume within the galaxy and with respect to the spiral arms.

4.2.1. A Suite of Data Sets Drawn from Spiral Arms and Inter-arm Regions

To investigate a range of data sets affected in different proportions by spiral arms, we center our test survey volumes at the positions marked in Figures 1 and 4 (see also Table 1) and consider volume sizes with ${r}_{\max }\in [0.5,1,2,3,4,5]\,\mathrm{kpc}$ for ${R}_{0}=8\,\mathrm{kpc}$ and ${r}_{\max }\in [0.5,1,2,3,4]\,\mathrm{kpc}$ for ${R}_{0}=5\,\mathrm{kpc}$ (to avoid the galactic center). As demonstrated in Figure 4, the spiral arm strength is very different in these test volumes. Each data set that we draw from the simulation contains a random selection of ${N}_{* }\,=$ 20,000 stars inside the given spherical volume and we fit a single qDF and MNHH-Pot to it.

4.2.2. Recovering the Circular Velocity Curves and Surface Density Profiles

It turns out that RoadMapping is successful in finding reasonable and even very good best-fit potential models for each one of the 22 test data sets independent of size and location—given the data and limitations of the model. To illustrate this and to make this encouraging result immediately obvious, we show the circular velocity curves and surface density profiles of all analyses in Figures 9–11.

**Figure 9.** Comparison of the true local circular velocity curve and surface density within $| z| \leqslant 2{z}_{{\rm{s}}}=0.34\,\mathrm{kpc}$ with the recovered *RoadMapping* models from survey volumes of size ${r}_{\max }=2\,\mathrm{kpc}$ and ${r}_{\max }=3\,\mathrm{kpc}$ . The blue lines show the *RoadMapping* potential models recovered from these data sets (each line is one of 100 potentials drawn from the full pdf sampled with the MCMC). The gray curves show the true profiles as derived from the galaxy simulation snapshot, averaged over the angular wedge ${\phi }_{0}\pm \arcsin ({r}_{\max }/{R}_{0})$ that encloses the corresponding survey volume (see Table 1 for all R₀ and ${\phi }_{0}$ values). In other words, we show the true profiles only for the region of the spiral galaxy that was actually probed by the data. The orange and green colored regions mark the radial extent of the survey volumes. We sorted the stars of each data set into radial bins of size ${\rm{\Delta }}R=200\,\mathrm{kpc}$ . The radial bins with a higher than average number of stars are marked with a lighter shade of the corresponding color, and the bins with a lower than average number with a darker shade. It turns out that the constraints of highest accuracy and precision are always where most of the stars are located—within the survey volume and, in particular, at the peak of the distribution.
Download figure:
Standard image High-resolution image

**Figure 9.** Comparison of the true local circular velocity curve and surface density within $| z| \leqslant 2{z}_{{\rm{s}}}=0.34\,\mathrm{kpc}$ with the recovered *RoadMapping* models from survey volumes of size ${r}_{\max }=2\,\mathrm{kpc}$ and ${r}_{\max }=3\,\mathrm{kpc}$ . The blue lines show the *RoadMapping* potential models recovered from these data sets (each line is one of 100 potentials drawn from the full pdf sampled with the MCMC). The gray curves show the true profiles as derived from the galaxy simulation snapshot, averaged over the angular wedge ${\phi }_{0}\pm \arcsin ({r}_{\max }/{R}_{0})$ that encloses the corresponding survey volume (see Table 1 for all R₀ and ${\phi }_{0}$ values). In other words, we show the true profiles only for the region of the spiral galaxy that was actually probed by the data. The orange and green colored regions mark the radial extent of the survey volumes. We sorted the stars of each data set into radial bins of size ${\rm{\Delta }}R=200\,\mathrm{kpc}$ . The radial bins with a higher than average number of stars are marked with a lighter shade of the corresponding color, and the bins with a lower than average number with a darker shade. It turns out that the constraints of highest accuracy and precision are always where most of the stars are located—within the survey volume and, in particular, at the peak of the distribution.
Download figure:
Standard image High-resolution image

**Figure 10.** Same as Figure 9, but for all small survey volumes with ${r}_{\max }=500\,\mathrm{pc}$ and ${r}_{\max }=1\,\mathrm{kpc}$ .
Download figure:
Standard image High-resolution image

**Figure 10.** Same as Figure 9, but for all small survey volumes with ${r}_{\max }=500\,\mathrm{pc}$ and ${r}_{\max }=1\,\mathrm{kpc}$ .
Download figure:
Standard image High-resolution image

**Figure 11.** Same as Figure 9, but for all big survey volumes with ${r}_{\max }=4\,\mathrm{kpc}$ and ${r}_{\max }=5\,\mathrm{kpc}$ .
Download figure:
Standard image High-resolution image

**Figure 11.** Same as Figure 9, but for all big survey volumes with ${r}_{\max }=4\,\mathrm{kpc}$ and ${r}_{\max }=5\,\mathrm{kpc}$ .
Download figure:
Standard image High-resolution image

In contrast to Figure 6, these figures only show the true profiles for the region within the galaxy where the data comes from: averaged over the angular wedge covering the radial extent of the survey volume, ${\rm{\Delta }}\phi ={\phi }_{0}\pm \arcsin ({r}_{\max }/{R}_{0})$ , and within $| z| \leqslant 2{z}_{{\rm{s}}}=0.34\,\mathrm{kpc}$ , i.e., twice the scale height of the disk, which contains most of the disk mass. This is the matter distribution in which the stars are currently moving, and therefore the potential to which the modeling should be sensitive. In Figures 9–11, we also mark the survey volume and the radial bins of size ${\rm{\Delta }}R=200\,\mathrm{pc}$ with the highest number of stars.

Even though the curves vary extremely between the individual data sets, it becomes very obvious that it is indeed the regions in which the majority of stars is located that drives the RoadMapping fit. Furthermore, whether this region is dominated by a spiral arm or an inter-arm region, and even if this region is only as small as ${r}_{\max }=500\,\mathrm{pc}$ , RoadMapping indeed constrains the local potential where most of the stars of the data set are located. Also, the constraints are not only most accurate but also most precise in these regions.

Only in the two volumes with ${r}_{\max }=[0.5,1]\,\mathrm{kpc}$ at position S8 does RoadMapping have some difficulties fitting the circular velocity curve; the model expects a flat or falling rotation curve and is presented with a steeply rising rotation curve due to the spiral arm dominating the region. However, given that RoadMapping recovers a good average surface density profile and the circular velocity at least at the center of the small volume, the fit is still quite successful.

In an application to real data in the MW, we would also have the possibility to impose some informative prior information on the potential shape (e.g., on the rotation curve), to avoid very unrealistic results (see also discussion in Section 5.4).

4.2.3. Discussion of the Model Parameter Recovery

Figures 9–11 in the previous section have illustrated how well the potential is recovered by RoadMapping. Figure 12 compares the potential and qDF parameters found with RoadMapping to the parameters of the reference DEHH-Pot from Table 2. At first glance, there appear to be several discrepancies. In the following, we will discuss the deviations and explain why each set of parameters still corresponds to a good fit to the data.

**Figure 12.** Overview of the model parameter estimates (`MNHH-Pot` parameters on the left, qDF parameters on the right) recovered with *RoadMapping* from 22 different data sets. All data sets were drawn from the same simulation snapshot, but from survey volumes at different positions in the galaxy (color-coded) and of different sizes ( ${r}_{\max }$ as indicated on the x-axis). Note that all five qDF parameters are shown here on a logarithmic scale, because *RoadMapping* uses a logarithmically flat prior for them in the fit (see Equation (16)). The black dotted line shows the known model parameters from the reference potential `DEHH-Pot` in Table 2 (the Miyamoto–Nagai disk parameters ${a}_{\mathrm{disk}}$ and ${b}_{\mathrm{disk}}$ are related but not directly comparable to an exponential disk scale length and height). The black squares denote the qDF parameters we recovered by fixing the potential to the `DEHH-Pot`, centering a survey volume with ${r}_{\max }=5\,\mathrm{kpc}$ on the spiral arm at ${R}_{0}=8\,\mathrm{kpc}$ (position `S8`), and fitting the qDF only. A survey volume with a radial coverage as large as ${r}_{\max }=5\,\mathrm{kpc}$ is required to properly recover all "true" model parameters. For smaller volumes, there seem to be quite large deviations between truth and model; the fact that these recovered parameters still all correspond to successful *RoadMapping* fits to the data is discussed in Section 4.2.3.
Download figure:
Standard image High-resolution image

Overall, the statistical random errors on the parameter recovery are very small for ${N}_{* }$ = 20,000 and possible systematic errors dominate. There are only a few exceptions ( ${r}_{\max }=500\,\mathrm{pc}$ , ${r}_{\max }=1\,\mathrm{kpc}$ at I5, ${r}_{\max }=2\,\mathrm{kpc}$ at I8), which we will discuss later.

We will first consider the parameters of the gravitational potential (left column in Figure 12). All volumes recover ${v}_{\mathrm{circ}}({R}_{\odot })$ within a few $\mathrm{km}\ {{\rm{s}}}^{-1};$ in the largest volumes—where the circular velocity curve is probed over several kiloparsecs—the estimate is the most accurate. The halo fraction ${f}_{\mathrm{halo}}$ of the radial force at the "solar" radius ${R}_{\odot }$ is very well recovered, especially for ${r}_{\max }\gtrsim 2\,\mathrm{kpc}$ . The estimate that we get for the best-fit Miyamoto–Nagai disk scale height ${b}_{\mathrm{disk}}$ seems to be also approximately independent of the size of the volume. We can even recover the true halo scale length ${a}_{\mathrm{halo}}$ ; however, only for a volume as large as ${r}_{\max }=5\,\mathrm{kpc}$ . The models at ${r}_{\max }=500\,\mathrm{pc}$ appear to be too small to constrain the halo at all, and the MCMCs diverged completely for this parameter. Smaller volumes that underestimate ${a}_{\mathrm{halo}}$ get slightly larger estimates for the disk scale length ${a}_{\mathrm{disk}}$ and the overall radial density slope is then probably closer to the truth, even if the individual parameters are not. Outliers can often be explained by having a look at the data. The large disk scale length recovered from the ${r}_{\max }=2\,\mathrm{kpc}$ volume at S5, for example, mirrors the comparably flat matter distribution caused by two spiral arms close together and dominating the volume (see Figure 9(a) and the large ${\sigma }_{{\rm{\Delta }}\mathrm{Spiral}}$ for this analysis in Figure 4(d)).

The right column of Figure 12 compares the recovered qDF parameters for the different survey volumes with the qDF parameters we obtained from fixing the potential model to the DEHH-Pot and fitting the qDF only in a ${r}_{\max }=5\,\mathrm{kpc}$ volume at S8. Even though the qDF parameters for small volumes are widely different for different positions within the galaxy, they all approach the values recovered with the DEHH-Pot for larger volumes. There seems, therefore, to be an overall best-fit qDF describing the average tracer distribution in the galaxy's disk. The only difference is in the ${h}_{\sigma ,z}$ parameter, where the models fitting an MNHH-Pot recover a slightly larger value than the models using the known DEHH-Pot. The suspected reason is that the Miyamoto–Nagai disk flares at larger radii as compared to the double exponential disk (see Figure 6), which leads to a less-steep radial decline in the vertical forces, and therefore mean vertical orbital energies $\langle {E}_{z}\rangle \sim \nu \times {J}_{z}$ , and therefore to a slightly longer ${h}_{\sigma ,z}$ scale length. In general, volumes centered on spiral arms have larger velocity dispersion parameters ${\sigma }_{R,0}$ and ${\sigma }_{z,0}$ as compared to volumes at the same radius R₀ but centered on an inter-arm region. Furthermore, the volumes at ${R}_{0}=5\,\mathrm{kpc}$ with their stronger spiral arms have larger velocity dispersions than those at ${R}_{0}=8\,\mathrm{kpc}$ —which is what we expect. Most volumes recover similar tracer scale lengths ${h}_{R}\sim 2.5\pm 0.5\,\mathrm{kpc}$ close to the known disk scale length ${R}_{{\rm{s}}}$ . Only the volumes centered on the inter-arm region at ${R}_{0}=8\,\mathrm{kpc}$ (position I8) recover much longer h_R. This might be related to the fact that volumes at I8 are dominated by an especially extended inter-arm region. The volumes at I5 with ${r}_{\max }=[0.5,1]\,\mathrm{kpc}$ were not able to constrain the tracer scale length at all because of the unfortunate position between the rising density wings of two strong spiral arms (see Figure 10).

There are a few survey volumes for which the recovered parameters show some peculiarities. The models from volumes with ${r}_{\max }=1\,\mathrm{kpc}$ at I5 and ${r}_{\max }=2\,\mathrm{kpc}$ at I8 reject the DM halo completely, i.e., ${f}_{\mathrm{halo}}=0$ . The corresponding halo scale lengths ${a}_{\mathrm{halo}}$ are therefore unconstrained,¹¹ while the corresponding disk scale heights ${b}_{\mathrm{disk}}$ are grossly overestimated to account for the missing contribution of the spherical halo. We have investigated the reason for this fitting result and found that for the way in which the spiral arms affect the circular velocity curve in these volumes, the recovered models with unusual radial profiles are indeed a better description for the data (see Figures 9(a) and 10(b)). Also, while most analyses average the vertical forces radially over the spiral arms (see Figures 7, lower right panel), for these analyses the averaging happens vertically, i.e., at approximately one scale height above the plane where the model's vertical forces are equally well recovered at all radii (in spiral arms and between), while at small and large $| z|$ the model is bad. RoadMapping therefore also found a good average fit model for the stars in these volumes.

Overall, we find that if the volume is large enough to average over several spiral arms and inter-arm regions, an unlucky positioning with respect to the spiral arms does not lead to strong biases in the parameter recovery. We stress again that for particularly large volumes, ${r}_{\max }=5\,\mathrm{kpc}$ , we were able to recover all model parameters, including the halo scale length ${a}_{\mathrm{halo}}$ .

4.2.4. Recovering the Local Gravitational Forces

In the previous section, we found that the potential and qDF parameters recovered from different survey volumes can be quite different. While the differences can be explained qualitatively, it is not yet clear how good the corresponding potential constraints actually are in a quantitative sense. To test this, we calculate again ${\rm{\Delta }}{F}_{R}({{\boldsymbol{x}}}_{* ,i})$ and ${\rm{\Delta }}{F}_{z}({{\boldsymbol{x}}}_{* ,i})$ from Equations (23) and (24) at the position of each star ${{\boldsymbol{x}}}_{* ,i}$ in each data set (analogous to Figure 7). From the corresponding histograms of number of stars versus ${\rm{\Delta }}F$ , we derive the median and the $16\mathrm{th}$ and $84\mathrm{th}$ percentiles ( $1\sigma$ range) and show them in Figure 13. We chose this diagnostic because the forces at the positions of the stars are the quantities of the potential to which our modeling is sensitive.

The important key result from Figure 13 is that we get very close to recovering the true forces ${\rm{\Delta }}F({{\boldsymbol{x}}}_{* ,i})\lesssim 10 \%$ at the positions of the majority of stars in the survey volume, no matter how large or small the survey volume is. On average, the force recovery is also unbiased¹² for the ensemble of stars.

Figure 13 also contains some subtle clues that suggest that the quality of the force recovery could be correlated with the position of the data set with respect to the spiral arms. We investigate this further by relating in Figure 14 the local force recovery, i.e., the distribution of ${\rm{\Delta }}{F}_{R}({{\boldsymbol{x}}}_{* ,i})$ and ${\rm{\Delta }}{F}_{z}({{\boldsymbol{x}}}_{* ,i})$ for each data set, to the relative spiral contrast within the respective survey volume, ${\sigma }_{{\rm{\Delta }}\mathrm{Spiral}}$ (Equation (6); see also Figure 4(d)). Figure 14 shows that the average fraction of stars for which the recovery of the radial or vertical force is bad (i.e., larger than 10%) increases with increasing spiral contrast ${\sigma }_{{\rm{\Delta }}\mathrm{Spiral}}$ . This is as expected. Volumes in which the steep gradient in surface density around a strong spiral arm ( $| {{\rm{\Delta }}}_{\mathrm{Spiral},k}| \gt 0$ ) is not balanced by larger areas with less perturbations ( ${{\rm{\Delta }}}_{\mathrm{Spiral},k}\sim 0$ ) have (1) a large relative spiral contrast ${\sigma }_{{\rm{\Delta }}\mathrm{Spiral}}$ , and (2) a large relative number of stars affected by the non-axisymmetric kinematics of the spiral arms. For these individual stars, the axisymmetric RoadMapping model is less successful in recovering the correct forces. (However, as we saw in Figure 13, the ensemble average is unbiased even in these cases.)

**Figure 14.** Influence of the spiral arm contrast on the recovery of the gravitational forces at the positions of the stars ${x}_{* ,i}$ that entered the *RoadMapping* analysis. Each circle/cross pair corresponds to one of our 22 data sets. The relative spiral contrast on the x-axis is quantified as ${\sigma }_{{\rm{\Delta }}\mathrm{Spiral}}$ calculated within each survey volume according to Equation (6) in Section 2.4. On the y-axis, the fraction of stars ( ${f}_{\mathrm{stars}}$ ) in each data set is shown for which the radial (red circles) and vertical (blue crosses) force residual calculated from Equations (23) and (24) is larger than 10% (i.e., at 0 all stars have good force measurements, at 1 everything went wrong). The red and blue lines are linear fits to the radial and vertical force residual fraction, respectively, and are guides to the eye that show the clear and expected trend that in volumes with smaller spiral arm contrast, where comparably fewer stars are located in spiral arms, the axisymmetric best-fit model can recover the true gravitational forces also for more stars. On average, the radial and vertical forces are equally well recovered at a given spiral contrast.
Download figure:
Standard image High-resolution image

**Figure 14.** Influence of the spiral arm contrast on the recovery of the gravitational forces at the positions of the stars ${x}_{* ,i}$ that entered the *RoadMapping* analysis. Each circle/cross pair corresponds to one of our 22 data sets. The relative spiral contrast on the x-axis is quantified as ${\sigma }_{{\rm{\Delta }}\mathrm{Spiral}}$ calculated within each survey volume according to Equation (6) in Section 2.4. On the y-axis, the fraction of stars ( ${f}_{\mathrm{stars}}$ ) in each data set is shown for which the radial (red circles) and vertical (blue crosses) force residual calculated from Equations (23) and (24) is larger than 10% (i.e., at 0 all stars have good force measurements, at 1 everything went wrong). The red and blue lines are linear fits to the radial and vertical force residual fraction, respectively, and are guides to the eye that show the clear and expected trend that in volumes with smaller spiral arm contrast, where comparably fewer stars are located in spiral arms, the axisymmetric best-fit model can recover the true gravitational forces also for more stars. On average, the radial and vertical forces are equally well recovered at a given spiral contrast.
Download figure:
Standard image High-resolution image

Interestingly, and even though there is some scatter, the force recovery at a given ${\sigma }_{{\rm{\Delta }}\mathrm{Spiral}}$ is on average very similar for the radial and vertical forces (compare the linear fits in Figure 14). This means that RoadMapping attempts to fit both the radial and vertical forces at the positions of the stars, and is not particularly sensitive to just one of them.

As we saw in Figure 4(d), the spiral contrast ${\sigma }_{{\rm{\Delta }}\mathrm{Spiral}}$ increases for the different test volume positions approximately in this order: ${\mathtt{I}}{\mathtt{8}}\longrightarrow {\mathtt{S}}{\mathtt{8}}\longrightarrow {\mathtt{I}}{\mathtt{5}}\longrightarrow {\mathtt{S}}{\mathtt{5}}$ . From Figure 14, it follows that this is also the order in which the accuracy of the force recovery decreases. (We did not include this piece of additional information in Figure 14, but it can be seen in Figure 13, especially for the smaller volumes.)

4.2.5. Extrapolating the Gravitational Potential Model

It is also interesting to see how well the extrapolation of a recovered potential describes the overall gravitational potential of the galaxy. To investigate the extrapolability, we introduce another diagnostic that uses a cylindrical grid centered on the respective positions in Table 1, always having a radius of ${r}_{\max }=5\,\mathrm{kpc}$ and a height of $z=1.5\,\mathrm{kpc}$ both above and below the plane. In the (x, y) plane, the regular grid points have a distance of $0.25\,\mathrm{kpc}$ and in z they have a distance of $0.125\,\mathrm{kpc}$ to better sample the thin disk. (We throw out grid points close to the galactic center with $R\lt 0.125\,\mathrm{kpc}$ .) We then evaluate at the position ${{\boldsymbol{x}}}_{g,j}\equiv ({x}_{j},{y}_{j},{z}_{j})$ of each regular grid point the force residuals

$\begin{eqnarray}&&{\rm{\Delta }}{F}_{R}({{\boldsymbol{x}}}_{g,j})\equiv \displaystyle \frac{| {F}_{R,M}({R}_{j},{z}_{j})| -| {F}_{R,T}({x}_{j},{y}_{j},{z}_{j})| }{| {F}_{R,\mathrm{typ}}({R}_{j})| },\end{eqnarray} \tag{ 25 }$

$\begin{eqnarray}&&{\rm{\Delta }}{F}_{z}({{\boldsymbol{x}}}_{g,j})\equiv \displaystyle \frac{| {F}_{z,M}({R}_{j},{z}_{j})| -| {F}_{z,T}({x}_{j},{y}_{j},{z}_{j})| }{| {F}_{z,\mathrm{typ}}({R}_{j})| },\end{eqnarray} \tag{ 26 }$

analogous to Equations (21)–(24). The two panels in Figure 15 show the $[16\mathrm{th}$ , $84\mathrm{th}]$ percentile range and the median of the grid points' distribution in ${\rm{\Delta }}{F}_{R}({{\boldsymbol{x}}}_{g,j})$ and ${\rm{\Delta }}{F}_{z}({{\boldsymbol{x}}}_{g,j})$ .

**Figure 15.** Extrapolability of a *RoadMapping* potential model. This figure is similar to Figure 13 and shows the radial (upper panel) and vertical (lower panel) gravitational force residuals. However, instead of calculating the residuals at the positions of the stars, we evaluated them here at regular grid points ${{\boldsymbol{x}}}_{g,j}$ in a large cylinder of ${r}_{\max }=5\,\mathrm{kpc}$ and height $| z| =1.5\,\mathrm{kpc}$ around each survey volumes' center: ${\rm{\Delta }}{F}_{R}({{\boldsymbol{x}}}_{g,j})$ and ${\rm{\Delta }}{F}_{z}({{\boldsymbol{x}}}_{g,j})$ in Equations (25) and (26). This demonstrates how well the model can be extrapolated out to $5\,\mathrm{kpc}$ , i.e., how close to the truth the corresponding volume-averaged model is. The x-axis shows the radial size ${r}_{\max }$ of the survey volume from which the model was derived. The dot and error bars denote the median and $[16\mathrm{th}$ , $84\mathrm{th}]$ percentile range for each distribution of ${\rm{\Delta }}F({{\boldsymbol{x}}}_{g,j})$ . The extrapolability works better for the radial forces than for the vertical forces. We account the systematic overestimation of the vertical forces to the spiral arms and the flaring of the disk model at large radii (see Section 4.2.6). We find that we need at least a survey volume of ${r}_{\max }=3\,\mathrm{kpc}$ to get a potential with a reasonable extrapolability.
Download figure:
Standard image High-resolution image

First, we find that the radial forces are overall very well predicted, especially when derived from large survey volumes. There is, however, an overestimation of ∼5%–20% in the vertical forces (depending on volume size and position), which is induced by the spiral arms and is partly also due to a systematic error caused by the choice of potential model (see the explanation below in Section 4.2.6).

Second, the constraints we get on the spatially averaged forces inside $r\lt 5\,\mathrm{kpc}$ are almost as good when derived from a survey volume of ${r}_{\max }=3\,\mathrm{kpc}$ as compared to survey volumes of ${r}_{\max }=4$ or $5\,\mathrm{kpc}$ . If we had to decide between a ${r}_{\max }=3\,\mathrm{kpc}$ volume with good data quality and a larger volume with worse data quality, we would lose nothing in terms of extrapolability when using the smaller volume (only the halo scale length might not be as well constrained, see Figure 12).

Third, there are further indications in Figure 15 that the position of the survey volume with respect to the spiral arms matters for the force recovery.

In Figure 16, we stress that even more by relating the extrapolability, i.e., the volume-averaged distribution of force residuals, ${\rm{\Delta }}{F}_{R}({{\boldsymbol{x}}}_{g,j})$ and ${\rm{\Delta }}{F}_{z}({{\boldsymbol{x}}}_{g,j})$ , to the dominance of the spiral arm $\langle {{\rm{\Delta }}}_{\mathrm{Spiral}}\rangle$ in the survey volume (Equation (5); see also Figure 4(c)). We derive the fraction of grid points ${{\boldsymbol{x}}}_{g,j}$ in the reference cylinder for each data set/RoadMapping model with forces that are misjudged by more than 10%, and plot it against $\langle {{\rm{\Delta }}}_{\mathrm{Spiral}}\rangle$ . Positive or negative $\langle {{\rm{\Delta }}}_{\mathrm{Spiral}}\rangle$ quantify how much the spiral arms or the inter-arm regions dominate the corresponding survey volume, respectively. A low fraction of grid points with $| {\rm{\Delta }}F| \gt 10 \%$ means that the extrapolation of the potential model works well.

$\langle {{\rm{\Delta }}}_{\mathrm{Spiral}}\rangle $ — **Figure 16.** Influence of the spiral arm dominance on the extrapolability of the *RoadMapping* potential models. Each circle/cross pair corresponds to one of our 22 *RoadMapping* analyses. How much a spiral arm dominates within a given survey volume is quantified by $\langle {{\rm{\Delta }}}_{\mathrm{Spiral}}\rangle$ from Equation (5) and shown on the x-axis. $\langle {{\rm{\Delta }}}_{\mathrm{Spiral}}\rangle \sim 0$ means spiral arms and inter-arm regions dominate equally in the survey volume. The more negative $\langle {{\rm{\Delta }}}_{\mathrm{Spiral}}\rangle$ is, the more dominated by an inter-arm region the survey volume is. The y-axis shows the volume fraction within which the radial (red circles) and vertical (blue crosses) gravitational force residuals are larger than 10%. In particular, we extrapolate each potential model within a cylindrical volume with ${r}_{\max }=5\,\mathrm{kpc}$ and $| z| \leqslant 1.5\,\mathrm{kpc}$ , centered on the respective survey volume positions given in Table 1. ${f}_{5\mathrm{kpc}}(| {\rm{\Delta }}F| \gt 10 \% )$ , therefore, quantifies how well a *RoadMapping* potential model can be extrapolated out to $5\,\mathrm{kpc}$ from the center of the survey volume; at ${f}_{5\mathrm{kpc}}(| {\rm{\Delta }}F| \gt 10 \% )\sim 0$ , the extrapolation works best. The red and blue lines are linear fits and only serve as a guide to the eye. This figure demonstrates that the extrapolability of the model gets better the less a spiral arm dominates the data set. For the dominance of inter-arm regions, this trend is less pronounced. Overall, the radial force can be much better extrapolated than the vertical force, and models from data sets centered on inter-arm regions can be more reliably extrapolated than those centered on spiral arms.
Download figure:
Standard image High-resolution image

First, we note that there is a clear trend that the extrapolability gets worse if a spiral arm strongly dominated the survey volume from which the potential constraint was derived ( $\langle {{\rm{\Delta }}}_{\mathrm{Spiral}}\rangle \gg 0$ ). The same trend can be seen for the dominance of inter-arm regions ( $\langle {{\rm{\Delta }}}_{\mathrm{Spiral}}\rangle \ll 0$ ), but it is weaker and less clear.

Second, we note again that ${F}_{z}({{\boldsymbol{x}}}_{g,j})$ is not predicted as well as ${F}_{R}({{\boldsymbol{x}}}_{g,j})$ . The reason for this is laid out in Section 4.2.6.

The main result of Figure 16 is, however, the following. The extrapolability of models derived from data sets drawn from survey volumes centered on inter-arm regions appears to be in general better than that of data sets centered on spiral arms. We suspect that the reason for this is that the stellar distribution between spiral arms is smoother, more extended, and closer to the overall axisymmetric average model, such that the potentials recovered from these volumes have real predictive power for a much larger volume.

4.2.6. Biases in the Potential Recovery Caused by the Spiral Arms

What are the reasons for the biases that we observe in Figure 15?

The peak of the distribution in ${\rm{\Delta }}{F}_{R}({{\boldsymbol{x}}}_{g,j})$ (and also in ${\rm{\Delta }}{F}_{R}({{\boldsymbol{x}}}_{* ,i})$ in Figure 13) is slightly ( $\sim 1.5 \%$ ) biased toward an underestimation of $| {F}_{R}|$ in our RoadMapping models. This bias already showed up in the circular velocity curve in Figure 6 and also in the reference potential model DEHH-Pot. We therefore suspect that this bias is caused by the spiral arms. The line of argument goes as follows. Spiral arms are very thin. If a spiral arm crosses the observation volume, both its leading side (at large radii) and its trailing side (at small radii) are also in the volume. Stars on the trailing side feel a lower gravitational pull toward the galaxy center than they would if there was no spiral arm. Because there are, in general, more stars at smaller radii, the RoadMapping fit is slightly biased to reproduce slightly weaker radial forces.¹³

The peak in the distribution of ${\rm{\Delta }}{F}_{z}({{\boldsymbol{x}}}_{g,j})$ is strongly biased toward an overestimation of $| {F}_{z}|$ by the RoadMapping model. We illustrate the reason for this in Figure 17, where we show how ${\rm{\Delta }}{F}_{z}({{\boldsymbol{x}}}_{g,j})$ varies as a function of R for three example data sets with ${r}_{\max }=2\,\mathrm{kpc}$ . One reason for this bias is a relic of the assumed MNHH-Pot disk model family, which flares outside of $R\sim 8\,\mathrm{kpc}$ (see Section 3.3). The vertical forces in this region are therefore much higher in the model than in the true galaxy with its exponential disk. The main reason for this overestimation of F_z, however, comes directly from the spiral arms: there are much more stars in the spiral arms than in the inter-arm regions, and the stars in the spiral arm feel stronger vertical forces because of the higher surface mass density. The RoadMapping fit is driven by the majority of stars and the best-fit model therefore predicts, in general, higher vertical forces. As expected, the overestimation of F_z in Figure 15 is especially strong ( $\sim 20 \%$ ) for small survey volumes dominated by spiral arms, while small volumes dominated by an inter-arm region result in much better estimates for the spatially averaged ${F}_{z}({{\boldsymbol{x}}}_{g,j})$ ( $\sim 5 \%$ bias). Large volumes lie somewhere in between (bias of $\sim 10 \%$ ).

**Figure 17.** Examples for the extrapolability of a *RoadMapping* gravitational potential model as a function of galactocentric radius, R. In particular, we show the vertical force residuals ${\rm{\Delta }}{F}_{z}({{\boldsymbol{x}}}_{g,j})$ from Equation (26) for the *RoadMapping* potential models derived from three data sets with ${r}_{\max }=2\,\mathrm{kpc}$ at positions `S8`, `S5`, and `I5` (color-coded; see also Figure 1). The colored bands show the distribution (median with the central 50% percentiles) of ${\rm{\Delta }}{F}_{z}$ 's of all regular grid points ${{\boldsymbol{x}}}_{g,j}$ within a distance of $r\leqslant 5\,\mathrm{kpc}$ and $| z| \leqslant 1.5\,\mathrm{kpc}$ from the survey volumes center at a given R. This figure demonstrates the origin of the bias in the vertical force prediction, which we found in Figure 15. Because the fit is driven by the excess of stars that feel stronger vertical forces in the spiral arms, e.g., at $R\sim 3\,\mathrm{kpc}$ or $R\sim 6\,\mathrm{kpc}$ , we get the vertical force right at these radii. We consequently overestimate it in the inter-arm regions. In addition, the assumed gravitational potential model family, `MNHH-Pot` with a Miyamoto–Nagai disk, flares outside of $R\sim 8\,\mathrm{kpc}$ as compared to the true exponential disk.
Download figure:
Standard image High-resolution image

Why do the biases show up in different strengths in Figures 13 and 15?

The stellar number asymmetry in the trailing versus leading sides of spiral arms is much smaller than the stellar number asymmetry in the spiral arm versus the inter-arm region. The bias is therefore visible in the distribution of ${\rm{\Delta }}{F}_{R}({{\boldsymbol{x}}}_{* ,i})$ (because the F_R recovery is biased only by a few stars, which leads to a bias that is visible for the majority of stars) and not in ${\rm{\Delta }}{F}_{z}({{\boldsymbol{x}}}_{* ,i})$ (because the majority of stars bias the fit and we therefore also recover F_z for the majority of stars). The bias becomes particularly pronounced for ${\rm{\Delta }}{F}_{z}({{\boldsymbol{x}}}_{g,j})$ (because the inter-arm regions dominate when averaging spatially, which leads to a large average overestimation of F_z) and stays small for ${\rm{\Delta }}{F}_{R}({{\boldsymbol{x}}}_{g,j})$ (because trailing and leading sides of spiral arms are similarly important when averaging spatially; it therefore becomes visible that the bias is actually not that big).

5. Discussion and Outlook

5.1. On the Informativeness of an Orbit DF

The qDF appears to be very informative. We did expect it to be at least a reasonable model for the overall symmetrized disk of the galaxy simulation, considering its initial setup as an axisymmetric, exponentially decreasing particle distribution that subsequently evolved as a mono-age population (see Sections 2.1 and 3.2). In Sections 4.1.1, 4.1.3, and 4.1.4, we demonstrated that the qDF is indeed a good average model for the tracer distribution in a large survey volume—even though the spiral arms did introduce considerable deviations.

We had, however, no indications beforehand of how well the axisymmetric qDF would perform in a small survey volume completely dominated by non-axisymmetric spiral arms. It would not have been surprising if RoadMapping had failed. However, in Section 4.2.2, it turned out that the potential measurements were reliable even in most of the small volumes with ${r}_{\max }=[0.5,1]\,\mathrm{kpc}$ . Furthermore, the corresponding qDF parameters were tightly constrained and reasonable as well.

We deduce that the qDF is indeed flexible and robust enough to work with data affected by non-axisymmetries.

That the corresponding potential constraints were reliable as well leads to the following conclusion: a potential model that does not fit the gravitational forces acting on the stars appears to lead to such an unrealistic orbit/action distribution, that a fit with even such a simple orbit DF as the qDF is impossible. This demonstrates once more how powerful the concept of an orbit DF is.

5.2. On the Restrictiveness of the Parametrized Potential Model

How much does the choice of potential model matter for the success of the modeling?

We used, on the one hand, a bulge and halo model that reproduces the true bulge and halo better than we can hope to use in reality for the MW. The fact that for all except the largest volumes the true halo scale length is not remotely recovered (and some small volumes even have ${f}_{\mathrm{halo}}=0$ ), and that the contribution of the bulge to the overall potential is small (i.e., the bulge contribution to the total radial force at $R=8\,\mathrm{kpc}$ is only ∼9%–10%) remedies this apparent advantage.

On the other hand, we used a disk model, the Miyamoto–Nagai disk, that we chose purely for its convenient parametric form and of which we know that it is not a good model for the simulation snapshot; especially not for the radial density profile at large radii (Smith et al. 2015). As we saw in most figures in this paper this lead to biases in predicting the potential at radii where we have only a few or no stars. However, because the spiral arms are such strong perturbations in the overall potential, a better disk model would not give much better results.

It appears that a potential model with a reasonable shape and flexibility (here, disk+bulge+halo structure with five free parameters) can do well enough in finding a good fit, both locally for small volumes and overall for large volumes.

This is in agreement with one of our key results of Paper I, where we managed to successfully fit data from an MW-like (but axisymmetric) galaxy model with a bulgeless potential of a restrictive Stäckel form. This was illustrated in Figure 16 of Paper I. Considering that we used there the same number of stars, ${N}_{* }\,=$ 20,000, the potential uncertainties were much greater than in the analogous figure of this work, Figure 6(a). We believe that this is how RoadMapping accounts for an inconvenient potential parametrization—by increasing the uncertainty of the model estimate—which is exactly as it should be.

5.3. Gaia Measurement Errors and Choosing the Survey Volume Size

Considering measurement uncertainties of distances and proper motions, we found in Paper I that for a survey volume with ${r}_{\max }=3\,\mathrm{kpc}$ , distance uncertainties of $\lt 10 \%$ and proper motion uncertainties of less than $3\,\mathrm{mas}\ {\mathrm{yr}}^{-1}$ RoadMapping still gives unbiased parameter results. Even if the proper motion errors are not perfectly known.

The measurement uncertainties of Gaia in proper motions (already in the first DR $\delta \mu \sim 1\,\mathrm{mas}\ {\mathrm{yr}}^{-1};$ Lindegren et al. 2016) and in distances (at least for the final DR within $3\,\mathrm{kpc}$ and for bright stars; de Bruijne et al. 2014) lie below these limits.

In Paper I, we focused on recovering completely unbiased model parameters and found that RoadMapping is robust against moderate deviations of the model assumptions. In this work, we released the condition that the model parameters themselves had to be recovered accurately, but allowed RoadMapping to simply find an overall best fit for the data strongly affected by spiral arms—which was surprisingly successful in recovering the local potential even if the model parameters were not recovered.

We therefore presume that, in reality, we probably have an even larger margin of error than we found in Paper I, and before the measurement uncertainties noticeably muddle the constraints.

In addition, we found in this work that a volume of ${r}_{\max }=3\,\mathrm{kpc}$ should already be big enough to find an overall best-fit axisymmetric model for the Galaxy. At larger distances, dust starts affecting the measurements. Furthermore, inside of $R=3\mbox{--}4\,\mathrm{kpc}$ , the stellar motions become increasingly non-axisymmetric, possibly because of the Galactic bar (e.g., Reid et al. 2014; Bovy et al. 2015, and others, see the Introduction in Section 1).

Overall, we should therefore be very well-off by applying RoadMapping to the final Gaia data set within ${r}_{\max }=3\,\mathrm{kpc}$ only.

How well we can do with the first few Gaia DRs remains to be seen. The Gaia DR1 from September 2016 has parallax measurements of $\sim 16 \%$ for red-clump giants at a distance of $\sim 500\,\mathrm{pc}$ from the Sun (de Bruijne et al. 2014; Michalik et al. 2015), which is not precise enough for RoadMapping. We could, however, use photometric distances, which should be precise enough for red-clump stars and even extend over a larger volume. The Gaia DR2 in April 2018 might, however, already cover $\sim 2\,\mathrm{kpc}$ with good parallaxes for fainter stars as well.

5.4. Spiral Arms in the Solar Neighborhood

The Sun is located in one of the smaller spiral arms of the MW, the local Orion spur/arm (Morgan et al. 1953). Two of the MW's major spiral arms pass by the Sun within a few kiloparsecs: the Perseus arm is $\sim 2\,\mathrm{kpc}$ from the Sun (toward the outer MW; Xu et al. 2006), and the Sagittarius arm at $\sim 1\,\mathrm{kpc}$ (toward the Galactic center; Sato et al. 2010). It is, however, still under dispute which arms are actually major arms of the MW (Blaauw 1985; Xu et al. 2013; Zhang et al. 2013).

How reliable the RoadMapping results from the Gaia DR1 will be depends on the strength of the local Orion arm, which will dominate the survey volume. RoadMapping had, for example, some difficulties recovering the circular velocity curve for small volumes (see Figure 10).

However, recent measurements of the MW's rotation curve (Bovy et al. 2012a; Reid et al. 2014) confirm again that it is flat. We could impose this condition as a prior constraint in RoadMapping (or fix the rotation curve slope as Bovy & Rix 2013 did in their RoadMapping analysis). Based on independent information on the location and strength of the MW spiral arms, we could also use the present approach to estimate systematic uncertainties.

For later Gaia DRs, where the tracers extend further into the Galaxy ( $\sim 2\,\mathrm{kpc}$ from the Sun), the Sagittarius and Perseus arms could also play a role. In general, RoadMapping should do much better with larger volumes, and if several spiral arms and inter-arm regions within the survey volume average each other out. This averaging should work especially well if the MW is—like the simulation in this study—a four-armed spiral. The exact number of spiral arms, however, is still disputed (see Section 2.5). If Gaia should show that the MW is a two-armed grand-design spiral instead, having a very large survey volume would become even more important to achieve a good axisymmetric average RoadMapping model for the MW. Local measurements of the potential as in Figure 10 should, however, still be possible.

5.5. Absence of a Central Bar in the Simulation

The simulation we analyzed in this work does not have a prominent bar, and so we have not explicitly explored the impact of such a feature. Bars can play an important role in the dynamics when very small volumes near a resonance are considered (see, e.g., Dehnen 2000, and references in Section 2.5). When considering volumes of $\gtrsim 1\,\mathrm{kpc}$ , we have no reason to believe that this should severely affect the robustness of such an analysis.

5.6. Interpreting RoadMapping Results

The two main findings of this work give a clear directive regarding how we should deal with any future result about the MW's potential derived with RoadMapping. (1) If the data spans a large volume, a significant proportion of the disk (at least $R\sim 5\mbox{--}11\,\mathrm{kpc}$ ), and averages over several spiral arms and inter-arm regions, we can trust and use the resulting model as an overall axisymmetric potential for the MW. For $R\sim 3\mbox{--}13\,\mathrm{kpc}$ , we should even be able to make definite statements about the DM distribution. (2) If the data do not span such a large volume, we can still believe the local constraints. In particular, the surface density within one to two disk scale heights, the circular velocity within the survey volume, and the average gravitational forces where the majority of stars is located.

Fortunately, this is consistent with the procedure by Bovy & Rix (2013), who used RoadMapping to constrain the vertical force F_z for each MAP at only one radius that corresponds to a typical radius. It will be interesting to see whether RoadMapping potential constraints from Gaia data will agree with their findings.

In addition, one should always compare the distribution of the data and the recovered model in configuration space $({\boldsymbol{x}},{\boldsymbol{v}})$ and in action space, as we did in Section 4.1. This is not only a sanity check to confirm the goodness of the fit, but it might also reveal some substructure in the data that only becomes visible when comparing it to an axisymmetric smooth model. The overall axisymmetric best-fit MW potential and disk DF from RoadMapping could subsequently be also used as zeroth order potential and DF for a perturbation-theory analysis of the response of the disk to spiral structure (Monari et al. 2016).

5.7. Recovering Non-axisymmetric Structures in the Potential from Modeling Small Volumes

As we saw in Sections 4.2.2 and 4.2.4, RoadMapping also makes a very good attempt at constraining the local potential for volumes as small as ${r}_{\max }=500\,\mathrm{pc}$ or $1\,\mathrm{kpc}$ . It recovers, for example, the higher surface density in volumes dominated by spiral arms, or correctly estimates the circular velocity at the median radial position of the stars (see Figure 10).

At a time after the final Gaia DR, when we have data of high accuracy covering a large proportion of the disk, we could make use of this interesting property of RoadMapping modeling. We could split the data set not only into different MAPs analogous to Bovy & Rix (2013), but also into different spatial bins in the (x, y) plane of the MW and model each of the smaller volumes separately. This approach would only probe the local potential, even when using an axisymmetric potential model, and should be sensitive to the overdensities induced by the spiral arms. In this way, it should be possible to build up a non-axisymmetric map of the MW potential—albeit with very large spatial pixels—with constraints from dynamical modeling only.

6. Conclusion

RoadMapping is a well-tested axisymmetric dynamical modeling machinery that simultaneously fits an action-based DF and gravitational potential to the individual 6D phase-space coordinates of stellar populations in the MW disk. RoadMapping builds on previous work by Binney & McMillan (2011), Binney (2012), and Bovy (2015), and was first applied by Bovy & Rix (2013). RoadMapping was improved and tested in detail against the breakdowns of its modeling assumptions by Trick et al. (2016).

In this paper, we investigated the robustness of RoadMapping when modeling a non-axisymmetric system. We explore this for the first time explicitly, by modeling a simulated spiral galaxy from D'Onghia et al. (2013), and by comparing the results to the true potential. This simulation has stronger spiral arms than we expect in the MW, and—except for the absence of a bar and thick and gas disk components—it has matter components similar to the MW, as discussed in Section 2.5. We find that RoadMapping-like action-based dynamical modeling is very robust against perturbations of spiral arms in this simulation, especially if the survey volume is large enough to encompass both spiral and inter-arm regions. In Section 4.1, we demonstrated this in detail for a single RoadMapping analysis of a data set with a spatial coverage of radius ${r}_{\max }=4\,\mathrm{kpc}$ around a position equivalent to that of the Sun.

In Section 4.2, we have investigated the role of survey volumes differing in size and position with respect to the spiral arms and inter-arm regions within the simulated galaxy. We find that the gravitational forces are mostly well recovered at the locations of the stars that entered the analysis. For survey volume sizes ${r}_{\max }\geqslant 3\,\mathrm{kpc}$ , the recovered potential model already becomes a good average potential model for a large portion of the galaxy. For some positions of the survey volume center, e.g., in a smooth and not-too-depleted inter-arm region, smaller volumes can also give good overall constraints. If a small volume is dominated by a very strong spiral arm the constraints become less reliable, as expected. The correct DM halo scale length was, however, only recovered for a survey volume as large as ${r}_{\max }=5\,\mathrm{kpc}$ .

This overall robustness of RoadMapping is particularly notable because the breakdown of the assumption of axisymmetry implies a breakdown of several model assumptions simultaneously: (1) orbital actions are not fully conserved anymore, (2) the true potential is not spanned by the family of model potentials, (3) the qDF need not, or will not, describe the orbit distribution within spiral arms. However, the qDF seems to be informative enough to guide the fit to potential shapes that correctly measure the average surface density (within $\sim 2\times$ the disk scale height) and the circular velocity where most of the stars that entered the analysis are located—even for small volumes with ${r}_{\max }=500\,\mathrm{pc}$ dominated by spiral arms.

The results of this paper imply that RoadMapping should be well-suited to making new measurements of the MW's gravitational potential with the upcoming Gaia DRs. It might even potentially work with the Gaia DR1 with its smaller coverage of the disk ( ${r}_{\max }\sim 1\,\mathrm{kpc}$ ) because the local Orion arm, in which the Sun is located, is thought to be only a minor spiral arm in the MW and should not significantly disturb the modeling.

W.H.T. thanks Stephen Pardy for helpful advice on handling N-body simulations. W.H.T. and H.-W.R. acknowledge funding from the European Research Council under the European Unions Seventh Framework Programme (FP 7) ERC Grant Agreement n. $[321035]$ . J.B. received support from the Natural Sciences and Engineering Research Council of Canada and the Alfred P. Sloan Foundation. E.D. gratefully acknowledges the support of the Alfred P. Sloan Foundation and ATP NASA Grant No NNX144AP53G, and support by Sonderforschungsbereich SFB 881 "The Milky Way System" (A3) of the German Research Foundation (DFG).