Three-dimensional continued fractions and Kloosterman sums

A. V. Ustinov

doi:10.1070/RM2015v070n03ABEH004953

§ 1. Introduction

1.1. Linnik's problem

Many number-theoretic problems can be reduced to the study of a Diophantine equation

$\begin{equation} F(x_1,\dots,x_n)=P, \end{equation} \tag{ 1.1 }$

where $F$ is a homogeneous polynomial and $P$ is an integer. The only general method that makes it possible to describe the asymptotic properties of solutions of equation (1.1) is the Hardy–Littlewood circle method, which, however, requires additional properties of the polynomial $F$ (see [1]). In certain situations the distribution of the solutions of (1.1) can be investigated by methods of algebraic number theory and algebraic geometry (see [2], [3]), but in the general case one does not even know estimates of the correct order for the number of solutions.

If equation (1.1) defines a homogeneous variety with an action of a linear algebraic group, then the possibility arises of applying methods of harmonic analysis on this group (see [4]–[7]). An important special case of such a situation is the determinant equation

$\begin{equation} \det X=P, \end{equation} \tag{ 1.2 }$

where $X$ is a square matrix with independent coefficients. For $3\times 3$ matrices Linnik and Skubenko [8] (see also [9], Chap. VIII) proved that as $P\to\infty$ the integer solutions of equation (1.2) are uniformly distributed with respect to the Haar measure. They were solving the problem under the assumption that the normalized matrix $\widetilde{X}=XP^{-1/3}$ is contained in some fixed domain $\Omega\subset SL_3(\mathbb{R})$ of finite measure. They proved an asymptotic formula for the number of solutions of (1.2) without explicitly indicating a lowering in the remainder term. (An explicit estimate for the remainder term for the domain $\|\widetilde{X}\|\leqslant 1$ , where $\|\,\cdot\,\|$ is the Euclidean norm, was given in [10].) In the general case the problem of the distribution of the integer solutions of equation (1.1) is known as Linnik's problem and is also usually considered under the assumption that $\widetilde{X}=XP^{-1/d}\ll 1$ , where $d$ is the degree of the polynomial $F$ (see [4], [5], [10]). See [11]–[14] for development of the method of Linnik and Skubenko.

1.2. Linnik–Skubenko reduction

By Weyl's criterion (see, for example, [15]) a necessary and sufficient condition for the uniform distribution of a system of functions $(f_1(x),\dots,f_n(x))$ is that

$\begin{equation*} \sum_{x=1}^{P}e(m_1f_1(x)+\dots+m_nf_n(x))=o(P)\qquad (P\to\infty), \end{equation*}$

where $m_1,\dots,m_n$ are arbitrary integers that are not simultaneously equal to zero, and $e(t)=e^{2\pi\sqrt{-1}\,t}$ . This condition makes it possible to reduce the study of the uniform distribution of systems of functions to estimates of the corresponding trigonometric sums.

Many problems related to planar integer lattices (see details in §2.2) can be reduced to the investigation of solutions of the determinant equation

$\begin{equation} \det \begin{pmatrix} a_1&a_2 \\ b_1&b_2 \end{pmatrix}=P. \end{equation} \tag{ 1.3 }$

This equation can be replaced by the equivalent congruence

$\begin{equation} a_2b_1+P\equiv 0\pmod{a}, \end{equation} \tag{ 1.4 }$

assuming that $a_1=a> 0$ is fixed and that the value of $b_2$ is determined from the equation $b_2=(a_2b_1+P)a^{-1}$ .

By using Weyl's criterion the problem of the uniform distribution of solutions of the congruence (1.4) can be reduced to estimates of the sums

$\begin{equation} K_a(m,n,q)=\sum_{x,y=1}^{a}\delta_a(xy+q)e \biggl(\frac{mx+ny}{a}\biggr), \end{equation} \tag{ 1.5 }$

where $m,n,q$ are arbitrary integers, $a$ is a positive integer, and $\delta_a$ is the characteristic function of divisibility by $a$ :

$\begin{equation*} \delta_a(x)=\begin{cases} 1& \text{if} \ x\equiv 0\!\!\!\pmod{a}, \\ 0& \text{if} \ x\not\equiv 0\!\!\!\pmod{a}. \end{cases} \end{equation*}$

For $q=-1$ the sums (1.5) coincide with the classical Kloosterman sums

$\begin{equation} K_a(m,n)=\sum_{x,y=1}^{a}\delta_a(xy-1)e\biggl(\frac{mx+ny}{a}\biggr). \end{equation} \tag{ 1.6 }$

Non-trivial estimates are known for the sums (1.6) and (1.5), and this makes it possible to find asymptotic formulae for sums of the form

$\begin{equation} \sum_{a_1b_2-a_2b_1=P}f(a_1,a_2,b_1,b_2) \end{equation} \tag{ 1.7 }$

by replacing them with the corresponding integrals. Problems in the geometry of numbers, the theory of continued fractions, and so on, can be reduced to the calculation of similar sums (see the survey [16]).

In the three-dimensional case, bases of lattices with determinant $P$ are parametrized by solutions of the determinant equation (1.2), where $X$ is a matrix of the form

$\begin{equation} X=\begin{pmatrix} a_1&a_2&a_3 \\ b_1&b_2&b_3 \\ c_1&c_2&c_3 \end{pmatrix} \end{equation} \tag{ 1.8 }$

in which the coordinates of the basis vectors are written in columns. In the study of Voronoi–Minkowski three-dimensional continued fractions (see the original publications [17], [18], and [19], as well as their exposition in [20], [21], and [22]), the necessity arises of counting the solutions of (1.2) for which the normalized matrix $\widetilde{X}=XP^{-1/3}$ can vary in a domain $\Omega\subset SL_3(\mathbb{R})$ of infinite measure.

The Linnik–Skubenko method was based on reduction to the preceding dimension — to the determinant equation (1.3). The main idea was that if the matrix $\begin{pmatrix} a_1 & a_2\\ b_1 & b_2 \end{pmatrix}$ is fixed and has non-zero determinant $q$ , and $(q,P)=1$ , then for each solution (1.8) of equation (1.2) it is possible to construct a series of solutions

$\begin{equation} \begin{pmatrix} a_1 & a_2 & za_3+sa_1+ta_2 \\ b_1 & b_2 & zb_3+sb_1+tb_2 \\ z^{-1}c_1+ua_1+vb_1&z^{-1}c_2+ua_2+vb_2&* \end{pmatrix}, \end{equation} \tag{ 1.9 }$

where $z\in\mathbb{Z}_q^*$ , $zz^{-1}\equiv 1\pmod{q}$ , and $s,t,u,v$ are arbitrary integers. The presence of the parameter $z$ , which is non-linearly involved in the parametrization (1.9), makes it possible to use Kloosterman sums for reducing the problem of the distribution of the solutions of (1.2) to sums of the form (1.7), the methods of calculation of which are well known.

In [23] a more precise version of the Linnik–Skubenko reduction was proposed which is applicable, in particular, for domains $\Omega$ of infinite volume. The auxiliary two-dimensional problems arising after the reduction were solved in [24]. In the present paper the results in [23] and [24] are applied to the study of statistical properties of Voronoi–Minkowski three-dimensional continued fractions. One can expect that the proposed approach will turn out to be useful also for solution of other problems related to three-dimensional lattices.

1.3. Theorems of Heilbronn and Porter

For a rational $r$ , let $l(r)$ denote the length of the expansion of $r$ into a finite continued fraction

$\begin{equation*} r=[a_0;a_1,\dots,a_l]=a_0+ \frac{1}{a_1+{\atop\ddots\,\displaystyle{+\cfrac{1}{a_l}}}}\,, \end{equation*}$

where $a_0=\lfloor r\rfloor$ (the integer part of $r$ ), $a_1,\dots,a_l$ are positive integers, and $a_l\geqslant 2$ for $l\geqslant1$ .

Heilbronn [25] proved an asymptotic formula for the average value of $l(r)$ taken over rational numbers $r$ with a fixed denominator:

$\begin{equation} \frac{1}{\varphi(P)}\sideset{}{^*}\sum_{a=1}^{P}l\biggl(\frac{a}{P}\biggr)= \frac{2\log 2}{\zeta(2)}\log P+O((\log\log P)^4) \end{equation} \tag{ 1.10 }$

(henceforth an asterisk means that the summation is carried out over the reduced system of residues). Porter later [26] refined this result by isolating the next significant term, which is an absolute constant:

$\begin{equation} \frac{1}{\varphi(P)}\sideset{}{^*}\sum_{a=1}^{P}l\biggl(\frac{a}{P}\biggr)= \mathscr{Q}_1(\log P)+O(P^{-1/6+\varepsilon}) \end{equation} \tag{ 1.11 }$

(we denote by $\mathscr{Q}_m(x)$ a polynomial of degree $m$ in a variable $x$ ; the constants in the symbols $O$ are always assumed to depend on an arbitrarily small positive number $\varepsilon$ ). Heilbronn's proof is elementary. Porter used estimates for Kloosterman sums and estimates for trigonometric sums according to van der Corput.

1.4. Brief statement of the main result

In the present paper the refined version in [23] of Linnik–Skubenko reduction is used to construct a method of analysis of minimal bases in three-dimensional lattices. In particular, this method enables us to prove a three-dimensional analogue of Porter's result (1.11) for them.

Theorem 1.1. The average number of Minkowski bases over totally primitive lattices with determinant $P$ has the asymptotics

$\begin{equation} \mathscr{Q}_2(\log P)+O(P^{-1/34+\varepsilon}). \end{equation} \tag{ 1.12 }$

A more precise statement of this result will be given below after the definitions of all the requisite notions (see Theorem 3.3 below). From the viewpoint of Linnik's problem, the presence in the asymptotics of a polynomial of second degree in the logarithm of $P$ is explained by the fact that when counting solutions of the equation $\det X=P$ one has to deal with a domain of infinite volume on the variety defined by the equation $\det \widetilde{X}=1$ .

A three-dimensional analogue of Heilbronn's theorem (1.10) was proved by Illarionov [27] (see [28]–[30] concerning other multidimensional generalizations). Illarionov's arguments make it possible to determine the leading term in the asymptotic formula (1.12) with remainder $O(\log P\log\log P)$ .

Equation (1.11) can be interpreted as a formula for the average length of the Euclidean algorithm applied to a pair of numbers $(a,P)$ such that $1\leqslant a\leqslant P$ and $(a,P)=1$ . From the geometric viewpoint, the left-hand side of equation (1.11) can be understood as the average number of minimal bases in lattices with bases from the pair of vectors $(1,a)$ and $(0,P)$ . The formula (1.12) describes the average number of minimal bases in three-dimensional lattices generated by the vectors $(1,0,a)$ , $(0,1,b)$ , and $(0,0,P)$ (see §2.3). The same quantity can be interpreted as the average number of all possible bases that can appear in the Euclidean algorithm applied to a triple $(a,b,P)$ in which $1\leqslant a,b\leqslant P$ and $(a,P)=(b,P)=1$ .

1.5. Plan of the paper

In §2 we give a brief survey of results connected with the metric theory of infinite and finite continued fractions.

In §3 we discuss three-dimensional continued fractions according to Voronoi and Minkowski. The precise statement of the main result is given.

In order to illustrate the scheme of the proof of the main theorem, we briefly describe in §4 the main steps needed to solve a model problem — the proof of a simplified version of equation (1.11). The general scheme of arguments in the proof of the main Theorem 3.3 is the same.

In §5 the solutions of equation (1.2) are divided into groups, in each of which the solutions will be counted independently (until a certain moment). At the end of §5.3 we describe the detailed scheme of proof of the main result.

In §§6–8 asymptotic formulae are proved for the number of solutions of equation (1.2) with fixed corner element $a_1$ and with corner minor $q=\begin{vmatrix} a_1 & a_2\\ b_1 & b_2 \end{vmatrix}$ , by three different methods.

In §9 we complete the proof of the main Theorem 3.3.

§ 2. Metric properties of continued fractions

2.1. Gauss measure

The metric theory of continued fractions goes back to Gauss' problem on the typical behaviour of numbers of the form

$\begin{equation*} \alpha_n=T^n(\alpha)=[0;a_{n+1},a_{n+2},\dots], \end{equation*}$

where $\alpha=[0;a_{1},a_{2},\dots]$ is a random number in the interval $[0,1)$ and $T$ is the Gauss map:

$\begin{equation*} T(\alpha)=\biggl\{\frac{1}{\alpha}\biggr\} \quad\text{for } \alpha\ne 0,\qquad T(0)=0. \end{equation*}$

For a real number $\xi\in[0,1]$ , let $F_n(\xi)$ denote the measure of the set of numbers $\alpha\in[0,1)$ for which $\alpha_n \leqslant \xi$ . In studying iterations of the map $T$ , Gauss arrived at the conjecture that

$\begin{equation} \lim_{n \to \infty} F_n (\xi)=\log_2 (1+\xi)=\frac{\log(1+\xi)}{\log 2} \end{equation} \tag{ 2.1 }$

(this is known from the correspondence of Gauss with Laplace; see [31], Chap. 3). Kuz'min [32] obtained the asymptotic formula

$\begin{equation*} F_n(\xi)=\log_2(1+\xi)+O(e^{-\lambda\sqrt n})\qquad (\lambda>0), \end{equation*}$

from which Gauss' conjecture follows. Kuz'min's result was refined by Lévy [33] and Wirsing [34]. The definitive solution of Gauss' problem is due to Babenko [35]. He proved the existence of an infinite sequence of numbers $\lambda_j$ decreasing to zero,

$\begin{equation*} 1>|\lambda_1|>|\lambda_2| \geqslant \cdots\geqslant|\lambda_k|\geqslant |\lambda_{k+1}| \geqslant \cdots, \end{equation*}$

and a corresponding sequence of analytic functions $\psi_k(\xi)$ such that

$\begin{equation} F_n (\xi)=\log_2(1+\xi)+\sum_{k=1}^\infty \psi_k(\xi)\lambda^n_k. \end{equation} \tag{ 2.2 }$

Equation (2.1) means that the typical behaviour of the numbers $x_n=T^n(x)$ is described by the Gauss measure

$\begin{equation*} d\nu(\xi)=\frac{1}{\log2}\,\frac{d\xi}{1+\xi}\,,\qquad \xi\in[0,1], \end{equation*}$

which is invariant under the map $T$ . In particular, this implies that the probabilities of the appearance of positive integers $k$ as partial quotients of real numbers are described by the Gauss–Kuz'min distribution

$\begin{equation} p_k=-\log_2\biggl(1-\frac{1}{(k+1)^2}\biggr),\qquad k\geqslant 1. \end{equation} \tag{ 2.3 }$

In many cases it is more convenient to consider the extended Gauss measure

$\begin{equation} d\overline\nu(\alpha,\beta)=\frac{1}{\log2}\, \frac{d\alpha\,d\beta}{1+\alpha\beta}\,,\qquad (\alpha,\beta)\in[0,1]^2, \end{equation} \tag{ 2.4 }$

which is invariant under the map

$\begin{equation*} \overline T\colon(\alpha,\beta)\to\begin{cases} \biggl(\biggl\{\dfrac{1}{\alpha}\biggr\}, \biggl(\biggl[\dfrac{1}{\alpha}\biggr]+\beta\biggr)^{-1}\biggr)&\text{for} \ \alpha\ne 0, \\ (\alpha,\beta)&\text{for} \ \alpha= 0 \end{cases} \end{equation*}$

(see [36]), which is almost everywhere invertible. If we expand the coordinates of the initial point $(\alpha_0,\beta_0)$ into continued fractions

$\begin{equation} \alpha_0=[0;a_0,a_1,\dots],\qquad \beta_0=[0;a_{-1},a_{-2},\dots], \end{equation} \tag{ 2.5 }$

then on the doubly infinite sequence $(\dots,a_{-2},a_{-1},a_0, a_1,a_2,\dots)$ obtained by concatenation of the expansions (2.5) written in opposite directions the map $\overline T$ is equivalent to a shift: $\overline T^{n}(\alpha_0,\beta_0)=(\alpha_{n},\beta_{n})$ , where $n$ is an arbitrary integer, $\alpha_n=[0;a_n,a_{n+1},\dots]$ , and $\beta_n=[0;a_{n-1},a_{n-2},\dots]$ .

2.2. Statistical properties of finite continued fractions

A discrete version of Gauss' problem is to study the statistical properties of finite continued fractions. For rational numbers (as well as for real numbers) it is convenient to interpret the Gauss–Kuz'min statistics in a wider sense. For real $\xi,\eta\in[0,1]$ and rational $r$ , we define the Gauss–Kuz'min statistics by the equation

$\begin{equation} l_{\xi,\eta}(r)=\sum_{j=1}^{s+1}\bigl[[0;a_j,\dots,a_s]\leqslant \xi,[0;a_{j-1},\dots,a_1]\leqslant \eta\bigr]. \end{equation} \tag{ 2.6 }$

We assume that an empty continued fraction is equal to zero by definition. If $A$ is some condition, then $[A]$ is the characteristic function of the set defined by this condition: $[A]=1$ if the condition holds, and $[A]=0$ otherwise. In particular, $l_{1,1}(r)=l(r)$ .

The distribution of the partial quotients in the expansion of numbers $a/b$ in the case where $1\leqslant a\leqslant b\leqslant P$ and $P\to\infty$ was first studied in 1961 by Lochs (see [37], as well as [25], [38]). Later this problem was posed in a more general setting by Arnold as Problem 1993-11 in [39] (the papers [40]–[45] were devoted to Arnold's problem). Lochs' result can be interpreted as follows: the asymptotic formula

$\begin{equation} \frac{2}{P^2}\sum_{1\leqslant a\leqslant b\leqslant P} l_{\xi,\eta}\biggl(\frac{a}{b}\biggr)= \frac{2}{\zeta(2)}\int_{0}^{\xi}\,\int_{0}^{\eta} \frac{d\alpha\,d\beta}{(1+\alpha\beta)^2}\log P+C(\xi,\eta)+ O(P^{-1/2+\varepsilon}) \end{equation} \tag{ 2.7 }$

holds for the average value of the Gauss–Kuz'min statistics (2.6) in which the leading coefficient is proportional to the Gauss measure of the rectangle $[0,\xi]\times [0,\eta]$ .

The function $C(\xi,\eta)$ , as well as the left-hand side of equation (2.7), is discontinuous at all the points that have at least one rational coordinate. This function is defined by a singular series, that is, a series consisting of the remainder terms of asymptotic formulae (see [37], [44], [46], [47]).

Equation (1.11) can also be generalized to the case of the Gauss–Kuz'min statistics:

$\begin{equation} \frac{1}{\varphi(P)}\sideset{}{^*}\sum_{a=1}^{P} l_{\xi,\eta}\biggl(\frac{a}{P}\biggr)= \frac{2\log P}{\zeta(2)}\int_{0}^{\xi}\,\int_{0}^{\eta} \frac{d\alpha\,d\beta}{(1+\alpha\beta)^2}+ \widetilde{C}(\xi,\eta)+O(P^{-1/6+\varepsilon}) \end{equation} \tag{ 2.8 }$

(for $\eta=1$ the proof is given in [43]; the formula for the principal term follows from Heilbronn's result [25]; the functions $C(\xi,\eta)$ and $\widetilde{C}(\xi,\eta)$ can be expressed in terms of each other). By comparing (2.2) with (2.7) (or with (2.8)) we can conclude that the principal terms in the continuous and discrete problems are proportional and are determined by an invariant measure, while the next significant terms differ and are of a fundamentally different nature.

The analytic apparatus used in the study of statistical properties of finite continued fractions makes it possible to solve also other problems in which continued fractions are used as an auxiliary tool.

In particular, this makes it possible

$\bullet$ to analyse the typical behaviour of the Euclidean algorithms with rounding off to the nearest integer [48], [49], with even and odd partial quotients [50]–[52], with by-excess division [53], [54], and with subtractive division [55], [56], and to analyse the typical behaviour of Minkowski diagonal fractions [57] and continued fractions of more general form [58], as well as to study continued fraction expansions of quadratic irrationals [45];

$\bullet$ to find the distribution density in Sinai billiards (circular scatterers are placed at nodes of the lattice $\mathbb{Z}^2$ ) of the random variable equal to the length of the free path of a particle [59]–[62], and (in an equivalent problem) to calculate the joint distribution density of the lengths of neighbouring segments connecting the origin with primitive points of the integer lattice [63], [64];

$\bullet$ to describe the limit distribution of Frobenius numbers with three arguments [65]–[69];

$\bullet$ to prove the existence of a limit distribution density for partial Gauss sums and partial theta-series [70]–[72], and also to find these densities [73].

It should be pointed out that for all the problems listed above there exist other approaches based on ergodic theory and methods of the geometry of numbers. Ergodic methods usually turn out to be applicable in a more general situation, but in comparison with analytic methods they require averaging over a greater number of parameters and produce less accurate remainder terms.

In particular, by using ergodic theory it is possible

$\bullet$ to analyse a wide class of Euclidean algorithms [74]–[76];

$\bullet$ to study Sinai billiards in spaces of arbitrary dimension [77], [78], and to study the behaviour of the free path of particles in quasi-crystallic structures [79];

$\bullet$ to prove the existence of a limit distribution density for Frobenius numbers with an arbitrary number of arguments [80], and to describe the properties of this density [81]–[83];

$\bullet$ to obtain results on the behaviour of partial Gauss sums and partial theta-series [84], [85].

2.3. Totally primitive lattices

Let $M(v_1,\dots,v_s)$ denote the matrix in which the coordinates of the vectors $v_1,\dots,v_s$ are written in the columns. A full lattice $\Lambda\subset\mathbb{Z}^s$ with basis $(v_1,\dots,v_s)$ is said to be totally primitive if for every row of the matrix $M(v_1,\dots,v_s)$ the minors corresponding to the elements of this row are setwise coprime. For example, for a matrix $\begin{pmatrix} a_1 & a_2\\ b_1 & b_2 \end{pmatrix}$ this condition means that $(a_1,a_2)=(b_1,b_2)=1$ , that is, for $s=2$ the notions of primitive and totally primitive lattice coincide. If $s=3$ , then for a basis matrix of the form (1.8) the condition of being totally primitive is written in the form

$\begin{equation} \begin{gathered}\biggl(\begin{vmatrix} a_1&a_2 \\ b_1&b_2 \end{vmatrix}, \begin{vmatrix} a_2&a_3 \\ b_2&b_3 \end{vmatrix}, \begin{vmatrix} a_1&a_3 \\ b_1&b_3 \end{vmatrix}\biggr)=1,\end{gathered} \end{equation} \tag{ 2.9 }$

$\begin{equation} \begin{gathered}\biggl(\begin{vmatrix} a_1&a_2 \\ c_1&c_2 \end{vmatrix}, \begin{vmatrix} a_1&a_3 \\ c_1&c_3 \end{vmatrix}, \begin{vmatrix} a_2&a_3 \\ c_2&c_3 \end{vmatrix}\biggr)=1,\end{gathered} \end{equation} \tag{ 2.10 }$

$\begin{equation} \begin{gathered}\biggl(\begin{vmatrix} b_1&b_2 \\ c_1&c_2 \end{vmatrix}, \begin{vmatrix} b_1&b_3 \\ c_1&c_3 \end{vmatrix}, \begin{vmatrix} b_2&b_3 \\ c_2&c_3 \end{vmatrix}\biggr)=1.\end{gathered} \end{equation} \tag{ 2.11 }$

An equivalent definition of a totally primitive lattice is obtained if the lattice is required to have a basis with a matrix

$\begin{equation} \begin{pmatrix} 1 & 0& 0 \\ 0 & 1& 0 \\ a & b & P \end{pmatrix}, \end{equation} \tag{ 2.12 }$

where $0\leqslant a, b<P$ and $(a,P)=(b,P)=1$ . In particular, this implies that there exist $\varphi^{2}(P)$ three-dimensional totally primitive lattices with determinant $P$ .

From the viewpoint of the theory of Diophantine approximations, the local minima of a lattice with basis matrix (2.12) are the best approximations of the linear form $x_1\dfrac{a}{P}+x_{2}\dfrac{b}{P}+x_3$ . All that was said above about primitive three-dimensional lattices can be extended in obvious fashion to the case of arbitrary dimension $s\geqslant 3$ .

2.4. Multidimensional analogues of the Gauss measure

The Gauss measure is a special case of a more general construction. The set of bases in $s$ -dimensional lattices can be identified with the set of matrices $GL_s(\mathbb{R})$ . The definitions of local minima and minimal systems of vectors (see §3.1) are independent of the choice of scales on the coordinate axes, and therefore for studying the properties of minimal bases it is natural to consider the quotient space $\mathscr{X}_s= D_s(\mathbb{R}) \setminus GL_s(\mathbb{R})$ , where $D_s(\mathbb{R})$ is the group of diagonal invertible $s\times s$ matrices with real coefficients. The bi-invariant Haar measure on $GL_s(\mathbb{R})$

$\begin{equation*} d\lambda(g)=\frac{dg}{(\det g)^{s}}\,, \end{equation*}$

where $g\in GL_s(\mathbb{R})$ and $dg=dg_{11}\,dg_{12}\cdots dg_{ss}$ is the Lebesgue measure, induces on $\mathscr{X}_s$ the quotient measure $\mu$ , which is a right Haar measure and which remains invariant under the left action of $D_s(\mathbb{R})$ :

$\begin{equation*} d\mu(hG)=d\mu(hGg) \end{equation*}$

for any $h\in D_s(\mathbb{R})$ and $g\in G=GL_s(\mathbb{R})$ .

Setting $\overline g=(\overline g_{ij})\in \mathscr{X}_s$ , where $\overline g_{ij}=g_{ij}g_{ii}^{-1}$ for $1\leqslant i,j\leqslant s$ , we can define the measure $\mu$ in the chart $\overline g_{11}=\overline g_{22}=\cdots=\overline g_{ss}=1$ by

$\begin{equation} d\mu(\overline g)=\frac{d\overline g}{(\det\overline g)^s}\,, \quad\text{where}\quad d\overline g=d\overline g_{12}\cdots d\overline g_{s\,s-1}. \end{equation} \tag{ 2.13 }$

Then the right invariance of $\mu$ follows from the formula

$\begin{equation*} \frac{dg}{(\det g)^s}=\frac{dg_{11}}{g_{11}}\, \frac{dg_{22}}{g_{22}}\cdots \frac{dg_{ss}}{g_{ss}}\, \frac{d\overline g_{12}\cdots d\overline g_{s\,s-1}}{(\det\overline g)^s}, \end{equation*}$

in which $\dfrac{dg_{11}}{g_{11}}\,\dfrac{dg_{22}}{g_{22}}\cdots \dfrac{dg_{ss}}{g_{ss}}$ is the Haar measure on $D_s(\mathbb{R})$ .

For $s=2$ the measure $\mu$ , taken on the space of matrices of the form $\begin{pmatrix} 1 & -\alpha\\ \beta & 1\end{pmatrix}$ (normalized Voronoi matrices), coincides up to the normalizing factor $1/\log 2$ with the extended Gauss measure (2.4). For $s=3$ the measure $\mu$ arises in the study of the statistical properties of Klein polyhedra (see [86], [87]) and Voronoi–Minkowski continued fractions (see [27]). An essential difference between these two objects is the fact that Klein polyhedra are parametrized by the points of the whole space $\mathscr{X}_s$ , the measure of which is infinite, while to non-degenerate minimal systems of vectors (that is, systems whose matrices have non-zero determinant; see §3) in the space $\mathscr{X}_s$ there corresponds a domain of finite measure $\mu$ : if a matrix $g\in GL_s(\mathbb{R})$ with diagonal dominance (in each row the absolute values of the non-diagonal elements do not exceed the absolute value of the diagonal element) defines a non-degenerate minimal system of vectors of a lattice $\Lambda$ , then by Minkowski's convex body theorem, $g_{11}g_{22}\cdots g_{ss}\leqslant\det\Lambda$ and

$\begin{equation*} |\det \overline g|=\frac{|\det g|}{g_{11}g_{22}\cdots g_{ss}}\geqslant \frac{|\det g|}{\det\Lambda}\geqslant1. \end{equation*}$

§ 3. Continued fractions and lattices

3.1. Geometry of continued fractions

There are two geometric interpretations of classical continued fractions admitting a natural generalization to the multidimensional case. In the first, due to Klein (see [88], [89], and also an earlier remark of Smith [90], pp. 146–147), a continued fraction is identified with the convex hull (the Klein polygon) of the points of the integer lattice that lie in two adjacent angles. The second interpretation, proposed independently by Voronoi and Minkowski (see [17], [18] and [19], [91], as well as the reiteration of the original results in [20], [21], and [22]), is based on the use of local minima of lattices, minimal systems, and extremal parallelepipeds (see the definitions below). In planar lattices the vertices of Klein polygons (after a linear transformation taking the sides of the angles to coordinate axes) can be identified with Voronoi local minima. But the geometric constructions of Klein and Voronoi–Minkowski become different starting from dimension $3$ (see [92], [93]).

We recall the requisite definitions going back to Voronoi and Minkowski. A lattice $\Lambda\subset\mathbb{R}^s$ is said to be irreducible (or a lattice of general position) if the coordinate hyperplanes do not contain nodes of the lattice other than the origin; in the opposite case the lattice is said to be reducible. The set of full $s$ -dimensional lattices (that is, lattices of dimension coinciding with the dimension of the space) is denoted by $\mathscr{L}_s(\mathbb{R})$ , and the subset of it consisting of irreducible lattices by $\mathscr{L}^*_s(\mathbb{R})$ .

For a non-empty finite set $A\subset\mathbb{R}^s$ , we put

$\begin{equation*} \begin{gathered} |A|_i=\max\{|x_i|: x=(x_1,\dots,x_s)\in A\}\qquad (i=1,\dots,s), \\ \operatorname{Box}(A)=(-|A|_1,|A|_1)\times\cdots\times(-|A|_s,|A|_s), \\ \overline{\operatorname{Box}}(A)= [-|A|_1,|A|_1]\times\cdots\times[-|A|_s,|A|_s]. \end{gathered} \end{equation*}$

In other words, $\operatorname{Box}(A)$ is the smallest parallelepiped circumscribed around the set $A$ (we consider only parallelepipeds with centre at the origin and with faces parallel to the coordinate planes).

A system of nodes of order $r$ of a lattice $\Lambda$ (not necessarily a full lattice) is defined to be any finite $r$ -tuple $(v_1,\dots,v_r)$ of non-zero nodes of $\Lambda$ in which $v_i\ne\pm v_j$ ( $1\leqslant i<j\leqslant r$ ). With an arbitrary system $S=(v_1,\dots,v_r)$ we associate the matrix $M(v_1,\dots,v_r)$ by writing the coordinates of the vectors $v_1,\dots,v_r$ in the columns.

A node $\gamma$ of a lattice $\Lambda\in\mathscr{L}_s(\mathbb{R})$ is called a Voronoi relative (local) minimum of $\Lambda$ (henceforth, simply a minimum) if the parallelepiped $\overline{\operatorname{Box}}(\gamma)$ does not contain nodes of $\Lambda$ other than its own vertices and the origin (see [18]). The set of all local minima of $\Lambda$ is denoted by $\mathfrak{M}(\Lambda)$ . If $\Lambda$ has several minimal vectors $v_1,\dots,v_k$ such that $|v_i|=|v_j|$ ( $1\leqslant i<j\leqslant k$ ), then we agree to include in $\mathfrak{M}(S)$ only one of these vectors.

A system $S$ of vectors of the lattice $\Lambda$ is said to be minimal if the parallelepiped $\operatorname{Box}(S)$ does not contain nodes of $\Lambda$ other than the origin. In particular, for irreducible lattices the notion of a minimal system of order $1$ coincides with the notion of a local minimum. For reducible lattices the definition of a minimal system has to be made more precise (see [94]).

In the two-dimensional case we introduce on the set $\mathfrak{M}(\Lambda)$ of local minima the structure of a sequence

$\begin{equation} \mathfrak{M}(\Lambda)=(\dots,v_{-2},v_{-1},v_{0},v_{1},v_{2},\dots) \end{equation} \tag{ 3.1 }$

in which the vectors $v_n=((-1)^nx_n,y_n)$ ( $x_n,y_n>0$ ) are ordered by decrease of the first coordinate: $y_{n+1}>y_n$ , $x_{n+1}<x_n$ . Here every minimal pair of vectors has the form $(v_n,v_{n+1})$ (that is, consists of neighbouring local minima) and is a basis of the lattice $\Lambda$ (see [18]); such pairs are called Voronoi bases.

By considering the normalized matrices

$\begin{equation} \begin{pmatrix} 1 & \mp \alpha_n \\ \pm \beta_n & 1 \end{pmatrix}=\begin{pmatrix} x_{n-1}^{-1} & 0 \\ 0 & y_n^{-1} \end{pmatrix}\begin{pmatrix} x_{n-1} & \mp x_n \\ \pm y_{n-1} & y_n \end{pmatrix}, \end{equation} \tag{ 3.2 }$

we deduce that from the geometric viewpoint the extended Gauss map $\overline T(\alpha_n,\beta_n)=(\alpha_{n+1},\beta_{n+1})$ means transition from the Voronoi basis $(v_{n-1},v_{n})$ to the adjacent basis $(v_n,v_{n+1})$ . Therefore, we can say that the extended Gauss measure (2.4) describes the typical behaviour of normalized Voronoi bases.

With a rational number $r=a/P$ such that $0\leqslant a< P$ and $(a,P)=1$ it is natural to associate the lattice $\Lambda(r)$ with basis matrix $\begin{pmatrix} 1&0\\ a&P \end{pmatrix}$ . Obviously, the map $r\to\Lambda(r)$ establishes a one-to-one correspondence between fractions of the form $a/P$ with $0\leqslant a< P$ and $(a,P)=1$ , and the two-dimensional primitive lattices with determinant $P$ . For $a\leqslant P/2$ the set of all local minima of the lattice $\Lambda(a/P)$ coincides with the set of vertices of convex hulls of points of the lattice $\Lambda(r)$ that lie in coordinate quadrants I and II. For $a/P=[0;a_1,\dots,a_l]\leqslant 1/2$ the local minima have the form

$\begin{equation*} v_0=(-P,0),\qquad v_1=(a,1),\quad\dots,\quad v_j=((-1)^{j+1}x_j,y_j),\quad\dots,\quad v_{l+1}=(0, P), \end{equation*}$

where the sequences $\{x_j\}$ and $\{y_j\}$ are defined by

$\begin{equation*} \frac{x_{j}}{x_{j-1}}=[0;a_{j},\dots,a_l],\quad \frac{y_{j-1}}{y_{j}}=[0;a_{j-1},\dots,a_1]\qquad (1\leqslant j\leqslant l+1). \end{equation*}$

(For $l=-1$ the fraction $[0;a_1,\dots,a_l]$ is assumed to be equal to $1/0$ by definition.) Here the set of Voronoi bases coincides with the set of pairs $(v_j,v_{j+1})$ , where $0\leqslant j\leqslant l$ . For $a>P/2$ the coordinates of the local minima are determined in similar fashion from the continued fraction expansion of the number $(P-a)/P$ .

3.2. Minkowski bases

Let $\Pi=[0,\xi]\times [0,\eta]\subset[0,1]^2$ , and let $N_\Pi(P)$ denote the number of Voronoi bases $(v_{n-1},v_n)$ (in all the primitive lattices with determinant $P$ ) for which the coefficients of the normalized matrix (3.2) satisfy the condition $(\alpha_n,\beta_n)\in\Pi$ . Then (2.8) can be rewritten in the form

$\begin{equation*} \frac{N_\Pi(P)}{\varphi(P)}=\frac{\overline\nu(\Pi)}{\zeta(2)} \log P+C(\Pi)+O(P^{-1/6+\varepsilon}), \end{equation*}$

where

$\begin{equation*} \overline\nu(\Pi)=\frac{1}{\log 2}\int_{\Pi} \frac{d\alpha\,d\beta}{(1+\alpha\beta)^2}=\log_2(1+xy). \end{equation*}$

Let $G_s$ denote the group generated by the following elementary transformations acting on the set of $s\times s$ matrices:

(i) permutations of columns and multiplication of columns by $-1$ (renumbering of the basis vectors and changing their orientation);

(ii) permutations of rows and multiplication of rows by $-1$ (renaming coordinate axes and changing their directions).

Two $s\times s$ matrices are regarded as equivalent if they are taken one to the other by the action of the group $G_s$ .

As noted above, the purpose of this paper is to develop analytic methods that enable us to prove a three-dimensional analogue of equation (2.8). The classification of minimal triples of vectors becomes somewhat more difficult in the three-dimensional case. A complete description of minimal triples in lattices of general position is given by the following result of Minkowski.

Theorem 3.1. (Minkowski) Let $S=(v_1,v_2,v_3)$ be a minimal system of a lattice $\Lambda\in\mathscr{L}_3^*$ . If the system $S$ is non-degenerate, then it is a basis of $\Lambda$ , and the matrix $M(v_1,v_2,v_3)$ is equivalent to one of the two canonical forms

$\begin{equation} \begin{gathered}\begin{pmatrix} x_1 & x_2 & -x_3 \\ -y_1 & y_2 & y_3 \\ z_1 & -z_2 & z_3 \end{pmatrix},\quad\text{where} \ x_2\geqslant x_3,\ \text{or}\ y_3\geqslant y_1,\ \text{or}\ z_1\geqslant z_2,\end{gathered} \end{equation} \tag{ 3.3 }$

$\begin{equation} \begin{gathered}\begin{pmatrix} x_1 & x_2 & x_3 \\ -y_1 & y_2 & y_3 \\ z_1 & -z_2 & z_3 \end{pmatrix},\quad\text{where}\ y_1+y_3\geqslant y_2\ \text{and}\ (x_2\geqslant x_3 \ \text{or} \ z_2\geqslant z_1).\end{gathered} \end{equation} \tag{ 3.4 }$

But if the system $S$ is degenerate, then $v_1\pm v_2\pm v_3=0$ for some combination of signs, and the matrix $M(v_1,v_2,v_3)$ can be reduced by the action of the group $G_3$ to the form

$\begin{equation} \begin{pmatrix} x_1 & -x_2 & -x_3 \\ -y_1 & y_2 & -y_3 \\ -z_1 & -z_2 & z_3 \end{pmatrix},\quad\text{where}\ x_2+x_3=x_1, \ y_1+y_3= y_2, \ z_1+z_2=z_3. \end{equation} \tag{ 3.5 }$

(In all three cases it is assumed that $x_i,y_i,z_i\geqslant 0$ and the basis matrix has diagonal dominance: $x_2,x_3\leqslant x_1$ , $y_1,y_3\leqslant y_2$ , and $z_1,z_2\leqslant z_3$ .)

The converse is also true: a system of three vectors $(v_1,v_2,v_3)$ with matrix equivalent to one of the matrices of the form (3.3) or (3.4) is a minimal system of the full lattice $\Lambda=\langle v_1,v_2,v_3\rangle$ ; a system of vectors $(v_1,v_2,v_3)$ with matrix of the form (3.5) is a minimal system of the rank- $2$ lattice $\Lambda=\langle v_1,v_2,v_3\rangle=\langle v_1,v_2\rangle$ .

Minkowski stated this theorem without proof (see [19], [91]). A detailed proof can be found in [22], papers 109–110 (see also [95]–[98]).

In accordance with Theorem 3.1, minimal systems with matrices equivalent to (3.3) or (3.4) are called Minkowski bases of type I or II, respectively.

For reducible lattices the classification of minimal systems becomes more difficult (see [94]). But it is Minkowski bases that are of main interest, since any local minimum $v$ can be supplemented to form a Minkowski basis by extending (in a lattice of general position that is close to the given lattice $\Lambda$ ) the parallelepiped $\operatorname{Box}(v)$ along the coordinate axes.

3.3. Three-dimensional continued fractions

For an arbitrary $T\subset \mathbb{R}^3$ let $T'=T$ if $0\notin T$ and $T'=T\setminus\{0\}$ if $0\in T$ , and let

$\begin{equation*} |T|=\{(|x|,|y|,|z|)\in\mathbb{R}^3\colon (x,y,z)\in T\}. \end{equation*}$

With each discrete set $T\subset \mathbb{R}^3$ ( $T\ne {0}$ ) we associate the orthogonal surface $\mathscr{P}(T)$ that is defined as the boundary of the set

$\begin{equation*} |T'|\oplus\mathbb{R}_{\geqslant 0}^3= \{t+r\colon t\in|T'|, \ r\in\mathbb{R}_{\geqslant 0}^3\}. \end{equation*}$

The Voronoi–Minkowski three-dimensional continued fraction associated with a lattice $\Lambda$ is defined as the orthogonal surface $\mathscr{P}(\Lambda)$ .

The fact that $S$ is discrete implies that the surface $\mathscr{P}(S)$ has only finitely many vertices inside any bounded set. The set of concave vertices of $\mathscr{P}(S)$ coincides with $\mathfrak{M}(S)$ — the set of local minima. Corresponding to each convex vertex of $\mathscr{P}(S)$ is an extremal parallelepiped — a parallelepiped of the form $\operatorname{Box}(v_1,v_2,v_3)$ , where $v_1,v_2,v_3$ are local minima that lie strictly inside three mutually perpendicular faces of $\operatorname{Box}(v_1,v_2,v_3)$ . In other words, the extremal parallelepiped is characterized by the fact that its dimensions cannot be increased in such a way that it remains free of points in the set $S$ .

Example 3.2. Let $S$ consist of non-zero nodes of the lattice $\Lambda=\langle e_1,e_2,e_3\rangle$ , where $e_1=(0,5,0)$ , $e_2=(0,0,5)$ , and $e_3=(1,1,2)$ . Then $\mathfrak{M}(S)=\{e_1,e_2,e_3,e_4,e_5\}$ , where $e_4=(2,2,-1)$ and $e_5=(5,0,0)$ . The surface $\mathscr{P}(S)$ is depicted in Fig. 1.

**Figure 1.**
Download figure:
Standard image

For describing the polyhedron $\mathscr{P}(S)$ we define the Minkowski–Voronoi complex $\operatorname{MV}(S)$ as the two-dimensional complex whose vertices are the extremal parallelepipeds, whose edges are the pairs of the form $(\operatorname{Box}(\gamma_1,\gamma_2,\gamma_3), \operatorname{Box}(\gamma_2,\gamma_3,\gamma_4))$ , and whose faces are the local minima $\gamma_0$ surrounded by chains of edges of the form

$\begin{equation*} \bigl(\operatorname{Box}(\gamma_0,\gamma_1,\gamma_2), \operatorname{Box}(\gamma_0,\gamma_2,\gamma_3)\bigr),\dots, \bigl(\operatorname{Box}(\gamma_0,\gamma_n,\gamma_1), \operatorname{Box}(\gamma_0,\gamma_1,\gamma_2)\bigr) \end{equation*}$

(see Fig. 2).

**Figure 2.** Example of the surface $\mathscr{P}(S)$ and the corresponding complex $\operatorname{MV}(S)$
Download figure:
Standard image

If $S$ is a set of general position, then $\mathscr{P}(S)$ has a more regular structure. In this case it is natural to define two mutually dual planar graphs — the Voronoi graph $\operatorname{Vor}(S)$ and the Minkowski graph $\operatorname{Min}(S)$ . The vertices, edges, and faces of the Voronoi (Minkowski) graph are assumed to be, respectively, the vertices (faces), edges, and faces (vertices) of the complex $\operatorname{MV}(S)$ .

**Figure 3.** Example of a Voronoi graph
Download figure:
Standard image

The graphs $\operatorname{Vor}(S)$ and $\operatorname{Min}(S)$ can be depicted on the surface $\mathscr{P}(S)$ by using the following rules: the vertices of $\operatorname{Vor}(S)$ are peaks (convex vertices) of $\mathscr{P}(S)$ , the edges are pairs of convex edges of $\mathscr{P}(S)$ (all the vertices of $V(\Lambda)$ have degree $3$ ), and the faces are domains that are formed after erasing local minima and the edges going out from them (see Fig. 3); the vertices of $\operatorname{Min}(S)$ are the local minima (concave vertices of $\mathscr{P}(S)$ ), each face is a triangle whose edges connect three local minima on the surface of some extremal parallelepiped ( $M(\Lambda)$ is a triangulation of the plane; the concave edges of $\mathscr{P}(S)$ can be regarded as part of the edges of $\operatorname{Min}(S)$ ; see Fig. 4).

**Figure 4.** Example of a Minkowski graph
Download figure:
Standard image

**Figure 5.** The Voronoi graph and its canonical diagram
Download figure:
Standard image

**Figure 6.** Geometric meaning of the directions on the canonical diagram
Download figure:
Standard image

**Figure 7.**
Download figure:
Standard image

The edges of each of the graphs $\operatorname{Min}(S)$ and $\operatorname{Vor}(S)$ are in a one-to-one correspondence with the saddle vertices of $\mathscr{P}(S)$ .

It is convenient to depict the Voronoi graph on the plane $x+y+z=0$ in the form of the canonical diagram — a graph whose edges are segments of the three directions (see Fig. 5).

The canonical diagram preserves information about the mutual disposition of extremal parallelepipeds. Let $\gamma_i=(\pm x_i,\pm y_i,\pm z_i)$ ( $i=1,2,3$ ) and suppose that the matrix $M(\gamma_1,\gamma_2,\gamma_3)$ has diagonal dominance. Suppose that we pass from the extremal parallelepiped $\operatorname{Box}(\gamma_1,\gamma_2,\gamma_3)$ to the adjacent parallelepiped by moving along the canonical diagram in the `Eastern' direction (the direction $1$ in Fig. 6). Then the movement takes place along the edge with label $(x_2>x_3)$ , and the adjacent sector (see Fig. 6 on the right) is denoted by $(\gamma_2,\gamma_3)$ . This means that such a passage is possible only if $x_2>x_3$ , and the adjacent parallelepiped has the form $\operatorname{Box}(\gamma_1',\gamma_2,\gamma_3)$ . In particular, this implies that there exist $8$ types of local structure of vertices of the canonical diagram (of the two radii that cut out each of the three grey sectors, exactly one is chosen) (see Fig. 7).

The adjacent sectors in Fig. 6 on the left have labels $x^\downarrow$ and $y^\uparrow$ ; therefore, the linear dimensions of the parallelepiped $\operatorname{Box}(\gamma_1',\gamma_2,\gamma_3)$ compared with $\operatorname{Box}(\gamma_1,\gamma_2,\gamma_3)$ are smaller in the first coordinate, larger in the second, and the same in the third.

By choosing in a special way the orientation and colouring of the edges, one can introduce the structure of Schnyder trees on the Voronoi graph (see [99]). This makes it possible to depict any finite subgraph of $\operatorname{Vor}(S)$ in the form of a canonical diagram with preservation of the mutual disposition of the vertices (for any two convex vertices of the surface $\mathscr{P}(S)$ , their coordinates in space are connected by the same inequalities as the coordinates of the corresponding vertices of the canonical diagram on the plane $x+y+z=0$ ). More detailed information about Voronoi and Minkowski graphs can be found in [100].

An interesting problem is to find necessary and sufficient conditions for a graph satisfying the obvious properties of a canonical diagram to really be the canonical diagram of the Voronoi graph of some lattice. It is also unknown whether an infinite Voronoi graph can always be depicted in the form of a canonical diagram with preservation of the mutual disposition of the vertices in such way that no limit points appear. One can conjecture that this is always possible at least for the periodic Voronoi graphs corresponding to totally real cubic fields.

For a given lattice $\Lambda$ , the process of constructing the three-dimensional continued fraction of the surface $\mathscr{P}(\Lambda)$ consists in successively constructing elements of $\mathscr{P}(\Lambda)$ — the minimal triples of vectors that form the faces of $\operatorname{Min}(\Lambda)$ . Some triples can be degenerate (see Theorem 3.1), but such a situation cannot arise too often: faces of $\operatorname{Min}(\Lambda)$ corresponding to degenerate triples cannot be adjacent (see [22], paper 11, Theorem 3.3). To find an initial Minkowski basis, we use methods of Voronoi (see [20], §59). At each of the next steps, we calculate the adjacent triples for a given Minkowski basis. If a triple turns out to be degenerate, then the adjacent triples are already bases (all the transition formulae can be written out explicitly; see [19], and also [22], pp. 402–405). In view of Theorem 3.1, three-dimensional continued fractions can be interpreted as a dynamical system with two-dimensional time, an invariant measure (2.13), and a phase space consisting of the matrices of $D_3(\mathbb{R})\setminus GL_3(\mathbb{R})$ equivalent to the Minkowski matrices (3.3) or (3.4). The algorithm for successively finding the Minkowski bases can also be applied for integer lattices, by passing to infinitesimally close lattices of general position when coordinates coincide.

The construction of three-dimensional continued fractions described above was initially proposed (in different forms) by Voronoi and Minkowski as a tool for finding fundamental units in totally real cubic fields. Corresponding to the rings of integers in such fields are lattices $\Lambda$ for which $\mathscr{P}(\Lambda)$ has a doubly periodic structure; thus, the problem of finding fundamental units reduces to finding minimal periods of $\mathscr{P}(\Lambda)$ (see [20]). For example, let $\theta=2\cos(2\pi/7)$ , $\theta'=2\cos(4\pi/7)$ , and $\theta''=2\cos(6\pi/7)$ be the roots of the cubic equation $\theta^3+\theta^2-2\theta+1=0$ . We can choose the triple $\{1,\theta,\theta^2\}$ as a basis of the ring of integers of the field $\mathbb{Q}(\theta)$ . Then to the fundamental units $\theta$ , $-1-\theta$ there correspond two independent periods of $\mathscr{P}(\Lambda)$ , where $\Lambda$ is the algebraic lattice generated by the vectors $e_j=(\theta^j,(\theta')^j,(\theta'')^j)$ ( $j=0,1,2$ ) (see Fig. 8; the bold dots in the picture denote degenerate minimal triples of vectors).

**Figure 8.** The Voronoi graph for the field $\mathbb{Q}(\theta)$ , $\theta=2\cos(2\pi/7)$
Download figure:
Standard image

**Figure 8.** The Voronoi graph for the field $\mathbb{Q}(\theta)$ , $\theta=2\cos(2\pi/7)$
Download figure:
Standard image

Figure 9 depicts the canonical diagram constructed from the triple of numbers $\theta=2\cos(2\pi/9)$ , $\theta'=2\cos(4\pi/9)$ , $\theta''=2\cos(6\pi/9)$ which are the roots of the cubic equation $\theta^3-3\theta+1=0$ . A basis of the ring of integers of the field is given by $\{1,\theta,\theta^2\}$ , and $\theta$ , $1-\theta$ are fundamental units.

The two examples above are interesting in that the numbers $2\cos(2\pi/7)$ and $2\cos(2\pi/9)$ are the beginning of a three-dimensional analogue of the Markov spectrum (see [101], [102], and also the isolation theorems in [103], [104]); in the classical theory of continued fractions the corresponding numbers are

$\begin{equation*} \frac{1+\sqrt{5}}{2}=2\cos\frac{2\pi}{5}\quad\text{and}\quad \sqrt{2}=2\cos\frac{2\pi}{8}\,. \end{equation*}$

For further development of the Voronoi and Minkowski algorithms see [105]–[112].

We also mention that for Voronoi–Minkowski three-dimensional continued fractions it is possible to prove an analogue of Vahlen's theorem (see [94], [95], [113]–[115]). Concerning applications of the theory of local minima see [29], [30], [116]–[122].

Information about other multidimensional generalizations of continued fractions can be found in [123].

3.4. Statement of the main result

A three-dimensional analogue of the problem of Gauss statistics for finite continued fractions is the question of the statistical properties of the Minkowski bases described in Theorem 3.1. The question of the behaviour on average of elements of Klein polyhedra reduces to the calculation of Minkowski matrices with certain additional restrictions or to the calculation of matrices with similar properties (see [27]–[30]). We confine ourselves to the consideration of Minkowski bases on totally primitive lattices (see the definition in §2.3), since this leads to a more simple and natural answer.

For a matrix $X$ of the form (1.8) let $a$ , $b$ , and $c$ denote the maximal absolute values of elements in the rows of $X$ :

$\begin{equation} a=\max\{|a_1|,|a_2|,|a_3|\},\quad b=\max\{|b_1|,|b_2|,|b_3|\},\quad c=\max\{|c_1|,|c_2|,|c_3|\}. \end{equation} \tag{ 3.6 }$

Let $X'$ , $X''$ , $X'''$ denote the following matrices:

$\begin{equation} \begin{gathered} X'=\begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0&0&c^{-1} \end{pmatrix}X,\qquad X''=\begin{pmatrix} 1 & 0 & 0 \\ 0 & b^{-1} & 0 \\ 0&0&c^{-1} \end{pmatrix}X, \\ X'''=\begin{pmatrix} a^{-1} & 0 & 0 \\ 0 & b^{-1} & 0 \\ 0 & 0 & c^{-1} \end{pmatrix}X. \end{gathered} \end{equation} \tag{ 3.7 }$

In particular, if $X$ is a basis matrix of the form (3.3) or (3.4), then the matrices $X'$ , $X''$ , $X'''$ have the form

$\begin{equation*} \begin{pmatrix} a_1 & a_2 & a_3 \\ b_1 & b_2 & b_3 \\ \gamma_1&\gamma_2&1 \end{pmatrix},\quad \begin{pmatrix} a_1 & a_2 & a_3 \\ \beta_1&1& \beta_3 \\ \gamma_1&\gamma_2&1 \end{pmatrix},\quad \begin{pmatrix} 1 & \alpha_2 & \alpha_3 \\ \beta_1 & 1 & \beta_3 \\ \gamma_1&\gamma_2&1 \end{pmatrix}, \end{equation*}$

For an arbitrary matrix set $M$ let $M(P)$ , $M'$ , $M''$ , and $M'''$ denote the following sets:

$\begin{equation*} \begin{alignedat}{2} M(P)&=\{X\in M:\det X=P\},&\qquad M''&=\{X'':X\in M\}, \\ M'&=\{X':X\in M\},&\qquad M'''&=\{X''':X\in M\}. \end{alignedat} \end{equation*}$

Let $\mathscr{M}$ be the set of Minkowski basis matrices of fixed type (I or II) with fixed signature and integer coefficients. It follows from Theorem 3.1 that in the calculation of matrices in the set $\mathscr{M}$ it is sufficient to confine oneself to the cases when

$\begin{equation} X=\begin{pmatrix} a_1 & a_2 & a_3 \\ b_1 & b_2 & b_3 \\ c_1&c_2&c_3 \end{pmatrix}=\begin{pmatrix} x_1 & x_2 & \pm x_3 \\ -y_1 & y_2 & y_3 \\ z_1&-z_2&z_3 \end{pmatrix}, \end{equation} \tag{ 3.8 }$

where $x_i=|a_i|$ , $y_i=|b_i|$ , $z_i=|c_i|$ satisfy the inequalities of Theorem 3.1.

As noted above, the main result of the paper is a three-dimensional generalization of equation (2.8), which is an asymptotic formula for the mean value of the Gauss–Kuz'min statistics of finite continued fractions. We define three-dimensional Gauss–Kuz'min statistics as follows. Let us fix a tuple of real numbers $(\xi_2,\xi_3,\eta_1,\eta_3, \zeta_1,\zeta_2)\in(0,1]^6$ and consider the parallelepiped

$\begin{equation} \Pi=[0,\xi_2]\times I(\xi_3)\times [-\eta_1,0]\times [0,\eta_3]\times [0,\zeta_1]\times [-\zeta_2,0], \end{equation} \tag{ 3.9 }$

where $I(\xi_3)=[-\xi_3,0]$ for matrices of type I, and $I(\xi_3)=[0,\xi_3]$ for matrices of type II. Then the three-dimensional Gauss–Kuz'min statistics (corresponding to the matrices in $\mathscr{M}$ of given type and signature) for a lattice $\Lambda\subset\mathbb{Z}^3$ are defined to be sums of the form

$\begin{equation*} \sum_{X\in\mathscr{M}}[(\boldsymbol{\alpha},\boldsymbol{\beta}, \boldsymbol{\gamma})\in\Pi,\ X \ \text{is a basis matrix of the lattice $\Lambda$}], \end{equation*}$

where $\boldsymbol{\alpha}=(\alpha_2,\alpha_3)$ , $\boldsymbol{\beta}=(\beta_1,\beta_3)$ , $\boldsymbol{\gamma}=(\gamma_1,\gamma_2)$ , and $\alpha_i=\alpha_i(X)$ , $\beta_i=\beta_i(X)$ , $\gamma_i=\gamma_i(X)$ are found from (3.7) and (3.8). Under this approach, a three-dimensional analogue of the sum on the left-hand side of equation (2.8) is the quantity

$\begin{equation} \mathscr{N}_\Pi(P)=\sideset{}{^\#}\sum_{X\in\mathscr{M}(P)} [(\boldsymbol{\alpha},\boldsymbol{\beta},\boldsymbol{\gamma})\in\Pi]. \end{equation} \tag{ 3.10 }$

Henceforth the symbol $\#$ means that the sum is taken over totally primitive matrices $X$ , that is, over matrices satisfying the conditions (2.9)–(2.11).

Theorem 3.3. For any positive integer $P$ and any real $\varepsilon>0$

$\begin{equation} \frac{\mathscr{N}_\Pi(P)}{\varphi^2(P)}= \mathscr{Q}_2(\log P)+O(P^{-1/34+\varepsilon}), \end{equation} \tag{ 3.11 }$

where $\mathscr{Q}_2(x)$ is a polynomial of second degree with leading coefficient

$\begin{equation*} \frac{\mu(\mathscr{M}'''\cap \Pi)}{2\zeta(2)\zeta(3)}= \frac{1}{2\zeta(2)\zeta(3)}\int_{\Pi} \frac{[X'''\in \mathscr{M}''']}{(\det X''')^3}\, d\boldsymbol{\alpha}\,d\boldsymbol{\beta}\,d\boldsymbol{\gamma}. \end{equation*}$

A detailed scheme of the proof of Theorem 3.3 is given in §5.3.

§ 4. Two-dimensional case as a model problem

4.1. Statement of the problem

We write Voronoi matrices in the form

$\begin{equation*} A=\begin{pmatrix} a_1 & a_2 \\ b_1 & b_2 \end{pmatrix}=\begin{pmatrix} x_1 & -x_2 \\ y_1 & y_2 \end{pmatrix},\quad\text{where}\quad x_1,x_2,y_1,y_2\geqslant 0. \end{equation*}$

Let $\mathscr{V}$ denote the set of all primitive Voronoi matrices:

$\begin{equation*} \begin{gathered} \mathscr{V}=\biggl\{A=\begin{pmatrix} x_1 & -x_2 \\ y_1 & y_2 \end{pmatrix}\!\colon (x_1,x_2)=(y_1,y_2)=1, \ 0\leqslant x_2\leqslant x_1, \ 0\leqslant y_1\leqslant y_2\biggr\}. \end{gathered} \end{equation*}$

Let $a$ and $b$ (within §4) denote the maximal absolute values of the elements in the rows of the matrix $A$ :

$\begin{equation*} a=\max\{|a_1|,|a_2|\}=x_1,\qquad b=\max\{|b_1|,|b_2|\}=y_2. \end{equation*}$

Here $ab\leqslant P\leqslant 2ab$ . For a matrix $A=\begin{pmatrix} a_1 & a_2\\ b_1 & b_2\end{pmatrix}$ , let $A'$ and $A''$ denote the matrices

$\begin{equation} A'=\begin{pmatrix} 1 & 0 \\ 0 & b^{-1} \end{pmatrix}A=\begin{pmatrix} x_1 & -x_2 \\ \beta & 1 \end{pmatrix},\qquad A''=\begin{pmatrix} a^{-1} & 0 \\ 0 & b^{-1} \end{pmatrix}A=\begin{pmatrix} 1 & -\alpha \\ \beta & 1 \end{pmatrix}. \end{equation} \tag{ 4.1 }$

The corresponding sets are denoted by $\mathscr{V}'$ and $\mathscr{V}''$ :

$\begin{equation*} \begin{aligned} \mathscr{V}'&=\biggl\{A'=\begin{pmatrix} x_1 & -x_2\\ \beta & 1 \end{pmatrix}\!\colon (x_1,x_2)=1, \ 0\leqslant x_2\leqslant x_1, \ \beta\in[0,1]\biggr\}, \\ \mathscr{V}''&=\biggl\{A''=\begin{pmatrix} 1 & -\alpha\\ \beta & 1 \end{pmatrix}\!\colon \alpha,\beta\in[0,1]\biggr\}. \end{aligned} \end{equation*}$

Let us fix a pair of real numbers $\xi$ , $\eta\in[0,1]$ and define the rectangle $\Pi=[0,\xi]\times [0,\eta]$ . We consider the problem of calculating the quantity

$\begin{equation*} N(P)=N_\Pi(P)=\biggl|\biggl\{A\in\mathscr{V}(P)\colon \biggl(\frac{x_2}{x_1}\,,\frac{y_1}{y_2}\biggr)\in\Pi\biggr\}\biggr| \end{equation*}$

equal to the number of primitive Voronoi matrices with determinant $P$ for which the coefficients of the normalized matrix belong to $\Pi$ . A solution to this problem is given by the following theorem.

Theorem 4.1. Let $P$ be a positive integer and $\varepsilon>0$ a real number. Then

$\begin{equation} \frac{N(P)}{\varphi(P)}=c_1(\Pi)\log P+c_0(\Pi)+O(P^{-1/8+\varepsilon}), \end{equation} \tag{ 4.2 }$

where

$\begin{equation*} c_1(\Pi)=\frac{\log(1+\xi\eta)}{\zeta(2)}= \frac{\overline \nu(\mathscr{V}''\cap\Pi)}{\zeta(2)}= \frac{1}{\zeta(2)}\int_{\Pi} \frac{[A''\in\mathscr{V}'']\,d\alpha\,d\beta}{(\det A'')^2}\,. \end{equation*}$

The remainder term in Theorem 4.1 is worse than the remainder term in Porter's result (1.11). This is due to the fact that the proof of Theorem 4.1 is simpler: instead of estimates of trigonometric sums by van der Corput's method, this proof uses the idea of approximating the boundaries of domains by step-functions. The proof of Theorem 3.3 (a three-dimensional analogue of Theorem 4.1) is based on the same approach. Below, all the main steps are briefly described in order to sketch the scheme of proof of the main result.

4.2. Division into cases

We divide the set of all Voronoi matrices into two parts:

$\begin{equation*} \mathscr{V}=\mathscr{V}_1\sqcup\mathscr{V}_2,\qquad \mathscr{V}_1=\{A\in\mathscr{V}\colon x_1\leqslant y_2\},\quad \mathscr{V}_2=\{A\in\mathscr{V}\colon x_1>y_2\}. \end{equation*}$

Correspondingly, the quantity $N(P)$ can be represented in the form $N(P)=N_1(P)+N_2(P)$ , where the definition of $N_\ell(P)$ ( $\ell=1,2$ ) is obtained from the definition of $N(P)$ by imposing the additional condition $A\in\mathscr{V}_\ell$ . To prove Theorem 4.1 it suffices to verify the asymptotic formula

$\begin{equation} \frac{N_\ell(P)}{\varphi(P)}=\frac{c_1(\Pi)}{2}\log P+c_0^{(\ell)}(\Pi)+ O(P^{-1/8+\varepsilon})\qquad (\ell=1,2). \end{equation} \tag{ 4.3 }$

The proof of (4.3) for $\ell=1$ will imply that if the non-strict inequality $x_1\leqslant y_2$ in the definition of $N_1(P)$ is replaced by the strict inequality $x_1>y_2$ , then only the form of the constant $c_0^{(1)}(\Pi)$ changes in (4.3). The map

$\begin{equation*} \begin{pmatrix} x_1 & -x_2 \\ y_1 & y_2 \end{pmatrix}\to \begin{pmatrix} y_2 & -y_1 \\ x_2 & x_1 \end{pmatrix} \end{equation*}$

establishes a one-to-one correspondence between the matrices in the set $\mathscr{V}_1$ for which $x_1<y_2$ and the matrices in $\mathscr{V}_2$ . Therefore, to prove (4.3) for $\ell=2$ (and thus Theorem 4.1) it suffices to verify this equation for $\ell=1$ . In what follows we assume that $\ell=1$ .

Let $\mathscr{V}_\ell(a,P)$ denote the set of matrices $A\in\mathscr{V}_\ell(P)$ for which $x_1=a$ . The sets $\mathscr{V}'_\ell(a,P)$ and $\mathscr{V}''_\ell(a,P)$ are defined by analogy with $\mathscr{V}'$ and $\mathscr{V}''$ .

To verify (4.3) we first prove an asymptotic formula for $N_\ell(a,P)=|\mathscr{V}_\ell(a,P)|$ . We do this in two different ways, first by elementary considerations, and second by using estimates of Kloosterman sums.

4.3. Linear parametrization of solutions

If we fix numbers $a_1$ and $a_2$ with $(a_1,a_2)=1$ , then we can find integers $\widetilde{x}_1$ and $\widetilde{x}_2$ such that

$\begin{equation} \begin{vmatrix} a_1&a_2 \\ \widetilde{x}_1&\widetilde{x}_2 \end{vmatrix}=1,\qquad \begin{vmatrix} a_1&a_2 \\ P\widetilde{x}_1&P\widetilde{x}_2 \end{vmatrix}=P. \end{equation} \tag{ 4.4 }$

Thus, all the solutions of the equation

$\begin{equation*} \begin{gathered} \begin{vmatrix} a_1&a_2 \\ b_1&b_2 \end{vmatrix}=P \end{gathered} \end{equation*}$

with respect to the unknowns $b_1$ , $b_2$ admit the linear parametrization

$\begin{equation} \begin{pmatrix} b_1 \\ b_2 \end{pmatrix}=\begin{pmatrix} \widetilde{b}_1 \\ \widetilde{b}_2 \end{pmatrix}+u\begin{pmatrix} a_1 \\ a_2 \end{pmatrix}, \end{equation} \tag{ 4.5 }$

where $u\in\mathbb{Z}$ and $\widetilde{b}_{i}=P\widetilde{x}_{i}$ . It follows from the equalities

$\begin{equation*} (b_1,b_2)=(b_1,b_2,P)=(ua_1,ua_2,P)=(u,P) \end{equation*}$

that a solution obtained by the formula (4.5) defines a primitive matrix $A$ if and only if $(u,P)=1$ .

4.4. First variant of estimation of the remainder term

We obtain an asymptotic formula for $N_\ell(a,P)$ based on elementary considerations. It follows from the equality $y_2=(P-x_2y_1)/x_1$ that the conditions $y_1\leqslant y_2$ and $x_1\leqslant y_2$ characterizing the set $\mathscr{V}_\ell(a,P)$ (recall that we consider only the case $\ell=1$ ) can be written in the form $y_1\leqslant f(x_2)$ , where

$\begin{equation} f(t)=\min\{f_1(t),f_2(t)\},\qquad f_1(t)=\frac{P}{a+t}\,,\quad f_2(t)=\frac{P-a^2}{t}\,. \end{equation} \tag{ 4.6 }$

Furthermore,

$\begin{equation} f(t)\leqslant f_1(t)\ll\frac{P}{a}\,,\quad |f_1'(t)|=\frac{P}{(a+t)^2}\ll\frac{P}{a^2}, \end{equation} \tag{ 4.7 }$

and (under the condition that $f(t)=f_2(t)$ )

$\begin{equation} |f_2'(t)|=\frac{|f_2(t)|}{t}\ll\frac{P}{at}\,. \end{equation} \tag{ 4.8 }$

Using the linear parametrization (4.5), we find that

$\begin{equation} \sideset{}{^\#}\sum_{0\leqslant y_1\leqslant f(x_2)}\delta_a(P-x_2y_1)= \frac{\varphi(P)}{aP}\int_{0}^{f(x_2)}\,dy_1+O(P^\varepsilon). \end{equation} \tag{ 4.9 }$

Passing to the variable $\beta=y_1/y_2$ and using the equalities

$\begin{equation*} y_1=\frac{\beta P}{\det A'}\,,\qquad dy_1=\frac{aP\,d\beta}{(\det A')^2}\,, \end{equation*}$

we rewrite (4.9) in the form

$\begin{equation*} \sum_{y_1}[A\in\mathscr{V}_\ell(a,P)]=\varphi(P)\int_{\mathbb{R}} \frac{[A'\in\mathscr{V}'_\ell(a,P)]\,d\beta}{(\det A')^2}+O(P^\varepsilon). \end{equation*}$

Summing the last equation over $x_2$ and passing to the variable $\alpha=x_2/x_1$ , we find that

$\begin{equation} \frac{N_\ell(a,P)}{\varphi(P)}=\frac{\varphi(a)}{a^2}\int_{\mathbb{R}^2} \frac{[A''\in\mathscr{V}''_\ell(a,P)]\,d\alpha\,d\beta}{(\det A'')^2}+ \rho_0(a)+O(aP^{-1+\varepsilon}), \end{equation} \tag{ 4.10 }$

where $\rho_0(a)\ll a^{-2+\varepsilon}$ .

Remark 4.2. Both sides of (4.9) are estimated as $O(Pa^{-2})$ . This enables us to obtain, in particular, the following asymptotic formula with a trivial estimate of the remainder:

$\begin{equation} \begin{aligned} &\sum_{Y_1\leqslant x_2<Y_1+Z_1}\, \sideset{\,\,\,}{^\#}\sum_{0\leqslant y_1\leqslant f(x_2)}\delta_a(P-x_2y_1) \\ &\qquad=\frac{\varphi(P)\varphi(a)}{Pa^2}\int_{Y_1}^{Y_1+Z_1}dx_2 \int_{0}^{f(x_2)}dy_1+O(Z_1P^{1+\varepsilon}a^{-2}). \end{aligned} \end{equation} \tag{ 4.11 }$

4.5. Second variant of estimation of the remainder term

The second approach to the calculation of $N_\ell(a,P)$ consists in counting the number of solutions of the congruence $xy+P\equiv 0\pmod{a}$ that lie below the graph of some monotonic function. We approximate this domain by rectangles, and in every rectangle we reduce the problem to estimates of Kloosterman sums.

Proposition 4.3. Let $a> 0$ and let

$\begin{equation*} I=[Y_1,Y_1+Z_1)\times[Y_2,Y_2+Z_2)\subset[0,a)\times [0,Pa). \end{equation*}$

Then the asymptotic formula

$\begin{equation} \sideset{}{^\#}\sum_{(x,y)\in I}\delta_a(xy+P)= \frac{\varphi(P)}{P}\,\frac{\varphi(a)}{a^2}Z_1Z_2+O(R(Z_2)) \end{equation} \tag{ 4.12 }$

holds, where

$\begin{equation*} R(Z_2)\ll\biggl(\frac{Z_2}{a}+a^{1/2}\biggr)(a,P)(aP)^\varepsilon. \end{equation*}$

Proposition 4.3 is proved by standard methods (see, for example, Theorem 3 in [24]; a generalization to the case of an arbitrary linear function can be found in [124]). It follows from the formula (4.12) that for an arbitrary non-negative function $f$ such that $Z_2\leqslant f(x)\leqslant Z_2+V$ ( $V\ll Z_2$ ) for $x\in [Y_1,Y_1+Z_2)$ the following asymptotic formula holds:

$\begin{equation} \begin{aligned} &\sum_{Y_1\leqslant x_2<Y_1+Z_2}\ \ \sideset{}{^\#}\sum_{0\leqslant y_1\leqslant f(x_2)}\delta_a(P-x_2y_1) \\ &\qquad=\frac{\varphi(P)}{P}\,\frac{\varphi(a)}{a^2}\int_{Y_1}^{Y_1+Z_1}dx \int_{0}^{f(x)}dy+O\biggl(\frac{Z_1V}{a}\biggr)+O(R(Z_2)). \end{aligned} \end{equation} \tag{ 4.13 }$

We apply this formula to the function $f$ defined by (4.6). For this we choose a positive integer $r\leqslant a$ and represent the interval $[0,a]$ in which $x_2$ varies in the form

$\begin{equation*} [0,a]=\bigsqcup_{j=0}^{r-1}I(j)=\bigsqcup_{k=0}^{m}W_k, \end{equation*}$

where $m\ll \log a$ , $I(0)=W_0=[0,a/r]$ ,

$\begin{equation*} I(j)=\biggl(\frac{j}{r}a,\frac{j+1}{r}a\biggr]\quad (1\leqslant j<r),\qquad W_k=\bigsqcup_{\substack{j=1\\ 2^{k-1}\leqslant j<2^k}}^{r-1}I(j)\quad (k>0). \end{equation*}$

We represent the quantity $N_\ell(a,P)$ in the form

$\begin{equation*} N_\ell(a,P)=\sum_{k=0}^{m}N_{\ell,k}(a,P), \end{equation*}$

where the definition of $N_{\ell,k}(a,P)$ is obtained from the definition of $N_\ell(a,P)$ by imposing the additional condition $x_2\in W_k$ . By approximating $f$ on each interval $I(j)$ by constant functions one can prove that

$\begin{equation} N_{\ell,k}(a,P)=\frac{\varphi(P)}{P}\,\frac{\varphi(a)}{a^2}\int_{W_k}dx_2 \int_{0}^{f(x_2)}dy_1+O\biggl(\frac{P}{ar}\biggr)+ O\bigl(rR(b)P^{\varepsilon}\bigr). \end{equation} \tag{ 4.14 }$

For this it is sufficient to use the formula (4.11) for $k=0$ and to sum equation (4.13) over the intervals $I(j)\subset W_k$ for $k> 0$ . By (4.7) and (4.8) we have $|f'(x)|\ll rP\times 2^{-k}a^{-2}$ on each of these intervals. The values of $f$ vary within an interval of length $V=O(P\cdot 2^{-k}a^{-1})$ , and this leads to a remainder $O(Z_1V/a)=O(P\cdot 2^{-k}a^{-1}r^{-1})$ . It remains to take into account that the number of intervals $I(j)\subset W_k$ is $O(2^{k})$ .

We sum (4.14) over $k$ and pass to the variables $\beta=y_1/y_2$ and $\alpha=x_2/x_1$ :

$\begin{equation*} \frac{N_\ell(a,P)}{\varphi(P)}=\frac{\varphi(a)}{a^2}\int_{\mathbb{R}^2} \frac{[A''\in\mathscr{V}''_\ell(a,P)]\,d\alpha\,d\beta}{(\det A'')^2}+ O\biggl(\frac{P^{\varepsilon}}{ra}\biggr)+ O\bigl(rR(b)P^{-1+\varepsilon}\bigr). \end{equation*}$

The value $r=[P^{1/2}a^{-3/4}]$ is chosen based on the relation $P/(ra)\asymp ra^{1/2}$ . The requirement $r\leqslant a$ holds under the condition that $a\geqslant P^{2/7}$ . As a result we obtain an asymptotic formula for $N_\ell(a,P)$ with a second version of the remainder:

$\begin{equation} \frac{N_\ell(a,P)}{\varphi(P)}=\frac{\varphi(a)}{a^2}\int_{\mathbb{R}^2} \frac{[A''\in\mathscr{V}''_\ell(a,P)]\,d\alpha\,d\beta}{(\det A'')^2}+ O\bigl((a,P)a^{-1/4}P^{1/2+\varepsilon}\bigr). \end{equation} \tag{ 4.15 }$

4.6. Estimation of the total remainder

Comparing the formulae (4.10) and (4.15), we find that it makes sense to use the first one for $a\leqslant P^{2/5}$ and the second for $a> P^{2/5}$ . Thus, for the sum of all remainder terms we obtain the estimate

$\begin{equation} \sum_{a\leqslant P^{2/5}}aP^\varepsilon+\sum_{P^{2/5}<a\leqslant P^{1/2}} (a,P)a^{-1/4}P^{1/2+\varepsilon}\ll P^{7/8+\varepsilon}. \end{equation} \tag{ 4.16 }$

4.7. Calculation of the principal term

If $a>\sqrt{q}$ , then the set $\mathscr{V}_\ell(a,P)$ becomes empty. Nevertheless, we can impart a meaning to the formula (4.10) by setting $\rho_0(a)=0$ . Under this convention, it follows from the estimate $\rho_0(a)\ll a^{-2+\varepsilon}$ that

$\begin{equation*} \sum_{a\leqslant P^{2/5}}\rho_0(a)=\sum_{a=1}^\infty\rho_0(a)+ O(P^{-2/5+\varepsilon}). \end{equation*}$

As noted above, the condition $\alpha_1\leqslant \beta_2$ , which is satisfied by the matrices in the set $\mathscr{V}''_\ell(a,P)$ , is equivalent to the inequality $a^2\leqslant P/\det A''$ . Therefore, the relations (4.10), (4.15), and (4.16) imply that

$\begin{equation*} \frac{N_\ell(P)}{\varphi(P)}=\sum_{a=1}^\infty \frac{\varphi(a)}{a^2}\int_{\mathbb{R}^2} \frac{[A''\in\mathscr{V}''_\ell(a,P)]\,d\alpha\,d\beta}{(\det A'')^2}+ \sum_{a=1}^\infty\rho_0(a)+O(P^{-1/8+\varepsilon}). \end{equation*}$

The condition $a\leqslant y_2$ , which must be satisfied by the matrices in $\mathscr{V}_\ell(a,P)$ , is not invariant under the left action of $D_3(\mathbb{R})$ . When we pass to the set $\mathscr{V}_\ell''(a,P)$ , this condition takes the form $a^2\leqslant P/\det A''$ , that is,

$\begin{equation*} \mathscr{V}''_\ell(a,P)=\biggl\{A'\in \mathscr{V}''_\ell\colon a^2\leqslant \frac{P}{\det A''}\biggr\}. \end{equation*}$

Thus,

$\begin{equation*} \begin{aligned} \frac{N_\ell(P)}{\varphi(P)}&=\int_{\mathbb{R}^2} \frac{[A''\in\mathscr{V}''_\ell]\,d\alpha\,d\beta}{(\det A'')^2} \sum_{a=1}^\infty\frac{\varphi(a)}{a^2} \biggl[a^2\leqslant\frac{P}{\det A''}\biggr] \\ &\qquad+\sum_{a=1}^\infty\rho_0(a)+O(P^{-1/8+\varepsilon}). \end{aligned} \end{equation*}$

Applying the relation

$\begin{equation*} \sum_{a=1}^N\frac{\varphi(a)}{a^2}=\frac{1}{\zeta(2)} \biggl(\log N+\gamma-\frac{\zeta'(2)}{\zeta(2)}\biggr)+O(N^{-1+\varepsilon}) \end{equation*}$

to the inner sum, we arrive at the required formula (4.3).

§ 5. Division into cases

In the two-dimensional case the set $\mathscr{V}$ was divided into two subsets $\mathscr{V}_1$ and $\mathscr{V}_2$ which were almost the same (see §4.2). In the three-dimensional case the division has a more complicated structure and essentially uses geometric properties of Minkowski bases.

5.1. Reduced matrices

Let $X$ be a Minkowski basis matrix of the form (1.8) and let $a,b,c$ be defined by (3.6). Then $P\asymp abc$ . Indeed, on the one hand, the obvious inequality $P\leqslant 6 abc$ holds. On the other hand, $P\geqslant abc$ by Minkowski's convex body theorem. In particular, the inequality $P\geqslant abc$ means that among the minors corresponding to the elements of any row of $X$ , there always is at least one that has maximal possible order. For example,

$\begin{equation*} \max\{|a_1b_2-a_2b_1|,|a_1b_3-a_3b_1|,|a_2b_3-a_3b_2|\}\geqslant \frac{ab}{3}\,. \end{equation*}$

An important role in the arguments below will be played by division of the set of all Minkowski basis matrices into charts — subsets on which the Linnik–Skubenko reduction will be conducted. In every chart the position of the minor of maximal order (for every row) will be fixed.

Definition 5.1. Let $A=\begin{pmatrix} a_1 & a_2\\ b_1 & b_2\end{pmatrix}$ , let $q=\det A\ne 0$ , and let $X$ be the matrix of a Minkowski basis of the form (1.8). The matrix $X$ is said to be reduced if it satisfies the following conditions.

(1)
$a_1=a$ .
(2)
$q\geqslant ab/4$ .
(3)
The basis $(e_1,e_2)$ of the lattice $\Lambda=\langle e_1,e_2\rangle$ is close to a minimal basis in the following sense: one of the bases $(e_1,e_2)$ , $(e_1,e_1+e_2)$ , $(e_2,e_1\pm e_2)$ is a Voronoi basis of $\Lambda$ .

In other words, the properties 1 and 2 mean that in a reduced matrix the corner minor $q$ and the corner element $a_1$ have greatest possible values with respect to the order. The property 3 means that the matrix $A$ can also be reconstructed from the lattice $\Lambda$ with basis $(a_1,b_1)$ , $(a_2,b_2)$ almost uniquely: the number of Voronoi bases in a lattice with determinant $q$ , like the length of the continued fraction expansion of the number $d/q$ , can be estimated as $O(\log(q+1))$ , and therefore the number of possible matrices $A$ for the given lattice $\Lambda$ is bounded by a quantity $O(\log(q+1))$ .

We denote by $g_1,\dots,g_6$ the elements of the group $G_3$ permuting the coefficients $a,b,c$ of the matrix $X$ (and preserving the diagonal dominance of $X$ ).

Lemma 5.2. The set $\mathscr{M}$ can be partitioned into finitely many subsets in such a way that:

(i) every set of the partition is defined by a finite set of inequalities that are invariant under the left action of $D_3(\mathbb{R})$ ;

(ii) if $\widetilde{\mathscr{M}}$ is one of the sets of the partition, then for any $i=1,\dots,6$ at least one of the sets $g_i\bigl(\widetilde{\mathscr{M}}\,\bigr)$ or $g_i\bigl(\widetilde{\mathscr{M}}\,\bigr)\begin{pmatrix} 1 & 0 & 0\\ 0 & 0 &1\\0 & 1 & 0\end{pmatrix}$ consists of reduced matrices.

Proof. By Theorem 3.1, it is sufficient to construct a required partition for the sets (3.3) and (3.4). In each of them we add the additional partition by all hyperplanes of the form $x_i=x_j$ , $y_i=y_j$ , $z_i=z_j$ ( $i\ne j$ ). Consider an arbitrary set in the resulting partition. If it consists of matrices of type I, then all its elements satisfy the conditions 1–3 (the matrix $A=\begin{pmatrix} a_1 & a_2\\ b_1 & b_2\end{pmatrix}$ defines a Voronoi basis, since $b_1\leqslant 0$ ).

Suppose that the set under consideration consists of matrices of type II. After the rows of the matrix $X$ are ordered by increase of maximal elements, there arise three variants of sign arrangements. Besides the case (3.4), there are also the following two possible cases:

$\begin{equation} \begin{gathered}\begin{pmatrix} x_1 & x_2 & -x_3 \\ -y_1 & y_2 & y_3 \\ z_1 & z_2 & z_3 \end{pmatrix},\quad\text{where}\ x_2+x_3\geqslant x_1 \ \text{and} \ (y_1\geqslant y_3 \ \text{or} \ z_1\geqslant z_2),\end{gathered} \end{equation} \tag{ 5.1 }$

$\begin{equation} \begin{gathered}\begin{pmatrix} x_1 & x_2 & -x_3 \\ y_1 & y_2 & y_3 \\ z_1 & -z_2 & z_3 \end{pmatrix},\quad\text{where}\ z_1+z_2\geqslant z_3 \ \text{and} \ (x_3\geqslant x_2\ \text{or} \ y_3\geqslant y_1).\end{gathered} \end{equation} \tag{ 5.2 }$

In the matrices (3.4) and (5.1), the element $b_1$ is negative. Hence, as for matrices of type I, we have $q=x_1y_2+x_2y_1\geqslant x_1y_2=ab$ and the basis $(e_1,e_2)$ is a Voronoi basis. For the matrices (5.2) we implement in addition the subpartition by the planes $a_2=a_1/2$ and $b_1=b_2/2$ and consider two cases:

1)
$b_1> 0$ and ( $a_2\leqslant a_1/2$ or ( $a_2>a_1/2$ , $b_1\leqslant b_2/2$ ));
2)
$b_1> 0$ , $a_2>a_1/2$ , $b_1> b_2/2$ .

In the first case, $q=x_1y_2-x_2y_1\geqslant x_1y_2/2=ab/2$ , and for $a_2\leqslant a_1/2$ we can choose as a Voronoi basis the pair $(e_1-e_2,e_2)$ with the matrix $\begin{pmatrix} x_1-x_2 & x_2\\ y_1-y_2 & y_2\end{pmatrix}$ , while for $b_1\leqslant b_2/2$ we can choose the pair $(e_1,e_2-e_1)$ with the matrix $\begin{pmatrix} x_1 & -(x_1-x_2)\\ y_1 & y_2-y_1 \end{pmatrix}$ .

In the second case, we transpose the second and third columns in $X$ :

$\begin{equation*} X=\begin{pmatrix} x_1 & x_2 & -x_3 \\ y_1 & y_2 & y_3 \\ z_1&-z_2&z_3 \end{pmatrix}\to \begin{pmatrix} x_1 & -x_3 & x_2 \\ y_1 & y_3 & y_2 \\ z_1&z_3&-z_2 \end{pmatrix}\sim \begin{pmatrix} x_1 & x_3 & -x_2 \\ -y_1 & y_3 & y_2 \\ z_1&-z_3&z_2 \end{pmatrix}. \end{equation*}$

If the condition $y_3\geqslant y_1$ holds in the matrix $\begin{pmatrix} x_1 & x_3\\ -y_1 & y_3\end{pmatrix}$ , then the basis of $\Lambda=\langle e_1,e_2\rangle$ consisting of the vectors $e_1=(x_1,-y_1)$ and $e_2=(x_3,y_3)$ is a Voronoi basis and $q=x_1y_3+x_3y_1\geqslant x_1y_1\geqslant ab/2$ . In the remaining case ( $y_3<y_1$ , $x_3\geqslant x_2$ ) we can choose as a Voronoi basis the pair $(e_2,e_2-e_1)$ with the matrix $\begin{pmatrix} x_3 & -(x_1-x_3)\\ y_3 & y_1+y_3 \end{pmatrix}$ . Here $q=x_1y_3+x_3y_1\geqslant y_1x_2\geqslant ab/4$ . $\square$

Thus, with each matrix $A$ we can associate a Voronoi basis of the lattice $\Lambda$ , and to each Voronoi basis there correspond at most four matrices $A$ .

Remark 5.3. It suffices to prove Theorem 3.3 after replacing in its statement (and in the definition (3.10) of the quantity $\mathscr{N}_\Pi(P)$ ) the set $\mathscr{M}$ by an arbitrary set $\widetilde{\mathscr{M}}$ of the partition constructed in Lemma 5.2. Then the set $\widetilde{\mathscr{M}}$ can be represented in the form

$\begin{equation} \widetilde{\mathscr{M}}=\bigsqcup_{\ell=1}^6g_\ell(\mathscr{M}_\ell), \end{equation} \tag{ 5.3 }$

where $g_\ell\in G_3$ , and each of the sets $\mathscr{M}_\ell$ consists of reduced matrices satisfying the conditions $a\leqslant b\leqslant c$ (in the definition of $\mathscr{M}_\ell$ , the non-strict inequalities between $a$ , $b$ , and $c$ can be replaced by strict inequalities, so that the sets $g_\ell(\mathscr{M}_\ell)$ are pairwise disjoint).

Remark 5.4. The three-dimensional Gauss measure is invariant under the left action of $D_3(\mathbb{R})$ . In particular, for $\beta_1'=\beta_1/\beta_3$ , $\beta_2'=1/\beta_3$ , $\gamma_1'=\gamma_1/\gamma_2$ , and $\gamma_3'=1/\gamma_2$ we have

$\begin{equation*} \begin{vmatrix} 1 & \alpha_2 & \alpha_3 \\ \beta_1 & 1 & \beta_3 \\ \gamma_1&\gamma_2&1 \end{vmatrix}^{-3}\,d\beta_1\,d\beta_3\,d\gamma_1\,d\gamma_2=\begin{vmatrix} 1 & \alpha_2 & \alpha_3 \\ \beta_1' & \beta_2' & 1 \\ \gamma_1'&1&\gamma_3' \end{vmatrix}^{-3}\,d\beta_1'\,d\beta_2'\,d\gamma_1'\,d\gamma_3'. \end{equation*}$

Thus, the measure of the set $D_3(\mathbb{R})\setminus\mathscr{M}$ is independent of whether the second and third columns of the matrix $X$ were transposed or not.

Remarks 5.3 and 5.4 imply that to prove Theorem 3.3 it suffices to verify the following assertion.

Theorem 5.5. Let $\mathscr{M}_\ell$ be one of the sets of the partition (5.3) and let $\Pi$ be the parallelepiped defined by equation (3.9). Then

$\begin{equation} \frac{1}{\varphi^2(P)}\ \sideset{}{^\#}\sum_{X\in\mathscr{M}_\ell(P)} [(\boldsymbol{\alpha},\boldsymbol{\beta},\boldsymbol{\gamma})\in\Pi]= \mathscr{Q}_2^{(\ell)}(\log P)+O(P^{-1/34+\varepsilon}), \end{equation} \tag{ 5.4 }$

where $\mathscr{Q}_2^{(\ell)}$ is a polynomial of second degree with leading coefficient

$\begin{equation} \frac{\mu(\widetilde{\mathscr{M}}'''\cap \Pi)}{12\zeta(2)\zeta(3)}= \frac{1}{12\zeta(2)\zeta(3)}\int_{\Pi} \frac{[X'''\in \widetilde{\mathscr{M}}''']}{(\det X''')^3}\, d\boldsymbol{\alpha}\,d\boldsymbol{\beta}\,d\boldsymbol{\gamma}. \end{equation} \tag{ 5.5 }$

To simplify the exposition, we conduct the proof of Theorem 5.5 under the assumption that all the parameters $\xi_2,\xi_3,\eta_1,\eta_3,\zeta_1,\zeta_2$ defining the dimensions of $\Pi$ are equal to 1. In the general case the arguments will be the same.

5.2. Properties of the constructed partition

Lemma 5.6. Suppose that $X$ is a reduced matrix. Then

$\begin{equation*} \frac{q}{2a}\leqslant b\leqslant \frac{4q}{a}\,,\qquad \frac{P}{24q}\leqslant c\leqslant \frac{2P}{q}\,. \end{equation*}$

Proof. The assertion of the lemma follows from the inequalities $q\leqslant 2ab$ and $abc\leqslant P\leqslant 6abc$ and the property 2 of reduced matrices. $\square$

Lemma 5.7. The partition constructed in Lemma 5.2 has the following additional properties: any of the sets $\mathscr{M}_\ell(a,q,P)$ is defined by finitely many inequalities of the form $\pm c_2\leqslant f_i(a_2,a_3,b_1,b_3,c_1)$ ( $1\leqslant i\leqslant i_0$ ) each of which acts over the corresponding domain $\Omega_i=\Omega_i(a_2,a_3,b_1,b_3,c_1)$ . Furthermore, $f_i\ll P/q$ and

$\begin{equation} \frac{\partial f_i}{\partial a_{2,3}}\ll\frac{P}{aU}\,,\quad \frac{\partial f_i}{\partial b_{1,3}}\ll\frac{P}{bU}\,,\quad \frac{\partial f_i}{\partial c_{1}}\ll\frac{P}{cU}, \qquad (a_2,a_3,b_1,b_3,c_1)\in\Omega_i, \end{equation} \tag{ 5.6 }$

where $U=|a_1b_3-a_3b_1|$ .

Proof. Consider Minkowski matrices of type I. Obviously, the estimate $f_i\ll P/q$ always holds, since $c_2\leqslant c\asymp P/q$ for reduced matrices (see Lemma 5.6). We consider successively all the functions that can define the limits of variation of $z_2$ for the matrices $X$ of the form (3.3) with $\det X=P$ . These are the functions $f_{i}$ ( $i=1,\dots,4$ ) that are defined, respectively, by the conditions $z_1=z_2$ , $z_1=z_3$ , $z_2=z_3$ (arising in the initial partition of the set $\mathscr{M}$ ), and $z_3=y_2$ (the part of the boundary appearing because of the inequality $b\leqslant c$ ). For the first function $f_1=z_1$ we have

$\begin{equation*} \frac{\partial f_1}{\partial z_1}=1\ll\frac{P}{cq}\,,\quad \frac{\partial f_1}{\partial x_{2,3}}=0,\quad \frac{\partial f_1}{\partial y_{1,3}}=0. \end{equation*}$

The other functions are found from the equation $\det X=P$ :

$\begin{equation*} \begin{gathered} f_2=\frac{P-z_1\biggl(q+\begin{vmatrix} x_2 & -x_3 \\ y_2 & y_3 \end{vmatrix}\biggr)}{\begin{vmatrix} x_1 & x_3 \\ y_1 & y_3 \end{vmatrix}},\qquad f_3=\frac{P-z_1\begin{vmatrix} x_2 & -x_3 \\ y_2 & y_3 \end{vmatrix}}{q+\begin{vmatrix} x_1 & x_3 \\ y_1 & y_3 \end{vmatrix}}, \\ f_4=\frac{P-y_2q-z_1\begin{vmatrix} x_2 & -x_3 \\ y_2 & y_3 \end{vmatrix}}{\begin{vmatrix} x_1 & x_3 \\ y_1 & y_3 \end{vmatrix}}\,. \end{gathered} \end{equation*}$

(If a function $f_i$ , where $i=1,\dots,4$ , defines the boundary of the domain of variation of $c_2$ , then, as noted above, $f_i\ll P/q$ , and therefore the denominator $U=|a_1b_3- a_3b_1|$ in such cases is non-zero.) If $f_i=F_i/G_i\ll P/q$ , then

$\begin{equation*} f_i'=\frac{F_i'}{G_i}-\frac{F_iG_i'}{G_i^2}\ll\biggl|\frac{F_i'}{G_i}\biggr| +\frac{P}{q}\biggl|\frac{G_i'}{G_i}\biggr|. \end{equation*}$

For all the functions under consideration we have

$\begin{equation*} \begin{alignedat}{3} \frac{\partial F_i}{\partial x_j}&\ll\frac{P}{a}\,,&\qquad \frac{\partial F_i}{\partial y_j}&\ll\frac{P}{b}\,,&\qquad \frac{\partial F_i}{\partial z_1}&\ll\frac{P}{c}\,, \\ \frac{\partial G_i}{\partial x_j}&\ll\frac{q}{a}\,,&\qquad \frac{\partial G_i}{\partial y_j}&\ll\frac{q}{b}\,,&\qquad \frac{\partial G_i}{\partial z_1}&=0. \end{alignedat} \end{equation*}$

Therefore, to verify the assertion of the lemma it is sufficient to show that $|G_i|\gg U$ . For $f_2$ and $f_4$ this is obvious, and for $f_3$ it follows from the inequalities on the elements of the Minkowski matrix:

$\begin{equation*} q+\begin{vmatrix} x_1 & x_3 \\ y_1 & y_3 \end{vmatrix}=x_1y_2+x_2y_1+x_1y_3-x_3y_1\geqslant x_1y_2\geqslant \frac{q}{2} \gg U \end{equation*}$

(for $x_2\geqslant x_3$ this follows from the inequality $x_2y_1-x_3y_1\geqslant0$ , for $y_3\geqslant y_1$ it follows from the inequality $x_1y_3-x_3y_1\geqslant 0$ , but for $z_1\geqslant z_2$ the function $f_2$ cannot define the boundary of the domain of variation of $c_2$ , since then the inequality $z_1>z_2$ would hold inside this domain, which contradicts the condition $z_1=z_3$ defining $f_2$ ).

For matrices of type II, the only difference in the proof of the estimates (5.6) is the need to consider the function $f_5$ defined by the condition $z_1+z_2=z_3$ . For this function the conditions (5.6) are verified in the same way as for the other functions. $\square$

5.3. Scheme of proof of the main result

To prove (5.4) we represent the set $\mathscr{M}_\ell(P)$ in the form

$\begin{equation*} \mathscr{M}_\ell(P)=\bigsqcup_{a,q}\mathscr{M}_\ell(a,q,P), \end{equation*}$

where $\mathscr{M}_\ell(a,q,P)$ is the set of matrices $X\in\mathscr{M}_\ell(P)$ in which the values of the corner element $a_1=a$ and the corner minor $\det A=q$ are fixed. Then

$\begin{equation} \mathscr{N}_\ell(a,q,P)=|\mathscr{M}_\ell(a,q,P)|= \ \sideset{}{^\#}\sum_{a_2,a_3,b_1,b_3,c_1,c_2}[X\in\mathscr{M}_\ell(a,q,P)]. \end{equation} \tag{ 5.7 }$

In the last equation it is assumed that in the summation over $a_2$ , $a_3$ , $b_1$ , $b_3$ , $c_1$ , $c_2$ the values of $b_2$ and $c_3$ are determined by the equations $\begin{vmatrix} a_1 & a_2 \\ b_1 & b_2 \end{vmatrix}=q$ and $\det X=P$ , respectively.

In the process of the proof we will pass successively (in various orders) from summation over the variables $a_2$ , $a_3$ , $b_1$ , $b_3$ , $c_1$ , $c_2$ to integration over the variables $\alpha_i=a_ia_1^{-1}$ , $\beta_i=b_ib_2^{-1}$ , $\gamma_i=c_ic_3^{-1}$ . In the end this will enable us to transform the sum in (5.7) into the integral in (5.5).

We define the parameters $\varkappa$ and $\lambda$ by $a=P^\varkappa$ and $q=P^\lambda$ . In addition we divide the domain

$\begin{equation*} \Omega=\{(a,q)\colon a^2\ll q\ll a^{1/2}P^{1/2}\}= \biggl\{(a,q)\colon 0\leqslant\varkappa\leqslant\frac{1}{3}\,, \ 2\varkappa\leqslant\lambda\leqslant\frac{1+\varkappa}{2}\biggr\} \end{equation*}$

in which the pairs $(a,q)$ vary into three parts depending on the values of $\varkappa$ and $\lambda$ (see Fig. 10).

**Figure 10.** Partition of the domain of variation of the parameters $\varkappa$ and $\lambda$
Download figure:
Standard image

We prove the asymptotic formulae for $\mathscr{N}_\ell(a,q,P)$ in three different ways, depending on which of the domains $\Omega_i$ ( $i=1,2,3$ ) the pair $(a,q)$ belongs to.

1. For small values of $a$ and $b$ (that is, for $(a,q)\in\Omega_1$ ), equation (1.2) is first solved as a linear equation with respect to $c_1$ , $c_2$ , $c_3$ (see §6). The solutions are counted by elementary considerations. A reduction is performed to a two- dimensional problem, which is solved by the methods of §4.

2. If the values of $a$ , $b$ , $c$ are commensurable (that is, for $(a,q)\in\Omega_2$ ), then the matrix $A=\begin{pmatrix} a_1 & a_2\\ b_1 & b_2 \end{pmatrix}$ is fixed and equation (1.2) can be solved with respect to the unknowns $a_3$ , $b_3$ , $c_1$ , $c_2$ , $c_3$ (see §7). The solutions are counted by using a non-linear Linnik–Skubenko parametrization and estimates of Kloosterman sums. Again a reduction is performed to a two-dimensional problem, which is also solved by using the methods of §4.

3. In the remaining case, when $a$ is small while $b$ and $c$ are large (that is, $(a,q)\in\Omega_3$ ), equation (1.2) is solved with respect to the unknowns $b_3$ , $c_1$ , $c_2$ , $c_3$ (see §8). The reduction to a two-dimensional problem is effected by using a special version of the Linnik–Skubenko reduction which preserves the value of $a_3$ , and this two-dimensional problem is solved by elementary methods.

Correspondingly, the quantity $\mathscr{N}_\ell(P)$ is represented in the form

$\begin{equation} \mathscr{N}_\ell(P)=\mathscr{N}_\ell^{(1)}(P)+ \mathscr{N}_\ell^{(2)}(P)+\mathscr{N}_\ell^{(3)}(P), \end{equation} \tag{ 5.8 }$

where

$\begin{equation} \mathscr{N}_\ell^{(i)}(P)=\sum_{(a,q)\in\Omega_i}\mathscr{N}_\ell(a,q,P). \end{equation} \tag{ 5.9 }$

For each case we indicate the order of the transitions from summation to integration and the assertions in which these transitions are effected.

1. $(c_1,c_2)$ , Lemmas 6.2, 6.3, and 6.4; $b_3$ , Lemma 6.7; $b_1$ , Lemma 6.9; $(a_2,a_3)$ , Lemma 9.13; $q$ , Proposition 9.14.

2. $(c_1,c_2,a_3,b_3)$ , Theorem 7.1 and Corollary 7.3; $(a_2,b_1)$ , Proposition 7.7; $q$ , Proposition 9.12.

3. $(c_1,c_2,b_3)$ , Theorem 8.1 and Corollary 8.2; $b_1$ , Proposition 8.3; $(a_2,a_3)$ , Lemma 9.13; $q$ , Proposition 9.14.

After this, the three results obtained are substituted into equation (5.8). The last transition, from summation to integration with respect to the variable $a_1$ , is implemented in the proof of Theorem 5.5.

After the transition to matrices in which some coefficients become real numbers, the symbol $\#$ will mean that instead of the total primitivity conditions (2.9)–(2.11) only those necessary restrictions hold that make sense (that is, those in which only integers occur):

$\begin{equation*} \begin{alignedat}{2} &\begin{pmatrix} \mathbb{Z} & \mathbb{Z} & \mathbb{Z} \\ \mathbb{Z} & \mathbb{Z} & \mathbb{Z} \\ \mathbb{R} & \mathbb{R} & \mathbb{R} \end{pmatrix}:&&\qquad \biggl(\begin{vmatrix} a_1&a_2 \\ b_1&b_2 \end{vmatrix}, \begin{vmatrix} a_2&a_3 \\ b_2&b_3 \end{vmatrix}, \begin{vmatrix} a_1&a_3 \\ b_1&b_3 \end{vmatrix}\biggr)=1; \\ &\begin{pmatrix} \mathbb{Z} & \mathbb{Z} & \mathbb{Z} \\ \mathbb{Z} & \mathbb{Z} & \mathbb{R} \\ \mathbb{R} & \mathbb{R} & \mathbb{R} \end{pmatrix}:&&\qquad (a_1,a_2,a_3)=1,\quad (a_1,a_2,b_1,b_2)=1; \\ &\begin{pmatrix} \mathbb{Z} & \mathbb{Z} & \mathbb{Z} \\ \mathbb{R} & \mathbb{R} & \mathbb{R} \\ \mathbb{R} & \mathbb{R} & \mathbb{R} \end{pmatrix}: &&\qquad (a_1,a_2,a_3)=1. \end{alignedat} \end{equation*}$

5.4. Different versions of Kloosterman sums

For the Kloosterman sums (1.6) we know the estimate

$\begin{equation} \begin{aligned} |K_a(m,n)|\leqslant &\tau(a)(m,n,a)^{1/2} a^{1/2} \end{aligned} \end{equation} \tag{ 5.10 }$

( $\tau(q)$ is the number of divisors of $q$ ), proved by Weil [125] for prime $a$ and extended to arbitrary $a$ by Estermann [126].

The trigonometric sums (1.5) are responsible for the distribution of the solutions of equation (1.3) (and the equivalent congruence (1.4)). For these sums it is convenient to use the estimate

$\begin{equation} |K_a(m,n,q)|\leqslant \tau(a)\tau((m,n,q,a))(mn,mq,nq,a)^{1/2}a^{1/2} \end{equation} \tag{ 5.11 }$

(see [44], Lemma 1), which generalizes the inequality (5.10).

As noted above, in the proof of the main result of this paper the reduction to the two-dimensional case is performed in three different ways. In the first two cases it is necessary to study solutions of the equation (1.3) (the congruence (1.4)) under the additional conditions $(a,x,y,z)=1$ or $(a,x)=1$ . The corresponding trigonometric sums are defined by

$\begin{equation} K^{\times}_a(m,n,q)=\sum_{x,y=1}^{a}\sum_{z}[az-xy=q,(a,x,y,z)=1] e\biggl(\frac{mx+ny}{a}\biggr), \end{equation} \tag{ 5.12 }$

$\begin{equation} K^*_a(m,n,q)=\sideset{}{^*}\sum_{x=1}^{a}\sum_{y=1}^{a}\delta_a(xy+q) e\biggl(\frac{mx+ny}{a}\biggr). \end{equation} \tag{ 5.13 }$

Estimation of the sums (5.12) and (5.13) reduces to the estimate (5.11) (see [24], Lemma 1–3). In both cases this makes it possible to prove the uniform distribution of the solutions of the congruence $xy+q\equiv 0\pmod{a}$ with the corresponding restrictions (see Propositions 6.8 and 7.4 below). For brevity we use the notation $K^{\times}_a(q)=K^{\times}_a(0,0,q).$

§ 6. First variant of estimation of the remainder

In this section the reduction to the two-dimensional case is performed by elementary considerations. For the two-dimensional case similar arguments were described in §§4.3 and 4.4. In the transition to integration in the second row of the matrix $X$ , standard `two-dimensional' methods are used, based on estimates of Kloosterman sums (see §4.5).

6.1. Linear parametrization of solutions

Lemma 6.1. A matrix $A=\begin{pmatrix} a_1 & a_2\\ b_1 & b_2 \end{pmatrix}$ with determinant $q\ne0$ can be supplemented to form a matrix

$\begin{equation} \overline A =\begin{pmatrix} a_1 & a_2 & a_3 \\ b_1 & b_2 & b_3 \end{pmatrix} \end{equation} \tag{ 6.1 }$

satisfying the condition

$\begin{equation} \biggl(q,\begin{vmatrix} a_2&a_3 \\ b_2&b_3 \end{vmatrix}, \begin{vmatrix} a_1&a_3 \\ b_1&b_3 \end{vmatrix}\biggr)=1 \end{equation} \tag{ 6.2 }$

if and only if $(a_1,a_2,b_1,b_2)=1$ .

For an arbitrary matrix $\overline A$ satisfying the condition (6.2) there are integers $c_1$ , $c_2$ , $c_3$ such that

$\begin{equation} \begin{vmatrix} a_1 & a_2 & a_3 \\ b_1 & b_2 & b_3 \\ c_1&c_2&c_3 \end{vmatrix}=1. \end{equation} \tag{ 6.3 }$

See the proof in [23].

Lemma 6.2. If the matrix (6.1) satisfies the condition (6.2) and $(\widetilde{c}_1,\widetilde{c}_2,\widetilde{c}_3)$ is a particular solution of equation (6.3), then the following assertions hold.

(i) All the solutions of the equation

$\begin{equation} \begin{vmatrix} a_1 & a_2 & a_3 \\ b_1 & b_2 & b_3 \\ c_1&c_2&c_3 \end{vmatrix}=P \end{equation} \tag{ 6.4 }$

with respect to the unknowns $c_1,c_2,c_3$ have the form

$\begin{equation} \begin{pmatrix} c_1 \\ c_2 \\c_3 \end{pmatrix}= P\begin{pmatrix} \widetilde{c}_1 \\ \widetilde{c}_2 \\ \widetilde{c}_3 \end{pmatrix}+ u\begin{pmatrix} a_1 \\ a_2 \\ a_3 \end{pmatrix}+v\begin{pmatrix} b_1 \\ b_2 \\ b_3\end{pmatrix} \qquad (u,v\in\mathbb{Z}). \end{equation} \tag{ 6.5 }$

(ii) The formula (6.5) produces different solutions $(c_1,c_2,c_3)$ for different pairs $(u,v)$ .

(iii) A solution $(c_1,c_2,c_3)$ obtained by the formula (6.5) defines a totally primitive matrix (1.8) if and only if $(u,P)=(v,P)=1$ .

Proof. (i) An arbitrary integer vector $c=(c_1,c_2,c_3)$ can be represented as a linear combination of the vectors $a=(a_1,a_2,a_3)$ , $b=(b_1,b_2,b_3)$ , and $\widetilde{c}=(\widetilde{c}_1,\widetilde{c}_2,\widetilde{c}_3)$ with integer coefficients. It follows from (6.4) that the coefficient of $\widetilde{c}$ must be equal to $P$ . This proves that representation in the form (6.5) is possible.

(ii) It follows from condition (6.2) that the vectors $a$ and $b$ are linearly independent. Therefore, different solutions $(c_1,c_2,c_3)$ correspond to different pairs $(u,v)$ .

(iii) The last assertion of the lemma is verified using the formulae (6.5) and (6.2):

$\begin{equation} \begin{gathered}\biggl(\begin{vmatrix} a_1&a_2 \\ c_1&c_2\end{vmatrix}, \begin{vmatrix} a_1&a_3 \\ c_1&c_3\end{vmatrix}, \begin{vmatrix} a_2&a_3 \\ c_2&c_3 \end{vmatrix}\biggr)= \biggl(\begin{vmatrix} a_1&a_2 \\ c_1&c_2\end{vmatrix}, \begin{vmatrix} a_1&a_3 \\ c_1&c_3\end{vmatrix}, \begin{vmatrix} a_2&a_3 \\ c_2&c_3 \end{vmatrix},P\biggr) \\ \qquad=\biggl(v\begin{vmatrix}a_1 & a_2\\ b_1 & b_2 \end{vmatrix}, v\begin{vmatrix} a_1&{a}_3 \\ b_1& {b}_3 \end{vmatrix}, v\begin{vmatrix} a_2&{a}_3 \\ b_2&{b}_3 \end{vmatrix},P\biggr)=(v,P),\end{gathered} \end{equation} \tag{ 6.6 }$

$\begin{equation} \begin{gathered}\biggl(\begin{vmatrix} b_1&b_2 \\ c_1&c_2 \end{vmatrix},\begin{vmatrix} b_1&b_3 \\ c_1&c_3 \end{vmatrix}, \begin{vmatrix} b_2&b_3 \\ c_2&c_3 \end{vmatrix}\biggr)= \biggl(\begin{vmatrix} b_1&b_2 \\ c_1&c_2 \end{vmatrix},\begin{vmatrix} b_1&b_3 \\ c_1&c_3 \end{vmatrix}, \begin{vmatrix} b_2&b_3 \\ c_2&c_3 \end{vmatrix},P\biggr) \\ \qquad=\biggl(u\begin{vmatrix} a_1 & a_2\\ b_1 & b_2 \end{vmatrix}, u\begin{vmatrix} a_1&{a}_3 \\ b_1&{b}_3 \end{vmatrix}, u\begin{vmatrix} a_2&{a}_3 \\ b_2&{b}_3 \end{vmatrix},P\biggr)=(u,P).\end{gathered} \end{equation} \tag{ 6.7 }$

Thus, the matrix $X$ is totally primitive if and only if $(u,P)=(v,P)=1$ . $\square$

Suppose that $\Omega$ is a planar domain with rectifiable boundary. Let $\operatorname{Area}(\Omega)$ denote the area of this domain, let $\mathscr{P}(\Omega)$ be its perimeter, and let $N(\Omega)$ be the number of points of the lattice $\mathbb{Z}^2$ lying inside $\Omega$ . For a convex domain Jarnik's inequality is known:

$\begin{equation*} \left|\operatorname{Area}(\Omega)-N(\Omega)\right|<\mathscr{P}(\Omega)+1. \end{equation*}$

For an arbitrary simply connected planar domain $\Omega$ with rectifiable boundary we have

$\begin{equation} \left|\operatorname{Area}(\Omega)-N(\Omega)\right|<4(\mathscr{P}(\Omega)+1) \end{equation} \tag{ 6.8 }$

(see, for example, [67], Lemma 1).

Lemma 6.3. Let $P$ be a positive integer, let $\Omega$ be a simply connected planar domain with rectifiable boundary such that $\mathscr{P}(\Omega)\gg 1$ , and let

$\begin{equation*} N^*(P;\Omega)=\sum_{(u,v)\in\Omega}[(u,P)=(v,P)=1]. \end{equation*}$

Then

$\begin{equation*} N^*(P;\Omega)=\frac{\varphi^2(P)}{P^2}\operatorname{Area}(\Omega)+ O(\mathscr{P}(\Omega)P^\varepsilon). \end{equation*}$

Proof. By the Möbius inversion formula,

$\begin{equation*} N^*(P;\Omega)=\sum_{d_1,d_2\mid P}\mu(d_1)\,\mu(d_2) \sum_{(u,v)\in\Omega}[d_1| u,d_2| v]. \end{equation*}$

The required asymptotic formula is obtained if in the inner sum we perform the change of variables $u=d_1u'$ , $v=d_2v'$ , use the inequality (6.8) in the new variables $u'$ , $v'$ , and estimate the perimeter of the diminished copy of the domain $\Omega$ as $O(\mathscr{P}(\Omega))$ :

$\begin{equation*} \begin{aligned} N^*(P;\Omega)&=\sum_{d_1,d_2\mid P}\mu(d_1)\,\mu(d_2) \biggl(\frac{\operatorname{Area}(\Omega)}{d_1d_2}+ O(\mathscr{P}(\Omega))\biggr) \\ &=\frac{\varphi^2(P)}{P^2}\operatorname{Area}(\Omega)+ O(\mathscr{P}(\Omega)P^\varepsilon). \end{aligned} \end{equation*}$

6.2. Transition to integration in the third row

We consider the set of matrices $X\in\mathscr{M}_\ell(a,q,P)$ for which the matrix $\overline A=\begin{pmatrix} a_1 & a_2 & a_3\\ b_1 & b_2 & b_3\end{pmatrix}$ is fixed. Since $q=\begin{vmatrix} a_1 & a_2\\ b_1 & b_2\end{vmatrix}\ne 0$ , the value of $c_3$ is uniquely expressible in terms of $c_1$ and $c_2$ . Therefore, any conditions imposed on the variables $c_1,c_2,c_3$ can be written in the form $(c_1,c_2)\in\Omega(\overline A)$ .

Lemma 6.4. Suppose that $\overline A$ satisfies the condition (6.2), $\Omega$ is a convex domain, and

$\begin{equation*} S(\overline A,\Omega)=\sideset{}{^\#}\sum_{(c_1,c_2)\in\Omega} [X\in\mathscr{M}_\ell(a,q,P)]. \end{equation*}$

Then

$\begin{equation*} S(\overline A,\Omega)=\frac{\varphi^2(P)}{P^2q} \int_\Omega[X\in\mathscr{M}_\ell(a,q,P)]\,dc_1\,dc_2+ O\biggl(\frac{P^{1+\varepsilon}}{aq}\biggr). \end{equation*}$

Proof. By construction the set $\mathscr{M}_\ell(a,q,P)$ consists of matrices of the form (3.3), (3.4), (5.1), or (5.2). Hence, the variables $c_1,c_2,c_3$ must satisfy some conditions in the following list:

$\begin{equation*} 0\leqslant c_1\leqslant|c_2|\leqslant c_3,\quad 0\leqslant |c_2|\leqslant c_1\leqslant c_3,\quad c_1+|c_2|\geqslant c_3. \end{equation*}$

Thus, the domain of variation of the variables $c_1,c_2$ is contained in a convex polygon with linear dimensions estimated as $O(c)=O(P/q)$ by Lemma 5.6. It follows from equation (6.5) that the linear dimensions of the corresponding polygon on the plane $Ouv$ are $O(bP/q^2)$ . By Lemma 6.3,

$\begin{equation*} \begin{aligned} S(\overline A,\Omega)&=\sideset{}{^\#}\sum_{\substack{u,v\\ (u,P)=(v,P)=1}} [X\in\mathscr{M}_\ell(a,q,P),(c_1,c_2)\in\Omega] \\ &=\frac{\varphi^2(P)}{P^2}\int_{\mathbb{R}^2} [X\in\mathscr{M}_\ell(a,q,P),(c_1,c_2)\in\Omega]\,du\,dv+ O\biggl(\frac{P^{1+\varepsilon}}{aq}\biggr) \\ &=\frac{\varphi^2(P)}{P^2q}\int_\Omega[X\in\mathscr{M}_\ell(a,q,P)]\, dc_1\,dc_2+O\biggl(\frac{P^{1+\varepsilon}}{aq}\biggr).\qquad\qquad\square \end{aligned} \end{equation*}$

Corollary 6.5. Assume the hypotheses of Lemma 6.4. Then

$\begin{equation} S(\overline A,\Omega)\ll \frac{P^{2+\varepsilon}}{q^3}\,. \end{equation} \tag{ 6.9 }$

Furthermore, for fixed $\widetilde{a}_3$ and $\widetilde{b}_3$

$\begin{equation} \begin{aligned} &\sideset{}{^\#}\sum_{(c_1,c_2)\in\Omega(\widetilde{a}_3,\widetilde{b}_3)} [X\in\mathscr{M}_\ell(a,q,P)] \\ &\qquad=\frac{\varphi^2(P)}{P^2q} \int_{\widetilde{a}_3}^{\widetilde{a}_3+1}\,da_3 \int_{\widetilde{b}_3}^{\widetilde{b}_3+1}\,db_3 \sum_{(c_1,c_2)\in\Omega(a_3,b_3)}[X\in\mathscr{M}_\ell(a,q,P)]+ O\biggl(\frac{P^{2+\varepsilon}}{q^3}\biggr). \end{aligned} \end{equation} \tag{ 6.10 }$

Proof. To verify the estimate (6.9) it suffices to use a trivial estimate for the integral in Lemma 6.4. Equation (6.10) follows from the fact that every term in it is $O(P^{2+\varepsilon}q^{-3})$ . $\square$

6.3. Transition to integration in the second row

By the norm of a function we always mean the $L^\infty$ -norm.

Lemma 6.6. Let $a$ and $D$ be positive integers with $D\mid a$ . Suppose that a function $f$ on an interval $I$ has finitely many monotonicity parts. Then the following asymptotic formulae hold:

$\begin{equation} \begin{gathered}\sum_{\substack{x\in I\\ x\equiv x_0\!\!\!\!\pmod{D}}}f(x)= \frac{1}{D}\int_{I}f(x)\,dx+O(\|f\|),\end{gathered} \end{equation} \tag{ 6.11 }$

$\begin{equation} \begin{gathered}\sum_{\substack{x\in I\\ (x,a)=D}}f(x)=\frac{\varphi(a/D)}{a} \int_{I}f(x)\,dx+O(\|f\| a^\varepsilon),\end{gathered} \end{equation} \tag{ 6.12 }$

$\begin{equation} \begin{gathered}\sum_{\substack{x\in I\\ (x,a)=1}}f(x)=\frac{\varphi(a)}{a} \int_{I}f(x)\,dx+O(\|f\|a^\varepsilon).\end{gathered} \end{equation} \tag{ 6.13 }$

See the proof in [24], Lemma 4.

Lemma 6.7. Let $A=\begin{pmatrix} a_1 & a_2\\ b_1 & b_2 \end{pmatrix}$ , $q=\det A>0$ , $(a_1,a_2,b_1,b_2)= 1$ , $D=(a_1,a_2)$ , $(D,a_3)=1$ , and

$\begin{equation*} f(n)=\begin{vmatrix} a_1 & a_2 & a_3 \\ b_1 & b_2 & n \\ z_1&z_2&1 \end{vmatrix}^{-3}\ll\frac{1}{q^3} \end{equation*}$

for all $n$ in an interval $I$ . Then for the sum

$\begin{equation*} S^\#(I)=\sideset{}{^\#}\sum_{n\in I}f(n) \end{equation*}$

the following asymptotic formula holds:

$\begin{equation*} S^\#(I)=\frac{\varphi(q)}{q}\,\frac{D}{\varphi(D)} \int_{I}f(t)\,dt+O(q^{-3+\varepsilon}). \end{equation*}$

Proof. Suppose that the matrix $A$ can be reduced to the form $\begin{pmatrix} D & 0\\ \alpha & qD^{-1}\end{pmatrix}$ by elementary transformations of columns. Then the condition (6.2) is equivalent to the equality

$\begin{equation} \biggl(q,\frac{q}{D}a_3,\begin{vmatrix} D &a_3 \\ \alpha &n \end{vmatrix}\biggr)=1. \end{equation} \tag{ 6.14 }$

Since $(D,a_3)=1$ , we have $\bigl(q,(q/D)a_3\bigr)=(q/D)(D,a_3)=q/D$ , and (6.14) means that $\biggl(\dfrac{q}{D}\,, \begin{vmatrix} D &a_3 \\ \alpha &n\end{vmatrix}\biggr)=1$ . Therefore,

$\begin{equation*} S^\#(I)=\sum_{n\in I}\biggl[\biggl(\frac{q}{D}\,, \begin{vmatrix} D &a_3 \\ \alpha &n \end{vmatrix}\biggr)=1\biggr] f(n)=\sum_{\delta\mid q/D}\mu(\delta)\sum_{n\in I} [\delta\mid(Dn-\alpha a_3)]f(n). \end{equation*}$

It follows from the condition $\delta\mid(Dn-\alpha a_3)$ that $(\delta,D)\mid\alpha a_3$ . By the hypothesis of the lemma, $(D,a_3)=1$ , and therefore $(\delta,D)\mid \alpha$ . Furthermore, $(D,q/D,\alpha)=(a_1,a_2,b_1,b_2)=1$ . Thus, it follows from the relations $(\delta,D)\mid \alpha$ and $\delta\mid q/D$ that $(\delta,D)=1$ . For $(\delta,D)=1$ the congruence $Dn-\alpha a_3\equiv 0\pmod{\delta}$ is equivalent to the condition $n\equiv n_0\pmod{\delta}$ , where $n_0\equiv \alpha a_3 D^{-1}\pmod{\delta}$ . To complete the proof of the lemma it remains to use (6.11):

$\begin{equation*} \begin{aligned} S^\#(I)&=\sum_{\substack{\delta \mid q/D\\ (\delta,D)=1}}\mu(\delta) \sum_{\substack{n\in I\\ n\equiv n_0\!\!\!\!\pmod{\delta}}}f(n)= \sum_{\substack{\delta \mid q/D\\ (\delta,D)=1}}\mu(\delta) \biggl(\frac{1}{\delta}\int_{I}f(t)\,dt+O(q^{-3})\biggr) \\ &=\sum_{\substack{\delta \mid q/D\\ (\delta,D)=1}}\!\!\! \frac{\mu(\delta)}{\delta}\int_{I}f(t)\,dt+O(q^{-3+\varepsilon})= \frac{\varphi(q)}{q}\,\frac{D}{\varphi(D)}\int_{I}f(t)\,dt+ O(q^{-3+\varepsilon}).\quad\square \end{aligned} \end{equation*}$

If $(x,a)=1$ , then every interval $[Y,Y+a)$ contains exactly one solution of the congruence $xy+q\equiv 0\pmod{a}$ with respect to the unknown $y$ . Hence, for an arbitrary function $G$ defined on the rectangle $[Y_1,Y_1+Z_1)\times [Y_2,Y_2+Z_2)$ , a natural approximation for the sum

$\begin{equation*} \Phi_{a,q}^*[G]=\sideset{}{^*}\sum_{Y_1\leqslant x< Y_1+Z_1} \ \sum_{Y_2\leqslant y< Y_2+Z_2}\delta_a(xy+q)G(x,y) \end{equation*}$

is given by the sum

$\begin{equation*} S_a^*[G]=\frac{1}{a}\ \sideset{}{^*}\sum_{Y_1\leqslant x< Y_1+Z_1} \int_{Y_2}^{Y_2+Z_2}G(x,y)\,dy. \end{equation*}$

Proposition 6.8. Let $G$ be a non-negative function and suppose that for any $z$ with $0\leqslant z\leqslant\|G\|$ the inequality $G(x,y)\leqslant z$ defines in a rectangle $I=[0,a]\times [0,Z_2]$ (where $Z_2\ll q/a$ ) the domain $\Omega_z=\{(x,y)\in I\colon y\leqslant f_z(x)\}$ , and the number of monotonicity parts for all the functions $f_z$ is bounded by an absolute constant. Then

$\begin{equation} \Phi_{a,q}^*[G]-S_a^*[G]\ll\|G\|(a,q)a^{-1/4}q^{1/2+\varepsilon}. \end{equation} \tag{ 6.15 }$

See the proof in [24], Theorem 4.

Lemma 6.9. Suppose that $a_1=a$ and $D\mid (a,q)$ . Then the sum

$\begin{equation*} S(a,q,D)=\sideset{}{^\#}\sum_{\substack{a_2,a_3,b_1\\ (a_1,a_2)=D}} \int_{\mathbb{R}}[X'\in\mathscr{M}_\ell'(a,q,P)](\det X')^{-3}\,db_3 \end{equation*}$

satisfies the asymptotic formula

$\begin{equation*} \begin{aligned} S(a,q,D)&=\frac{D\varphi((D,q/D))}{q(D,q/D)}\sideset{}{^\#} \sum_{\substack{a_2,a_3\\ (a,a_2)=D}}\int_{\mathbb{R}^2} [X''\in\mathscr{M}_\ell''(a,q,P)](\det X'')^{-3}\,d\beta_1\,d\beta_3 \\ &\qquad+O\bigl((a,q)a^{-1/4}q^{-3/2+\varepsilon}\bigr). \end{aligned} \end{equation*}$

Proof. We transform the indicated sum, introducing the variables $a_1'=a_1D^{-1}$ , $a_2'=a_2D^{-1}$ , and $a_3'=a_3D^{-1}$ (where $a_3'$ is not necessarily an integer):

$\begin{equation*} S(a,q,D)=\frac{1}{D^3}\int_{\mathbb{R}}\,db_3 \sum_{\substack{a_3\\ (a_3,D)=1}}\, \sideset{}{^*}\sum_{a_2'}\,\sum_{\substack{b_1\\ (D,b_1,b_2)=1}} \begin{vmatrix} a_1' & a_2' & a_3'\\ b_1 & b_2 & b_3\\ \gamma_1&\gamma_2&1 \end{vmatrix}^{-3}. \end{equation*}$

We get rid of the condition $(D,b_1,b_2)=1$ by using the Möbius function:

$\begin{equation*} \begin{aligned} S(a_1,q,D)&=\frac{1}{D^3}\sum_{\delta\mid(D,q/D)}\mu(\delta) \int_{\mathbb{R}}\,db_3\sum_{\substack{a_3\\ (a_3,D)=1}} \sideset{}{^*}\sum_{a_2'}\,\sum_{b_1\!\colon\!\delta\mid (b_1,b_2)} \begin{vmatrix} a_1' & a_2' & a_3' \\ b_1 & b_2 & b_3 \\ \gamma_1&\gamma_2&1 \end{vmatrix}^{-3} \\ &=\frac{1}{D^3}\sum_{\delta\mid(D,q/D)}\frac{\mu(\delta)}{\delta^2} \int_{\mathbb{R}}\,db_3'\sum_{\substack{a_3\\ (a_3,D)=1}}\, \sideset{}{^*}\sum_{a_2'}\,\sum_{b_1'} \begin{vmatrix} a_1' & a_2' & a_3' \\ b_1' & b_2' & b_3' \\ \gamma_1&\gamma_2&1 \end{vmatrix}^{-3}, \end{aligned} \end{equation*}$

where $b_1'=b_1\delta^{-1}$ , $b_2'=b_2\delta^{-1}$ , and $b_3'=b_3\delta^{-1}$ . We apply Proposition 6.8 to the inner double sum. By the hypothesis of the lemma,

$\begin{equation*} \begin{vmatrix} a_1' & a_2' & a_3' \\ b_1' & b_2' & b_3' \\ \gamma_1&\gamma_2&1 \end{vmatrix}^{-3}\ll \biggl(\frac{D\delta}{q}\biggr)^{3}. \end{equation*}$

Furthermore, the function $b_1=f(a_2)$ implicitly defined by the equations

$\begin{equation*} \begin{vmatrix} a_1' & a_2' & a_3' \\ b_1' & b_2' & b_3' \\ \gamma_1&\gamma_2&1 \end{vmatrix}=Q,\qquad \begin{vmatrix} a_1' & a_2' \\ b_1' & b_2' \end{vmatrix}=q'=\frac{q}{\Delta\delta} \end{equation*}$

can be reduced to the form

$\begin{equation*} (a_3'b_1'-b_3'a_1')(a_1'z_2-a_2'z_1)=a_1'Q-q'\begin{vmatrix} a_1' & a_3' \\ z_1 & 1 \end{vmatrix}. \end{equation*}$

Therefore, the graph of $f$ consists of finitely many monotonicity parts, and Proposition 6.8 is indeed applicable. Thus,

$\begin{equation*} \begin{aligned} S(a_1,q,D)&=\frac{1}{D^3}\sum_{\delta\mid(D,q/D)}\frac{\mu(\delta)}{\delta^2} \int_{\mathbb{R}}db_3'\sum_{\substack{a_3\\ (a_3,D)=1}} \Biggl(\frac{1}{a_1'}\sideset{}{^*}\sum_{a_2'}\int_{\mathbb{R}} \begin{vmatrix} a_1' & a_2' & a_3' \\ b_1' & b_2' & b_3' \\ \gamma_1&\gamma_2&1 \end{vmatrix}^{-3}\,db_1' \\ &\qquad+O\biggl(\!\biggl(\frac{D\delta}{q}\biggr)^3 \biggl(\frac{a}{D}\,,\frac{q}{D\delta}\biggr) \biggl(\frac{a}{D}\biggr)^{-1/4} \biggl(\frac{q}{D\delta}\biggr)^{1/2+\varepsilon}\biggr)\!\Biggr). \end{aligned} \end{equation*}$

We sum the remainders that have appeared. The variable $b_3'$ varies in an interval of length $O\bigl(q/(a\delta)\bigr)$ , and the variable $a_3$ in an interval of length $O(a)$ . Thus, the sum of the remainders can be estimated by the sum

$\begin{equation*} \sum_{\delta\mid(D,q/D)}(a,q)a^{-1/4}D^{-1/4}\delta^{-1/2} q^{-3/2+\varepsilon}\ll(a,q)a^{-1/4}q^{-3/2+\varepsilon}. \end{equation*}$

We transform the sum of the principal terms, passing to the variables $a_2=Da_2'$ and $a_3=Da_3'$ :

$\begin{equation*} \begin{aligned} &\sum_{\delta\mid(D,q/D)}\frac{\mu(\delta)}{\delta^2}\int_{\mathbb{R}} \frac{db_3}{\delta}\sum_{\substack{a_3\\ (a_3,D)=1}}\frac{D}{a_1} \sum_{{a_2\atop(a,a_2)=D}}\int_{\mathbb{R}}\delta^2 \begin{vmatrix} a_1 & a_2 & a_3 \\ b_1 & b_2 & b_3 \\ \gamma_1&\gamma_2&1 \end{vmatrix}^{-3}\,db_1 \\ &\qquad=\frac{D}{a}\sum_{\delta\mid(D,q/D)}\frac{\mu(\delta)}{\delta} \sideset{}{^\#}\sum_{\substack{a_2,a_3\\ (a,a_2)=D}}\int_{\mathbb{R}^2} \begin{vmatrix} a_1 & a_2 & a_3 \\ b_1 & b_2 & b_3 \\ \gamma_1&\gamma_2&1 \end{vmatrix}^{-3}\,db_1\,db_3. \end{aligned} \end{equation*}$

After this it remains to pass to the variables $\beta_1=b_1/b_2$ and $\beta_3=b_3/b_2$ , where $b_2=(q+a_2b_1)a_1^{-1}$ . Since

$\begin{equation} \frac{\partial(b_1,b_3)}{\partial(\beta_1,\beta_3)}=\frac{ab^3}{q}\,,\qquad \begin{vmatrix} a_1 & a_2 & a_3 \\ b_1 & b_2 & b_3 \\ \gamma_1&\gamma_2&1 \end{vmatrix}^{-3}db_1\,db_3=\frac{a}{q} \begin{vmatrix} a_1 & a_2 & a_3 \\ \beta_1&1& \beta_3 \\ \gamma_1&\gamma_2&1 \end{vmatrix}^{-3}\,d\beta_1\,d\beta_3, \end{equation} \tag{ 6.16 }$

we obtain the required principal term

$\begin{equation*} \frac{D\varphi((D,q/D))}{q(D,q/D)}\, \sideset{}{^\#}\sum_{\substack{a_2,a_3\\ (a,a_2)=D}}\int_{\mathbb{R}^2} \begin{vmatrix} a_1 & a_2 & a_3 \\ \beta_1&1& \beta_3 \\ \gamma_1&\gamma_2&1 \end{vmatrix}^{-3}\,d\beta_1\,d\beta_3.\qquad\qquad\square \end{equation*}$

Proposition 6.10. Suppose that $q\gg a^2$ . Then for the quantity $\mathscr{N}_\ell(a,q,P)$ defined by (5.7) the following asymptotic formula holds:

$\begin{equation} \begin{aligned} \mathscr{N}_\ell(a,q,P)&=\varphi^2(P)\biggl(\frac{\varphi(q)}{q^2} \int_{\mathbb{R}^4}d\boldsymbol{\beta}\,d\boldsymbol{\gamma} \sideset{}{^\#}\sum_{a_2,a_3}c(q,D) \frac{[X''\in\mathscr{M}_\ell''(a,q,P)]}{(\det X'')^3} +\rho_1(a,q)\biggr) \\ &\qquad+O\bigl(R_1(a,q,P)\bigr), \end{aligned} \end{equation} \tag{ 6.17 }$

where $D=(a,a_2)$ , $R_1(a,q,P)=a^{-2}qP^{1+\varepsilon}$ ,

$\begin{equation} c(q,D)=\frac{D^2\varphi((D,q/D))}{\varphi(D)(D,q/D)}\,, \end{equation} \tag{ 6.18 }$

$\begin{equation} \rho_1(a,q)\ll(a,q)a^{-1/4}q^{-3/2+\varepsilon}. \end{equation} \tag{ 6.19 }$

Proof. It follows from the inequality (5.11) that

$\begin{equation} K_a^{\times}(q)\ll a^{1+\varepsilon}. \end{equation} \tag{ 6.20 }$

We transform the sum $\mathscr{N}_\ell(a,q,P)$ , using Lemma 6.4 and the estimate (6.20):

$\begin{equation*} \begin{aligned} \mathscr{N}_\ell(a,q,P)&=\sideset{}{^\#} \sum_{\substack{a_2,a_3,b_1,b_3\\ b_1a_2+q\equiv 0\ (\operatorname{mod}{a})}} \biggl(\frac{\varphi^2(P)}{P^2q}\int_{\mathbb{R}^2} [X\in\mathscr{M}_\ell(a,q,P)]\,dc_1\,dc_2+ O\biggl(\frac{P^{1+\varepsilon}}{aq}\biggr)\biggr) \\ &=\frac{\varphi^2(P)}{P^2q}\,\sideset{}{^\#}\sum_{a_2,a_3,b_1,b_3}\, \int_{\mathbb{R}^2}[X\in\mathscr{M}_\ell(a,q,P)]\,dc_1\,dc_2+ O\biggl(\frac{qP^{1+\varepsilon}}{a^2}\biggr). \end{aligned} \end{equation*}$

In the double integral we pass to the variables $\gamma_1=c_1c_3^{-1}$ and $\gamma_2=c_2c_3^{-1}$ . Since $c_3=P(\det X')^{-1}$ , we have

$\begin{equation} \begin{aligned} \frac{\partial(c_1,c_2)}{\partial(\gamma_1,\gamma_2)}&= c_3\biggl(c_3+\frac{\partial c_3}{\partial\gamma_1}\gamma_1+ \frac{\partial c_3}{\partial\gamma_2}\gamma_2\biggr)= \frac{qP^2}{(\det X')^{3}}\,, \\ dc_1\,dc_2&=\frac{qP^2}{(\det X')^{3}}\,d\gamma_1\,d\gamma_2. \end{aligned} \end{equation} \tag{ 6.21 }$

Therefore,

$\begin{equation*} \frac{\mathscr{N}_\ell(a,q,P)}{\varphi^2(P)}= \int_{\mathbb{R}^2}d\boldsymbol{\gamma}\;\,\sideset{}{^\#}\sum_{a_2,a_3,b_1,b_3} \frac{[X'\in\mathscr{M}'_\ell(a,q,P)]}{(\det X')^3}+ O\biggl(\frac{qP^{-1+\varepsilon}}{a^2}\biggr). \end{equation*}$

Using Lemma 6.7, we replace the summation over the variable $b_3$ by an integration:

$\begin{equation*} \begin{aligned} \frac{\mathscr{N}_\ell(a,q,P)}{\varphi^2(P)}&= \int_{\mathbb{R}^2}d\boldsymbol{\gamma}\;\,\sideset{}{^\#}\sum_{a_2,a_3,b_1} \biggl(\frac{\varphi(q)D}{q\varphi(D)}\int_{\mathbb{R}} \frac{[X'\in\mathscr{M}'_\ell(a,q,P)]}{(\det X')^3}\,db_3+ O(q^{-3+\varepsilon})\biggr) \\ &\qquad+O\biggl(\frac{qP^{-1+\varepsilon}}{a^2}\biggr) \\ &=\int_{\mathbb{R}^2}d\boldsymbol{\gamma}\;\,\sideset{}{^\#}\sum_{a_2,a_3,b_1} \frac{\varphi(q)D}{q\varphi(D)}\int_{\mathbb{R}} \frac{[X'\in\mathscr{M}'_\ell(a,q,P)]}{(\det X')^3}\,db_3 \\ &\qquad+O(q^{-2+\varepsilon})+O\biggl(\frac{qP^{-1+\varepsilon}}{a^2}\biggr). \end{aligned} \end{equation*}$

By Lemma 6.9 we arrive at the following equation, which is equivalent to the formula (6.17):

$\begin{equation*} \begin{aligned} \frac{\mathscr{N}_\ell(a,q,P)}{\varphi^2(P)}&= \int_{\mathbb{R}^2}d\boldsymbol{\gamma}\sum_{D\mid(a,q)} \frac{\varphi(q)D}{q\varphi(D)}\biggl(\frac{D\varphi((D,q/D))}{q(D,q/D)}\\ &\qquad\times \sideset{}{^\#}\sum_{\substack{a_2,a_3\\ (a,a_2)=D}}\int_{\mathbb{R}^2} \frac{[X''\in\mathscr{M}''_\ell(a,q,P)]\,d\boldsymbol{\beta}} {(\det X'')^3}+O((a,q)a^{-1/4}q^{-3/2+\varepsilon})\biggr)\\ &\qquad+O(q^{-2+\varepsilon})+O\biggl(\frac{qP^{-1+\varepsilon}}{a^2}\biggr) \\ &=\frac{\varphi(q)}{q^2}\int_{\mathbb{R}^4}d\boldsymbol{\beta}\, d\boldsymbol{\gamma}\;\sideset{}{^\#}\sum_{a_2,a_3}c(q,D) \frac{[X''\in\mathscr{M}''_\ell(a,q,P)]}{(\det X'')^3} \\ &\qquad+O((a,q)a^{-1/4}q^{-3/2+\varepsilon})+ O\biggl(\frac{qP^{-1+\varepsilon}}{a^2}\biggr).\qquad\qquad\square \end{aligned} \end{equation*}$

§ 7. Second variant of estimation of the remainder

Let $\lambda_1(\Lambda)$ and $\lambda_2(\Lambda)$ denote successive minima of a two-dimensional lattice $\Lambda$ , that is, $\lambda_1(\Lambda)$ is the length of a shortest non-zero vector $e_1\in\Lambda$ , and $\lambda_2(\Lambda)$ is the length of a shortest vector $e_2\in\Lambda$ that is linearly independent of $e_1$ . Let $\Lambda(A)$ denote the lattice consisting of solutions of the system of congruences

$\begin{equation*} a_1x+a_2y\equiv 0\pmod{q},\qquad b_1x+b_2y\equiv 0\pmod{q} \end{equation*}$

and let

$\begin{equation} \lambda_2(A)=\lambda_2(\Lambda(A)),\qquad \lambda_2(A^T)=\lambda_2(\Lambda(A^T)). \end{equation} \tag{ 7.1 }$

Theorem 7.1. Suppose that the matrix $A=\begin{pmatrix} a_1 & a_2\\ b_1 & b_2 \end{pmatrix}$ is fixed, $q=\det A\ne 0$ , $(a_1,a_2,b_1,b_2)=1$ , and the matrix $X$ has the form (1.8). Then for any parallelepiped

$\begin{equation*} I=[Y_1,Y_1+Z_1)\times[Y_2,Y_2+Z_2)\times[Y_3,Y_3+Z_3)\times [Y_4,Y_4+Z_4) \end{equation*}$

with dimensions satisfying the conditions $1\leqslant Z_1, Z_2\leqslant q$ and $1\leqslant Z_3, Z_4\leqslant Pq$ ,

$\begin{equation} \sum_{(a_3,b_3,c_1,c_2)\in I}\sideset{}{^\#}\sum_{c_3}[\det X=P]= \frac{\varphi^2(P)}{P^2}\,\frac{\varphi(q)}{q^2}\, Z_1Z_2Z_3Z_4+O(R_A(Z_1,Z_2,Z_3,Z_4)), \end{equation} \tag{ 7.2 }$

where

$\begin{equation} \begin{aligned} R_A(Z_1,Z_2,Z_3,Z_4)&=(P,q)(Pq)^{\varepsilon}\biggl(\frac{Z_1Z_2Z_3Z_4}{q} \biggl(\frac{1}{Z_1}+\frac{1}{Z_2}+\frac{1}{Z_3}+\frac{1}{Z_4}\biggr) \\ &\qquad+q^{1/2}+\frac{\lambda_2(A^T)}{q^{1/2}}(Z_1+Z_2)+ \frac{\lambda_2(A)}{q^{1/2}}(Z_3+Z_4) \\ &\qquad+\frac{\lambda_2(A)\lambda_2^{1/2}(A^T)}{q}(Z_1+Z_2)(Z_3+Z_4)\biggr). \end{aligned} \end{equation} \tag{ 7.3 }$

See the proof in [23].

Remark. Theorem 7.1 is a variant of Lemma 7 in [8]. But in [8] the contribution of the summands containing $\lambda_2(A)$ and $\lambda_2(A^T)$ in the remainder term was not taken into account. Further arguments show that it is these summands that make the main contribution to the final remainder (see the choice of the parameters $r_1$ and $r$ in the proof of Proposition 9.8). Here the estimate $\lambda_2(A)\ll q^{1/2+\varepsilon}$ is used, which is valid only on average with respect to $A$ .

Corollary 7.3. Suppose that the hypotheses of Theorem 7.1 hold, the dimensions of the parallelepiped

$\begin{equation*} I'=[Y_1,Y_1+Z_1)\times[Y_2,Y_2+Z_2)\times[Y_3,Y_3+Z_3) \end{equation*}$

satisfy the inequalities $1\leqslant Z_1,Z_2\leqslant q$ and $1\leqslant Z_3\leqslant Pq$ , and for $(a_3,b_3,c_1)\in I'$ the values of a non-negative function $f(a_3,b_3,c_1)$ vary within the interval $[Z_4,Z_4+V)$ , where $1\leqslant V\ll Z_4\leqslant Pq$ . Then

$\begin{equation*} \begin{aligned} &\sum_{(a_3,b_3,c_1)\in I'}\,\sum_{0<c_2\leqslant f(a_3,b_3,c_1)}\, \sideset{}{^\#}\sum_{c_3}[\det X=P] \\ &\qquad=\frac{\varphi^2(P)}{P^2}\,\frac{\varphi(q)}{q^2} \int_{I'}f(a_3,b_3,c_1)\,da_3\,db_3\,dc_1 \\ &\qquad\qquad+O(R_A(Z_1,Z_2,Z_3,Z_4))+O\biggl(\frac{Z_1Z_2Z_3V}{q}\biggr). \end{aligned} \end{equation*}$

Proof. Replacing $f(a_3,b_3,c_1)$ in the second sum by $Z_4$ and ${Z_4+V}$ , we obtain, respectively, a lower and an upper estimate of the sum being calculated. Applying Theorem 7.1 and using the estimate

$\begin{equation*} \int_{I'}dx_1\,dx_2\,dx_3\bigl(f(x_1,x_2,x_3)-(Z_4+\theta V)\bigr) \ll Z_1Z_2Z_3V \qquad (\theta=0,1), \end{equation*}$

we arrive at the assertion of the corollary. $\square$

Proposition 7.4. Let $a$ be a positive integer, let $q$ be an integer, let $I=[Y_1,Y_1+ Z_1)\times [Y_2,Y_2+Z_2)$ , where $Z_1,Z_2>0$ , and let

$\begin{equation*} \Phi_{a,q}^{{\times}}(I)=\sum_{(x,y)\in I}\,\sum_{z}[az-xy=q,(a,x,y,z)=1]. \end{equation*}$

Then

$\begin{equation} \Phi_{a,q}^{{\times}}(I)=\frac{K_a^{\times}(q)}{a^2}\,Z_1Z_2+ O\bigl(R(Z_1,Z_2)\bigr), \end{equation} \tag{ 7.4 }$

where

$\begin{equation} R(Z_1,Z_2)= a^{1/2+\varepsilon}+ \biggl(\frac{Z_1}{a}+\frac{Z_2}{a}+1\biggr)(a,q)a^{\varepsilon}. \end{equation} \tag{ 7.5 }$

See the proof in [24], Theorem 2.

Corollary 7.5. Suppose that the hypotheses of Proposition 7.4 hold and that for $(x,y)\in I$ the values of a function $f(x,y)$ are contained in the segment $[Z_3,Z_3+V]$ , where $Z_3>0$ and $V \ll Z_3$ . Then

$\begin{equation*} \begin{aligned} &\sum_{(x,y)\in I}\,\sum_{z}[az-xy=q,(a,x,y,z)=1]f(x,y) \\ &\qquad=\frac{K_a^{\times}(q)}{a^2}\,\int_{I}f(x,y)\,dx\,dy+ O(a^{-1+\varepsilon}Z_1Z_2V)+O\bigl(Z_3R(Z_1,Z_2)\bigr), \end{aligned} \end{equation*}$

where $R(Z_1,Z_2)$ is defined by (7.5).

Proof. Replacing $f(x,y)$ in this sum by $Z_3$ and $Z_3+V$ , we obtain, respectively, a lower and an upper estimate for this sum. Using Proposition 7.4, the estimates (6.20), and

$\begin{equation*} \int_{Y_1}^{Y_1+Z_1}dx\int_{Y_2}^{Y_2+Z_2}dy\bigl(f(x,y)- (Z_3+\theta V)\bigr)\ll Z_1Z_2V \qquad (\theta=0,1), \end{equation*}$

we arrive at the assertion of the corollary. $\square$

Corollary 7.6. Suppose that the hypotheses of Proposition 7.4 hold and for $(x,y)\in I$ the values of a function $f(x,y)$ are contained in a segment $[0,Z_3]$ . Then

$\begin{equation*} \begin{aligned} &\sum_{(x,y)\in I}\,\sum_{z}[az-xy=q,\ (a,x,y,z)=1]f(x,y) \\ &\qquad\ll a^{-1+\varepsilon}Z_1Z_2Z_3+Z_3R(Z_1,Z_2). \end{aligned} \end{equation*}$

Proposition 7.7. For any positive integers $r_1,r_2\leqslant a$

$\begin{equation} \begin{aligned} \mathscr{N}_\ell(a,q,P)&=\varphi^2(P)\frac{\varphi(q)}{q^2}\, \frac{K_a^\times(q)}{a^2}\int_{\mathbb{R}^6} \frac{[X'''\in\mathscr{M}_\ell'''(a,q,P)]}{(\det X''')^3}\, d\boldsymbol{\alpha}\,d\boldsymbol{\beta}\,d\boldsymbol{\gamma} \\ &\qquad+O\bigl(R_2(a,q,P)\bigr), \end{aligned} \end{equation} \tag{ 7.6 }$

where

$\begin{equation*} R_2(a,q,P)=r_1^3\sum_{a_2,b_1}R_A\biggl(\frac{a}{r_1}\,, \frac{b}{r_1}\,,\frac{c}{r_1}\,,c\biggr)+\frac{P^{2+\varepsilon}}{r_1aq}+ r_2^2\frac{P^2}{q^2}R\biggl(\frac{a}{r_2}\,,\frac{b}{r_2}\biggr)+ \frac{P^{2+\varepsilon}}{r_2aq} \end{equation*}$

and the remainders $R(Z_1,Z_2)$ and $R_A(Z_1,Z_2,Z_3,Z_4)$ are determined by (7.5) and (7.3), respectively.

Remark 7.8. For the parameter $\lambda_2(A^T)$ defined in (7.1), we use the trivial estimate $\lambda_2(A^T)\ll q/a$ . Using the analogous estimate for $\lambda_2(A)$ does not enable us to single out the principal term in Theorem 3.3. In what follows we will prove (see §9.1) that, on average with respect to $A$ , the upper estimate $\lambda_2(A)\leqslant q^{1/2+\varepsilon}$ holds and is close to the lower estimate. This is the reason for keeping the summation over $a_2$ and $b_1$ in the remainder term $R_2(a,q,P)$ . A similar situation arises in Proposition 8.3.

Proof. of Proposition 7.7 The transition from (7.10) to (7.11) will be realized in two stages. First, by using Corollary 7.3 we pass from summation to integration with respect to the variables $a_3$ , $b_3$ , $c_1$ , $c_2$ , and then by using Corollary 7.5 we pass from summation to integration with respect to the variables $a_2$ , $b_1$ .

The intervals

$\begin{equation} I_\nu=[Y_\nu,Y_\nu+Z_\nu)\quad \biggl(\nu=1,2,3; \ Z_1=a+1, \ Z_2=\frac{4q}{a}+1, \ Z_3=\frac{4P}{q}+1\biggr) \end{equation} \tag{ 7.7 }$

(their lengths were estimated in Lemma 5.6) in which the respective variables $a_3$ , $b_3$ , $c_1$ vary are divided into $r_1$ equal intervals, and the intervals $I_\nu=[Y_\nu,Y_\nu+Z_\nu)$ ( $\nu=4,5$ ; $Z_4=a+1$ , $Z_5=4q/a+1$ ) in which $a_2$ , $b_1$ vary are divided into $r_2$ equal intervals:

$\begin{equation*} \begin{alignedat}{3} I_\nu&=\bigsqcup_{j=0}^{r_1-1}I_{\nu}(j),&\quad I_{\nu}(j)&=\biggl[Y_\nu+\frac{j}{r_1}Z_{\nu}, Y_\nu+\frac{j+1}{r_1}Z_{\nu}\biggr)&&\qquad (\nu=1,2,3); \\ I_\nu&=\bigsqcup_{j=0}^{r_2-1}I_{\nu}(j),&\quad I_{\nu}(j)&=\biggl[Y_\nu+\frac{j}{r_2}Z_{\nu}, Y_\nu+\frac{j+1}{r_2}Z_{\nu}\biggr)&&\qquad (\nu=4,5). \end{alignedat} \end{equation*}$

We define a function

$\begin{equation} H(j_1,j_2;b_1)=\frac{r_1}{q}\inf_{(a_3,b_3)\in I_1(j_1)\times I_2(j_2)}|a_1b_3-a_3b_1| \qquad (0\leqslant j_1,j_2<r_1). \end{equation} \tag{ 7.8 }$

It follows from the estimates $0\leqslant H(j_1,j_2;b_1)\ll a$ that the rectangle $I_{1}\times I_{2}$ can be represented in the form $I_{1}\times I_{2}=\bigsqcup_{k=0}^m W_k(b_1)$ , where $m\ll \log a$ and

$\begin{equation*} \begin{aligned} W_k(b_1)&=\bigsqcup_{\substack{j_1,j_2=0\\ H(j_1,j_2;b_1)\in (2^{k-1},2^k]}}^{r_1-1} I_1(j_1)\times I_2(j_2)\qquad (k>0), \\ W_0(b_1)&=\bigsqcup_{\substack{j_1,j_2=0\\ H(j_1,j_2;b_1)\in[0,1]}}^{r_1-1} I_1(j_1)\times I_2(j_2). \end{aligned} \end{equation*}$

Correspondingly, we write the sum $\mathscr{N}_\ell(a,q,P)$ in the form

$\begin{equation} \mathscr{N}_\ell(a,q,P)=\sum_{k=0}^{m}\mathscr{N}_{\ell,k}(a,q,P), \end{equation} \tag{ 7.9 }$

where

$\begin{equation} \mathscr{N}_{\ell,k}(a,q,P)=\sum_{a_2,b_1}\,\sum_{(a_3,b_3)\in W_k(b_1)}\, \sideset{}{^\#}\sum_{c_1,c_2}[X\in\mathscr{M}_\ell(a,q,P)]. \end{equation} \tag{ 7.10 }$

Then Proposition 7.7 will be proved if we show that

$\begin{equation} \mathscr{N}_{\ell,k}(a,q,P)=\frac{\varphi^2(P)}{P^2}\,\frac{\varphi(q)}{q^2}\, \frac{K_a^\times(q)}{a^2}\int_{\mathbb{R}^2}F_k(a_2,b_1)\,da_2\,db_1+ O\bigl(R_2(a,q,P)\bigr), \end{equation} \tag{ 7.11 }$

where

$\begin{equation} F_k(a_2,b_1)=\int_{W_k(b_1)} da_3\,db_3 \int_{\mathbb{R}^2}[X\in\mathscr{M}_\ell(a,q,P)]\,dc_1\,dc_2. \end{equation} \tag{ 7.12 }$

Indeed, it follows from (7.11) and (7.9) that

$\begin{equation*} \begin{aligned} \mathscr{N}_{\ell}(a,q,P)&=\frac{\varphi^2(P)}{P^2}\,\frac{\varphi(q)}{q^2}\, \frac{K_a^\times(q)}{a^2}\int_{\mathbb{R}^6}[X\in\mathscr{M}_\ell(a,q,P)]\, da_2\,da_3\,db_1\,db_3\,dc_1\,dc_2 \\ &\qquad+O\bigl(R_2(a,q,P)\bigr). \end{aligned} \end{equation*}$

Thus, to obtain (7.6) it remains to pass from the variables $a_2,a_3,b_1,b_3, c_1,c_2$ to the variables $\alpha_2,\alpha_3,\beta_1,\beta_3,\gamma_1,\gamma_2$ using the relations (6.21), (6.16), and

$\begin{equation*} \frac{da_2\,da_3}{(\det X'')^3}= \frac{d\alpha_2\,d\alpha_3}{a(\det X''')^3}\,. \end{equation*}$

We now estimate the number of rectangles of the partition

$\begin{equation} I_{1}\times I_{2}=\bigsqcup_{j_1,j_2=0}^{r_1-1}I_{1}(j_1)\times I_{2}(j_2) \end{equation} \tag{ 7.13 }$

in each of the sets $W_k(b_1)$ . If $I_{1}(j_1)\times I_{2}(j_2)\subset W_k(b_1)$ , then there is a point $(a_3,b_3)\in I_{1}(j_1)\times I_{2}(j_2)$ for which $|a_1b_3-a_3b_1|\leqslant 2^{k+1}q/r_1$ . Since $a_1=a$ , this inequality defines a strip of width $2^{k+2}q/(ar_1)\ll 2^{k}b/r_1$ on the plane $Oa_3b_3$ (in the direction of the $Ob_3$ -axis). Therefore, above each point of the $Oa_3$ -axis there are $O(2^k)$ rectangles of the partition (7.13) that have common points with this strip. Consequently, the set $W_k(b_1)$ consists of $O(r_1\cdot 2^k)$ rectangles of the partition.

We represent the sum $\mathscr{N}_{\ell,k}(a,q,P)$ in the form

$\begin{equation} \mathscr{N}_{\ell,k}(a,q,P)=\sum_{a_2,b_1}G_k(a_2,b_1), \end{equation} \tag{ 7.14 }$

where

$\begin{equation*} G_k(a_2,b_1)=\sum_{(a_3,b_3)\in W_k(b_1)}\, \sideset{}{^\#}\sum_{c_1,c_2}[X\in\mathscr{M}_\ell(a,q,P)]. \end{equation*}$

We now prove the asymptotic formula

$\begin{equation} \begin{aligned} G_k(a_2,b_1)&=\frac{\varphi^2(P)}{P^2}\,\frac{\varphi(q)}{q^2} F_k(a_2,b_1) \\ &\qquad+O\biggl(r_1^3R_A\biggl(\frac{a}{r_1}\,,\frac{b}{r_1}\,, \frac{c}{r_1}\,,c\biggr)\biggr)+ O\biggl(\frac{P^{2+\varepsilon}}{r_1q^2}\biggr), \end{aligned} \end{equation} \tag{ 7.15 }$

where $F_k(a_2,b_1)$ is defined by (7.12).

We note that the number of points in each of the parallelepipeds of the partition (7.13) can be estimated as $O(qr_1^{-2})$ . Thus, it follows from equation (6.10) (see Corollary 6.5) that for any rectangle $I_{1}(j_1)\times I_{2}(j_2)$ we can use the following formula with a trivial estimate for the remainder term:

$\begin{equation} \begin{aligned} &\sum_{(a_3,b_3)\in I_{1}(j_1)\times I_{2}(j_2)}\, \sideset{}{^\#}\sum_{c_1,c_2}[X\in\mathscr{M}_\ell(a,q,P)] =\frac{\varphi^2(P)}{P^2}\,\frac{\varphi(q)}{q^2} \\ &\qquad\times\int_{I_{1}(j_1)\times I_{2}(j_2)}da_3\,db_3 \int_{\mathbb{R}^2}[X\in\mathscr{M}_\ell(a,q,P)]\,dc_1\,dc_2+ O\biggl(\frac{P^{2}}{q^2r_1^2}\biggr). \end{aligned} \end{equation} \tag{ 7.16 }$

For $k=0$ equation (7.15) follows from (7.16). Hence, we assume that $k>0$ in what follows.

For fixed $a_2$ and $b_1$ the pair of variables $(a_3,b_3)$ can vary inside some rectangle $\Pi(a_2,b_1)$ , by the definition of the Minkowski matrices (3.3) and (3.4). Let $\Omega_{1,2}(a_2,b_1)$ denote the domain consisting of those rectangles of the partition (7.13) that are completely contained in $\Pi(a_2,b_1)$ , and let $\Omega_{1,2}'(a_2,b_1)$ be the domain consisting of the rectangles of (7.13) that have common points with the boundary of $\Pi(a_2,b_1)$ . Obviously, $\Omega_{1,2}'(a_2,b_1)$ consists of $O(r_1)$ rectangles of (7.13), and by (7.16) we have

$\begin{equation*} \begin{aligned} &\sum_{(a_3,b_3)\in \Omega_{1,2}'(a_2,b_1)}\ \sideset{}{^\#}\sum_{c_1,c_2}[X\in\mathscr{M}_\ell(a,q,P)] \\ &\qquad=\frac{\varphi^2(P)}{P^2}\,\frac{\varphi(q)}{q^2} \int_{\Omega_{1,2}'(a_2,b_1)}da_3\,db_3\int_{\mathbb{R}^2} [X\in\mathscr{M}_\ell(a,q,P)]\,dc_1\,dc_2+ O\biggl(\frac{P^{2+\varepsilon}}{q^2r_1}\biggr). \end{aligned} \end{equation*}$

Thus, to prove (7.15) it suffices to verify that

$\begin{equation} \begin{aligned} &\sum_{(a_3,b_3)\in \widetilde{W}_{k}(a_2,b_1)}\, \sideset{}{^\#}\sum_{c_1,c_2}[X\in\mathscr{M}_\ell(a,q,P)] \\ &\qquad=\frac{\varphi^2(P)}{P^2}\,\frac{\varphi(q)}{q^2} \int_{\widetilde{W}_{k}(a_2,b_1)}da_3\,db_3\int_{\mathbb{R}^2} [X\in\mathscr{M}_\ell(a,q,P)]\,dc_1\,dc_2 \\ &\qquad\qquad+O\biggl(r_1^3R_A\biggl(\frac{a}{r_1}\,,\frac{b}{r_1}\,, \frac{c}{r_1}\,,c\biggr)\biggr)+ O\biggl(\frac{P^{2+\varepsilon}}{q^2r_1}\biggr), \end{aligned} \end{equation} \tag{ 7.17 }$

where $\widetilde{W}_{k}(a_2,b_1)=W_k(b_1)\cap\Omega_{1,2}(a_2,b_1)$ .

Let us fix an arbitrary rectangle $I_{1}(j_1)\times I_{2}(j_2)\subset \widetilde{W}_{k}(a_2,b_1)$ . We apply Corollary 7.3 on each of the parallelepipeds $I_{1}(j_1)\times I_{2}(j_2)\times I_3(j_3)$ . Since $(a_3,b_3)\in W_k(b_1)$ ( $k>0$ ), we have $|a_1b_3-a_3b_1|\gg U=2^kq/r_1$ , and by Lemma 5.7 each of the functions $f_i$ defining the boundary of the set $\Omega(\overline A)$ satisfies the estimates

$\begin{equation} \frac{\partial f_i}{\partial a_3}\ll\frac{r_1}{2^k}\,\frac{P}{aq}\,,\quad \frac{\partial f_i}{\partial b_3}\ll\frac{r_1}{2^k}\,\frac{P}{bq}\,,\quad \frac{\partial f_i}{\partial c_1}\ll\frac{r_1}{2^k}\,\frac{P}{cq}\,. \end{equation} \tag{ 7.18 }$

Therefore, on each of the parallelepipeds $I_{1}(j_1)\times I_{2}(j_2)\times I_3(j_3)$ the values of the function $f_i$ (under the condition that its graph defines the boundary) are contained in an interval of length $V=O(P\cdot 2^{-k}q^{-1})$ . By Corollary 7.3,

$\begin{equation*} \begin{aligned} &\sum_{(a_3,b_3,c_1)\in I_{1}(j_1)\times I_{2}(j_2)\times I_3(j_3)}\, \sum_{c_2}[X\in\mathscr{M}_\ell(a,q,P)] \\ &\qquad=\frac{\varphi^2(P)}{P^2}\,\frac{\varphi(q)}{q^2} \int_{I_{1}(j_1)\times I_{2}(j_2)\times I_3(j_3)}da_3\,db_3\,dc_1 \int_{\mathbb{R}}dc_2\,[X\in\mathscr{M}_\ell(a,q,P)] \\ &\qquad\qquad+O\biggl(R_A\biggl(\frac{a}{r_1}\,,\frac{b}{r_1}\,, \frac{c}{r_1}\,,c\biggr)\biggr)+ O\biggl(\frac{P^{2}}{r_1^3q^2\cdot 2^k}\biggr). \end{aligned} \end{equation*}$

We arrive at (7.17) by summing the last equality over $j_3$ and over the rectangles $I_{1}(j_1)\times I_{2}(j_2)\subset \widetilde{W}_{k}(a_2,b_1)$ (the number of which is $O(r_1\cdot 2^k)$ ). Thus, (7.15) is also proved.

Substituting (7.15) into (7.14), we get that to prove the main formula (7.6) it suffices to verify that

$\begin{equation} \begin{aligned} \sideset{}{^\times}\sum_{a_2,b_1}F_k(a_2,b_1)&= \frac{K_a^\times(q)}{a^2}\int_{\mathbb{R}^2}F_k(a_2,b_1)\,da_2\,db_1+ O\biggl(\frac{P^{2+\varepsilon}}{ar_1}\biggr) \\ &\qquad+O\biggl(\frac{P^{2+\varepsilon}}{ar_2}\biggr)+ O\biggl(r_2^2\frac{P^{2}}{q}R\biggl(\frac{a}{r_2}\,, \frac{b}{r_2}\biggr)\biggr). \end{aligned} \end{equation} \tag{ 7.19 }$

We note that

$\begin{equation} \int_{\mathbb{R}^2}[X\in\mathscr{M}_\ell(a,q,P)]\,dc_1\,dc_2\ll \frac{P^2}{q^2}\,, \end{equation} \tag{ 7.20 }$

and therefore $F_k$ always satisfies the trivial estimate

$\begin{equation} F_k(a_2,b_1)\ll\frac{2^k}{r_1}\,\frac{P^2}{q}\,. \end{equation} \tag{ 7.21 }$

Hence, by Corollary 7.6, in each of the rectangles of the partition

$\begin{equation} I_{4}\times I_{5}=\bigsqcup_{j_4,j_5=0}^{r_2-1}I_{4}(j_4)\times I_{5}(j_5) \end{equation} \tag{ 7.22 }$

we can apply the formula

$\begin{equation} \begin{aligned} \sideset{}{^\times}\sum_{(a_2,b_1)\in I_{4}(j_4)\times I_{5}(j_5)} F_k(a_2,b_1)&=\frac{K_a^\times(q)}{a^2}\int_{I_{4}(j_4)\times I_{5}(j_5)} F_k(a_2,b_1)\,da_2\,db_1 \\ &\qquad+O\biggl(\frac{P^{2+\varepsilon}}{ar_2^2}\biggr)+ O\biggl(\frac{P^{2}}{q}R\biggl(\frac{a}{r_2}\,,\frac{b}{r_2}\biggr)\biggr). \end{aligned} \end{equation} \tag{ 7.23 }$

The boundary of the domain of variation of the pair of variables $(a_2,b_1)$ is determined by the conditions $b_2\geqslant a_1$ and $b_2\geqslant |b_1|$ (where $b_2=(q+a_2b_1)/a_1$ ) and therefore consists of parts of the graphs of the functions

$\begin{equation} b_1(a_2)=\frac{q-a_1^2}{a_2}\,,\qquad b_1(a_2)=\frac{q}{\pm a_1-a_2}\,. \end{equation} \tag{ 7.24 }$

Let $\Omega_{4,5}$ denote the domain consisting of those rectangles of the partition (7.22) that are completely contained in the domain of variation of $(a_2,b_1)$ , and let $\Omega_{4,5}'$ be the domain consisting of those rectangles of (7.22) that intersect the boundary of the domain of variation of $(a_2,b_1)$ . It follows from the monotonicity of the function (7.24) that $\Omega_{4,5}'$ consists of $O(r_2)$ rectangles of (7.22). Using the formula (7.23) in each of them, we get that

$\begin{equation*} \begin{aligned} \sideset{}{^\times}\sum_{(a_2,b_1)\in \Omega_{4,5}'}F_k(a_2,b_1)&= \frac{K_a^\times(q)}{a^2}\int_{\Omega_{4,5}'}F_k(a_2,b_1)\,da_2\,db_1 \\ &\qquad+O\biggl(\frac{P^{2+\varepsilon}}{ar_2}\biggr)+ O\biggl(r_2\frac{P^{2}}{q}R\biggl(\frac{a}{r_2}\,,\frac{b}{r_2}\biggr)\biggr). \end{aligned} \end{equation*}$

Therefore, to prove (7.19) it is sufficient to verify the asymptotic formula

$\begin{equation*} \begin{aligned} \sideset{}{^\times}\sum_{(a_2,b_1)\in \Omega_{4,5}}F_k(a_2,b_1)&= \frac{K_a^\times(q)}{a^2}\int_{\Omega_{4,5}}F_k(a_2,b_1)\,da_2\,db_1 \\ &\qquad+O\biggl(\frac{P^{2+\varepsilon}}{ar_1}\biggr)+ O\biggl(\frac{P^{2+\varepsilon}}{ar_2}\biggr)+ O\biggl(r_2^2\frac{P^{2}}{q}R\biggl(\frac{a}{r_2}\,, \frac{b}{r_2}\biggr)\biggr). \end{aligned} \end{equation*}$

This equation in turn is a consequence of the fact that for any rectangle $I_{4}(j_4)\times I_{5}(j_5)\subset \Omega_{4,5}$ we have

$\begin{equation*} \begin{aligned} &\sideset{}{^\times}\sum_{(a_2,b_1)\in I_{4}(j_4)\times I_{5}(j_5)} F_k(a_2,b_1)=\frac{K_a^\times(q)}{a^2}\int_{I_{4}(j_4)\times I_{5}(j_5)} F_k(a_2,b_1)\,da_2\,db_1 \\ &\qquad+O\biggl(\frac{P^{2+\varepsilon}}{ar_1r_2^2}\biggr)+ O\biggl(\frac{P^{2+\varepsilon}}{ar_2^3}\biggr)+ O\biggl(\frac{P^{2}}{q}R\biggl(\frac{a}{r_2}\,,\frac{b}{r_2}\biggr)\biggr). \end{aligned} \end{equation*}$

The last relation follows from Corollary 7.5 under the condition that the values of $F_k(a_2,b_1)$ vary within an interval of length

$\begin{equation} V\ll\frac{P^2}{q}\biggl(\frac{1}{r_1}+\frac{1}{r_2}\biggr). \end{equation} \tag{ 7.25 }$

We now verify this condition.

Since $(a_3,b_3)\in W_k(b_1)$ , it follows from Lemma 5.7 that each of the functions $f_i$ defining the boundary of the domain $\Omega(\overline A)$ (for $b_2=(q-a_2b_1)a_1^{-1}$ ) satisfies estimates analogous to (7.18):

$\begin{equation*} \frac{\partial f_i}{\partial a_2}\ll\frac{r_1}{2^k}\,\frac{P}{aq}\,,\qquad \frac{\partial f_i}{\partial b_1}\ll\frac{r_1}{2^k}\,\frac{P}{bq}\,. \end{equation*}$

Hence, for $(a_2,b_1)\in I_{4}(j_4)\times I_{5}(j_5)$ the area of the domain $\Omega(\overline A)$ can vary within an interval of length $O\biggl(\dfrac{r_1}{2^k}\,\dfrac{P^2}{r_2q^2}\biggr)$ . This implies that the values of $F_k(a_2,b_1)$ can vary in an interval of length

$\begin{equation*} O\biggl(\operatorname{Area}(W_k(b_1))\,\frac{r_1}{2^k}\, \frac{P^2}{r_2q^2}\biggr)=O\biggl(\frac{P^2}{r_2q}\biggr), \end{equation*}$

which agrees with the estimate (7.25).

The values of $F_k(a_2,b_1)$ can vary also because the domain $W_k$ changes. Suppose that different domains $W_k(b_1)$ correspond to different values of $b_1$ . Assume that $b_1,b_1'\in I_5(j_5)$ and $W_k(b_1)\ne W_k(b_1')$ . For example, suppose that $I_{1}(j_1)\times I_{2}(j_2)\subset W_k(b_1)\setminus W_k(b_1')$ , that is,

$\begin{equation} H(j_1,j_2;b_1)\in(2^{k-1},2^k],\qquad H(j_1,j_2;b_1')\notin(2^{k-1},2^k], \end{equation} \tag{ 7.26 }$

where the function $H$ is defined by (7.8). The values of $H$ for fixed $j_1,j_2$ and different $b_1,b_1'$ can differ by $O(r_1/r_2)$ . Moreover, $H(j_1,j_2+1;b_1)-H(j_1,j_2;b_1)\gg 1$ . Therefore, for a fixed $j_1$ the number of indices $j_2$ for which the conditions (7.26) can hold can be estimated as $O(r_1/r_2+1)$ . Thus, the areas of the figures $W_k(b_1)$ and $W_k(b_1')$ differ by at most

$\begin{equation*} O\biggl(\!\biggl(\frac{r_1}{r_2}+1\biggr)r_1\,\frac{q}{r_1^2}\biggr)= O\biggl(q\biggl(\frac{1}{r_1}+\frac{1}{r_2}\biggr)\!\biggr). \end{equation*}$

This fact and the estimate (7.20) imply that the values of $F_k(a_2,b_1)$ can then vary in an interval of length

$\begin{equation*} O\biggl(\frac{P^2}{q^2}q\biggl(\frac{1}{r_1}+\frac{1}{r_2}\biggr)\!\biggr)= O\biggl(\frac{P^2}{q}\biggl(\frac{1}{r_1}+\frac{1}{r_2}\biggr)\!\biggr), \end{equation*}$

which again agrees with the estimate (7.25). $\square$

§ 8. Third variant of estimation of the remainder

Theorem 8.1. Suppose that a matrix $A=\begin{pmatrix} a_1 & a_2\\ b_1 & b_2\end{pmatrix}$ and a coefficient $a_3$ are fixed, $q=\det A\ne 0$ , $(a_1,a_2,b_1,b_2)=1$ , $D=(a_1,a_2)$ , $(D,a_3)=1$ , and a matrix $X$ has the form (1.8). Then any parallelepiped

$\begin{equation*} I=[Y_2,Y_2+Z_2)\times [Y_3,Y_3+Z_3)\times[Y_4,Y_4+Z_4) \end{equation*}$

with $1\leqslant Z_2\leqslant q$ and $1\leqslant Z_3, Z_4\leqslant Pq$ satisfies

$\begin{equation} \sum_{(b_3,c_1,c_2)\in I}\sideset{}{^\#}\sum_{c_3}[\det X=P]= \frac{\varphi^2(P)}{P^2}\,\frac{\varphi(q)}{q^2}\,\frac{D}{\varphi(D)}\, Z_2Z_3Z_4+O(R_A(Z_2,Z_3,Z_4)), \end{equation} \tag{ 8.1 }$

where

$\begin{equation*} \begin{aligned} R_A(Z_2,Z_3,Z_4)&=(P,q)(Pq)^{\varepsilon}\biggl(\frac{Z_2Z_3Z_4}{q} \biggl(\frac{1}{Z_2}+\frac{1}{Z_3}+\frac{1}{Z_4}\biggr)D \\ &\qquad+q^{1/2}+\lambda_2^{1/2}(A)(Z_3+Z_4)\biggr) \end{aligned} \end{equation*}$

and $\lambda_2(A)$ is defined in (7.1).

See the proof in [23].

Corollary 8.2. Suppose that the hypotheses of Theorem 8.1 hold, the dimensions of the rectangle

$\begin{equation*} I'=[Y_2,Y_2+Z_2)\times[Y_3,Y_3+Z_3) \end{equation*}$

satisfy the inequalities $1\leqslant Z_2\leqslant q$ and $1\leqslant Z_3\leqslant Pq$ , and for $(b_3,c_1)\in I'$ the values of a non-negative function $f(b_3,c_1)$ vary within an interval $[Z_4,Z_4+V)$ , where $1\leqslant V\ll Z_4\leqslant Pq$ . Then

$\begin{equation*} \begin{aligned} \sum_{(b_3,c_1)\in I'}\,\sum_{0<c_2\leqslant f(b_3,c_1)}\, \sideset{}{^\#}\sum_{c_3}[\det X=P]&=\frac{\varphi^2(P)}{P^2}\, \frac{\varphi(q)}{q^2}\,\frac{D}{\varphi(D)}\int_{I'}f(b_3,c_1)\,db_3\,dc_1 \\ &\qquad+O(R_A(Z_2,Z_3,Z_4))+O\biggl(\frac{Z_2Z_3V}{q}\biggr). \end{aligned} \end{equation*}$

The proof is similar to the proof of Corollary 7.3.

Proposition 8.3. Let $r$ be a positive integer such that $1\leqslant r\leqslant a$ . Then the sum $\mathscr{N}_\ell(a,q,P)$ in (5.7) satisfies the asymptotic formula

$\begin{equation*} \begin{aligned} \mathscr{N}_\ell(a,q,P)&=\varphi^2(P)\biggl(\frac{\varphi(q)}{q^2} \int_{\mathbb{R}^4}d\boldsymbol{\beta}\,d\boldsymbol{\gamma} \sideset{}{^\#}\sum_{a_2,a_3}c(q,D) \frac{[X''\in\mathscr{M}_\ell''(a,q,P)]}{(\det X'')^3}+ \rho_3(a,q)\biggr) \\ &\qquad+O\bigl(R_3(a,q,P)\bigr), \end{aligned} \end{equation*}$

where $c(q,D)$ is defined by (6.18), $\rho_3(a,q)\ll aq^{-2+\varepsilon}$ ,

$\begin{equation*} R_3(a,q,P)=r^2a\sum_{a_2,b_1} R_A\biggl(\frac{b}{r}\,,\frac{c}{r}\,,c\biggr) +\frac{P^{2+\varepsilon}}{raq}, \end{equation*}$

and $R_A(Z_2,Z_3,Z_4)$ is the remainder in Theorem 8.1.

Proof. The first part of the proof in essence repeats the proof of Proposition 7.7.

The intervals $I_2=[Y_2,Y_2+Z_2)$ and $I_3=[Y_3,Y_3+Z_3)$ (see (7.7)) in which the respective variables $b_3$ and $c_1$ vary are divided into $r$ equal intervals:

$\begin{equation} I_\nu=\bigsqcup_{j=0}^{r-1}I_{\nu}(j),\quad I_{\nu}(j)=\biggl[Y_\nu+\frac{j}{r}Z_{\nu},Y_\nu+ \frac{j+1}{r}Z_{\nu}\biggr)\qquad (\nu=2,3). \end{equation} \tag{ 8.2 }$

We define a function

$\begin{equation*} H(j;b_1)=\frac{r}{q}\inf_{b_3\in I_2(j)}|a_1b_3-a_3b_1| \qquad (0\leqslant j<r). \end{equation*}$

Let the interval $I_{2}$ be represented as $I_{2}=\bigsqcup_{k=0}^m W_k(b_1)$ , where $m\ll \log a$ ,

$\begin{equation*} W_k(b_1)=\bigsqcup_{\substack{j=0\\ H(j;b_1)\in(2^{k-1},2^k]}}^{r-1} I_{2}(j)\quad (k>0),\qquad W_0(b_1)=\bigsqcup_{\substack{j=0\\ H(j;b_1)\in[0,1]}}^{r-1}I_{2}(j). \end{equation*}$

We write the sum $\mathscr{N}_{\ell}(a,q,P)$ in the form

$\begin{equation} \mathscr{N}_{\ell}(a,q,P)=\sum_{k=0}^{m}\mathscr{N}_{\ell,k}(a,q,P), \end{equation} \tag{ 8.3 }$

where

$\begin{equation} \begin{aligned} \mathscr{N}_{\ell,k}(a,q,P)&=\sum_{a_2,a_3,b_1}G_k(a_2,a_3,b_1), \\ G_k(a_2,a_3,b_1)&=\sum_{b_3\in W_k(b_1)}\ \sideset{}{^\#}\sum_{c_1,c_2}[X\in\mathscr{M}_\ell(a,q,P)]. \end{aligned} \end{equation} \tag{ 8.4 }$

Let us now verify the asymptotic formula

$\begin{equation} \begin{aligned} G_k(a_2,a_3,b_1)&=\frac{\varphi^2(P)}{P^2}\,\frac{\varphi(q)}{q^2}\, \frac{D}{\varphi(D)}\int_{W_k(b_1)}db_3\int_{\mathbb{R}^2} [X\in\mathscr{M}_\ell(a,q,P)]\,dc_1\,dc_2 \\ &\qquad+O\biggl(r\cdot 2^kR\biggl(\frac{b}{r}\,,\frac{c}{r}\,,c\biggr)\biggr) +O\biggl(\frac{P^{2+\varepsilon}}{raq^2}\biggr). \end{aligned} \end{equation} \tag{ 8.5 }$

To this end we first estimate the number of intervals $I_{2}(j)$ contained in each of the sets $W_k(b_1)$ . If $I_{2}(j)\subset W_k(b_1)$ , then there exists a point $b_3\in I_{2}(j)$ for which the inequality

$\begin{equation} |a_1b_3-a_3b_1|\leqslant\frac{2^{k+1}q}{r} \end{equation} \tag{ 8.6 }$

holds. Since $a_1=a$ , this inequality defines a segment of length $2^{k+2}q/(ar)$ on the $Ob_3$ -axis. Therefore, there exist $O(2^k)$ intervals $I_{2}(j)$ which have common points with this segment, that is, the set $W_k(b_1)$ consists of $O(2^k)$ intervals of the partition.

It follows from (6.10) that for any interval $I_{2}(j)$ we have the following asymptotic formula with a trivial estimate of the remainder term:

$\begin{equation} \begin{aligned} &\sum_{b_3\in I_{2}(j)}\ \sideset{}{^\#}\sum_{c_1,c_2} [X\in\mathscr{M}_\ell(a,q,P)]=\frac{\varphi^2(P)}{P^2}\, \frac{\varphi(q)}{q^2}\,\frac{D}{\varphi(D)} \\ &\qquad\times\int_{I_{2}(j)}db_3\int_{\mathbb{R}^2} [X\in\mathscr{M}_\ell(a,q,P)]\,dc_1\,dc_2+ O\biggl(\frac{P^{2+\varepsilon}}{raq^2}\biggr). \end{aligned} \end{equation} \tag{ 8.7 }$

For $k=0$ equation (8.5) follows from (8.7). Hence, in what follows we assume that $k>0$ .

Equation (8.5) will be proved if for any interval $I_{2}(j)\subset W_k(b_1)$ we show that

$\begin{equation} \begin{aligned} &\sum_{b_3\in I_{2}(j)}\ \sideset{}{^\#}\sum_{c_1,c_2} [X\in\mathscr{M}_\ell(a,q,P)] \\ &\qquad=\frac{\varphi^2(P)}{P^2}\,\frac{\varphi(q)}{q^2}\, \frac{D}{\varphi(D)}\int_{I_{2}(j)}db_3\int_{\mathbb{R}^2} [X\in\mathscr{M}_\ell(a,q,P)]\,dc_1\,dc_2 \\ &\qquad\qquad+O\biggl(rR\biggl(\frac{b}{r}\,,\frac{c}{r}\,,c\biggr)\biggr)+ O\biggl(\frac{P^{2+\varepsilon}}{r\cdot 2^kaq^2}\biggr). \end{aligned} \end{equation} \tag{ 8.8 }$

(For the intervals $I_{2}(j)$ that intersect $W_k(b_1)$ only partially, it suffices to use (8.7).)

If $b_3\in W_k(b_1)$ , then $|a_1b_3-a_3b_1|\gg U={2^kq}/{r}$ . Therefore, it follows from the conditions (5.6) that for each of the functions $f_i$ whose graphs define the boundary of $\Omega(\overline A)$ we have the estimates

$\begin{equation*} \frac{\partial f_i}{\partial b_3}\ll\frac{r}{2^k}\,\frac{P}{bq}\,,\qquad \frac{\partial f_i}{\partial c_1}\ll\frac{r}{2^k}\,. \end{equation*}$

Hence, on each of the rectangles $I_2(j_2)\times I_3(j_3)$ the values of the functions $f_i$ (under the condition that the corresponding part of the graph defines the boundary) are contained in an interval of length $V=O(Pq^{-1}\cdot 2^{-k})$ . On each of the parallelepipeds $I(j_2,j_3)=I_{2}(j_2)$ with $I_{2}(j_2)\subset W_k$ we use Corollary 8.2:

$\begin{equation*} \begin{aligned} &\sum_{(b_3,c_1)\in I_2(j_2)\times I_3(j_3)}\,\sideset{}{^\#}\sum_{c_2} [X\in\mathscr{M}_\ell(a,q,P)] \\ &\qquad=\frac{\varphi^2(P)}{P^2}\,\frac{\varphi(q)}{q^2}\,\frac{D}{\varphi(D)} \int_{I_2(j_2)\times I_3(j_3)}db_3\,dc_1\int_{\mathbb{R}}dc_2\, [X\in\mathscr{M}_\ell(P)] \\ &\qquad\qquad+O\biggl(R\biggl(\frac{b}{r}\,,\frac{c}{r}\,,c\biggr)\biggr)+ O\biggl(\frac{P^{2}}{2^kr^2aq^2}\biggr). \end{aligned} \end{equation*}$

Summing the resulting equation over $j_3$ , we arrive at (8.8). Thus, (8.5) is proved.

The further steps will be different from the proof of Proposition 7.7.

We get from (8.3)–(8.5) that

$\begin{equation*} \begin{aligned} \mathscr{N}_\ell(a,q,P)&=\frac{\varphi^2(P)}{P^2}\,\frac{\varphi(q)}{q^2}\, \sideset{}{^\#}\sum_{a_2,a_3,b_1}\frac{D}{\varphi(D)}\int_{\mathbb{R}^3} [X\in\mathscr{M}_\ell(a,q,P)]\,db_3\,dc_1\,dc_2 \\ &\qquad+O(R_3(a,q,P)). \end{aligned} \end{equation*}$

Using (6.21), we pass to the variables $\gamma_1,\gamma_2$ :

$\begin{equation*} \begin{aligned} \mathscr{N}_\ell(a,q,P)&=\varphi^2(P)\frac{\varphi(q)}{q}\, \sideset{}{^\#}\sum_{a_2,a_3,b_1}\frac{D}{\varphi(D)} \int_{\mathbb{R}^2}d\boldsymbol{\gamma}\int_{\mathbb{R}} \frac{db_3[X\in\mathscr{M}_\ell(a,q,P)]}{(\det X')^3} \\ &\qquad+O(R_3(a,q,P)) \\ &=\varphi^2(P)\frac{\varphi(q)}{q}\int_{\mathbb{R}^2} S(a,q,\boldsymbol{\gamma})\,d\boldsymbol{\gamma}+O(R_3(a,q,P)), \end{aligned} \end{equation*}$

where

$\begin{equation} S(a,q,\boldsymbol{\gamma})=\,\sideset{}{^\#}\sum_{a_2,a_3,b_1} \frac{D}{\varphi(D)}\int_{\mathbb{R}} \frac{db_3\,[X\in\mathscr{M}_\ell(a,q,P)]}{(\det X')^3}\,. \end{equation} \tag{ 8.9 }$

To prove the proposition it remains to verify that the sum $S(a,q,\boldsymbol{\gamma})$ satisfies the asymptotic formula

$\begin{equation} S(a,q,\boldsymbol{\gamma})=\frac{1}{q}\int_{\mathbb{R}^2}d\boldsymbol{\beta} \sideset{}{^\#}\sum_{a_2,a_3}c(q,D) \frac{[X''\in\mathscr{M}_\ell''(a,q,P)]}{(\det X'')^3}+ O(aq^{-2+\varepsilon})\,, \end{equation} \tag{ 8.10 }$

where $c(q,D)$ is defined by (6.18).

Suppose that the values of $a_2$ , $a_3$ , $b_3$ , $\gamma_1$ , and $\gamma_2$ are fixed and consider the sum

$\begin{equation*} \sigma(I)=\sum_{b_1\in I}\frac{[(a_1,a_2,b_1,b_2)=1]}{(\det X')^3}\,. \end{equation*}$

Let $(u_0,v_0)$ be a particular solution of the equation $\begin{vmatrix} a_1 & a_2 \\ u & v \end{vmatrix}=D$ . Then

$\begin{equation*} \begin{vmatrix} a_1 & a_2 \\ u_0q/d & v_0q/d \end{vmatrix}=q, \end{equation*}$

and all the solutions of the equation $\begin{vmatrix} a_1 & a_2 \\ b_1 & b_2 \end{vmatrix}=q$ with respect to the unknowns $b_1,b_2$ can be written in the form

$\begin{equation*} \begin{pmatrix} b_1(t) \\ b_2(t) \end{pmatrix}=\begin{pmatrix} u_0 q/D+t a_1/D \\ v_0 q/D+t a_2/D \end{pmatrix}=\begin{pmatrix} a_1/D& u_0 \\ a_2/D&v_0 \\ \end{pmatrix}\begin{pmatrix} t \\ q/D \end{pmatrix}\qquad (t\in\mathbb{Z}). \end{equation*}$

Since $\begin{pmatrix} a_1/D& u_0\\ a_2/D&v_0 \end{pmatrix} \in SL_2(\mathbb{Z})$ , we have

$\begin{equation*} (b_1(t),b_2(t))=(t,q/D)\quad\text{and}\quad (a_1,a_2,b_1(t),b_2(t))=(D,t,q/D). \end{equation*}$

Therefore,

$\begin{equation*} \begin{aligned} \sigma(I)&=\sum_{t:\,b_1(t)\in I}\begin{vmatrix} a_1 & a_2 & a_3 \\ b_1(t) & b_2(t) & b_3 \\ \gamma_1&\gamma_2&1 \end{vmatrix}^{-3}\biggl[\biggl(D,t,\frac{q}{D}\biggr)=1\biggr] \\ &=\sum_{\delta\mid(D,q/D)}\mu(\delta)\sum_{t:\,b_1(\delta t)\in I}F(t), \end{aligned} \end{equation*}$

where

$\begin{equation*} F(t)=\begin{vmatrix} a_1 & a_2 & a_3 \\ b_1(\delta t) & b_2(\delta t) & b_3 \\ \gamma_1&\gamma_2&1 \end{vmatrix}^{-3}. \end{equation*}$

This function has finitely many monotonicity parts and satisfies the estimate $F(t)\ll q^{-3}$ . By Lemma 6.6,

$\begin{equation*} \begin{aligned} \sigma(I)&=\sum_{\delta\mid(D,q/D)}\mu(\delta)\biggl(\frac{D}{\delta a} \int_{I}\frac{db_1}{(\det X')^3}+O\biggl(\frac{1}{q^3}\biggr)\biggr) \\ &=\frac{D}{a}\,\frac{\varphi((D,q/D))}{(D,q/D)}\int_{I} \frac{db_1}{(\det X')^3}+O(q^{-3+\varepsilon}). \end{aligned} \end{equation*}$

The formula for $\sigma(I)$ thus obtained lets us replace the summation over the variable $b_1$ in (8.9) by integration. Using the definition (6.18) of $c(q,D)$ , we find that

$\begin{equation*} S(a,q,\boldsymbol{\gamma})=\sideset{}{^\#}\sum_{a_2,a_3}\frac{c(q,D)}{a} \int_{\mathbb{R}^2}\frac{db_1\,db_3\,[X'\in\mathscr{M}'_\ell(a,q,P)]} {(\det X')^3}+O(aq^{-2+\varepsilon}). \end{equation*}$

Passing from the variables $b_1,b_3$ to the variables $\beta_1,\beta_3$ and using (6.16), we obtain the required formula (8.10). Proposition 8.3 is proved. $\square$

§ 9. Completion of the proof of the main result

9.1. On reduced bases in planar lattices

Lemma 9.1. Let $Q$ be a positive integer, let $a_1,\dots,a_Q$ and $\lambda_1\leqslant\cdots\leqslant\lambda_Q$ be real numbers, and let $g$ be a continuously differentiable function defined on the segment $[\lambda_1,\lambda_Q]$ . Then

$\begin{equation} \sum_{j=1}^{Q}a_j\,g(\lambda_j)=g(\lambda_1)\sum_{j=1}^{Q}a_j+ \int_{\lambda_1}^{\lambda_Q}g'(\lambda)\sum_{j=1}^{Q} a_j[\lambda_j\geqslant\lambda]\,d\lambda. \end{equation} \tag{ 9.1 }$

Proof. To verify the assertion of the lemma it suffices to transform the integral on the right-hand side of (9.1):

$\begin{equation*} \begin{aligned} &\int_{\lambda_1}^{\lambda_Q}g'(\lambda)\sum_{j=1}^{Q} a_j[\lambda_j\geqslant\lambda]\,d\lambda=\sum_{j=1}^{Q}a_j \int_{\lambda_1}^{\lambda_Q}[\lambda\leqslant\lambda_j]g'(\lambda)\,d\lambda \\ &\qquad=\sum_{j=1}^{Q}a_j\int_{\lambda_1}^{\lambda_j}g'(\lambda)\,d\lambda= \sum_{j=1}^{Q}a_j\bigl(g(\lambda_j)-g(\lambda_1)\bigr).\qquad\qquad\square \end{aligned} \end{equation*}$

Corollary 9.2. Suppose that the hypotheses of Lemma 9.1 hold and $a_1=\dots=a_Q= 1$ . Then

$\begin{equation*} \sum_{j=1}^{Q}g(\lambda_j)=Qg(\lambda_1)+\int_{\lambda_1}^{\lambda_Q} g'(\lambda)\sum_{j=1}^{Q}[\lambda_j\geqslant\lambda]\,d\lambda. \end{equation*}$

Lemma 9.3. Let $t\geqslant 1$ be a real number. Then the number of lattices $\Lambda\subset\mathbb{Z}^2$ such that $\det\Lambda=q$ and $\lambda_2(\Lambda)\geqslant t\sqrt{q}$ can be estimated as $O(q^{1+\varepsilon}t^{-2})$ .

Proof. Since $\lambda_2(\Lambda)\leqslant q$ , the assertion of the lemma obviously holds for $t>\sqrt{q}$ . Thus, in what follows we assume that $t\leqslant\sqrt{q}$ . Let $(e_1,e_2)$ be a reduced basis of the lattice $\Lambda$ with determinant $q$ (that is, $e_1$ is a shortest vector of the lattice, and the projection of $e_2$ onto $e_1$ lies between $-e_1/2$ and $e_1/2$ ). Then it follows from the inequality $\|e_2\|\geqslant t\sqrt{q}$ that $\|e_1\|\leqslant 2\sqrt{q}/t$ . For a fixed vector $e_1=(x_1,y_1)$ the endpoint of the vector $e_2=(x_2,y_2)$ must lie on the straight line $\{(x,y)\colon xy_1-yx_1= q\}$ and satisfy the condition $-\|e_1\|^2/2<(e_1,e_2)\leqslant\|e_1\|^2/2$ . Therefore, the vector $e_2$ can be chosen in $(x_1,y_1)$ ways. Thus,

$\begin{equation*} \begin{aligned} &\sum_{\Lambda:\det\Lambda=q}[\lambda_2(\Lambda)\geqslant t\sqrt{q}\,]\ll \sum_{0\leqslant x_1,y_1\leqslant2\sqrt{q}/t}(x_1,y_1) \\ &\qquad\ll \sum_{d\mid q,d\leqslant 2\sqrt{q}/t}\,d\, \sum_{0\leqslant x_1,y_1\leqslant2\sqrt{q}/t}[d\mid(x_1,y_1)] \\ &\qquad\ll \sum_{d\mid q,d\leqslant 2\sqrt{q}/t}d \biggl(\frac{\sqrt{q}}{dt}+1\biggr)^2\ll q^{1+\varepsilon}t^{-2}.\qquad\qquad\square \end{aligned} \end{equation*}$

Lemma 9.4. Let $\beta\leqslant 2$ . Then

$\begin{equation*} \sum_{\Lambda:\det\Lambda=q}\lambda_2^\beta(\Lambda)\ll q^{1+\beta/2+\varepsilon}. \end{equation*}$

Proof. We use Corollary 9.2, setting $g(\lambda)=\lambda^\beta$ . As $\lambda_1,\dots,\lambda_Q$ we choose the numbers $\lambda_2(\Lambda)$ taken for all lattices $\Lambda$ with determinant $q$ and ordered in ascending order. Furthermore, $Q=\sigma(q)\ll q^{1+\varepsilon}$ and $\lambda_1\ll\sqrt{q}$ (we can always choose a vector $e_1$ with length of order $\sqrt{q}$ and supplement it to form a reduced basis with determinant $q$ ). Hence,

$\begin{equation*} \begin{aligned} \sum_{\Lambda:\det\Lambda=q}\lambda_2^\beta(\Lambda)&\ll q^{1+\beta/2+\varepsilon}+\int_{\sqrt{q}}^{q}\lambda^{\beta-1} \sum_{\Lambda:\det\Lambda=q}[\lambda_2(\Lambda)\geqslant\lambda]\,d\lambda \\ &\ll q^{1+\beta/2+\varepsilon}+\sqrt{q} \int_{1}^{\sqrt{q}} (t\sqrt{q}\,)^{\beta-1}\sum_{\Lambda:\det\Lambda=q} \biggl[\frac{\lambda_2(\Lambda)}{\sqrt{q}}\geqslant t\biggr]\,dt. \end{aligned} \end{equation*}$

Substituting the estimate in Lemma 9.3 into the last integral, we arrive at the assertion of the lemma. $\square$

Corollary 9.5. Let $\beta\leqslant 2$ and let

$\begin{equation*} \mathscr{A}_\ell(a,q)=\left\{\begin{pmatrix} a_1&a_2 \\ b_1&b_2 \end{pmatrix}\!\colon \begin{pmatrix} a_1&a_2&a_3 \\ b_1&b_2&b_3 \\ c_1&c_2&c_3 \end{pmatrix}\in\mathscr{M}_\ell(a,q,P)\right\}. \end{equation*}$

Then

$\begin{equation*} \sum_{a,a_2,b_1}\lambda_2^\beta(A)[A\in\mathscr{A}_\ell(a,q)]\ll q^{1+\beta/2+\varepsilon}. \end{equation*}$

Proof. We transform this sum:

$\begin{equation*} \begin{aligned} &\sum_{\Lambda:\det\Lambda=q}\lambda_2^\beta(\Lambda) \sum_{a,a_2,b_1}[A\in\mathscr{A}_\ell(a,q),\Lambda(A)=\Lambda] \\ &\qquad\leqslant\sum_{\Lambda:\det\Lambda=q}\lambda_2^\beta(\Lambda) \sum_{a,a_2,b_1}[A\in\mathscr{A}_\ell(a,q),\Lambda(A)=\Lambda]. \end{aligned} \end{equation*}$

By the property 3 of reduced matrices, to each term in the inner sum there corresponds some Voronoi basis of the lattice $\Lambda$ , and to each Voronoi basis there correspond at most four matrices $A$ . Consequently, the inner sum can be estimated up to a constant by the number of minimal bases, that is,

$\begin{equation*} \sum_{a,a_2,b_1}[A\in\mathscr{A}_\ell(a,q),\Lambda(A)=\Lambda]\ll l\biggl(\frac{a}{q}\biggr)\ll\log(q+1). \end{equation*}$

Estimating the remaining sum by Lemma 9.4, we arrive at the assertion of the corollary. $\square$

Corollary 9.6. Suppose that $\beta\leqslant 2$ , $0< A_1\leqslant A_2$ , and $\alpha$ is an arbitrary real number. Then

$\begin{equation*} \sum_{A_1\leqslant a<A_2}a^\alpha\sum_{a_2,b_1}\lambda_2^\beta(A) [A\in\mathscr{A}_\ell(a,q)]\ll (A_1^\alpha+A_2^\alpha) q^{1+\beta/2+\varepsilon}. \end{equation*}$

Proof. The required estimate is obtained from Corollary 9.5 by using the Abel transformation

$\begin{equation*} \sum_{A_1\leqslant a<A_2}f(a)g(a)=f(A_2)\sum_{A_1\leqslant a<A_2}g(a)- \sum_{A_1\leqslant x<A_2}\Delta f(x)\sum_{A_1\leqslant a\leqslant x}g(a). \end{equation*}$

Setting $f(a)=a^\alpha$ and $g(a)=\sum_{a_2,b_1}\lambda_2^\beta(A)[A\in\mathscr{A}_\ell(a,q)]$ , we get that

$\begin{equation*} \begin{aligned} &\sum_{A_1\leqslant a<A_2}a^\alpha\sum_{a_2,b_1}\lambda_2^\beta(A) [A\in\mathscr{A}_\ell(a,q)] \\ &\qquad\ll \biggl(A_2^\alpha+\sum_{A_1\leqslant x<A_2}x^{\alpha-1}\biggr) \sum_{a,a_2,b_1}\lambda_2^\beta(A)[A\in\mathscr{A}_\ell(a,q)] \\ &\qquad\ll (A_1^\alpha+A_2^\alpha) q^{1+\beta/2+\varepsilon}.\qquad\qquad\square \end{aligned} \end{equation*}$

9.2. Estimation of the remainder term

Lemma 9.7. Let $a$ be a positive integer, and let $\alpha,Q_1,Q_2$ be real numbers with $0<Q_1<Q_2$ . Then

$\begin{equation*} \sum_{Q_1<q\leqslant Q_2}(a,q)q^\alpha\ll(Q_1^{\alpha+1+\varepsilon}+ Q_2^{\alpha+1+\varepsilon})a^\varepsilon. \end{equation*}$

Proof. Setting $d=(a,q)$ , we have

$\begin{equation*} \begin{aligned} \sum_{Q_1<q\leqslant Q_2}(a,q)q^\alpha&\leqslant\sum_{d\mid a}\,d\, \sum_{\substack{Q_1<q\leqslant Q_2\\ d\mid q}}q^\alpha\leqslant \sum_{d\mid a}d^{1+\alpha}\sum_{Q_1/d<q_1\leqslant Q_2/d}q_1^\alpha \\ &\ll\sum_{d\mid a}d^{1+\alpha} \biggl(\!\biggl(\frac{Q_1}{d}\biggr)^{\alpha+1+\varepsilon}+ \biggl(\frac{Q_2}{d}\biggr)^{\alpha+1+\varepsilon}\biggr) \\ &\ll(Q_1^{\alpha+1+\varepsilon}+Q_2^{\alpha+1+\varepsilon})a^\varepsilon.\qquad\qquad\square \end{aligned} \end{equation*}$

Proposition 9.8. The following estimates hold:

$\begin{equation*} \sum_{(a,q)\in\Omega_{i}}R_{i}(a,q,P)\ll P^{2-{1}/{34}+\varepsilon}\qquad (i=1,2,3). \end{equation*}$

Proof. We sum the remainder $R_1(a,q,P)=\displaystyle\frac{P^{1+\varepsilon}q}{a^{2}}$ (see Proposition 6.10) over the points $(a,q)\in\Omega_1$ :

$\begin{equation*} \begin{aligned} \sum_{(a,q)\in\Omega_1}\frac{P^{1+\varepsilon}q}{a^2}&\ll P^{1+\varepsilon}\sum_{a\leqslant P^{1/3}}\frac{1}{a^2} \sum_{q\leqslant P^{33/68}a^{1/2}}q \\ &\ll P^{2-1/34+\varepsilon}\sum_{a\leqslant P^{1/3}}\frac{1}{a}\ll P^{2-{1}/{34}+\varepsilon}. \end{aligned} \end{equation*}$

Let us estimate the sum of the remainders $R_2(a,q,P)$ . By Proposition 7.7,

$\begin{equation*} \begin{aligned} R_2(a,q,P)&\ll\frac{(P,q)P^{1+\varepsilon}}{q} \biggl(\,\sum_{a_2,b_1}\biggl(\frac{r_1^2\lambda_2(A)q^{1/2}}{a^{3/2}} +\frac{r_1^3\lambda_2(A)}{q^{1/2}}\biggr)+\frac{r_1P}{a^2}+ \frac{P}{r_1a}\biggr) \\ &\qquad+\frac{P^2}{q^2}\biggl(r_2\frac{q}{a^2}+r_2^2a^{1/2}+ \frac{q}{r_2a}\biggr)(a,q)a^\varepsilon. \end{aligned} \end{equation*}$

We choose the values of the parameters $r_1,r_2$ based on the equations

$\begin{equation*} \begin{aligned} r_1^2\frac{P}{a^{3/2}q^{1/2}}\lambda_2(A)&=\frac{P^2}{r_1q^2}, \\ r_2^2a^{1/2}&=\frac{q}{r_2a}\quad \text{(for}\ q\leqslant a^3), \\ r_2\frac{q}{a^2}&=\frac{q}{r_2a}\quad \text{(for} \ q> a^3), \end{aligned} \end{equation*}$

assuming that $\lambda_2(A)\asymp q^{1/2}$ :

$\begin{equation*} r_1=\lfloor P^{1/3}a^{1/2}q^{-2/3}\rfloor,\qquad r_2=\begin{cases} \lfloor q^{1/3}a^{-1/2}\rfloor & \text{for} \ q\leqslant a^3; \\ \lfloor a^{1/2}\rfloor &\text{for} \ q> a^3. \end{cases} \end{equation*}$

(Note that the estimates $r_1,r_2\leqslant a$ hold in the domain $\Omega_2$ , and therefore Proposition 7.7 is indeed applicable.) Then it follows from the inequality $\lambda_2(A)\geqslant q^{1/2}$ that for $(a,q)\in\Omega_2$

$\begin{equation*} \begin{aligned} R_2(a,q,P)&\ll(P,q)P^{5/3+\varepsilon}a^{-1/2}q^{-11/6} \sum_{a_2,b_1}\lambda_2(A) \\ &\qquad+(a,q)P^{2+\varepsilon}(a^{-1/2}q^{-4/3}+a^{-3/2}q^{-1}). \end{aligned} \end{equation*}$

We estimate the sum of each of the terms over the pairs $(a,q)\in\Omega_2$ . For the first term,

$\begin{equation*} \begin{aligned} &P^{5/3+\varepsilon}\sum_{(a,q)\in\Omega_2}(P,q) a^{-1/2}q^{-11/6}\sum_{a_2,b_1}\lambda_2(A) \\ &\qquad\leqslant P^{5/3+\varepsilon} \biggl(\,\sum_{q\leqslant P^{10/17}}(P,q)q^{-11/6} \sum_{a\geqslant P^{3/17}}a^{-1/2}\sum_{a_2,b_1}\lambda_2(A) \\ &\qquad\qquad+\sum_{q>P^{10/17}}(P,q)q^{-11/6} \sum_{a\gg q^{2}P^{-1}}a^{-1/2}\sum_{a_2,b_1}\lambda_2(A)\biggr). \end{aligned} \end{equation*}$

Applying Corollary 9.6 and Lemma 9.7 to the inner sums, we arrive at the required estimate:

$\begin{equation*} P^{5/3+\varepsilon}\biggl(P^{-3/34}\sum_{q\leqslant P^{10/17}}(P,q)q^{-1/3}+ P^{1/2}\sum_{q> P^{10/17}}(P,q)q^{-4/3}\biggr)\ll P^{2-{1}/{34}+\varepsilon}. \end{equation*}$

We sum the remaining part of the remainder $R_2(a,q,P)$ . The required estimate also is verified by using Lemma 9.7:

$\begin{equation*} \begin{aligned} \sum_{a\geqslant P^{3/17}}(a,q)a^{-3/2}\sum_{q\leqslant P}q^{-1}&\ll P^\varepsilon\sum_{a>P^{3/17}}a^{-3/2}\ll P^{-{3}/{34}+\varepsilon}, \\ \sum_{a\geqslant P^{3/17}}(a,q)a^{-1/2}\sum_{q\geqslant a^{2}}q^{-4/3}&\ll P^{\varepsilon}\sum_{a\geqslant P^{3/17}}a^{-7/6}\ll P^{-{1}/{34}+\varepsilon}. \end{aligned} \end{equation*}$

Consider the third remainder (see Proposition 8.3):

$\begin{equation*} R_3(a,q,P)\ll(P,q)P^\varepsilon\biggl(\frac{rDP^2}{q^2}+\frac{P^2}{raq}+ \frac{r^2aP}{q}\sum_{a_2,b_1}\lambda_2^{1/2}(A)\biggr). \end{equation*}$

We choose the value $r=\lfloor P^{1/3}a^{-1/3}q^{-5/12}\rfloor$ based on the equation

$\begin{equation*} r^2\frac{aP}{q}\lambda_2^{1/2}(A)=\frac{P^2}{rq^2}, \end{equation*}$

again assuming that $\lambda_2(A)\asymp q^{1/2}$ . For this choice of $r$ and for $(a,q)\in\Omega_3$ we have

$\begin{equation*} R_3(a,q,P)\ll(P,q)P^{5/3+\varepsilon}a^{1/3}q^{-11/6} \sum_{a_2,b_1}\lambda_2^{1/2}(A). \end{equation*}$

Applying Corollary 9.6 and Lemma 9.7 successively, we find that

$\begin{equation*} \begin{aligned} \sum_{(a,q)\in\Omega_3}R_3(a,q,P)&\ll P^{5/3+\varepsilon} \sum_{q\leqslant P^{10/17}}(P,q)q^{-11/6} \sum_{a\leqslant P^{3/17}}a^{1/3}\sum_{a_2,b_1}\lambda_2^{1/2}(A) \\ &\ll P^{5/3+\varepsilon}\sum_{q\leqslant P^{10/17}}(P,q)P^{1/17}q^{-7/12} \ll P^{2-{1}/{34}+\varepsilon}.\qquad\qquad\square \end{aligned} \end{equation*}$

9.3. Auxiliary assertions

Along with the Euler function $\varphi(q)$ , we use the functions

$\begin{equation*} \varphi_+(q)=q\prod_{p\mid q}\biggl(1+\frac{1}{p}\biggr)\quad\text{and}\quad \varphi_2(q)=\frac{\varphi(q)\varphi_+(q)}{q}= q\prod_{p\mid q}\biggl(1-\frac{1}{p^2}\biggr). \end{equation*}$

Lemma 9.9. Let $Q\geqslant 2$ . Then

$\begin{equation} \sum_{q\leqslant Q}\frac{\varphi_2(q)}{q^2}=\frac{1}{\zeta(3)} \biggl(\log Q+\gamma-\frac{\zeta'(3)}{\zeta(3)}\biggr)+ O\biggl(\frac{1}{Q}\biggr), \end{equation} \tag{ 9.2 }$

$\begin{equation} \sum_{q\leqslant Q}\frac{\varphi_2(q)}{q^2}\log q=\frac{1}{2\zeta(3)} \log^2Q+c_0+O\biggl(\frac{\log Q}{Q}\biggr), \end{equation} \tag{ 9.3 }$

$\begin{equation} \begin{gathered}\sum_{\substack{q\leqslant Q\\ \Delta\mid q}}\frac{\varphi(q)}{q^2}= \frac{\log Q}{\zeta(2)\varphi_+(\Delta)}+\psi(\Delta)+ O\biggl(\frac{\log Q}{Q}\biggr),\end{gathered} \end{equation} \tag{ 9.4 }$

where $c_0$ is an absolute constant,

$\begin{equation} \psi(\Delta)=\frac{1}{\Delta}\sum_{d=1}^{\infty}\frac{\mu(d)(d,\Delta)}{d^2} \biggl(\log\frac{(d,\Delta)}{d\Delta}+\gamma\biggr), \end{equation} \tag{ 9.5 }$

and $\gamma$ is the Euler constant.

Proof. We verify equation (9.2). Expressing $\varphi_2(q)$ in terms of the Möbius function, we arrive at the relations

$\begin{equation*} \begin{aligned} \sum_{q\leqslant Q}\frac{\varphi_2(q)}{q^2}&=\sum_{q\leqslant Q} \frac{1}{q}\sum_{d\mid q}\frac{\mu(d)}{d^2}= \sum_{d\leqslant Q}\frac{\mu(d)}{d^2} \sum_{\substack{q\leqslant Q\\ d\mid q}}\frac{1}{q} =\sum_{d\leqslant Q}\frac{\mu(d)}{d^3}\biggl(\log\frac{Q}{d}+\gamma+ O\biggl(\frac{d}{q}\biggr)\biggr) \\ &=(\log Q+\gamma)\biggl(\frac{1}{\zeta(3)}+ O\biggl(\frac{1}{Q^2}\biggr)\biggr)-\sum_{d\leqslant Q} \frac{\mu(d)\log d}{d^3}+O\biggl(\frac{1}{Q}\biggr). \end{aligned} \end{equation*}$

To calculate the resulting sum it is sufficient to use the fact that

$\begin{equation*} \biggl(\frac{1}{\zeta(s)}\biggr)'=-\sum_{d=1}^\infty\frac{\mu(d)\log d}{d^s} =-\frac{\zeta'(s)}{\zeta^2(s)}, \end{equation*}$

and in particular,

$\begin{equation*} \sum_{d=1}^\infty\frac{\mu(d)\log d}{d^3}=\frac{\zeta'(3)}{\zeta^2(3)}\,. \end{equation*}$

We transform the second sum:

$\begin{equation*} \begin{aligned} \sum_{q\leqslant Q}\frac{\varphi_2(q)}{q^2}\log q&= \sum_{q\leqslant Q}\frac{\log q}{q}\sum_{d\mid q}\frac{\mu(d)}{d^2}= \sum_{d\leqslant Q}\frac{\mu(d)}{d^2}\sum_{\substack{q\leqslant Q\\ d\mid q}} \frac{\log q}{q} \\ &=\sum_{d\leqslant Q}\frac{\mu(d)}{d^3}\sum_{q\leqslant Q/d} \frac{\log q+\log d}{q}\,. \end{aligned} \end{equation*}$

Using the equation

$\begin{equation} \sum_{k\leqslant T}\frac{\log k}{k}=\dfrac{\log ^{2}T}{2}+ \gamma_1+ O\biggl(\frac{\log T}{T}\biggr)\qquad (T\geqslant2), \end{equation} \tag{ Stielt }$

where $\gamma_1$ is the first Stieltjes constant (see [127], §2.21), we find that

$\begin{equation*} \sum_{q\leqslant Q}\frac{\varphi_2(q)}{q^2}\log q= \sum_{d\leqslant Q}\frac{\mu(d)}{d^3}\biggl(\frac{\log^2Q}{2}+ \gamma_1-\frac{\log^2d}{2}+\gamma\log d\biggr)+ O\biggl(\frac{\log Q}{Q}\biggr). \end{equation*}$

Setting

$\begin{equation*} c_0=\sum_{d=1}^{\infty}\frac{\mu(d)}{d^3} \biggl(\gamma_1-\frac{\log^2d}{2}+\gamma\log d\biggr), \end{equation*}$

we obtain the second formula of the lemma.

See the proof of (9.4) in [24], Lemma 9. $\square$

Lemma 9.10. Let $D$ be a positive integer and let $c(q,D)$ be defined by (6.18). Then the sum

$\begin{equation*} S_D(Q_1,Q_2)=\sum_{\substack{Q_1\leqslant q\leqslant Q_2\\ D\mid q}} \frac{\varphi(q)}{q^2}c(q,D) \end{equation*}$

satisfies the asymptotic formula

$\begin{equation*} S_D(Q_1,Q_2)=\frac{\log (Q_2/Q_1)}{\zeta(2)}+ O(D^{1+\varepsilon}Q_1^{-1+\varepsilon}). \end{equation*}$

Proof. Setting $q=Dq_1$ , we obtain for this sum the representation

$\begin{equation*} S_D(Q_1,Q_2)=\frac{1}{\varphi(D)}\sum_{Q_1\leqslant Dq_1\leqslant Q_2} \frac{\varphi(Dq_1)\varphi((D,q_1))}{(D,q_1)q_1^2}\,. \end{equation*}$

We transform the sum obtained, introducing the parameters $\Delta=(D,q_1)$ , $q_2=q_1\Delta^{-1}$ , and $q_3=q_2\delta^{-1}$ :

$\begin{equation*} \begin{aligned} S_D(Q_1,Q_2)&=\frac{1}{\varphi(D)}\sum_{\Delta\mid D} \frac{\varphi(\Delta)}{\Delta^3} \sum_{\substack{Q_1\leqslant D\Delta q_2\leqslant Q_2\\ (D/\Delta,q_2)=1}} \frac{\varphi(D\Delta q_2)}{q_2^2} \\ &=\frac{1}{\varphi(D)}\sum_{\Delta\mid D}\frac{\varphi(\Delta)}{\Delta^3} \sum_{Q_1\leqslant D\Delta q_2\leqslant Q_2} \frac{\varphi(D\Delta q_2)}{q_2^2}\sum_{\delta\mid(D/\Delta,q_2)}\mu(\delta) \\ &=\frac{1}{\varphi(D)}\sum_{\Delta\mid D}\frac{\varphi(\Delta)}{\Delta^3} \sum_{\delta\mid D/\Delta}\frac{\mu(\delta)}{\delta^2} \sum_{Q_1\leqslant D\Delta\delta q_3\leqslant Q_2} \frac{\varphi(D\Delta\delta q_3)}{q_3^2} \\ &=\frac{D^2}{\varphi(D)}\sum_{\delta\Delta\mid D} \frac{\varphi(\Delta)\mu(\delta)}{\Delta} \sum_{\substack{Q_1\leqslant q\leqslant Q_2\\ D\Delta\delta\mid q}} \frac{\varphi(q)}{q^2}\,. \end{aligned} \end{equation*}$

We apply equation (9.4) to the inner sum:

$\begin{equation*} \begin{aligned} S_D(Q_1,Q_2)&=\frac{D^2}{\varphi(D)}\sum_{\delta\Delta\mid D} \frac{\varphi(\Delta)\mu(\delta)}{\Delta} \biggl(\frac{\log(Q_2/Q_1)}{\zeta(2)\varphi_+(D\Delta\delta)}+ O(Q_1^{-1+\varepsilon})\biggr) \\ &=\frac{\log Q_2/Q_1}{\zeta(2)}\,\frac{D^2}{\varphi(D)} \sum_{\delta\Delta\mid D}\frac{\varphi(\Delta)}{\Delta}\, \frac{\mu(\delta)}{\Delta\delta\varphi_+(D)}+ O(D^{1+\varepsilon}Q_1^{-1+\varepsilon}) \\ &=\frac{\log Q_2/Q_1}{\zeta(2)}\,\frac{D}{\varphi_2(D)} \sum_{\delta\Delta\mid D}\frac{\varphi(\Delta)}{\Delta^2}\, \frac{\mu(\delta)}{\delta}+O(D^{1+\varepsilon}Q_1^{-1+\varepsilon}) \\ &=\frac{\log Q_2/Q_1}{\zeta(2)}\,\frac{D}{\varphi_2(D)} \sum_{\Delta\mid D}\frac{\varphi(\Delta)}{\Delta^2}\, \frac{\varphi(D/\Delta)}{D/\Delta}+O(D^{1+\varepsilon}Q_1^{-1+\varepsilon}). \end{aligned} \end{equation*}$

Substituting the equation (see [24], Lemma 8)

$\begin{equation*} \sum_{\Delta\mid D}\frac{\varphi(\Delta)\varphi(D/\Delta)}{\Delta}= \varphi_2(D) \end{equation*}$

into the last formula, we arrive at the assertion of the lemma. $\square$

9.4. Calculation of the principal term

To complete the proof of Theorem 5.5 it remains to pass to integration with respect to the variables in the first row of the matrix $X$ . First we need to sum over the variable $q$ .

The conditions $a\leqslant b\leqslant c$ satisfied by the matrices $X\in\mathscr{M}_\ell(a,q,P)$ are not invariant under the left action of $D_3(\mathbb{R})$ . We express $c$ and $b$ from the equalities

$\begin{equation*} abc=\frac{P}{\det X'''}\,,\qquad ab=\frac{q}{\det A''}\,, \end{equation*}$

where $A''=\begin{pmatrix} 1 & \alpha_2\\ \beta_1 & 1 \end{pmatrix}$ . Then after passing to the set $\mathscr{M}_\ell'''(a,q,P)$ , the inequalities $a\leqslant b\leqslant c$ take the form

$\begin{equation} a\leqslant \frac{q}{a\det A''}\leqslant \frac{P\det A''}{q \det X'''}\,. \end{equation} \tag{ 9.7 }$

Therefore, for given coefficients of the matrix $X'''$ , for each $i=1,2,3$ the domain $\Omega_i(X''')$ in which $a$ and $q$ can vary is defined by

$\begin{equation*} \Omega_i(X''')=\biggl\{(a,q)\in\Omega_i:a^2\det A''\leqslant q\leqslant \biggl(\frac{aP}{\det X'''}\biggr)^{1/2}\det A''\biggr\}\qquad (i=1,2,3). \end{equation*}$

Remark 9.11. As in the two-dimensional case (see §4.7), it is natural to extend by zero the values of the functions $\rho_{2,3}(a,q)$ for those pairs $(a,q)$ for which the sets $\mathscr{M}_\ell(a,q,P)$ are empty. Thus, we assume that the values $\rho_{2,3}(a,q)$ are defined for all positive integers $a$ and $q$ .

Proposition 9.12. The quantity $\mathscr{N}_\ell^{(2)}(P)$ defined by (5.9) satisfies the asymptotic formula

$\begin{equation*} \begin{aligned} \frac{\mathscr{N}_\ell^{(2)}(P)}{\varphi^2(P)}&= \frac{1}{\zeta(2)}\int_{\mathbb{R}^6} \frac{d\boldsymbol{\alpha}\,d\boldsymbol{\beta}\, d\boldsymbol{\gamma}\,[X'''\in\mathscr{M}_\ell''']}{(\det X''')^3} \\ &\qquad\times\sum_{a}\biggl(\frac{\varphi_2(a)}{a^2} \int_{\mathbb{R}}\frac{dq}{q}[(a,q)\in\Omega_{2}(X''')] +\rho_{2}(a)\biggr)+O(P^{-{1}/{34}+\varepsilon}), \end{aligned} \end{equation*}$

where $\rho_{2}(a)\ll a^{-2+\varepsilon}$ .

Proof. By Propositions 7.7 and 9.8 we have

$\begin{equation} N_\ell^{(2)}(P)=\varphi^2(P)\int_{\mathbb{R}^6} \frac{d\boldsymbol{\alpha}\,d\boldsymbol{\beta}\, d\boldsymbol{\gamma}}{(\det X''')^3}S_\ell^{(2)} (P,\boldsymbol{\alpha},\boldsymbol{\beta},\boldsymbol{\gamma})+ O(P^{-{1}/{34}+\varepsilon}), \end{equation} \tag{ 9.8 }$

where

$\begin{equation*} S_\ell^{(2)}(P,\boldsymbol{\alpha},\boldsymbol{\beta},\boldsymbol{\gamma})= \sum_{(a,q)\in\Omega_2}\frac{\varphi(q)}{q^2}\,\frac{K_a^\times(q)}{a^2} [X'''\in\mathscr{M}_\ell'''(a,q,P)]. \end{equation*}$

The set $\widetilde{\mathscr{M}}'''$ is obtained from $\mathscr{M}_\ell'''(a,q,P)$ by discarding the conditions (9.7) while keeping the remaining (invariant) inequalities. Therefore,

$\begin{equation} [X'''\in\mathscr{M}_\ell'''(a,q,P),(a,q)\in\Omega_i]= [X'''\in\mathscr{M}_\ell''',(a,q)\in\Omega_i(X''')] \end{equation} \tag{ 9.9 }$

and

$\begin{equation*} \begin{aligned} S_\ell^{(2)}(P,\boldsymbol{\alpha},\boldsymbol{\beta},\boldsymbol{\gamma})= [X'''\in\mathscr{M}_\ell''']\sum_{a,q}\frac{\varphi(q)}{q^2}\, \frac{K_a^\times(q)}{a^2}[(a,q)\in\Omega_2(X''')]. \end{aligned} \end{equation*}$

We apply the relation

$\begin{equation*} \sum_{Q_1\leqslant q\leqslant Q_2}\frac{\varphi(q)}{q^2}\, \frac{K_a^{\times}(q)}{a^2}=\frac{1}{\zeta(2)}\,\frac{\varphi_2(a)}{a^2} \int_{Q_1}^{Q_2}\frac{dq}{q}+O(a^{-1+\varepsilon}Q_1^{-1+\varepsilon}) \end{equation*}$

(see [24], Corollary 3) to the sum obtained above. The lower limit of summation $Q_1$ satisfies the estimate $Q_1\gg a^2$ . Thus,

$\begin{equation*} S_\ell^{(2)}(P,\boldsymbol{\alpha},\boldsymbol{\beta},\boldsymbol{\gamma})= \frac{[X'''\in\mathscr{M}_\ell''']}{\zeta(2)}\sum_{a} \biggl(\frac{\varphi_2(a)}{a^2}\int_{\mathbb{R}}\frac{dq}{q} [(a,q)\in\Omega_2(X''')]+O(a^{-3+\varepsilon})\biggr). \end{equation*}$

Substituting the last formula into (9.8), we arrive at the required equality. $\square$

Lemma 9.13. Suppose that $\Omega\subset[-a,a]^2$ and the boundary of the domain $\Omega$ has length $O(a)$ . Then the sum

$\begin{equation*} S(\Omega)=\sum_{\substack{(a_2,a_3)\in\Omega\\ (a,a_2,a_3)=1}} \frac{1}{(\det X'')^3} \end{equation*}$

satisfies the asymptotic formula

$\begin{equation*} S(\Omega)=\frac{\varphi_2(a)}{a^2}\int_{\mathbb{R}^2} \frac{[(\alpha_2,\alpha_3)\in a^{-1}\Omega]\,d\boldsymbol{\alpha}} {(\det X''')^3}+O(a^{-2+\varepsilon}). \end{equation*}$

Proof. We get rid of the coprimeness condition by using the Möbius function:

$\begin{equation*} S(\Omega)=\sum_{d\mid a}\mu(d) \sum_{\substack{(a_2,a_3)\in\Omega \\ d\mid a_2,d\mid a_3}}\begin{vmatrix} a & a_2 & a_3 \\ \beta_1 & \beta_2 & \beta_3 \\ \gamma_1&\gamma_2&\gamma_3 \end{vmatrix}^{-3}=\sum_{d\mid a}\frac{\mu(d)}{d^3} \sum_{(a_2',a_3')\in d^{-1}\Omega}\begin{vmatrix} a' & a_2' & a_3' \\ \beta_1 & \beta_2 & \beta_3 \\ \gamma_1&\gamma_2&\gamma_3 \end{vmatrix}^{-3}. \end{equation*}$

Let

$\begin{equation*} G(u,v)=\begin{vmatrix} a' & u & v \\ \beta_1 & \beta_2 & \beta_3 \\ \gamma_1&\gamma_2&\gamma_3 \end{vmatrix}^{-3}. \end{equation*}$

Then

$\begin{equation*} G(u,v)\asymp (a')^{-3},\qquad \frac{\partial G}{\partial u}\ll\frac{1}{(a')^4}\,,\qquad \frac{\partial G}{\partial v}\ll\frac{1}{(a')^4}\,. \end{equation*}$

Consequently,

$\begin{equation*} G(u,v)=\int_{[0,1]^2}G(u+t_1,v+t_2)\,dt_1\,dt_2+ O\biggl(\frac{1}{(a')^4}\biggr). \end{equation*}$

The number of unit squares intersected by the boundary of the domain $d^{-1}\Omega$ is $O(a')$ . In each of these squares, we can use the following formula with a trivial estimate of the remainder term:

$\begin{equation*} G(u,v)=\int_{[0,1]^2}G(u+t_1,v+t_2)[(u+t_1,v+t_2)\in d^{-1}\Omega]\,dt_1\,dt_2 +O\biggl(\frac{1}{(a')^3}\biggr). \end{equation*}$

Thus,

$\begin{equation*} \begin{aligned} S(\Omega)&=\sum_{d\mid a}\frac{\mu(d)}{d^3}\left(\int_{d^{-1}\Omega} \begin{vmatrix} a' & a_2' & a_3' \\ \beta_1 & \beta_2 & \beta_3 \\ \gamma_1&\gamma_2&\gamma_3 \end{vmatrix}^{-3}da_2'\,da_3'+O(d^2a^{-2})\right) \\ &=\frac{1}{a}\sum_{d\mid a}\frac{\mu(d)}{d^2}\int_{a^{-1}\Omega} \begin{vmatrix} 1 & \alpha_2 & \alpha_3 \\ \beta_1 & \beta_2 & \beta_3 \\ \gamma_1&\gamma_2&\gamma_3 \end{vmatrix}^{-3}\,d\boldsymbol{\alpha}+O(a^{-2+\varepsilon}) \\ &=\frac{\varphi_2(a)}{a^2} \int_{a^{-1}\Omega}\begin{vmatrix} 1 & \alpha_2 & \alpha_3 \\ \beta_1 & \beta_2 & \beta_3 \\ \gamma_1&\gamma_2&\gamma_3 \end{vmatrix}^{-3}\,d\boldsymbol{\alpha}+O(a^{-2+\varepsilon}).\qquad\qquad\square \end{aligned} \end{equation*}$

Proposition 9.14.

$\begin{equation*} \begin{aligned} \frac{\mathscr{N}_\ell^{(1,3)}(P)}{\varphi^2(P)}&=\frac{1}{\zeta(2)} \int_{\mathbb{R}^6}\frac{d\boldsymbol{\alpha}\,d\boldsymbol{\beta}\, d\boldsymbol{\gamma}\,[X'''\in\mathscr{M}_\ell''']}{(\det X''')^3} \\ &\qquad\times\sum_{a}\frac{\varphi_2(a)}{a^2}\biggl(\int_{\mathbb{R}} \frac{dq}{q}[(a,q)\in\Omega_{1,3}(X''')]+\rho_{1,3}(a)\biggr) \\ &\qquad+O(P^{-{1}/{34}+\varepsilon}), \end{aligned} \end{equation*}$

where $\rho_{1,3}(a)\ll a^{-5/4+\varepsilon}$ .

Proof. By Propositions 6.10, 8.3, and 9.8 we have

$\begin{equation} \mathscr{N}_\ell^{(1,3)}(P)=\varphi^2(P)\biggl(\int_{\mathbb{R}^4} d\boldsymbol{\beta}\,d\boldsymbol{\gamma}\,S_\ell^{(1,3)} (P,\boldsymbol{\beta},\boldsymbol{\gamma}) +\sum_{(a,q)\in\Omega_{1,3}}\rho_{1,3}(a,q)\biggr)+ O(P^{-{1}/{34}+\varepsilon}), \end{equation} \tag{ 9.10 }$

where

$\begin{equation*} S_\ell^{(1,3)}(P,\boldsymbol{\beta},\boldsymbol{\gamma})= \sum_{(a,q)\in\Omega_{1,3}}\ \sideset{}{^\#}\sum_{a_2,a_3} \frac{\varphi(q)}{q^2}c(q,D) \frac{[X''\in\mathscr{M}_\ell''(a,q,P)]}{(\det X'')^3}. \end{equation*}$

It follows from equation (9.9) that

$\begin{equation*} S_\ell^{(1,3)}(P,\boldsymbol{\beta},\boldsymbol{\gamma})= \sideset{}{^\#}\sum_{a,a_2,a_3} \frac{[X'''\in\mathscr{M}_\ell''']}{(a\det X''')^3}\sum_{q} \frac{\varphi(q)}{q^2}c(q,D)[(a,q)\in\Omega_{1,3}(X''')]. \end{equation*}$

We apply Lemma 9.10 to the sum over $q$ :

$\begin{equation*} \begin{aligned} S_\ell^{(1,3)}(P,\boldsymbol{\beta},\boldsymbol{\gamma})&= \frac{1}{\zeta(2)} \ \sideset{}{^\#}\sum_{a,a_2,a_3} \frac{[X'''\in\mathscr{M}_\ell''']}{(a\det X''')^3} \\ &\qquad\times\biggl(\int_{\mathbb{R}} \frac{dq}{q}[(a,q)\in\Omega_{1,3}(X''')]+O(Da^{-2+\varepsilon})\biggr). \end{aligned} \end{equation*}$

By Lemma 9.7,

$\begin{equation*} \sum_{a_2,a_3\leqslant a}Da^{-5+\varepsilon}= \sum_{a_2,a_3\leqslant a}(a_2,a)a^{-5+\varepsilon}\ll a^{-3+\varepsilon}. \end{equation*}$

Therefore,

$\begin{equation*} \begin{aligned} S_\ell^{(1,3)}(P,\boldsymbol{\beta},\boldsymbol{\gamma})&= \frac{1}{\zeta(2)}\sum_{a}\biggl(\frac{1}{a^3}\int_{\mathbb{R}}\frac{dq}{q}\, \sideset{}{^\#}\sum_{a_2,a_3} \frac{[X'''\in\mathscr{M}_\ell''']}{(\det X''')^3} \\ &\qquad\times[(a,q)\in\Omega_{1,3}(X''')]+ O(a^{-3+\varepsilon})\biggr). \end{aligned} \end{equation*}$

Using Lemma 9.13, we arrive at the asymptotic formula

$\begin{equation} \begin{aligned} S_\ell^{(1,3)}(P,\boldsymbol{\beta},\boldsymbol{\gamma})&=\frac{1}{\zeta(2)} \sum_{a}\biggl(\frac{\varphi_2(a)}{a^2} \\ &\qquad\times\int_{\mathbb{R}^2}d\boldsymbol{\alpha}\, \frac{[X'''\in\mathscr{M}_\ell''']}{(\det X''')^3}\int_{\mathbb{R}} \frac{dq}{q}[(a,q)\in\Omega_{1,3}(X''')]+O(a^{-2+\varepsilon})\biggr). \end{aligned} \end{equation} \tag{ 9.11 }$

Furthermore, by Lemma 9.7 we have

$\begin{equation} \sum_{q}\rho_1(a,q)[(a,q)\in\Omega_1]\ll\sum_{q\geqslant a^2}(a,q) a^{-1/4}q^{-3/2+\varepsilon}\ll a^{-5/4+\varepsilon}, \end{equation} \tag{ 9.12 }$

$\begin{equation} \sum_{q}\rho_3(a,q)[(a,q)\in\Omega_3]\ll \sum_{q\geqslant a^4} aq^{-2+\varepsilon}\ll a^{-3+\varepsilon}. \end{equation} \tag{ 9.13 }$

Substituting (9.11), (9.12), and (9.13) into (9.10), we arrive at the required equality. $\square$

Corollary 9.15. Asymptotically,

$\begin{equation*} \begin{aligned} \frac{\mathscr{N}_\ell(P)}{\varphi^2(P)}&=\frac{1}{\zeta(2)} \int_{\mathbb{R}^6}\frac{d\boldsymbol{\alpha}\,d\boldsymbol{\beta}\, d\boldsymbol{\gamma}\,[X'''\in\mathscr{M}_\ell''']}{(\det X''')^3} \\ &\qquad\times\sum_{a}\frac{\varphi_2(a)}{a^2}\biggl(\int_{\mathbb{R}} \frac{dq}{q}[(a,q)\in\Omega(X''')]+\rho(a)\biggr)+ O(P^{-{1}/{34}+\varepsilon}), \end{aligned} \end{equation*}$

where $\rho(a)\ll a^{-5/4+\varepsilon}$ .

For the proof it is sufficient to substitute the asymptotic formulae in Propositions 9.12 and 9.14 into (5.8).

Proof. of Theorem 5.5 Since

$\begin{equation*} \sum_{a\ll P^{1/3}}\rho(a)=\sum_{a=1}^{\infty}\rho(a)+ O(P^{-1/12+\varepsilon}), \end{equation*}$

it follows by Corollary 9.15 that

$\begin{equation*} \begin{aligned} \frac{\mathscr{N}_\ell(P)}{\varphi^2(P)}&=\frac{1}{\zeta(2)} \int_{\mathbb{R}^6}\frac{d\boldsymbol{\alpha}\,d\boldsymbol{\beta}\, d\boldsymbol{\gamma}\,[X'''\in\mathscr{M}_\ell''']}{(\det X''')^3} \sum_{a}\frac{\varphi_2(a)}{a^2}\int_{\mathbb{R}}\frac{dq}{q} [(a,q)\in\Omega(X''')] \\ &\qquad+\sum_{a=1}^{\infty}\rho(a)+O(P^{-{1}/{34}+\varepsilon}). \end{aligned} \end{equation*}$

We transform the integral with respect to the variable $q$ :

$\begin{equation*} \begin{aligned} &\int_{\mathbb{R}}\frac{dq}{q}[(a,q)\in\Omega(X''')]=\int_{\mathbb{R}} \frac{dq}{q}\biggl[a^2\det A''\leqslant q\leqslant \biggl(\frac{aP}{\det X'''}\biggr)^{1/2}\det A''\biggr] \\ &\qquad=\int_{\mathbb{R}}\frac{dq}{q}\biggl[a^2\leqslant q\leqslant \biggl(\frac{aP}{\det X'''}\biggr)^{1/2}\biggr]=\frac{1}{2}\log P- \frac{3}{2}\log a-\frac{1}{2}\log X'''. \end{aligned} \end{equation*}$

Substituting the resulting expression into the formula for $\mathscr{N}_\ell(P)$ and using (9.2) and (9.3), we arrive at the assertion of the theorem. $\square$

Three-dimensional continued fractions and Kloosterman sums

Article metrics

Permissions

Share this article

Author e-mails

Author affiliations

Dates

Abstract

§ 1. Introduction

1.1. Linnik's problem

1.2. Linnik–Skubenko reduction

1.3. Theorems of Heilbronn and Porter

1.4. Brief statement of the main result

1.5. Plan of the paper

§ 2. Metric properties of continued fractions

2.1. Gauss measure

2.2. Statistical properties of finite continued fractions

2.3. Totally primitive lattices

2.4. Multidimensional analogues of the Gauss measure

§ 3. Continued fractions and lattices

3.1. Geometry of continued fractions

3.2. Minkowski bases

3.3. Three-dimensional continued fractions

3.4. Statement of the main result

§ 4. Two-dimensional case as a model problem

4.1. Statement of the problem

4.2. Division into cases

4.3. Linear parametrization of solutions

4.4. First variant of estimation of the remainder term

4.5. Second variant of estimation of the remainder term

4.6. Estimation of the total remainder

4.7. Calculation of the principal term

§ 5. Division into cases

5.1. Reduced matrices

5.2. Properties of the constructed partition

5.3. Scheme of proof of the main result

5.4. Different versions of Kloosterman sums

§ 6. First variant of estimation of the remainder

6.1. Linear parametrization of solutions

6.2. Transition to integration in the third row

6.3. Transition to integration in the second row

§ 7. Second variant of estimation of the remainder

§ 8. Third variant of estimation of the remainder

§ 9. Completion of the proof of the main result

9.1. On reduced bases in planar lattices

9.2. Estimation of the remainder term

9.3. Auxiliary assertions

9.4. Calculation of the principal term