Brought to you by:
Paper

Three-dimensional continued fractions and Kloosterman sums

© 2015 RAS(DoM) and LMS
, , Citation A. V. Ustinov 2015 Russ. Math. Surv. 70 483 DOI 10.1070/RM2015v070n03ABEH004953

0036-0279/70/3/483

Abstract

This survey is devoted to results related to metric properties of classical continued fractions and Voronoi–Minkowski three-dimensional continued fractions. The main focus is on applications of analytic methods based on estimates of Kloosterman sums. An apparatus is developed for solving problems about three-dimensional lattices. The approach is based on reduction to the preceding dimension, an idea used earlier by Linnik and Skubenko in the study of integer solutions of the determinant equation $\det X=P$, where $X$ is a $3\times 3$ matrix with independent coefficients and $P$ is an increasing parameter. The proposed method is used for studying statistical properties of Voronoi–Minkowski three-dimensional continued fractions in lattices with a fixed determinant. In particular, an asymptotic formula with polynomial lowering in the remainder term is proved for the average number of Minkowski bases. This result can be regarded as a three-dimensional analogue of Porter's theorem on the average length of finite continued fractions.

Bibliography: 127 titles.

Export citation and abstract BibTeX RIS

§ 1. Introduction

1.1. Linnik's problem

Many number-theoretic problems can be reduced to the study of a Diophantine equation

Equation (1.1)

where $F$ is a homogeneous polynomial and $P$ is an integer. The only general method that makes it possible to describe the asymptotic properties of solutions of equation (1.1) is the Hardy–Littlewood circle method, which, however, requires additional properties of the polynomial $F$ (see [1]). In certain situations the distribution of the solutions of (1.1) can be investigated by methods of algebraic number theory and algebraic geometry (see [2], [3]), but in the general case one does not even know estimates of the correct order for the number of solutions.

If equation (1.1) defines a homogeneous variety with an action of a linear algebraic group, then the possibility arises of applying methods of harmonic analysis on this group (see [4]–[7]). An important special case of such a situation is the determinant equation

Equation (1.2)

where $X$ is a square matrix with independent coefficients. For $3\times 3$ matrices Linnik and Skubenko [8] (see also [9], Chap. VIII) proved that as $P\to\infty$ the integer solutions of equation (1.2) are uniformly distributed with respect to the Haar measure. They were solving the problem under the assumption that the normalized matrix $\widetilde{X}=XP^{-1/3}$ is contained in some fixed domain $\Omega\subset SL_3(\mathbb{R})$ of finite measure. They proved an asymptotic formula for the number of solutions of (1.2) without explicitly indicating a lowering in the remainder term. (An explicit estimate for the remainder term for the domain $\|\widetilde{X}\|\leqslant 1$, where $\|\,\cdot\,\|$ is the Euclidean norm, was given in [10].) In the general case the problem of the distribution of the integer solutions of equation (1.1) is known as Linnik's problem and is also usually considered under the assumption that $\widetilde{X}=XP^{-1/d}\ll 1$, where $d$ is the degree of the polynomial $F$ (see [4], [5], [10]). See [11]–[14] for development of the method of Linnik and Skubenko.

1.2. Linnik–Skubenko reduction

By Weyl's criterion (see, for example, [15]) a necessary and sufficient condition for the uniform distribution of a system of functions $(f_1(x),\dots,f_n(x))$ is that

where $m_1,\dots,m_n$ are arbitrary integers that are not simultaneously equal to zero, and $e(t)=e^{2\pi\sqrt{-1}\,t}$. This condition makes it possible to reduce the study of the uniform distribution of systems of functions to estimates of the corresponding trigonometric sums.

Many problems related to planar integer lattices (see details in §2.2) can be reduced to the investigation of solutions of the determinant equation

Equation (1.3)

This equation can be replaced by the equivalent congruence

Equation (1.4)

assuming that $a_1=a> 0$ is fixed and that the value of $b_2$ is determined from the equation $b_2=(a_2b_1+P)a^{-1}$.

By using Weyl's criterion the problem of the uniform distribution of solutions of the congruence (1.4) can be reduced to estimates of the sums

Equation (1.5)

where $m,n,q$ are arbitrary integers, $a$ is a positive integer, and $\delta_a$ is the characteristic function of divisibility by $a$:

For $q=-1$ the sums (1.5) coincide with the classical Kloosterman sums

Equation (1.6)

Non-trivial estimates are known for the sums (1.6) and (1.5), and this makes it possible to find asymptotic formulae for sums of the form

Equation (1.7)

by replacing them with the corresponding integrals. Problems in the geometry of numbers, the theory of continued fractions, and so on, can be reduced to the calculation of similar sums (see the survey [16]).

In the three-dimensional case, bases of lattices with determinant $P$ are parametrized by solutions of the determinant equation (1.2), where $X$ is a matrix of the form

Equation (1.8)

in which the coordinates of the basis vectors are written in columns. In the study of Voronoi–Minkowski three-dimensional continued fractions (see the original publications [17], [18], and [19], as well as their exposition in [20], [21], and [22]), the necessity arises of counting the solutions of (1.2) for which the normalized matrix $\widetilde{X}=XP^{-1/3}$ can vary in a domain $\Omega\subset SL_3(\mathbb{R})$ of infinite measure.

The Linnik–Skubenko method was based on reduction to the preceding dimension — to the determinant equation (1.3). The main idea was that if the matrix $\begin{pmatrix} a_1 & a_2\\ b_1 & b_2 \end{pmatrix}$ is fixed and has non-zero determinant $q$, and $(q,P)=1$, then for each solution (1.8) of equation (1.2) it is possible to construct a series of solutions

Equation (1.9)

where $z\in\mathbb{Z}_q^*$, $zz^{-1}\equiv 1\pmod{q}$, and $s,t,u,v$ are arbitrary integers. The presence of the parameter $z$, which is non-linearly involved in the parametrization (1.9), makes it possible to use Kloosterman sums for reducing the problem of the distribution of the solutions of (1.2) to sums of the form (1.7), the methods of calculation of which are well known.

In [23] a more precise version of the Linnik–Skubenko reduction was proposed which is applicable, in particular, for domains $\Omega$ of infinite volume. The auxiliary two-dimensional problems arising after the reduction were solved in [24]. In the present paper the results in [23] and [24] are applied to the study of statistical properties of Voronoi–Minkowski three-dimensional continued fractions. One can expect that the proposed approach will turn out to be useful also for solution of other problems related to three-dimensional lattices.

1.3. Theorems of Heilbronn and Porter

For a rational $r$, let $l(r)$ denote the length of the expansion of $r$ into a finite continued fraction

where $a_0=\lfloor r\rfloor$ (the integer part of $r$), $a_1,\dots,a_l$ are positive integers, and $a_l\geqslant 2$ for $l\geqslant1$.

Heilbronn [25] proved an asymptotic formula for the average value of $l(r)$ taken over rational numbers $r$ with a fixed denominator:

Equation (1.10)

(henceforth an asterisk means that the summation is carried out over the reduced system of residues). Porter later [26] refined this result by isolating the next significant term, which is an absolute constant:

Equation (1.11)

(we denote by $\mathscr{Q}_m(x)$ a polynomial of degree $m$ in a variable $x$; the constants in the symbols $O$ are always assumed to depend on an arbitrarily small positive number $\varepsilon$). Heilbronn's proof is elementary. Porter used estimates for Kloosterman sums and estimates for trigonometric sums according to van der Corput.

1.4. Brief statement of the main result

In the present paper the refined version in [23] of Linnik–Skubenko reduction is used to construct a method of analysis of minimal bases in three-dimensional lattices. In particular, this method enables us to prove a three-dimensional analogue of Porter's result (1.11) for them.

Theorem 1.1.  The average number of Minkowski bases over totally primitive lattices with determinant $P$ has the asymptotics

Equation (1.12)

A more precise statement of this result will be given below after the definitions of all the requisite notions (see Theorem 3.3 below). From the viewpoint of Linnik's problem, the presence in the asymptotics of a polynomial of second degree in the logarithm of $P$ is explained by the fact that when counting solutions of the equation $\det X=P$ one has to deal with a domain of infinite volume on the variety defined by the equation $\det \widetilde{X}=1$.

A three-dimensional analogue of Heilbronn's theorem (1.10) was proved by Illarionov [27] (see [28]–[30] concerning other multidimensional generalizations). Illarionov's arguments make it possible to determine the leading term in the asymptotic formula (1.12) with remainder $O(\log P\log\log P)$.

Equation (1.11) can be interpreted as a formula for the average length of the Euclidean algorithm applied to a pair of numbers $(a,P)$ such that $1\leqslant a\leqslant P$ and $(a,P)=1$. From the geometric viewpoint, the left-hand side of equation (1.11) can be understood as the average number of minimal bases in lattices with bases from the pair of vectors $(1,a)$ and $(0,P)$. The formula (1.12) describes the average number of minimal bases in three-dimensional lattices generated by the vectors $(1,0,a)$, $(0,1,b)$, and $(0,0,P)$ (see §2.3). The same quantity can be interpreted as the average number of all possible bases that can appear in the Euclidean algorithm applied to a triple $(a,b,P)$ in which $1\leqslant a,b\leqslant P$ and $(a,P)=(b,P)=1$.

1.5. Plan of the paper

In §2 we give a brief survey of results connected with the metric theory of infinite and finite continued fractions.

In §3 we discuss three-dimensional continued fractions according to Voronoi and Minkowski. The precise statement of the main result is given.

In order to illustrate the scheme of the proof of the main theorem, we briefly describe in §4 the main steps needed to solve a model problem — the proof of a simplified version of equation (1.11). The general scheme of arguments in the proof of the main Theorem 3.3 is the same.

In §5 the solutions of equation (1.2) are divided into groups, in each of which the solutions will be counted independently (until a certain moment). At the end of §5.3 we describe the detailed scheme of proof of the main result.

In §§68 asymptotic formulae are proved for the number of solutions of equation (1.2) with fixed corner element $a_1$ and with corner minor $q=\begin{vmatrix} a_1 & a_2\\ b_1 & b_2 \end{vmatrix}$, by three different methods.

In §9 we complete the proof of the main Theorem 3.3.

§ 2. Metric properties of continued fractions

2.1. Gauss measure

The metric theory of continued fractions goes back to Gauss' problem on the typical behaviour of numbers of the form

where $\alpha=[0;a_{1},a_{2},\dots]$ is a random number in the interval $[0,1)$ and $T$ is the Gauss map:

For a real number $\xi\in[0,1]$, let $F_n(\xi)$ denote the measure of the set of numbers $\alpha\in[0,1)$ for which $\alpha_n \leqslant \xi$. In studying iterations of the map $T$, Gauss arrived at the conjecture that

Equation (2.1)

(this is known from the correspondence of Gauss with Laplace; see [31], Chap. 3). Kuz'min [32] obtained the asymptotic formula

from which Gauss' conjecture follows. Kuz'min's result was refined by Lévy [33] and Wirsing [34]. The definitive solution of Gauss' problem is due to Babenko [35]. He proved the existence of an infinite sequence of numbers $\lambda_j$ decreasing to zero,

and a corresponding sequence of analytic functions $\psi_k(\xi)$ such that

Equation (2.2)

Equation (2.1) means that the typical behaviour of the numbers $x_n=T^n(x)$ is described by the Gauss measure

which is invariant under the map $T$. In particular, this implies that the probabilities of the appearance of positive integers $k$ as partial quotients of real numbers are described by the Gauss–Kuz'min distribution

Equation (2.3)

In many cases it is more convenient to consider the extended Gauss measure

Equation (2.4)

which is invariant under the map

(see [36]), which is almost everywhere invertible. If we expand the coordinates of the initial point $(\alpha_0,\beta_0)$ into continued fractions

Equation (2.5)

then on the doubly infinite sequence $(\dots,a_{-2},a_{-1},a_0, a_1,a_2,\dots)$ obtained by concatenation of the expansions (2.5) written in opposite directions the map $\overline T$ is equivalent to a shift: $\overline T^{n}(\alpha_0,\beta_0)=(\alpha_{n},\beta_{n})$, where $n$ is an arbitrary integer, $\alpha_n=[0;a_n,a_{n+1},\dots]$, and $\beta_n=[0;a_{n-1},a_{n-2},\dots]$.

2.2. Statistical properties of finite continued fractions

A discrete version of Gauss' problem is to study the statistical properties of finite continued fractions. For rational numbers (as well as for real numbers) it is convenient to interpret the Gauss–Kuz'min statistics in a wider sense. For real $\xi,\eta\in[0,1]$ and rational $r$, we define the Gauss–Kuz'min statistics by the equation

Equation (2.6)

We assume that an empty continued fraction is equal to zero by definition. If $A$ is some condition, then $[A]$ is the characteristic function of the set defined by this condition: $[A]=1$ if the condition holds, and $[A]=0$ otherwise. In particular, $l_{1,1}(r)=l(r)$.

The distribution of the partial quotients in the expansion of numbers $a/b$ in the case where $1\leqslant a\leqslant b\leqslant P$ and $P\to\infty$ was first studied in 1961 by Lochs (see [37], as well as [25], [38]). Later this problem was posed in a more general setting by Arnold as Problem 1993-11 in [39] (the papers [40]–[45] were devoted to Arnold's problem). Lochs' result can be interpreted as follows: the asymptotic formula

Equation (2.7)

holds for the average value of the Gauss–Kuz'min statistics (2.6) in which the leading coefficient is proportional to the Gauss measure of the rectangle $[0,\xi]\times [0,\eta]$.

The function $C(\xi,\eta)$, as well as the left-hand side of equation (2.7), is discontinuous at all the points that have at least one rational coordinate. This function is defined by a singular series, that is, a series consisting of the remainder terms of asymptotic formulae (see [37], [44], [46], [47]).

Equation (1.11) can also be generalized to the case of the Gauss–Kuz'min statistics:

Equation (2.8)

(for $\eta=1$ the proof is given in [43]; the formula for the principal term follows from Heilbronn's result [25]; the functions $C(\xi,\eta)$ and $\widetilde{C}(\xi,\eta)$ can be expressed in terms of each other). By comparing (2.2) with (2.7) (or with (2.8)) we can conclude that the principal terms in the continuous and discrete problems are proportional and are determined by an invariant measure, while the next significant terms differ and are of a fundamentally different nature.

The analytic apparatus used in the study of statistical properties of finite continued fractions makes it possible to solve also other problems in which continued fractions are used as an auxiliary tool.

In particular, this makes it possible

$\bullet$ to analyse the typical behaviour of the Euclidean algorithms with rounding off to the nearest integer [48], [49], with even and odd partial quotients [50]–[52], with by-excess division [53], [54], and with subtractive division [55], [56], and to analyse the typical behaviour of Minkowski diagonal fractions [57] and continued fractions of more general form [58], as well as to study continued fraction expansions of quadratic irrationals [45];

$\bullet$ to find the distribution density in Sinai billiards (circular scatterers are placed at nodes of the lattice $\mathbb{Z}^2$) of the random variable equal to the length of the free path of a particle [59]–[62], and (in an equivalent problem) to calculate the joint distribution density of the lengths of neighbouring segments connecting the origin with primitive points of the integer lattice [63], [64];

$\bullet$ to describe the limit distribution of Frobenius numbers with three arguments [65]–[69];

$\bullet$ to prove the existence of a limit distribution density for partial Gauss sums and partial theta-series [70]–[72], and also to find these densities [73].

It should be pointed out that for all the problems listed above there exist other approaches based on ergodic theory and methods of the geometry of numbers. Ergodic methods usually turn out to be applicable in a more general situation, but in comparison with analytic methods they require averaging over a greater number of parameters and produce less accurate remainder terms.

In particular, by using ergodic theory it is possible

$\bullet$ to analyse a wide class of Euclidean algorithms [74]–[76];

$\bullet$ to study Sinai billiards in spaces of arbitrary dimension [77], [78], and to study the behaviour of the free path of particles in quasi-crystallic structures [79];

$\bullet$ to prove the existence of a limit distribution density for Frobenius numbers with an arbitrary number of arguments [80], and to describe the properties of this density [81]–[83];

$\bullet$ to obtain results on the behaviour of partial Gauss sums and partial theta-series [84], [85].

2.3. Totally primitive lattices

Let $M(v_1,\dots,v_s)$ denote the matrix in which the coordinates of the vectors $v_1,\dots,v_s$ are written in the columns. A full lattice $\Lambda\subset\mathbb{Z}^s$ with basis $(v_1,\dots,v_s)$ is said to be totally primitive if for every row of the matrix $M(v_1,\dots,v_s)$ the minors corresponding to the elements of this row are setwise coprime. For example, for a matrix $\begin{pmatrix} a_1 & a_2\\ b_1 & b_2 \end{pmatrix}$ this condition means that $(a_1,a_2)=(b_1,b_2)=1$, that is, for $s=2$ the notions of primitive and totally primitive lattice coincide. If $s=3$, then for a basis matrix of the form (1.8) the condition of being totally primitive is written in the form

Equation (2.9)

Equation (2.10)

Equation (2.11)

An equivalent definition of a totally primitive lattice is obtained if the lattice is required to have a basis with a matrix

Equation (2.12)

where $0\leqslant a, b<P$ and $(a,P)=(b,P)=1$. In particular, this implies that there exist $\varphi^{2}(P)$ three-dimensional totally primitive lattices with determinant $P$.

From the viewpoint of the theory of Diophantine approximations, the local minima of a lattice with basis matrix (2.12) are the best approximations of the linear form $x_1\dfrac{a}{P}+x_{2}\dfrac{b}{P}+x_3$. All that was said above about primitive three-dimensional lattices can be extended in obvious fashion to the case of arbitrary dimension $s\geqslant 3$.

2.4. Multidimensional analogues of the Gauss measure

The Gauss measure is a special case of a more general construction. The set of bases in $s$-dimensional lattices can be identified with the set of matrices $GL_s(\mathbb{R})$. The definitions of local minima and minimal systems of vectors (see §3.1) are independent of the choice of scales on the coordinate axes, and therefore for studying the properties of minimal bases it is natural to consider the quotient space $\mathscr{X}_s= D_s(\mathbb{R}) \setminus GL_s(\mathbb{R})$, where $D_s(\mathbb{R})$ is the group of diagonal invertible $s\times s$ matrices with real coefficients. The bi-invariant Haar measure on $GL_s(\mathbb{R})$

where $g\in GL_s(\mathbb{R})$ and $dg=dg_{11}\,dg_{12}\cdots dg_{ss}$ is the Lebesgue measure, induces on $\mathscr{X}_s$ the quotient measure $\mu$, which is a right Haar measure and which remains invariant under the left action of $D_s(\mathbb{R})$:

for any $h\in D_s(\mathbb{R})$ and $g\in G=GL_s(\mathbb{R})$.

Setting $\overline g=(\overline g_{ij})\in \mathscr{X}_s$, where $\overline g_{ij}=g_{ij}g_{ii}^{-1}$ for $1\leqslant i,j\leqslant s$, we can define the measure $\mu$ in the chart $\overline g_{11}=\overline g_{22}=\cdots=\overline g_{ss}=1$ by

Equation (2.13)

Then the right invariance of $\mu$ follows from the formula

in which $\dfrac{dg_{11}}{g_{11}}\,\dfrac{dg_{22}}{g_{22}}\cdots \dfrac{dg_{ss}}{g_{ss}}$ is the Haar measure on $D_s(\mathbb{R})$.

For $s=2$ the measure $\mu$, taken on the space of matrices of the form $\begin{pmatrix} 1 & -\alpha\\ \beta & 1\end{pmatrix}$ (normalized Voronoi matrices), coincides up to the normalizing factor $1/\log 2$ with the extended Gauss measure (2.4). For $s=3$ the measure $\mu$ arises in the study of the statistical properties of Klein polyhedra (see [86], [87]) and Voronoi–Minkowski continued fractions (see [27]). An essential difference between these two objects is the fact that Klein polyhedra are parametrized by the points of the whole space $\mathscr{X}_s$, the measure of which is infinite, while to non-degenerate minimal systems of vectors (that is, systems whose matrices have non-zero determinant; see §3) in the space $\mathscr{X}_s$ there corresponds a domain of finite measure $\mu$: if a matrix $g\in GL_s(\mathbb{R})$ with diagonal dominance (in each row the absolute values of the non-diagonal elements do not exceed the absolute value of the diagonal element) defines a non-degenerate minimal system of vectors of a lattice $\Lambda$, then by Minkowski's convex body theorem, $g_{11}g_{22}\cdots g_{ss}\leqslant\det\Lambda$ and

§ 3. Continued fractions and lattices

3.1. Geometry of continued fractions

There are two geometric interpretations of classical continued fractions admitting a natural generalization to the multidimensional case. In the first, due to Klein (see [88], [89], and also an earlier remark of Smith [90], pp. 146–147), a continued fraction is identified with the convex hull (the Klein polygon) of the points of the integer lattice that lie in two adjacent angles. The second interpretation, proposed independently by Voronoi and Minkowski (see [17], [18] and [19], [91], as well as the reiteration of the original results in [20], [21], and [22]), is based on the use of local minima of lattices, minimal systems, and extremal parallelepipeds (see the definitions below). In planar lattices the vertices of Klein polygons (after a linear transformation taking the sides of the angles to coordinate axes) can be identified with Voronoi local minima. But the geometric constructions of Klein and Voronoi–Minkowski become different starting from dimension $3$ (see [92], [93]).

We recall the requisite definitions going back to Voronoi and Minkowski. A lattice $\Lambda\subset\mathbb{R}^s$ is said to be irreducible (or a lattice of general position) if the coordinate hyperplanes do not contain nodes of the lattice other than the origin; in the opposite case the lattice is said to be reducible. The set of full $s$-dimensional lattices (that is, lattices of dimension coinciding with the dimension of the space) is denoted by $\mathscr{L}_s(\mathbb{R})$, and the subset of it consisting of irreducible lattices by $\mathscr{L}^*_s(\mathbb{R})$.

For a non-empty finite set $A\subset\mathbb{R}^s$, we put

In other words, $\operatorname{Box}(A)$ is the smallest parallelepiped circumscribed around the set $A$ (we consider only parallelepipeds with centre at the origin and with faces parallel to the coordinate planes).

A system of nodes of order $r$ of a lattice $\Lambda$ (not necessarily a full lattice) is defined to be any finite $r$-tuple $(v_1,\dots,v_r)$ of non-zero nodes of $\Lambda$ in which $v_i\ne\pm v_j$ ($1\leqslant i<j\leqslant r$). With an arbitrary system $S=(v_1,\dots,v_r)$ we associate the matrix $M(v_1,\dots,v_r)$ by writing the coordinates of the vectors $v_1,\dots,v_r$ in the columns.

A node $\gamma$ of a lattice $\Lambda\in\mathscr{L}_s(\mathbb{R})$ is called a Voronoi relative (local) minimum of $\Lambda$ (henceforth, simply a minimum) if the parallelepiped $\overline{\operatorname{Box}}(\gamma)$ does not contain nodes of $\Lambda$ other than its own vertices and the origin (see [18]). The set of all local minima of $\Lambda$ is denoted by $\mathfrak{M}(\Lambda)$. If $\Lambda$ has several minimal vectors $v_1,\dots,v_k$ such that $|v_i|=|v_j|$ ($1\leqslant i<j\leqslant k$), then we agree to include in $\mathfrak{M}(S)$ only one of these vectors.

A system $S$ of vectors of the lattice $\Lambda$ is said to be minimal if the parallelepiped $\operatorname{Box}(S)$ does not contain nodes of $\Lambda$ other than the origin. In particular, for irreducible lattices the notion of a minimal system of order $1$ coincides with the notion of a local minimum. For reducible lattices the definition of a minimal system has to be made more precise (see [94]).

In the two-dimensional case we introduce on the set $\mathfrak{M}(\Lambda)$ of local minima the structure of a sequence

Equation (3.1)

in which the vectors $v_n=((-1)^nx_n,y_n)$ ($x_n,y_n>0$) are ordered by decrease of the first coordinate: $y_{n+1}>y_n$, $x_{n+1}<x_n$. Here every minimal pair of vectors has the form $(v_n,v_{n+1})$ (that is, consists of neighbouring local minima) and is a basis of the lattice $\Lambda$ (see [18]); such pairs are called Voronoi bases.

By considering the normalized matrices

Equation (3.2)

we deduce that from the geometric viewpoint the extended Gauss map $\overline T(\alpha_n,\beta_n)=(\alpha_{n+1},\beta_{n+1})$ means transition from the Voronoi basis $(v_{n-1},v_{n})$ to the adjacent basis $(v_n,v_{n+1})$. Therefore, we can say that the extended Gauss measure (2.4) describes the typical behaviour of normalized Voronoi bases.

With a rational number $r=a/P$ such that $0\leqslant a< P$ and $(a,P)=1$ it is natural to associate the lattice $\Lambda(r)$ with basis matrix $\begin{pmatrix} 1&0\\ a&P \end{pmatrix}$. Obviously, the map $r\to\Lambda(r)$ establishes a one-to-one correspondence between fractions of the form $a/P$ with $0\leqslant a< P$ and $(a,P)=1$, and the two-dimensional primitive lattices with determinant $P$. For $a\leqslant P/2$ the set of all local minima of the lattice $\Lambda(a/P)$ coincides with the set of vertices of convex hulls of points of the lattice $\Lambda(r)$ that lie in coordinate quadrants I and II. For $a/P=[0;a_1,\dots,a_l]\leqslant 1/2$ the local minima have the form

where the sequences $\{x_j\}$ and $\{y_j\}$ are defined by

(For $l=-1$ the fraction $[0;a_1,\dots,a_l]$ is assumed to be equal to $1/0$ by definition.) Here the set of Voronoi bases coincides with the set of pairs $(v_j,v_{j+1})$, where $0\leqslant j\leqslant l$. For $a>P/2$ the coordinates of the local minima are determined in similar fashion from the continued fraction expansion of the number $(P-a)/P$.

3.2. Minkowski bases

Let $\Pi=[0,\xi]\times [0,\eta]\subset[0,1]^2$, and let $N_\Pi(P)$ denote the number of Voronoi bases $(v_{n-1},v_n)$ (in all the primitive lattices with determinant $P$) for which the coefficients of the normalized matrix (3.2) satisfy the condition $(\alpha_n,\beta_n)\in\Pi$. Then (2.8) can be rewritten in the form

where

Let $G_s$ denote the group generated by the following elementary transformations acting on the set of $s\times s$ matrices:

(i) permutations of columns and multiplication of columns by $-1$ (renumbering of the basis vectors and changing their orientation);

(ii) permutations of rows and multiplication of rows by $-1$ (renaming coordinate axes and changing their directions).

Two $s\times s$ matrices are regarded as equivalent if they are taken one to the other by the action of the group $G_s$.

As noted above, the purpose of this paper is to develop analytic methods that enable us to prove a three-dimensional analogue of equation (2.8). The classification of minimal triples of vectors becomes somewhat more difficult in the three-dimensional case. A complete description of minimal triples in lattices of general position is given by the following result of Minkowski.

Theorem 3.1. (Minkowski) Let $S=(v_1,v_2,v_3)$ be a minimal system of a lattice $\Lambda\in\mathscr{L}_3^*$. If the system $S$ is non-degenerate, then it is a basis of $\Lambda$, and the matrix $M(v_1,v_2,v_3)$ is equivalent to one of the two canonical forms

Equation (3.3)

Equation (3.4)

But if the system $S$ is degenerate, then $v_1\pm v_2\pm v_3=0$ for some combination of signs, and the matrix $M(v_1,v_2,v_3)$ can be reduced by the action of the group $G_3$ to the form

Equation (3.5)

(In all three cases it is assumed that $x_i,y_i,z_i\geqslant 0$ and the basis matrix has diagonal dominance: $x_2,x_3\leqslant x_1$, $y_1,y_3\leqslant y_2$, and $z_1,z_2\leqslant z_3$.)

The converse is also true: a system of three vectors $(v_1,v_2,v_3)$ with matrix equivalent to one of the matrices of the form (3.3) or (3.4) is a minimal system of the full lattice $\Lambda=\langle v_1,v_2,v_3\rangle$; a system of vectors $(v_1,v_2,v_3)$ with matrix of the form (3.5) is a minimal system of the rank-$2$ lattice $\Lambda=\langle v_1,v_2,v_3\rangle=\langle v_1,v_2\rangle$.

Minkowski stated this theorem without proof (see [19], [91]). A detailed proof can be found in [22], papers 109–110 (see also [95]–[98]).

In accordance with Theorem 3.1, minimal systems with matrices equivalent to (3.3) or (3.4) are called Minkowski bases of type I or II, respectively.

For reducible lattices the classification of minimal systems becomes more difficult (see [94]). But it is Minkowski bases that are of main interest, since any local minimum $v$ can be supplemented to form a Minkowski basis by extending (in a lattice of general position that is close to the given lattice $\Lambda$) the parallelepiped $\operatorname{Box}(v)$ along the coordinate axes.

3.3. Three-dimensional continued fractions

For an arbitrary $T\subset \mathbb{R}^3$ let $T'=T$ if $0\notin T$ and $T'=T\setminus\{0\}$ if $0\in T$, and let

With each discrete set $T\subset \mathbb{R}^3$ ($T\ne {0}$) we associate the orthogonal surface $\mathscr{P}(T)$ that is defined as the boundary of the set

The Voronoi–Minkowski three-dimensional continued fraction associated with a lattice $\Lambda$ is defined as the orthogonal surface $\mathscr{P}(\Lambda)$.

The fact that $S$ is discrete implies that the surface $\mathscr{P}(S)$ has only finitely many vertices inside any bounded set. The set of concave vertices of $\mathscr{P}(S)$ coincides with $\mathfrak{M}(S)$ — the set of local minima. Corresponding to each convex vertex of $\mathscr{P}(S)$ is an extremal parallelepiped — a parallelepiped of the form $\operatorname{Box}(v_1,v_2,v_3)$, where $v_1,v_2,v_3$ are local minima that lie strictly inside three mutually perpendicular faces of $\operatorname{Box}(v_1,v_2,v_3)$. In other words, the extremal parallelepiped is characterized by the fact that its dimensions cannot be increased in such a way that it remains free of points in the set $S$.

Example 3.2.  Let $S$ consist of non-zero nodes of the lattice $\Lambda=\langle e_1,e_2,e_3\rangle$, where $e_1=(0,5,0)$, $e_2=(0,0,5)$, and $e_3=(1,1,2)$. Then $\mathfrak{M}(S)=\{e_1,e_2,e_3,e_4,e_5\}$, where $e_4=(2,2,-1)$ and $e_5=(5,0,0)$. The surface $\mathscr{P}(S)$ is depicted in Fig. 1.

Figure 1.

Figure 1. 

Standard image

For describing the polyhedron $\mathscr{P}(S)$ we define the Minkowski–Voronoi complex $\operatorname{MV}(S)$ as the two-dimensional complex whose vertices are the extremal parallelepipeds, whose edges are the pairs of the form $(\operatorname{Box}(\gamma_1,\gamma_2,\gamma_3), \operatorname{Box}(\gamma_2,\gamma_3,\gamma_4))$, and whose faces are the local minima $\gamma_0$ surrounded by chains of edges of the form

(see Fig. 2).

Figure 2.

Figure 2. Example of the surface $\mathscr{P}(S)$ and the corresponding complex $\operatorname{MV}(S)$

Standard image

If $S$ is a set of general position, then $\mathscr{P}(S)$ has a more regular structure. In this case it is natural to define two mutually dual planar graphs — the Voronoi graph $\operatorname{Vor}(S)$ and the Minkowski graph $\operatorname{Min}(S)$. The vertices, edges, and faces of the Voronoi (Minkowski) graph are assumed to be, respectively, the vertices (faces), edges, and faces (vertices) of the complex $\operatorname{MV}(S)$.

Figure 3.

Figure 3. Example of a Voronoi graph

Standard image

The graphs $\operatorname{Vor}(S)$ and $\operatorname{Min}(S)$ can be depicted on the surface $\mathscr{P}(S)$ by using the following rules: the vertices of $\operatorname{Vor}(S)$ are peaks (convex vertices) of $\mathscr{P}(S)$, the edges are pairs of convex edges of $\mathscr{P}(S)$ (all the vertices of $V(\Lambda)$ have degree $3$), and the faces are domains that are formed after erasing local minima and the edges going out from them (see Fig. 3); the vertices of $\operatorname{Min}(S)$ are the local minima (concave vertices of $\mathscr{P}(S)$), each face is a triangle whose edges connect three local minima on the surface of some extremal parallelepiped ($M(\Lambda)$ is a triangulation of the plane; the concave edges of $\mathscr{P}(S)$ can be regarded as part of the edges of $\operatorname{Min}(S)$; see Fig. 4).

Figure 4.

Figure 4. Example of a Minkowski graph

Standard image
Figure 5.

Figure 5. The Voronoi graph and its canonical diagram

Standard image
Figure 6.

Figure 6. Geometric meaning of the directions on the canonical diagram

Standard image
Figure 7.

Figure 7. 

Standard image

The edges of each of the graphs $\operatorname{Min}(S)$ and $\operatorname{Vor}(S)$ are in a one-to-one correspondence with the saddle vertices of $\mathscr{P}(S)$.

It is convenient to depict the Voronoi graph on the plane $x+y+z=0$ in the form of the canonical diagram — a graph whose edges are segments of the three directions (see Fig. 5).

The canonical diagram preserves information about the mutual disposition of extremal parallelepipeds. Let $\gamma_i=(\pm x_i,\pm y_i,\pm z_i)$ ($i=1,2,3$) and suppose that the matrix $M(\gamma_1,\gamma_2,\gamma_3)$ has diagonal dominance. Suppose that we pass from the extremal parallelepiped $\operatorname{Box}(\gamma_1,\gamma_2,\gamma_3)$ to the adjacent parallelepiped by moving along the canonical diagram in the `Eastern' direction (the direction $1$ in Fig. 6). Then the movement takes place along the edge with label $(x_2>x_3)$, and the adjacent sector (see Fig. 6 on the right) is denoted by $(\gamma_2,\gamma_3)$. This means that such a passage is possible only if $x_2>x_3$, and the adjacent parallelepiped has the form $\operatorname{Box}(\gamma_1',\gamma_2,\gamma_3)$. In particular, this implies that there exist $8$ types of local structure of vertices of the canonical diagram (of the two radii that cut out each of the three grey sectors, exactly one is chosen) (see Fig. 7).

The adjacent sectors in Fig. 6 on the left have labels $x^\downarrow$ and $y^\uparrow$; therefore, the linear dimensions of the parallelepiped $\operatorname{Box}(\gamma_1',\gamma_2,\gamma_3)$ compared with $\operatorname{Box}(\gamma_1,\gamma_2,\gamma_3)$ are smaller in the first coordinate, larger in the second, and the same in the third.

By choosing in a special way the orientation and colouring of the edges, one can introduce the structure of Schnyder trees on the Voronoi graph (see [99]). This makes it possible to depict any finite subgraph of $\operatorname{Vor}(S)$ in the form of a canonical diagram with preservation of the mutual disposition of the vertices (for any two convex vertices of the surface $\mathscr{P}(S)$, their coordinates in space are connected by the same inequalities as the coordinates of the corresponding vertices of the canonical diagram on the plane $x+y+z=0$). More detailed information about Voronoi and Minkowski graphs can be found in [100].

An interesting problem is to find necessary and sufficient conditions for a graph satisfying the obvious properties of a canonical diagram to really be the canonical diagram of the Voronoi graph of some lattice. It is also unknown whether an infinite Voronoi graph can always be depicted in the form of a canonical diagram with preservation of the mutual disposition of the vertices in such way that no limit points appear. One can conjecture that this is always possible at least for the periodic Voronoi graphs corresponding to totally real cubic fields.

For a given lattice $\Lambda$, the process of constructing the three-dimensional continued fraction of the surface $\mathscr{P}(\Lambda)$ consists in successively constructing elements of $\mathscr{P}(\Lambda)$ — the minimal triples of vectors that form the faces of $\operatorname{Min}(\Lambda)$. Some triples can be degenerate (see Theorem 3.1), but such a situation cannot arise too often: faces of $\operatorname{Min}(\Lambda)$ corresponding to degenerate triples cannot be adjacent (see [22], paper 11, Theorem 3.3). To find an initial Minkowski basis, we use methods of Voronoi (see [20], §59). At each of the next steps, we calculate the adjacent triples for a given Minkowski basis. If a triple turns out to be degenerate, then the adjacent triples are already bases (all the transition formulae can be written out explicitly; see [19], and also [22], pp. 402–405). In view of Theorem 3.1, three-dimensional continued fractions can be interpreted as a dynamical system with two-dimensional time, an invariant measure (2.13), and a phase space consisting of the matrices of $D_3(\mathbb{R})\setminus GL_3(\mathbb{R})$ equivalent to the Minkowski matrices (3.3) or (3.4). The algorithm for successively finding the Minkowski bases can also be applied for integer lattices, by passing to infinitesimally close lattices of general position when coordinates coincide.

The construction of three-dimensional continued fractions described above was initially proposed (in different forms) by Voronoi and Minkowski as a tool for finding fundamental units in totally real cubic fields. Corresponding to the rings of integers in such fields are lattices $\Lambda$ for which $\mathscr{P}(\Lambda)$ has a doubly periodic structure; thus, the problem of finding fundamental units reduces to finding minimal periods of $\mathscr{P}(\Lambda)$ (see [20]). For example, let $\theta=2\cos(2\pi/7)$, $\theta'=2\cos(4\pi/7)$, and $\theta''=2\cos(6\pi/7)$ be the roots of the cubic equation $\theta^3+\theta^2-2\theta+1=0$. We can choose the triple $\{1,\theta,\theta^2\}$ as a basis of the ring of integers of the field $\mathbb{Q}(\theta)$. Then to the fundamental units $\theta$, $-1-\theta$ there correspond two independent periods of $\mathscr{P}(\Lambda)$, where $\Lambda$ is the algebraic lattice generated by the vectors $e_j=(\theta^j,(\theta')^j,(\theta'')^j)$ ($j=0,1,2$) (see Fig. 8; the bold dots in the picture denote degenerate minimal triples of vectors).

Figure 8.

Figure 8. The Voronoi graph for the field $\mathbb{Q}(\theta)$, $\theta=2\cos(2\pi/7)$

Standard image

Figure 9 depicts the canonical diagram constructed from the triple of numbers $\theta=2\cos(2\pi/9)$, $\theta'=2\cos(4\pi/9)$, $\theta''=2\cos(6\pi/9)$ which are the roots of the cubic equation $\theta^3-3\theta+1=0$. A basis of the ring of integers of the field is given by $\{1,\theta,\theta^2\}$, and $\theta$, $1-\theta$ are fundamental units.

Figure 9.

Figure 9. The Voronoi graph for the field $\mathbb{Q}(\theta)$, $\theta=2\cos(2\pi/9)$

Standard image

The two examples above are interesting in that the numbers $2\cos(2\pi/7)$ and $2\cos(2\pi/9)$ are the beginning of a three-dimensional analogue of the Markov spectrum (see [101], [102], and also the isolation theorems in [103], [104]); in the classical theory of continued fractions the corresponding numbers are

For further development of the Voronoi and Minkowski algorithms see [105]–[112].

We also mention that for Voronoi–Minkowski three-dimensional continued fractions it is possible to prove an analogue of Vahlen's theorem (see [94], [95], [113]–[115]). Concerning applications of the theory of local minima see [29], [30], [116]–[122].

Information about other multidimensional generalizations of continued fractions can be found in [123].

3.4. Statement of the main result

A three-dimensional analogue of the problem of Gauss statistics for finite continued fractions is the question of the statistical properties of the Minkowski bases described in Theorem 3.1. The question of the behaviour on average of elements of Klein polyhedra reduces to the calculation of Minkowski matrices with certain additional restrictions or to the calculation of matrices with similar properties (see [27]–[30]). We confine ourselves to the consideration of Minkowski bases on totally primitive lattices (see the definition in §2.3), since this leads to a more simple and natural answer.

For a matrix $X$ of the form (1.8) let $a$, $b$, and $c$ denote the maximal absolute values of elements in the rows of $X$:

Equation (3.6)

Let $X'$, $X''$, $X'''$ denote the following matrices:

Equation (3.7)

In particular, if $X$ is a basis matrix of the form (3.3) or (3.4), then the matrices $X'$, $X''$, $X'''$ have the form

respectively, where $|\alpha_i|$, $|\beta_i|$, $|\gamma_i|\leqslant 1$.

For an arbitrary matrix set $M$ let $M(P)$, $M'$, $M''$, and $M'''$ denote the following sets:

Let $\mathscr{M}$ be the set of Minkowski basis matrices of fixed type (I or II) with fixed signature and integer coefficients. It follows from Theorem 3.1 that in the calculation of matrices in the set $\mathscr{M}$ it is sufficient to confine oneself to the cases when

Equation (3.8)

where $x_i=|a_i|$, $y_i=|b_i|$, $z_i=|c_i|$ satisfy the inequalities of Theorem 3.1.

As noted above, the main result of the paper is a three-dimensional generalization of equation (2.8), which is an asymptotic formula for the mean value of the Gauss–Kuz'min statistics of finite continued fractions. We define three-dimensional Gauss–Kuz'min statistics as follows. Let us fix a tuple of real numbers $(\xi_2,\xi_3,\eta_1,\eta_3, \zeta_1,\zeta_2)\in(0,1]^6$ and consider the parallelepiped

Equation (3.9)

where $I(\xi_3)=[-\xi_3,0]$ for matrices of type I, and $I(\xi_3)=[0,\xi_3]$ for matrices of type II. Then the three-dimensional Gauss–Kuz'min statistics (corresponding to the matrices in $\mathscr{M}$ of given type and signature) for a lattice $\Lambda\subset\mathbb{Z}^3$ are defined to be sums of the form

where $\boldsymbol{\alpha}=(\alpha_2,\alpha_3)$, $\boldsymbol{\beta}=(\beta_1,\beta_3)$, $\boldsymbol{\gamma}=(\gamma_1,\gamma_2)$, and $\alpha_i=\alpha_i(X)$, $\beta_i=\beta_i(X)$, $\gamma_i=\gamma_i(X)$ are found from (3.7) and (3.8). Under this approach, a three-dimensional analogue of the sum on the left-hand side of equation (2.8) is the quantity

Equation (3.10)

Henceforth the symbol $\#$ means that the sum is taken over totally primitive matrices $X$, that is, over matrices satisfying the conditions (2.9)–(2.11).

Theorem 3.3.  For any positive integer $P$ and any real $\varepsilon>0$

Equation (3.11)

where $\mathscr{Q}_2(x)$ is a polynomial of second degree with leading coefficient

A detailed scheme of the proof of Theorem 3.3 is given in §5.3.

§ 4. Two-dimensional case as a model problem

4.1. Statement of the problem

We write Voronoi matrices in the form

Let $\mathscr{V}$ denote the set of all primitive Voronoi matrices:

Let $a$ and $b$ (within §4) denote the maximal absolute values of the elements in the rows of the matrix $A$:

Here $ab\leqslant P\leqslant 2ab$. For a matrix $A=\begin{pmatrix} a_1 & a_2\\ b_1 & b_2\end{pmatrix}$, let $A'$ and $A''$ denote the matrices

Equation (4.1)

The corresponding sets are denoted by $\mathscr{V}'$ and $\mathscr{V}''$:

Let us fix a pair of real numbers $\xi$, $\eta\in[0,1]$ and define the rectangle $\Pi=[0,\xi]\times [0,\eta]$. We consider the problem of calculating the quantity

equal to the number of primitive Voronoi matrices with determinant $P$ for which the coefficients of the normalized matrix belong to $\Pi$. A solution to this problem is given by the following theorem.

Theorem 4.1.  Let $P$ be a positive integer and $\varepsilon>0$ a real number. Then

Equation (4.2)

where

The remainder term in Theorem 4.1 is worse than the remainder term in Porter's result (1.11). This is due to the fact that the proof of Theorem 4.1 is simpler: instead of estimates of trigonometric sums by van der Corput's method, this proof uses the idea of approximating the boundaries of domains by step-functions. The proof of Theorem 3.3 (a three-dimensional analogue of Theorem 4.1) is based on the same approach. Below, all the main steps are briefly described in order to sketch the scheme of proof of the main result.

4.2. Division into cases

We divide the set of all Voronoi matrices into two parts:

Correspondingly, the quantity $N(P)$ can be represented in the form $N(P)=N_1(P)+N_2(P)$, where the definition of $N_\ell(P)$ ($\ell=1,2$) is obtained from the definition of $N(P)$ by imposing the additional condition $A\in\mathscr{V}_\ell$. To prove Theorem 4.1 it suffices to verify the asymptotic formula

Equation (4.3)

The proof of (4.3) for $\ell=1$ will imply that if the non-strict inequality $x_1\leqslant y_2$ in the definition of $N_1(P)$ is replaced by the strict inequality $x_1>y_2$, then only the form of the constant $c_0^{(1)}(\Pi)$ changes in (4.3). The map

establishes a one-to-one correspondence between the matrices in the set $\mathscr{V}_1$ for which $x_1<y_2$ and the matrices in $\mathscr{V}_2$. Therefore, to prove (4.3) for $\ell=2$ (and thus Theorem 4.1) it suffices to verify this equation for $\ell=1$. In what follows we assume that $\ell=1$.

Let $\mathscr{V}_\ell(a,P)$ denote the set of matrices $A\in\mathscr{V}_\ell(P)$ for which $x_1=a$. The sets $\mathscr{V}'_\ell(a,P)$ and $\mathscr{V}''_\ell(a,P)$ are defined by analogy with $\mathscr{V}'$ and $\mathscr{V}''$.

To verify (4.3) we first prove an asymptotic formula for $N_\ell(a,P)=|\mathscr{V}_\ell(a,P)|$. We do this in two different ways, first by elementary considerations, and second by using estimates of Kloosterman sums.

4.3. Linear parametrization of solutions

If we fix numbers $a_1$ and $a_2$ with $(a_1,a_2)=1$, then we can find integers $\widetilde{x}_1$ and $\widetilde{x}_2$ such that

Equation (4.4)

Thus, all the solutions of the equation

with respect to the unknowns $b_1$, $b_2$ admit the linear parametrization

Equation (4.5)

where $u\in\mathbb{Z}$ and $\widetilde{b}_{i}=P\widetilde{x}_{i}$. It follows from the equalities

that a solution obtained by the formula (4.5) defines a primitive matrix $A$ if and only if $(u,P)=1$.

4.4. First variant of estimation of the remainder term

We obtain an asymptotic formula for $N_\ell(a,P)$ based on elementary considerations. It follows from the equality $y_2=(P-x_2y_1)/x_1$ that the conditions $y_1\leqslant y_2$ and $x_1\leqslant y_2$ characterizing the set $\mathscr{V}_\ell(a,P)$ (recall that we consider only the case $\ell=1$) can be written in the form $y_1\leqslant f(x_2)$, where

Equation (4.6)

Furthermore,

Equation (4.7)

and (under the condition that $f(t)=f_2(t)$)

Equation (4.8)

Using the linear parametrization (4.5), we find that

Equation (4.9)

Passing to the variable $\beta=y_1/y_2$ and using the equalities

we rewrite (4.9) in the form

Summing the last equation over $x_2$ and passing to the variable $\alpha=x_2/x_1$, we find that

Equation (4.10)

where $\rho_0(a)\ll a^{-2+\varepsilon}$.

Remark 4.2.  Both sides of (4.9) are estimated as $O(Pa^{-2})$. This enables us to obtain, in particular, the following asymptotic formula with a trivial estimate of the remainder:

Equation (4.11)

4.5. Second variant of estimation of the remainder term

The second approach to the calculation of $N_\ell(a,P)$ consists in counting the number of solutions of the congruence $xy+P\equiv 0\pmod{a}$ that lie below the graph of some monotonic function. We approximate this domain by rectangles, and in every rectangle we reduce the problem to estimates of Kloosterman sums.

Proposition 4.3.  Let $a> 0$ and let

Then the asymptotic formula

Equation (4.12)

holds, where

Proposition 4.3 is proved by standard methods (see, for example, Theorem 3 in [24]; a generalization to the case of an arbitrary linear function can be found in [124]). It follows from the formula (4.12) that for an arbitrary non-negative function $f$ such that $Z_2\leqslant f(x)\leqslant Z_2+V$ ($V\ll Z_2$) for $x\in [Y_1,Y_1+Z_2)$ the following asymptotic formula holds:

Equation (4.13)

We apply this formula to the function $f$ defined by (4.6). For this we choose a positive integer $r\leqslant a$ and represent the interval $[0,a]$ in which $x_2$ varies in the form

where $m\ll \log a$, $I(0)=W_0=[0,a/r]$,

We represent the quantity $N_\ell(a,P)$ in the form

where the definition of $N_{\ell,k}(a,P)$ is obtained from the definition of $N_\ell(a,P)$ by imposing the additional condition $x_2\in W_k$. By approximating $f$ on each interval $I(j)$ by constant functions one can prove that

Equation (4.14)

For this it is sufficient to use the formula (4.11) for $k=0$ and to sum equation (4.13) over the intervals $I(j)\subset W_k$ for $k> 0$. By (4.7) and (4.8) we have $|f'(x)|\ll rP\times 2^{-k}a^{-2}$ on each of these intervals. The values of $f$ vary within an interval of length $V=O(P\cdot 2^{-k}a^{-1})$, and this leads to a remainder $O(Z_1V/a)=O(P\cdot 2^{-k}a^{-1}r^{-1})$. It remains to take into account that the number of intervals $I(j)\subset W_k$ is $O(2^{k})$.

We sum (4.14) over $k$ and pass to the variables $\beta=y_1/y_2$ and $\alpha=x_2/x_1$:

The value $r=[P^{1/2}a^{-3/4}]$ is chosen based on the relation $P/(ra)\asymp ra^{1/2}$. The requirement $r\leqslant a$ holds under the condition that $a\geqslant P^{2/7}$. As a result we obtain an asymptotic formula for $N_\ell(a,P)$ with a second version of the remainder:

Equation (4.15)

4.6. Estimation of the total remainder

Comparing the formulae (4.10) and (4.15), we find that it makes sense to use the first one for $a\leqslant P^{2/5}$ and the second for $a> P^{2/5}$. Thus, for the sum of all remainder terms we obtain the estimate

Equation (4.16)

4.7. Calculation of the principal term

If $a>\sqrt{q}$ , then the set $\mathscr{V}_\ell(a,P)$ becomes empty. Nevertheless, we can impart a meaning to the formula (4.10) by setting $\rho_0(a)=0$. Under this convention, it follows from the estimate $\rho_0(a)\ll a^{-2+\varepsilon}$ that

As noted above, the condition $\alpha_1\leqslant \beta_2$, which is satisfied by the matrices in the set $\mathscr{V}''_\ell(a,P)$, is equivalent to the inequality $a^2\leqslant P/\det A''$. Therefore, the relations (4.10), (4.15), and (4.16) imply that

The condition $a\leqslant y_2$, which must be satisfied by the matrices in $\mathscr{V}_\ell(a,P)$, is not invariant under the left action of $D_3(\mathbb{R})$. When we pass to the set $\mathscr{V}_\ell''(a,P)$, this condition takes the form $a^2\leqslant P/\det A''$, that is,

Thus,

Applying the relation

to the inner sum, we arrive at the required formula (4.3).

§ 5. Division into cases

In the two-dimensional case the set $\mathscr{V}$ was divided into two subsets $\mathscr{V}_1$ and $\mathscr{V}_2$ which were almost the same (see §4.2). In the three-dimensional case the division has a more complicated structure and essentially uses geometric properties of Minkowski bases.

5.1. Reduced matrices

Let $X$ be a Minkowski basis matrix of the form (1.8) and let $a,b,c$ be defined by (3.6). Then $P\asymp abc$. Indeed, on the one hand, the obvious inequality $P\leqslant 6 abc$ holds. On the other hand, $P\geqslant abc$ by Minkowski's convex body theorem. In particular, the inequality $P\geqslant abc$ means that among the minors corresponding to the elements of any row of $X$, there always is at least one that has maximal possible order. For example,

An important role in the arguments below will be played by division of the set of all Minkowski basis matrices into charts — subsets on which the Linnik–Skubenko reduction will be conducted. In every chart the position of the minor of maximal order (for every row) will be fixed.

Definition 5.1.  Let $A=\begin{pmatrix} a_1 & a_2\\ b_1 & b_2\end{pmatrix}$, let $q=\det A\ne 0$, and let $X$ be the matrix of a Minkowski basis of the form (1.8). The matrix $X$ is said to be reduced if it satisfies the following conditions.

  • (1)  
    $a_1=a$.
  • (2)  
    $q\geqslant ab/4$.
  • (3)  
    The basis $(e_1,e_2)$ of the lattice $\Lambda=\langle e_1,e_2\rangle$ is close to a minimal basis in the following sense: one of the bases $(e_1,e_2)$, $(e_1,e_1+e_2)$, $(e_2,e_1\pm e_2)$ is a Voronoi basis of $\Lambda$.

In other words, the properties 1 and 2 mean that in a reduced matrix the corner minor $q$ and the corner element $a_1$ have greatest possible values with respect to the order. The property 3 means that the matrix $A$ can also be reconstructed from the lattice $\Lambda$ with basis $(a_1,b_1)$, $(a_2,b_2)$ almost uniquely: the number of Voronoi bases in a lattice with determinant $q$, like the length of the continued fraction expansion of the number $d/q$, can be estimated as $O(\log(q+1))$, and therefore the number of possible matrices $A$ for the given lattice $\Lambda$ is bounded by a quantity $O(\log(q+1))$.

We denote by $g_1,\dots,g_6$ the elements of the group $G_3$ permuting the coefficients $a,b,c$ of the matrix $X$ (and preserving the diagonal dominance of $X$).

Lemma 5.2.  The set $\mathscr{M}$ can be partitioned into finitely many subsets in such a way that:

(i) every set of the partition is defined by a finite set of inequalities that are invariant under the left action of $D_3(\mathbb{R})$;

(ii) if $\widetilde{\mathscr{M}}$ is one of the sets of the partition, then for any $i=1,\dots,6$ at least one of the sets $g_i\bigl(\widetilde{\mathscr{M}}\,\bigr)$ or $g_i\bigl(\widetilde{\mathscr{M}}\,\bigr)\begin{pmatrix} 1 & 0 & 0\\ 0 & 0 &1\\0 & 1 & 0\end{pmatrix}$ consists of reduced matrices.

Proof.  By Theorem 3.1, it is sufficient to construct a required partition for the sets (3.3) and (3.4). In each of them we add the additional partition by all hyperplanes of the form $x_i=x_j$, $y_i=y_j$, $z_i=z_j$ ($i\ne j$). Consider an arbitrary set in the resulting partition. If it consists of matrices of type I, then all its elements satisfy the conditions 13 (the matrix $A=\begin{pmatrix} a_1 & a_2\\ b_1 & b_2\end{pmatrix}$ defines a Voronoi basis, since $b_1\leqslant 0$).

Suppose that the set under consideration consists of matrices of type II. After the rows of the matrix $X$ are ordered by increase of maximal elements, there arise three variants of sign arrangements. Besides the case (3.4), there are also the following two possible cases:

Equation (5.1)

Equation (5.2)

In the matrices (3.4) and (5.1), the element $b_1$ is negative. Hence, as for matrices of type I, we have $q=x_1y_2+x_2y_1\geqslant x_1y_2=ab$ and the basis $(e_1,e_2)$ is a Voronoi basis. For the matrices (5.2) we implement in addition the subpartition by the planes $a_2=a_1/2$ and $b_1=b_2/2$ and consider two cases:

  • 1)  
    $b_1> 0$ and ($a_2\leqslant a_1/2$ or ($a_2>a_1/2$, $b_1\leqslant b_2/2$));
  • 2)  
    $b_1> 0$, $a_2>a_1/2$, $b_1> b_2/2$.

In the first case, $q=x_1y_2-x_2y_1\geqslant x_1y_2/2=ab/2$, and for $a_2\leqslant a_1/2$ we can choose as a Voronoi basis the pair $(e_1-e_2,e_2)$ with the matrix $\begin{pmatrix} x_1-x_2 & x_2\\ y_1-y_2 & y_2\end{pmatrix}$, while for $b_1\leqslant b_2/2$ we can choose the pair $(e_1,e_2-e_1)$ with the matrix $\begin{pmatrix} x_1 & -(x_1-x_2)\\ y_1 & y_2-y_1 \end{pmatrix}$.

In the second case, we transpose the second and third columns in $X$:

If the condition $y_3\geqslant y_1$ holds in the matrix $\begin{pmatrix} x_1 & x_3\\ -y_1 & y_3\end{pmatrix}$, then the basis of $\Lambda=\langle e_1,e_2\rangle$ consisting of the vectors $e_1=(x_1,-y_1)$ and $e_2=(x_3,y_3)$ is a Voronoi basis and $q=x_1y_3+x_3y_1\geqslant x_1y_1\geqslant ab/2$. In the remaining case ($y_3<y_1$, $x_3\geqslant x_2$) we can choose as a Voronoi basis the pair $(e_2,e_2-e_1)$ with the matrix $\begin{pmatrix} x_3 & -(x_1-x_3)\\ y_3 & y_1+y_3 \end{pmatrix}$. Here $q=x_1y_3+x_3y_1\geqslant y_1x_2\geqslant ab/4$. $\square$

Thus, with each matrix $A$ we can associate a Voronoi basis of the lattice $\Lambda$, and to each Voronoi basis there correspond at most four matrices $A$.

Remark 5.3.  It suffices to prove Theorem 3.3 after replacing in its statement (and in the definition (3.10) of the quantity $\mathscr{N}_\Pi(P)$) the set $\mathscr{M}$ by an arbitrary set $\widetilde{\mathscr{M}}$ of the partition constructed in Lemma 5.2. Then the set $\widetilde{\mathscr{M}}$ can be represented in the form

Equation (5.3)

where $g_\ell\in G_3$, and each of the sets $\mathscr{M}_\ell$ consists of reduced matrices satisfying the conditions $a\leqslant b\leqslant c$ (in the definition of $\mathscr{M}_\ell$, the non-strict inequalities between $a$, $b$, and $c$ can be replaced by strict inequalities, so that the sets $g_\ell(\mathscr{M}_\ell)$ are pairwise disjoint).

Remark 5.4.  The three-dimensional Gauss measure is invariant under the left action of $D_3(\mathbb{R})$. In particular, for $\beta_1'=\beta_1/\beta_3$, $\beta_2'=1/\beta_3$, $\gamma_1'=\gamma_1/\gamma_2$, and $\gamma_3'=1/\gamma_2$ we have

Thus, the measure of the set $D_3(\mathbb{R})\setminus\mathscr{M}$ is independent of whether the second and third columns of the matrix $X$ were transposed or not.

Remarks 5.3 and 5.4 imply that to prove Theorem 3.3 it suffices to verify the following assertion.

Theorem 5.5.  Let $\mathscr{M}_\ell$ be one of the sets of the partition (5.3) and let $\Pi$ be the parallelepiped defined by equation (3.9). Then

Equation (5.4)

where $\mathscr{Q}_2^{(\ell)}$ is a polynomial of second degree with leading coefficient

Equation (5.5)

To simplify the exposition, we conduct the proof of Theorem 5.5 under the assumption that all the parameters $\xi_2,\xi_3,\eta_1,\eta_3,\zeta_1,\zeta_2$ defining the dimensions of $\Pi$ are equal to 1. In the general case the arguments will be the same.

5.2. Properties of the constructed partition

Lemma 5.6.  Suppose that $X$ is a reduced matrix. Then

Proof.  The assertion of the lemma follows from the inequalities $q\leqslant 2ab$ and $abc\leqslant P\leqslant 6abc$ and the property 2 of reduced matrices. $\square$

Lemma 5.7.  The partition constructed in Lemma 5.2 has the following additional properties: any of the sets $\mathscr{M}_\ell(a,q,P)$ is defined by finitely many inequalities of the form $\pm c_2\leqslant f_i(a_2,a_3,b_1,b_3,c_1)$ ($1\leqslant i\leqslant i_0$) each of which acts over the corresponding domain $\Omega_i=\Omega_i(a_2,a_3,b_1,b_3,c_1)$. Furthermore, $f_i\ll P/q$ and

Equation (5.6)

where $U=|a_1b_3-a_3b_1|$.

Proof.  Consider Minkowski matrices of type I. Obviously, the estimate $f_i\ll P/q$ always holds, since $c_2\leqslant c\asymp P/q$ for reduced matrices (see Lemma 5.6). We consider successively all the functions that can define the limits of variation of $z_2$ for the matrices $X$ of the form (3.3) with $\det X=P$. These are the functions $f_{i}$ ($i=1,\dots,4$) that are defined, respectively, by the conditions $z_1=z_2$, $z_1=z_3$, $z_2=z_3$ (arising in the initial partition of the set $\mathscr{M}$), and $z_3=y_2$ (the part of the boundary appearing because of the inequality $b\leqslant c$). For the first function $f_1=z_1$ we have

The other functions are found from the equation $\det X=P$:

(If a function $f_i$, where $i=1,\dots,4$, defines the boundary of the domain of variation of $c_2$, then, as noted above, $f_i\ll P/q$, and therefore the denominator $U=|a_1b_3- a_3b_1|$ in such cases is non-zero.) If $f_i=F_i/G_i\ll P/q$, then

For all the functions under consideration we have

Therefore, to verify the assertion of the lemma it is sufficient to show that $|G_i|\gg U$. For $f_2$ and $f_4$ this is obvious, and for $f_3$ it follows from the inequalities on the elements of the Minkowski matrix:

(for $x_2\geqslant x_3$ this follows from the inequality $x_2y_1-x_3y_1\geqslant0$, for $y_3\geqslant y_1$ it follows from the inequality $x_1y_3-x_3y_1\geqslant 0$, but for $z_1\geqslant z_2$ the function $f_2$ cannot define the boundary of the domain of variation of $c_2$, since then the inequality $z_1>z_2$ would hold inside this domain, which contradicts the condition $z_1=z_3$ defining $f_2$).

For matrices of type II, the only difference in the proof of the estimates (5.6) is the need to consider the function $f_5$ defined by the condition $z_1+z_2=z_3$. For this function the conditions (5.6) are verified in the same way as for the other functions. $\square$

5.3. Scheme of proof of the main result

To prove (5.4) we represent the set $\mathscr{M}_\ell(P)$ in the form

where $\mathscr{M}_\ell(a,q,P)$ is the set of matrices $X\in\mathscr{M}_\ell(P)$ in which the values of the corner element $a_1=a$ and the corner minor $\det A=q$ are fixed. Then

Equation (5.7)

In the last equation it is assumed that in the summation over $a_2$, $a_3$, $b_1$, $b_3$, $c_1$, $c_2$ the values of $b_2$ and $c_3$ are determined by the equations $\begin{vmatrix} a_1 & a_2 \\ b_1 & b_2 \end{vmatrix}=q$ and $\det X=P$, respectively.

In the process of the proof we will pass successively (in various orders) from summation over the variables $a_2$, $a_3$, $b_1$, $b_3$, $c_1$, $c_2$ to integration over the variables $\alpha_i=a_ia_1^{-1}$, $\beta_i=b_ib_2^{-1}$, $\gamma_i=c_ic_3^{-1}$. In the end this will enable us to transform the sum in (5.7) into the integral in (5.5).

We define the parameters $\varkappa$ and $\lambda$ by $a=P^\varkappa$ and $q=P^\lambda$. In addition we divide the domain

in which the pairs $(a,q)$ vary into three parts depending on the values of $\varkappa$ and $\lambda$ (see Fig. 10).

Figure 10.

Figure 10. Partition of the domain of variation of the parameters $\varkappa$ and $\lambda$

Standard image

We prove the asymptotic formulae for $\mathscr{N}_\ell(a,q,P)$ in three different ways, depending on which of the domains $\Omega_i$ ($i=1,2,3$) the pair $(a,q)$ belongs to.

1. For small values of $a$ and $b$ (that is, for $(a,q)\in\Omega_1$), equation (1.2) is first solved as a linear equation with respect to $c_1$, $c_2$, $c_3$ (see §6). The solutions are counted by elementary considerations. A reduction is performed to a two- dimensional problem, which is solved by the methods of §4.

2. If the values of $a$, $b$, $c$ are commensurable (that is, for $(a,q)\in\Omega_2$), then the matrix $A=\begin{pmatrix} a_1 & a_2\\ b_1 & b_2 \end{pmatrix}$ is fixed and equation (1.2) can be solved with respect to the unknowns $a_3$, $b_3$, $c_1$, $c_2$, $c_3$ (see §7). The solutions are counted by using a non-linear Linnik–Skubenko parametrization and estimates of Kloosterman sums. Again a reduction is performed to a two-dimensional problem, which is also solved by using the methods of §4.

3. In the remaining case, when $a$ is small while $b$ and $c$ are large (that is, $(a,q)\in\Omega_3$), equation (1.2) is solved with respect to the unknowns $b_3$, $c_1$, $c_2$, $c_3$ (see §8). The reduction to a two-dimensional problem is effected by using a special version of the Linnik–Skubenko reduction which preserves the value of $a_3$, and this two-dimensional problem is solved by elementary methods.

Correspondingly, the quantity $\mathscr{N}_\ell(P)$ is represented in the form

Equation (5.8)

where

Equation (5.9)

For each case we indicate the order of the transitions from summation to integration and the assertions in which these transitions are effected.

1. $(c_1,c_2)$, Lemmas 6.2, 6.3, and 6.4; $b_3$, Lemma 6.7; $b_1$, Lemma 6.9; $(a_2,a_3)$, Lemma 9.13; $q$, Proposition 9.14.

2. $(c_1,c_2,a_3,b_3)$, Theorem 7.1 and Corollary 7.3; $(a_2,b_1)$, Proposition 7.7; $q$, Proposition 9.12.

3. $(c_1,c_2,b_3)$, Theorem 8.1 and Corollary 8.2; $b_1$, Proposition 8.3; $(a_2,a_3)$, Lemma 9.13; $q$, Proposition 9.14.

After this, the three results obtained are substituted into equation (5.8). The last transition, from summation to integration with respect to the variable $a_1$, is implemented in the proof of Theorem 5.5.

After the transition to matrices in which some coefficients become real numbers, the symbol $\#$ will mean that instead of the total primitivity conditions (2.9)–(2.11) only those necessary restrictions hold that make sense (that is, those in which only integers occur):

5.4. Different versions of Kloosterman sums

For the Kloosterman sums (1.6) we know the estimate

Equation (5.10)

($\tau(q)$ is the number of divisors of $q$), proved by Weil [125] for prime $a$ and extended to arbitrary $a$ by Estermann [126].

The trigonometric sums (1.5) are responsible for the distribution of the solutions of equation (1.3) (and the equivalent congruence (1.4)). For these sums it is convenient to use the estimate

Equation (5.11)

(see [44], Lemma 1), which generalizes the inequality (5.10).

As noted above, in the proof of the main result of this paper the reduction to the two-dimensional case is performed in three different ways. In the first two cases it is necessary to study solutions of the equation (1.3) (the congruence (1.4)) under the additional conditions $(a,x,y,z)=1$ or $(a,x)=1$. The corresponding trigonometric sums are defined by

Equation (5.12)

Equation (5.13)

Estimation of the sums (5.12) and (5.13) reduces to the estimate (5.11) (see [24], Lemma 1–3). In both cases this makes it possible to prove the uniform distribution of the solutions of the congruence $xy+q\equiv 0\pmod{a}$ with the corresponding restrictions (see Propositions 6.8 and 7.4 below). For brevity we use the notation $K^{\times}_a(q)=K^{\times}_a(0,0,q).$

§ 6. First variant of estimation of the remainder

In this section the reduction to the two-dimensional case is performed by elementary considerations. For the two-dimensional case similar arguments were described in §§4.3 and 4.4. In the transition to integration in the second row of the matrix $X$, standard `two-dimensional' methods are used, based on estimates of Kloosterman sums (see §4.5).

6.1. Linear parametrization of solutions

Lemma 6.1.  A matrix $A=\begin{pmatrix} a_1 & a_2\\ b_1 & b_2 \end{pmatrix}$ with determinant $q\ne0$ can be supplemented to form a matrix

Equation (6.1)

satisfying the condition

Equation (6.2)

if and only if $(a_1,a_2,b_1,b_2)=1$.

For an arbitrary matrix $\overline A $ satisfying the condition (6.2) there are integers $c_1$, $c_2$, $c_3$ such that

Equation (6.3)

See the proof in [23].

Lemma 6.2.  If the matrix (6.1) satisfies the condition (6.2) and $(\widetilde{c}_1,\widetilde{c}_2,\widetilde{c}_3)$ is a particular solution of equation (6.3), then the following assertions hold.

(i) All the solutions of the equation

Equation (6.4)

with respect to the unknowns $c_1,c_2,c_3$ have the form

Equation (6.5)

(ii) The formula (6.5) produces different solutions $(c_1,c_2,c_3)$ for different pairs $(u,v)$.

(iii) A solution $(c_1,c_2,c_3)$ obtained by the formula (6.5) defines a totally primitive matrix (1.8) if and only if $(u,P)=(v,P)=1$.

Proof.  (i) An arbitrary integer vector $c=(c_1,c_2,c_3)$ can be represented as a linear combination of the vectors $a=(a_1,a_2,a_3)$, $b=(b_1,b_2,b_3)$, and $\widetilde{c}=(\widetilde{c}_1,\widetilde{c}_2,\widetilde{c}_3)$ with integer coefficients. It follows from (6.4) that the coefficient of $\widetilde{c}$ must be equal to $P$. This proves that representation in the form (6.5) is possible.

(ii) It follows from condition (6.2) that the vectors $a$ and $b$ are linearly independent. Therefore, different solutions $(c_1,c_2,c_3)$ correspond to different pairs $(u,v)$.

(iii) The last assertion of the lemma is verified using the formulae (6.5) and (6.2):

Equation (6.6)

Equation (6.7)

Thus, the matrix $X$ is totally primitive if and only if $(u,P)=(v,P)=1$. $\square$

Suppose that $\Omega$ is a planar domain with rectifiable boundary. Let $\operatorname{Area}(\Omega)$ denote the area of this domain, let $\mathscr{P}(\Omega)$ be its perimeter, and let $N(\Omega)$ be the number of points of the lattice $\mathbb{Z}^2$ lying inside $\Omega$. For a convex domain Jarnik's inequality is known:

For an arbitrary simply connected planar domain $\Omega$ with rectifiable boundary we have

Equation (6.8)

(see, for example, [67], Lemma 1).

Lemma 6.3.  Let $P$ be a positive integer, let $\Omega$ be a simply connected planar domain with rectifiable boundary such that $\mathscr{P}(\Omega)\gg 1$, and let

Then

Proof.  By the Möbius inversion formula,

The required asymptotic formula is obtained if in the inner sum we perform the change of variables $u=d_1u'$, $v=d_2v'$, use the inequality (6.8) in the new variables $u'$, $v'$, and estimate the perimeter of the diminished copy of the domain $\Omega$ as $O(\mathscr{P}(\Omega))$:

6.2. Transition to integration in the third row

We consider the set of matrices $X\in\mathscr{M}_\ell(a,q,P)$ for which the matrix $\overline A=\begin{pmatrix} a_1 & a_2 & a_3\\ b_1 & b_2 & b_3\end{pmatrix}$ is fixed. Since $q=\begin{vmatrix} a_1 & a_2\\ b_1 & b_2\end{vmatrix}\ne 0$, the value of $c_3$ is uniquely expressible in terms of $c_1$ and $c_2$. Therefore, any conditions imposed on the variables $c_1,c_2,c_3$ can be written in the form $(c_1,c_2)\in\Omega(\overline A)$.

Lemma 6.4.  Suppose that $\overline A$ satisfies the condition (6.2), $\Omega$ is a convex domain, and

Then

Proof.  By construction the set $\mathscr{M}_\ell(a,q,P)$ consists of matrices of the form (3.3), (3.4), (5.1), or (5.2). Hence, the variables $c_1,c_2,c_3$ must satisfy some conditions in the following list:

Thus, the domain of variation of the variables $c_1,c_2$ is contained in a convex polygon with linear dimensions estimated as $O(c)=O(P/q)$ by Lemma 5.6. It follows from equation (6.5) that the linear dimensions of the corresponding polygon on the plane $Ouv$ are $O(bP/q^2)$. By Lemma 6.3,

Corollary 6.5.  Assume the hypotheses of Lemma 6.4. Then

Equation (6.9)

Furthermore, for fixed $\widetilde{a}_3$ and $\widetilde{b}_3$

Equation (6.10)

Proof.  To verify the estimate (6.9) it suffices to use a trivial estimate for the integral in Lemma 6.4. Equation (6.10) follows from the fact that every term in it is $O(P^{2+\varepsilon}q^{-3})$. $\square$

6.3. Transition to integration in the second row

By the norm of a function we always mean the $L^\infty$-norm.

Lemma 6.6.  Let $a$ and $D$ be positive integers with $D\mid a$. Suppose that a function $f$ on an interval $I$ has finitely many monotonicity parts. Then the following asymptotic formulae hold:

Equation (6.11)

Equation (6.12)

Equation (6.13)

See the proof in [24], Lemma 4.

Lemma 6.7.  Let $A=\begin{pmatrix} a_1 & a_2\\ b_1 & b_2 \end{pmatrix}$, $q=\det A>0$, $(a_1,a_2,b_1,b_2)= 1$, $D=(a_1,a_2)$, $(D,a_3)=1$, and

for all $n$ in an interval $I$. Then for the sum

the following asymptotic formula holds:

Proof.  Suppose that the matrix $A$ can be reduced to the form $\begin{pmatrix} D & 0\\ \alpha & qD^{-1}\end{pmatrix}$ by elementary transformations of columns. Then the condition (6.2) is equivalent to the equality

Equation (6.14)

Since $(D,a_3)=1$, we have $\bigl(q,(q/D)a_3\bigr)=(q/D)(D,a_3)=q/D$, and (6.14) means that $\biggl(\dfrac{q}{D}\,, \begin{vmatrix} D &a_3 \\ \alpha &n\end{vmatrix}\biggr)=1$. Therefore,

It follows from the condition $\delta\mid(Dn-\alpha a_3)$ that $(\delta,D)\mid\alpha a_3$. By the hypothesis of the lemma, $(D,a_3)=1$, and therefore $(\delta,D)\mid \alpha$. Furthermore, $(D,q/D,\alpha)=(a_1,a_2,b_1,b_2)=1$. Thus, it follows from the relations $(\delta,D)\mid \alpha $ and $\delta\mid q/D$ that $(\delta,D)=1$. For $(\delta,D)=1$ the congruence $Dn-\alpha a_3\equiv 0\pmod{\delta}$ is equivalent to the condition $n\equiv n_0\pmod{\delta}$, where $n_0\equiv \alpha a_3 D^{-1}\pmod{\delta}$. To complete the proof of the lemma it remains to use (6.11):

If $(x,a)=1$, then every interval $[Y,Y+a)$ contains exactly one solution of the congruence $xy+q\equiv 0\pmod{a}$ with respect to the unknown $y$. Hence, for an arbitrary function $G$ defined on the rectangle $[Y_1,Y_1+Z_1)\times [Y_2,Y_2+Z_2)$, a natural approximation for the sum

is given by the sum

Proposition 6.8.  Let $G$ be a non-negative function and suppose that for any $z$ with $0\leqslant z\leqslant\|G\|$ the inequality $G(x,y)\leqslant z$ defines in a rectangle $I=[0,a]\times [0,Z_2]$ (where $Z_2\ll q/a$) the domain $\Omega_z=\{(x,y)\in I\colon y\leqslant f_z(x)\}$, and the number of monotonicity parts for all the functions $f_z$ is bounded by an absolute constant. Then

Equation (6.15)

See the proof in [24], Theorem 4.

Lemma 6.9.  Suppose that $a_1=a$ and $D\mid (a,q)$. Then the sum

satisfies the asymptotic formula

Proof.  We transform the indicated sum, introducing the variables $a_1'=a_1D^{-1}$, $a_2'=a_2D^{-1}$, and $a_3'=a_3D^{-1}$ (where $a_3'$ is not necessarily an integer):

We get rid of the condition $(D,b_1,b_2)=1$ by using the Möbius function:

where $b_1'=b_1\delta^{-1}$, $b_2'=b_2\delta^{-1}$, and $b_3'=b_3\delta^{-1}$. We apply Proposition 6.8 to the inner double sum. By the hypothesis of the lemma,

Furthermore, the function $b_1=f(a_2)$ implicitly defined by the equations

can be reduced to the form

Therefore, the graph of $f$ consists of finitely many monotonicity parts, and Proposition 6.8 is indeed applicable. Thus,

We sum the remainders that have appeared. The variable $b_3'$ varies in an interval of length $O\bigl(q/(a\delta)\bigr)$, and the variable $a_3$ in an interval of length $O(a)$. Thus, the sum of the remainders can be estimated by the sum

We transform the sum of the principal terms, passing to the variables $a_2=Da_2'$ and $a_3=Da_3'$:

After this it remains to pass to the variables $\beta_1=b_1/b_2$ and $\beta_3=b_3/b_2$, where $b_2=(q+a_2b_1)a_1^{-1}$. Since

Equation (6.16)

we obtain the required principal term

Proposition 6.10.  Suppose that $q\gg a^2$. Then for the quantity $\mathscr{N}_\ell(a,q,P)$ defined by (5.7) the following asymptotic formula holds:

Equation (6.17)

where $D=(a,a_2)$, $R_1(a,q,P)=a^{-2}qP^{1+\varepsilon}$,

Equation (6.18)

Equation (6.19)

Proof.  It follows from the inequality (5.11) that

Equation (6.20)

We transform the sum $\mathscr{N}_\ell(a,q,P)$, using Lemma 6.4 and the estimate (6.20):

In the double integral we pass to the variables $\gamma_1=c_1c_3^{-1}$ and $\gamma_2=c_2c_3^{-1}$. Since $c_3=P(\det X')^{-1}$, we have

Equation (6.21)

Therefore,

Using Lemma 6.7, we replace the summation over the variable $b_3$ by an integration:

By Lemma 6.9 we arrive at the following equation, which is equivalent to the formula (6.17):

§ 7. Second variant of estimation of the remainder

Let $\lambda_1(\Lambda)$ and $\lambda_2(\Lambda)$ denote successive minima of a two-dimensional lattice $\Lambda$, that is, $\lambda_1(\Lambda)$ is the length of a shortest non-zero vector $e_1\in\Lambda$, and $\lambda_2(\Lambda)$ is the length of a shortest vector $e_2\in\Lambda$ that is linearly independent of $e_1$. Let $\Lambda(A)$ denote the lattice consisting of solutions of the system of congruences

and let

Equation (7.1)

Theorem 7.1.  Suppose that the matrix $A=\begin{pmatrix} a_1 & a_2\\ b_1 & b_2 \end{pmatrix}$ is fixed, $q=\det A\ne 0$, $(a_1,a_2,b_1,b_2)=1$, and the matrix $X$ has the form (1.8). Then for any parallelepiped

with dimensions satisfying the conditions $1\leqslant Z_1, Z_2\leqslant q$ and $1\leqslant Z_3, Z_4\leqslant Pq$,

Equation (7.2)

where

Equation (7.3)

See the proof in [23].

Remark.  Theorem 7.1 is a variant of Lemma 7 in [8]. But in [8] the contribution of the summands containing $\lambda_2(A)$ and $\lambda_2(A^T)$ in the remainder term was not taken into account. Further arguments show that it is these summands that make the main contribution to the final remainder (see the choice of the parameters $r_1$ and $r$ in the proof of Proposition 9.8). Here the estimate $\lambda_2(A)\ll q^{1/2+\varepsilon}$ is used, which is valid only on average with respect to $A$.

Corollary 7.3.  Suppose that the hypotheses of Theorem 7.1 hold, the dimensions of the parallelepiped

satisfy the inequalities $1\leqslant Z_1,Z_2\leqslant q$ and $1\leqslant Z_3\leqslant Pq$, and for $(a_3,b_3,c_1)\in I'$ the values of a non-negative function $f(a_3,b_3,c_1)$ vary within the interval $[Z_4,Z_4+V)$, where $1\leqslant V\ll Z_4\leqslant Pq$. Then

Proof.  Replacing $f(a_3,b_3,c_1)$ in the second sum by $Z_4$ and ${Z_4+V}$, we obtain, respectively, a lower and an upper estimate of the sum being calculated. Applying Theorem 7.1 and using the estimate

we arrive at the assertion of the corollary. $\square$

Proposition 7.4.  Let $a$ be a positive integer, let $q$ be an integer, let $I=[Y_1,Y_1+ Z_1)\times [Y_2,Y_2+Z_2)$, where $Z_1,Z_2>0$, and let

Then

Equation (7.4)

where

Equation (7.5)

See the proof in [24], Theorem 2.

Corollary 7.5.  Suppose that the hypotheses of Proposition 7.4 hold and that for $(x,y)\in I$ the values of a function $f(x,y)$ are contained in the segment $[Z_3,Z_3+V]$, where $Z_3>0$ and $V \ll Z_3$. Then

where $R(Z_1,Z_2)$ is defined by (7.5).

Proof.  Replacing $f(x,y)$ in this sum by $Z_3$ and $Z_3+V$, we obtain, respectively, a lower and an upper estimate for this sum. Using Proposition 7.4, the estimates (6.20), and

we arrive at the assertion of the corollary. $\square$

Corollary 7.6.  Suppose that the hypotheses of Proposition 7.4 hold and for $(x,y)\in I$ the values of a function $f(x,y)$ are contained in a segment $[0,Z_3]$. Then

Proposition 7.7.  For any positive integers $r_1,r_2\leqslant a$

Equation (7.6)

where

and the remainders $R(Z_1,Z_2)$ and $R_A(Z_1,Z_2,Z_3,Z_4)$ are determined by (7.5) and (7.3), respectively.

Remark 7.8.  For the parameter $\lambda_2(A^T)$ defined in (7.1), we use the trivial estimate $\lambda_2(A^T)\ll q/a$. Using the analogous estimate for $\lambda_2(A)$ does not enable us to single out the principal term in Theorem 3.3. In what follows we will prove (see §9.1) that, on average with respect to $A$, the upper estimate $\lambda_2(A)\leqslant q^{1/2+\varepsilon}$ holds and is close to the lower estimate. This is the reason for keeping the summation over $a_2$ and $b_1$ in the remainder term $R_2(a,q,P)$. A similar situation arises in Proposition 8.3.

Proof. of Proposition 7.7 The transition from (7.10) to (7.11) will be realized in two stages. First, by using Corollary 7.3 we pass from summation to integration with respect to the variables $a_3$, $b_3$, $c_1$, $c_2$, and then by using Corollary 7.5 we pass from summation to integration with respect to the variables $a_2$, $b_1$.

The intervals

Equation (7.7)

(their lengths were estimated in Lemma 5.6) in which the respective variables $a_3$, $b_3$, $c_1$ vary are divided into $r_1$ equal intervals, and the intervals $I_\nu=[Y_\nu,Y_\nu+Z_\nu)$ ($\nu=4,5$; $Z_4=a+1$, $Z_5=4q/a+1$) in which $a_2$, $b_1$ vary are divided into $r_2$ equal intervals:

We define a function

Equation (7.8)

It follows from the estimates $0\leqslant H(j_1,j_2;b_1)\ll a$ that the rectangle $I_{1}\times I_{2}$ can be represented in the form $I_{1}\times I_{2}=\bigsqcup_{k=0}^m W_k(b_1)$, where $m\ll \log a$ and

Correspondingly, we write the sum $\mathscr{N}_\ell(a,q,P)$ in the form

Equation (7.9)

where

Equation (7.10)

Then Proposition 7.7 will be proved if we show that

Equation (7.11)

where

Equation (7.12)

Indeed, it follows from (7.11) and (7.9) that

Thus, to obtain (7.6) it remains to pass from the variables $a_2,a_3,b_1,b_3, c_1,c_2$ to the variables $\alpha_2,\alpha_3,\beta_1,\beta_3,\gamma_1,\gamma_2$ using the relations (6.21), (6.16), and

We now estimate the number of rectangles of the partition

Equation (7.13)

in each of the sets $W_k(b_1)$. If $I_{1}(j_1)\times I_{2}(j_2)\subset W_k(b_1)$, then there is a point $(a_3,b_3)\in I_{1}(j_1)\times I_{2}(j_2)$ for which $|a_1b_3-a_3b_1|\leqslant 2^{k+1}q/r_1$. Since $a_1=a$, this inequality defines a strip of width $2^{k+2}q/(ar_1)\ll 2^{k}b/r_1$ on the plane $Oa_3b_3$ (in the direction of the $Ob_3$-axis). Therefore, above each point of the $Oa_3$-axis there are $O(2^k)$ rectangles of the partition (7.13) that have common points with this strip. Consequently, the set $W_k(b_1)$ consists of $O(r_1\cdot 2^k)$ rectangles of the partition.

We represent the sum $\mathscr{N}_{\ell,k}(a,q,P)$ in the form

Equation (7.14)

where

We now prove the asymptotic formula

Equation (7.15)

where $F_k(a_2,b_1)$ is defined by (7.12).

We note that the number of points in each of the parallelepipeds of the partition (7.13) can be estimated as $O(qr_1^{-2})$. Thus, it follows from equation (6.10) (see Corollary 6.5) that for any rectangle $I_{1}(j_1)\times I_{2}(j_2)$ we can use the following formula with a trivial estimate for the remainder term:

Equation (7.16)

For $k=0$ equation (7.15) follows from (7.16). Hence, we assume that $k>0$ in what follows.

For fixed $a_2$ and $b_1$ the pair of variables $(a_3,b_3)$ can vary inside some rectangle $\Pi(a_2,b_1)$, by the definition of the Minkowski matrices (3.3) and (3.4). Let $\Omega_{1,2}(a_2,b_1)$ denote the domain consisting of those rectangles of the partition (7.13) that are completely contained in $\Pi(a_2,b_1)$, and let $\Omega_{1,2}'(a_2,b_1)$ be the domain consisting of the rectangles of (7.13) that have common points with the boundary of $\Pi(a_2,b_1)$. Obviously, $\Omega_{1,2}'(a_2,b_1)$ consists of $O(r_1)$ rectangles of (7.13), and by (7.16) we have

Thus, to prove (7.15) it suffices to verify that

Equation (7.17)

where $\widetilde{W}_{k}(a_2,b_1)=W_k(b_1)\cap\Omega_{1,2}(a_2,b_1)$.

Let us fix an arbitrary rectangle $I_{1}(j_1)\times I_{2}(j_2)\subset \widetilde{W}_{k}(a_2,b_1)$. We apply Corollary 7.3 on each of the parallelepipeds $I_{1}(j_1)\times I_{2}(j_2)\times I_3(j_3)$. Since $(a_3,b_3)\in W_k(b_1)$ ($k>0$), we have $|a_1b_3-a_3b_1|\gg U=2^kq/r_1$, and by Lemma 5.7 each of the functions $f_i$ defining the boundary of the set $\Omega(\overline A)$ satisfies the estimates

Equation (7.18)

Therefore, on each of the parallelepipeds $I_{1}(j_1)\times I_{2}(j_2)\times I_3(j_3)$ the values of the function $f_i$ (under the condition that its graph defines the boundary) are contained in an interval of length $V=O(P\cdot 2^{-k}q^{-1})$. By Corollary 7.3,

We arrive at (7.17) by summing the last equality over $j_3$ and over the rectangles $I_{1}(j_1)\times I_{2}(j_2)\subset \widetilde{W}_{k}(a_2,b_1)$ (the number of which is $O(r_1\cdot 2^k)$). Thus, (7.15) is also proved.

Substituting (7.15) into (7.14), we get that to prove the main formula (7.6) it suffices to verify that

Equation (7.19)

We note that

Equation (7.20)

and therefore $F_k$ always satisfies the trivial estimate

Equation (7.21)

Hence, by Corollary 7.6, in each of the rectangles of the partition

Equation (7.22)

we can apply the formula

Equation (7.23)

The boundary of the domain of variation of the pair of variables $(a_2,b_1)$ is determined by the conditions $b_2\geqslant a_1$ and $b_2\geqslant |b_1|$ (where $b_2=(q+a_2b_1)/a_1$) and therefore consists of parts of the graphs of the functions

Equation (7.24)

Let $\Omega_{4,5}$ denote the domain consisting of those rectangles of the partition (7.22) that are completely contained in the domain of variation of $(a_2,b_1)$, and let $\Omega_{4,5}'$ be the domain consisting of those rectangles of (7.22) that intersect the boundary of the domain of variation of $(a_2,b_1)$. It follows from the monotonicity of the function (7.24) that $\Omega_{4,5}'$ consists of $O(r_2)$ rectangles of (7.22). Using the formula (7.23) in each of them, we get that

Therefore, to prove (7.19) it is sufficient to verify the asymptotic formula

This equation in turn is a consequence of the fact that for any rectangle $I_{4}(j_4)\times I_{5}(j_5)\subset \Omega_{4,5}$ we have

The last relation follows from Corollary 7.5 under the condition that the values of $F_k(a_2,b_1)$ vary within an interval of length

Equation (7.25)

We now verify this condition.

Since $(a_3,b_3)\in W_k(b_1)$, it follows from Lemma 5.7 that each of the functions $f_i$ defining the boundary of the domain $\Omega(\overline A)$ (for $b_2=(q-a_2b_1)a_1^{-1}$) satisfies estimates analogous to (7.18):

Hence, for $(a_2,b_1)\in I_{4}(j_4)\times I_{5}(j_5)$ the area of the domain $\Omega(\overline A)$ can vary within an interval of length $O\biggl(\dfrac{r_1}{2^k}\,\dfrac{P^2}{r_2q^2}\biggr)$. This implies that the values of $F_k(a_2,b_1)$ can vary in an interval of length

which agrees with the estimate (7.25).

The values of $F_k(a_2,b_1)$ can vary also because the domain $W_k$ changes. Suppose that different domains $W_k(b_1)$ correspond to different values of $b_1$. Assume that $b_1,b_1'\in I_5(j_5)$ and $W_k(b_1)\ne W_k(b_1')$. For example, suppose that $I_{1}(j_1)\times I_{2}(j_2)\subset W_k(b_1)\setminus W_k(b_1')$, that is,

Equation (7.26)

where the function $H$ is defined by (7.8). The values of $H$ for fixed $j_1,j_2$ and different $b_1,b_1'$ can differ by $O(r_1/r_2)$. Moreover, $H(j_1,j_2+1;b_1)-H(j_1,j_2;b_1)\gg 1$. Therefore, for a fixed $j_1$ the number of indices $j_2$ for which the conditions (7.26) can hold can be estimated as $O(r_1/r_2+1)$. Thus, the areas of the figures $W_k(b_1)$ and $W_k(b_1')$ differ by at most

This fact and the estimate (7.20) imply that the values of $F_k(a_2,b_1)$ can then vary in an interval of length

which again agrees with the estimate (7.25). $\square$

§ 8. Third variant of estimation of the remainder

Theorem 8.1.  Suppose that a matrix $A=\begin{pmatrix} a_1 & a_2\\ b_1 & b_2\end{pmatrix}$ and a coefficient $a_3$ are fixed, $q=\det A\ne 0$, $(a_1,a_2,b_1,b_2)=1$, $D=(a_1,a_2)$, $(D,a_3)=1$, and a matrix $X$ has the form (1.8). Then any parallelepiped

with $1\leqslant Z_2\leqslant q$ and $1\leqslant Z_3, Z_4\leqslant Pq$ satisfies

Equation (8.1)

where

and $\lambda_2(A)$ is defined in (7.1).

See the proof in [23].

Corollary 8.2.  Suppose that the hypotheses of Theorem 8.1 hold, the dimensions of the rectangle

satisfy the inequalities $1\leqslant Z_2\leqslant q$ and $1\leqslant Z_3\leqslant Pq$, and for $(b_3,c_1)\in I'$ the values of a non-negative function $f(b_3,c_1)$ vary within an interval $[Z_4,Z_4+V)$, where $1\leqslant V\ll Z_4\leqslant Pq$. Then

The proof is similar to the proof of Corollary 7.3.

Proposition 8.3.  Let $r$ be a positive integer such that $1\leqslant r\leqslant a$. Then the sum $\mathscr{N}_\ell(a,q,P)$ in (5.7) satisfies the asymptotic formula

where $c(q,D)$ is defined by (6.18), $\rho_3(a,q)\ll aq^{-2+\varepsilon}$,

and $R_A(Z_2,Z_3,Z_4)$ is the remainder in Theorem 8.1.

Proof.  The first part of the proof in essence repeats the proof of Proposition 7.7.

The intervals $I_2=[Y_2,Y_2+Z_2)$ and $I_3=[Y_3,Y_3+Z_3)$ (see (7.7)) in which the respective variables $b_3$ and $c_1$ vary are divided into $r$ equal intervals:

Equation (8.2)

We define a function

Let the interval $I_{2}$ be represented as $I_{2}=\bigsqcup_{k=0}^m W_k(b_1)$, where $m\ll \log a$,

We write the sum $\mathscr{N}_{\ell}(a,q,P)$ in the form

Equation (8.3)

where

Equation (8.4)

Let us now verify the asymptotic formula

Equation (8.5)

To this end we first estimate the number of intervals $I_{2}(j)$ contained in each of the sets $W_k(b_1)$. If $I_{2}(j)\subset W_k(b_1)$, then there exists a point $b_3\in I_{2}(j)$ for which the inequality

Equation (8.6)

holds. Since $a_1=a$, this inequality defines a segment of length $2^{k+2}q/(ar)$ on the $Ob_3$-axis. Therefore, there exist $O(2^k)$ intervals $I_{2}(j)$ which have common points with this segment, that is, the set $W_k(b_1)$ consists of $O(2^k)$ intervals of the partition.

It follows from (6.10) that for any interval $I_{2}(j)$ we have the following asymptotic formula with a trivial estimate of the remainder term:

Equation (8.7)

For $k=0$ equation (8.5) follows from (8.7). Hence, in what follows we assume that $k>0$.

Equation (8.5) will be proved if for any interval $I_{2}(j)\subset W_k(b_1)$ we show that

Equation (8.8)

(For the intervals $I_{2}(j)$ that intersect $W_k(b_1)$ only partially, it suffices to use (8.7).)

If $b_3\in W_k(b_1)$, then $|a_1b_3-a_3b_1|\gg U={2^kq}/{r}$. Therefore, it follows from the conditions (5.6) that for each of the functions $f_i$ whose graphs define the boundary of $\Omega(\overline A)$ we have the estimates

Hence, on each of the rectangles $I_2(j_2)\times I_3(j_3)$ the values of the functions $f_i$ (under the condition that the corresponding part of the graph defines the boundary) are contained in an interval of length $V=O(Pq^{-1}\cdot 2^{-k})$. On each of the parallelepipeds $I(j_2,j_3)=I_{2}(j_2)$ with $I_{2}(j_2)\subset W_k$ we use Corollary 8.2:

Summing the resulting equation over $j_3$, we arrive at (8.8). Thus, (8.5) is proved.

The further steps will be different from the proof of Proposition 7.7.

We get from (8.3)–(8.5) that

Using (6.21), we pass to the variables $\gamma_1,\gamma_2$:

where

Equation (8.9)

To prove the proposition it remains to verify that the sum $S(a,q,\boldsymbol{\gamma})$ satisfies the asymptotic formula

Equation (8.10)

where $c(q,D)$ is defined by (6.18).

Suppose that the values of $a_2$, $a_3$, $b_3$, $\gamma_1$, and $\gamma_2$ are fixed and consider the sum

Let $(u_0,v_0)$ be a particular solution of the equation $\begin{vmatrix} a_1 & a_2 \\ u & v \end{vmatrix}=D$. Then

and all the solutions of the equation $\begin{vmatrix} a_1 & a_2 \\ b_1 & b_2 \end{vmatrix}=q$ with respect to the unknowns $b_1,b_2$ can be written in the form

Since $\begin{pmatrix} a_1/D& u_0\\ a_2/D&v_0 \end{pmatrix} \in SL_2(\mathbb{Z})$, we have

Therefore,

where

This function has finitely many monotonicity parts and satisfies the estimate $F(t)\ll q^{-3}$. By Lemma 6.6,

The formula for $\sigma(I)$ thus obtained lets us replace the summation over the variable $b_1$ in (8.9) by integration. Using the definition (6.18) of $c(q,D)$, we find that

Passing from the variables $b_1,b_3$ to the variables $\beta_1,\beta_3$ and using (6.16), we obtain the required formula (8.10). Proposition 8.3 is proved. $\square$

§ 9. Completion of the proof of the main result

9.1. On reduced bases in planar lattices

Lemma 9.1.  Let $Q$ be a positive integer, let $a_1,\dots,a_Q$ and $\lambda_1\leqslant\cdots\leqslant\lambda_Q$ be real numbers, and let $g$ be a continuously differentiable function defined on the segment $[\lambda_1,\lambda_Q]$. Then

Equation (9.1)

Proof.  To verify the assertion of the lemma it suffices to transform the integral on the right-hand side of (9.1):

Corollary 9.2.  Suppose that the hypotheses of Lemma 9.1 hold and $a_1=\dots=a_Q= 1$. Then

Lemma 9.3.  Let $t\geqslant 1$ be a real number. Then the number of lattices $\Lambda\subset\mathbb{Z}^2$ such that $\det\Lambda=q$ and $\lambda_2(\Lambda)\geqslant t\sqrt{q}$ can be estimated as $O(q^{1+\varepsilon}t^{-2})$.

Proof.  Since $\lambda_2(\Lambda)\leqslant q$, the assertion of the lemma obviously holds for $t>\sqrt{q}$ . Thus, in what follows we assume that $t\leqslant\sqrt{q}$ . Let $(e_1,e_2)$ be a reduced basis of the lattice $\Lambda$ with determinant $q$ (that is, $e_1$ is a shortest vector of the lattice, and the projection of $e_2$ onto $e_1$ lies between $-e_1/2$ and $e_1/2$). Then it follows from the inequality $\|e_2\|\geqslant t\sqrt{q}$ that $\|e_1\|\leqslant 2\sqrt{q}/t$. For a fixed vector $e_1=(x_1,y_1)$ the endpoint of the vector $e_2=(x_2,y_2)$ must lie on the straight line $\{(x,y)\colon xy_1-yx_1= q\}$ and satisfy the condition $-\|e_1\|^2/2<(e_1,e_2)\leqslant\|e_1\|^2/2$. Therefore, the vector $e_2$ can be chosen in $(x_1,y_1)$ ways. Thus,

Lemma 9.4.  Let $\beta\leqslant 2$. Then

Proof.  We use Corollary 9.2, setting $g(\lambda)=\lambda^\beta$. As $\lambda_1,\dots,\lambda_Q$ we choose the numbers $\lambda_2(\Lambda)$ taken for all lattices $\Lambda$ with determinant $q$ and ordered in ascending order. Furthermore, $Q=\sigma(q)\ll q^{1+\varepsilon}$ and $\lambda_1\ll\sqrt{q}$ (we can always choose a vector $e_1$ with length of order $\sqrt{q}$ and supplement it to form a reduced basis with determinant $q$). Hence,

Substituting the estimate in Lemma 9.3 into the last integral, we arrive at the assertion of the lemma. $\square$

Corollary 9.5.  Let $\beta\leqslant 2$ and let

Then

Proof.  We transform this sum:

By the property 3 of reduced matrices, to each term in the inner sum there corresponds some Voronoi basis of the lattice $\Lambda$, and to each Voronoi basis there correspond at most four matrices $A$. Consequently, the inner sum can be estimated up to a constant by the number of minimal bases, that is,

Estimating the remaining sum by Lemma 9.4, we arrive at the assertion of the corollary. $\square$

Corollary 9.6.  Suppose that $\beta\leqslant 2$, $0< A_1\leqslant A_2$, and $\alpha$ is an arbitrary real number. Then

Proof.  The required estimate is obtained from Corollary 9.5 by using the Abel transformation

Setting $f(a)=a^\alpha$ and $g(a)=\sum_{a_2,b_1}\lambda_2^\beta(A)[A\in\mathscr{A}_\ell(a,q)]$, we get that

9.2. Estimation of the remainder term

Lemma 9.7.  Let $a$ be a positive integer, and let $\alpha,Q_1,Q_2$ be real numbers with $0<Q_1<Q_2$. Then

Proof.  Setting $d=(a,q)$, we have

Proposition 9.8.  The following estimates hold:

Proof.  We sum the remainder $R_1(a,q,P)=\displaystyle\frac{P^{1+\varepsilon}q}{a^{2}}$ (see Proposition 6.10) over the points $(a,q)\in\Omega_1$:

Let us estimate the sum of the remainders $R_2(a,q,P)$. By Proposition 7.7,

We choose the values of the parameters $r_1,r_2$ based on the equations

assuming that $\lambda_2(A)\asymp q^{1/2}$:

(Note that the estimates $r_1,r_2\leqslant a$ hold in the domain $\Omega_2$, and therefore Proposition 7.7 is indeed applicable.) Then it follows from the inequality $\lambda_2(A)\geqslant q^{1/2}$ that for $(a,q)\in\Omega_2$

We estimate the sum of each of the terms over the pairs $(a,q)\in\Omega_2$. For the first term,

Applying Corollary 9.6 and Lemma 9.7 to the inner sums, we arrive at the required estimate:

We sum the remaining part of the remainder $R_2(a,q,P)$. The required estimate also is verified by using Lemma 9.7:

Consider the third remainder (see Proposition 8.3):

We choose the value $r=\lfloor P^{1/3}a^{-1/3}q^{-5/12}\rfloor$ based on the equation

again assuming that $\lambda_2(A)\asymp q^{1/2}$. For this choice of $r$ and for $(a,q)\in\Omega_3$ we have

Applying Corollary 9.6 and Lemma 9.7 successively, we find that

9.3. Auxiliary assertions

Along with the Euler function $\varphi(q)$, we use the functions

Lemma 9.9.  Let $Q\geqslant 2$. Then

Equation (9.2)

Equation (9.3)

Equation (9.4)

where $c_0$ is an absolute constant,

Equation (9.5)

and $\gamma$ is the Euler constant.

Proof.  We verify equation (9.2). Expressing $\varphi_2(q)$ in terms of the Möbius function, we arrive at the relations

To calculate the resulting sum it is sufficient to use the fact that

and in particular,

We transform the second sum:

Using the equation

Equation (Stielt)

where $\gamma_1$ is the first Stieltjes constant (see [127], §2.21), we find that

Setting

we obtain the second formula of the lemma.

See the proof of (9.4) in [24], Lemma 9. $\square$

Lemma 9.10.  Let $D$ be a positive integer and let $c(q,D)$ be defined by (6.18). Then the sum

satisfies the asymptotic formula

Proof.  Setting $q=Dq_1$, we obtain for this sum the representation

We transform the sum obtained, introducing the parameters $\Delta=(D,q_1)$, $q_2=q_1\Delta^{-1}$, and $q_3=q_2\delta^{-1}$:

We apply equation (9.4) to the inner sum:

Substituting the equation (see [24], Lemma 8)

into the last formula, we arrive at the assertion of the lemma. $\square$

9.4. Calculation of the principal term

To complete the proof of Theorem 5.5 it remains to pass to integration with respect to the variables in the first row of the matrix $X$. First we need to sum over the variable $q$.

The conditions $a\leqslant b\leqslant c$ satisfied by the matrices $X\in\mathscr{M}_\ell(a,q,P)$ are not invariant under the left action of $D_3(\mathbb{R})$. We express $c$ and $b$ from the equalities

where $A''=\begin{pmatrix} 1 & \alpha_2\\ \beta_1 & 1 \end{pmatrix}$. Then after passing to the set $\mathscr{M}_\ell'''(a,q,P)$, the inequalities $a\leqslant b\leqslant c$ take the form

Equation (9.7)

Therefore, for given coefficients of the matrix $X'''$, for each $i=1,2,3$ the domain $\Omega_i(X''')$ in which $a$ and $q$ can vary is defined by

Remark 9.11.  As in the two-dimensional case (see §4.7), it is natural to extend by zero the values of the functions $\rho_{2,3}(a,q)$ for those pairs $(a,q)$ for which the sets $\mathscr{M}_\ell(a,q,P)$ are empty. Thus, we assume that the values $\rho_{2,3}(a,q)$ are defined for all positive integers $a$ and $q$.

Proposition 9.12.  The quantity $\mathscr{N}_\ell^{(2)}(P)$ defined by (5.9) satisfies the asymptotic formula

where $\rho_{2}(a)\ll a^{-2+\varepsilon}$.

Proof.  By Propositions 7.7 and 9.8 we have

Equation (9.8)

where

The set $\widetilde{\mathscr{M}}'''$ is obtained from $\mathscr{M}_\ell'''(a,q,P)$ by discarding the conditions (9.7) while keeping the remaining (invariant) inequalities. Therefore,

Equation (9.9)

and

We apply the relation

(see [24], Corollary 3) to the sum obtained above. The lower limit of summation $Q_1$ satisfies the estimate $Q_1\gg a^2$. Thus,

Substituting the last formula into (9.8), we arrive at the required equality. $\square$

Lemma 9.13.  Suppose that $\Omega\subset[-a,a]^2$ and the boundary of the domain $\Omega$ has length $O(a)$. Then the sum

satisfies the asymptotic formula

Proof.  We get rid of the coprimeness condition by using the Möbius function:

Let

Then

Consequently,

The number of unit squares intersected by the boundary of the domain $d^{-1}\Omega$ is $O(a')$. In each of these squares, we can use the following formula with a trivial estimate of the remainder term:

Thus,

Proposition 9.14. 

where $\rho_{1,3}(a)\ll a^{-5/4+\varepsilon}$.

Proof.  By Propositions 6.10, 8.3, and 9.8 we have

Equation (9.10)

where

It follows from equation (9.9) that

We apply Lemma 9.10 to the sum over $q$:

By Lemma 9.7,

Therefore,

Using Lemma 9.13, we arrive at the asymptotic formula

Equation (9.11)

Furthermore, by Lemma 9.7 we have

Equation (9.12)

Equation (9.13)

Substituting (9.11), (9.12), and (9.13) into (9.10), we arrive at the required equality. $\square$

Corollary 9.15.  Asymptotically,

where $\rho(a)\ll a^{-5/4+\varepsilon}$.

For the proof it is sufficient to substitute the asymptotic formulae in Propositions 9.12 and 9.14 into (5.8).

Proof. of Theorem 5.5 Since

it follows by Corollary 9.15 that

We transform the integral with respect to the variable $q$:

Substituting the resulting expression into the formula for $\mathscr{N}_\ell(P)$ and using (9.2) and (9.3), we arrive at the assertion of the theorem. $\square$

Please wait… references are loading.
10.1070/RM2015v070n03ABEH004953