In this blog post of mine I talk about the “polynomial zeta function” \(\zeta_q(s)\) and prove an analogue of Riemann hypothesis for it (indeed, this zeta has no zeroes at all). At the end I mention that this zeta function is, in fact, the “correct” zeta function for this purpose. In this blog post I am going to derive a formula analogous to the von Mangoldt’s explicit formula. This is a little project for me to see whether I’m able to work out all the details.

The proof is based on the exposition in Davenport’s Multiplicative Number Theory and requires some understanding of complex analysis.

Preliminary definitions and facts

\(\newcommand{\re}{\operatorname{Re}}\newcommand{\im}{\operatorname{Im}}\)
Fix a finite field \(\mathbb F_q\) with \(q\) elements. Throughout we are only interested in monic polynomials in \(\mathbb F_q[x]\). For such a polynomial \(f\), we define its norm to be \(Nf=|\mathbb F_q[x]/(f)|=q^{\deg f}\). We define the zeta function for \(\re s>1\) by
\[\zeta_q(s)=\sum_f(Nf)^{-s}\]
(this and the following sums range over monic polynomials). Using multiplicativity of the norm and uniqueness of factorization in \(\mathbb F_q[x]\) we can establish and alternative expression for this zeta, known as the Euler product:
\[\zeta_q(s)=\prod_{p\text{ prime}}(1-(Np)^{-s})^{-1}.\]
It’s easy to give an explicit formula for \(\zeta_q\) (unlike for standard zeta function or many of its variants), but in order to show off analytic techniques, we will only use the following few facts:

  • \(\zeta_q\) can be extended to a meromorphic function on the whole complex plane,
  • \(\zeta_q\) is nonzero everywhere and has only simple poles at points \(1+2\pi in/\log q,n\in\mathbb Z\),
  • \(\frac{\zeta_q’}{\zeta_q}\) has simple poles with residue \(-1\) at points \(1+2\pi in/\log q,n\in\mathbb Z\) and is holomorphic elsewhere (this follows from the previous two points),
  • \(\left|\frac{\zeta_q'(s)}{\zeta_q(s)}\right|\) is bounded when \(|s-\rho|>\varepsilon\) for all poles \(\rho\) of \(\zeta_q\) and any fixed \(\varepsilon\).

For \(\re s>1\) we have can find the expression for \(\frac{\zeta_q’}{\zeta_q}\) by taking natural logarithm (denoted below by \(\log\)) of \(\zeta_q\) and differentiating (for this reason we call \(\frac{\zeta_q’}{\zeta_q}\) the logarithmic derivative of \(\zeta_q\)). Skipping the intermediate steps, we get
\[-\frac{\zeta_q'(s)}{\zeta_q(s)}=\sum_{p\text{ prime}}\log Np\cdot\sum_{k=1}^\infty (N(p^k))^{-s}.\]
If we introduce the von Mangoldt functions for polynomials, which is defined by \(\Lambda_q(f)=\log Np\) if \(f=p^k\) for some irreducible \(p\) and \(k\geq 1\), and \(\Lambda_q(f)=0\) otherwise, then we find
\[-\frac{\zeta_q'(s)}{\zeta_q(s)}=\sum_f\Lambda_q(f)(Nf)^{-1}\qquad(*)\]
(note: in my blog post on the Riemann hypothesis I define \(\Lambda_q\) using logarithm to the base \(q\). This only makes a difference of a factor \(\log q\) and the convention used here makes formulas simpler. In the last section we d get rid of it, though). This formula is the starting point for the analytic arguments which will establish an explicit formula for the partial sums \(\psi_q(x)=\sum_{Nf\leq x}\Lambda_q(f)\), meaning the sum over all polynomials with norm at most \(x\). In fact, it will be more convenient to deal with the modified summatory function \(\widetilde\psi_q(x)=\sum_{Nf\leq x}’\Lambda_q(f)\), where \(\sum’\) indicates that if \(Nf=x\), then we count in only half of the \(\Lambda_q(f)\) term.

Heuristic

\(\newcommand{\Res}{\operatorname{Res}}\)
There is a rather simple heuristic argument which works for many Dirichlet series and which shows how the partial sums of an arithmetic function “should” behave asymptotically. We start of with the following integral: for \(c,y\) positive and real we have
\[\frac{1}{2\pi i}\int_{c-i\infty}^{c+i\infty}\frac{y^s}{s}\mathrm ds=\begin{cases}
1 & \text{for }y>1,\\
\frac{1}{2} & \text{for }y=1,\\
0 & \text{for }y<1,
\end{cases}\]
where \(\int_{c-i\infty}^{c+i\infty}\) means the limit of line integrals \(\int_{c-iT}^{c+iT}\) as \(T\) goes to infinity. Therefore if we multiply \((*)\) by \(\frac{x^s}{s}\) and integrate from \(2-i\infty\) to \(2+i\infty\), we get (heuristically! we ignore issues with swapping the integral and the sum)
\[\frac{1}{2\pi i}\int_{2-i\infty}^{2+i\infty}-\frac{\zeta_q'(s)}{\zeta_q(s)}\frac{x^s}{s}\mathrm ds=\sum_f\Lambda_q(f)\frac{1}{2\pi i}\int_{2-i\infty}^{2+i\infty}\frac{(x/Nf)^s}{s}\mathrm ds=\widetilde\psi_q(x).\]
Now we move the integration contour — we continuously deform the line \((c-i\infty,c+i\infty)\) from \(c=2\) to \(-\infty\). Right now we use (heuristically) the residue theorem, which says that the only change in the value of the integral is due to the contour passing through a pole of the integrand. If the integral vanishes as \(c\rightarrow-\infty\), this will give us
\[\frac{1}{2\pi i}\int_{2-i\infty}^{2+i\infty}-\frac{\zeta_q'(s)}{\zeta_q(s)}\frac{x^s}{s}\mathrm ds=\sum_z\Res\left(-\frac{\zeta_q’}{\zeta_q}\frac{x^s}{s},z\right).\]
The poles of this function appear exactly at \(s=0\), where the residue is equal to \(-\frac{\zeta_q'(0)}{\zeta_q(0)}\), and at poles of \(\frac{\zeta_q’}{\zeta_q}\). The residue of \(-\frac{\zeta_q’}{\zeta_q}\) itself at each pole \(\rho\) is \(1\), but given that we multiply by \(\frac{x^s}{s}\), the residue is \(\frac{x^\rho}{\rho}\). In the end, taking into account the formula for poles of \(\zeta_q\), we obtain
\[\widetilde\psi_q(x)=-\frac{\zeta_q'(0)}{\zeta_q(0)}+\sum_{k=-\infty}^\infty\frac{x^{1+2\pi ik/\log q}}{1+2\pi ik/\log q}.\]

While the argumentation above is not enough to call it a proof, this formula turns out to be fully correct. The following section contains a more formal argument.

Rigorous derivation

Formally deriving the above formula is a bit more work, but still doesn’t require anything beyond basic complex analysis. If someone is satisfied with the above heuristic and is more interested in consequences of this explicit formula, I recommend checking out the next section and returning to this one later.

For any \(x,T\) real positive we consider the integral \(J(x,T)=\frac{1}{2\pi i}\int_{2-iT}^{2+iT}-\frac{\zeta_q'(s)}{\zeta_q(s)}\frac{x^s}{s}\mathrm ds\). If \(T\) doesn’t coincide with an imaginary part of a pole, for any \(U>0\) we can apply the residue theorem to the rectangular contour with vertices \(2-iT,2+iT,-U+iT,-U-iT\) to get that \(J(x,T)\) is equal to the sum of residues, \(-\frac{\zeta_q'(0)}{\zeta_q(0)}+\sum_{|\im\rho|\leq T}\frac{x^\rho}{\rho}\), \(\rho\) ranging over poles, plus the sum of three integrals along line segments going through \(2-iT,-U-iT,-U+iT,2+iT\). Perturbing \(T\) by a bounded amount (which won’t affect the value as \(T\) goes to infinity, as the integrand goes to zero) we can make it so that the contour is not close to any of the poles (moving, for example, \(T\) to some \((2k+1)\pi/\log q\)). By one of the properties above, \(\left|\frac{\zeta_q’}{\zeta_q}\right|\) is bounded on that contour by some constant \(A\). For \(U\geq T\), we have \(|s|\geq T\) on this contour and \(|x^s|=x^{\re s}\). On the vertical line segment we have \(\re s=-U\), so the integral over this segment can be estimated by
\[\int_{-T}^T A\frac{x^{-U}}{T}\mathrm dt=2Ax^{-U},\]
which goes to zero as \(U\rightarrow\infty\) provided \(x>1\) (this is the first and last place where we need that assumption!) and on each of the horizontal segments the integral can be estimated by
\[\int_{-U}^2 A\frac{x^t}{T}\mathrm dt\leq\frac{A}{T}\int_{-\infty}^2x^t\mathrm dt=\frac{A}{T}\frac{x^2}{\log x}.\]
Hence we find \(J(x,T)=-\frac{\zeta_q'(0)}{\zeta_q(0)}+\sum_{|\im\rho|\leq T}\frac{x^\rho}{\rho}+O(T^{-1}x^2(\log x)^{-1})\) (the error term could be improved by a factor \(x\) if we took, say, \(1+(\log x)^{-1}\) in place of \(2\)).

Now let’s estimate the difference between \(J(x,T)\) and \(\widetilde\psi_q(x)\). For this, we again use the integral \(\frac{1}{2\pi i}\int_{c-i\infty}^{c+i\infty}\frac{y^s}{s}\mathrm ds\) used in the heuristic, but this time we need to know how quickly the integral converges in order to justify uniform convergence. Let
\[I(y,T)=\frac{1}{2\pi i}\int_{c-iT}^{c+iT}\frac{y^s}{s}\mathrm ds\\
\delta(y)=\begin{cases}
1 &\text{for }y>1,\\
\frac{1}{2} &\text{for }y=1,\\
0 &\text{for }y<1.
\end{cases}\]
Then the following estimate holds, which is proven in Davenport’s book and which I don’t reprove here:
\[|I(y,T)-\delta(y)|<\begin{cases} y^c\min\{1,T^{-1}|\log y|^{-1}\} &\text{for }y\neq 1,\\ cT^{-1} &\text{for }y=1. \end{cases}\] Note that we have \(\widetilde\psi_q(x)=\sum_f\Lambda_q(f)\delta\left(\frac{Nf}{x}\right)\) and, appealing to uniform convergence for \(\re s>1+\varepsilon\),
\[J(x,T)\stackrel{(*)}{=}\frac{1}{2\pi i}\int_{2-iT}^{2+iT}\left(\sum_f\Lambda_q(f)(Nf)^{-s}\right)\frac{x^s}{s}\mathrm ds=\sum_f\Lambda_q(f)I\left(\frac{Nf}{x},T\right),\]
therefore our goal is to estimate the difference
\[R(x,T)=J(x,T)-\widetilde\psi_q(x)=\sum_f\Lambda_q(f)\left(I\left(\frac{Nf}{x},T\right)-\delta\left(\frac{Nf}{x}\right)\right)\]
and show it goes to zero with \(T\) going to infinity. We have (since \(c=2\) here)
\[|R(x,T)|\leq\sum_{Nf\neq x}\Lambda_q(f)\left(\frac{x}{Nf}\right)^2\min\left\{1,T^{-1}\left|\log\frac{x}{Nf}\right|^{-1}\right\}+2T^{-1}\sum_{Nf=x}\Lambda_q(f).\]
In the last sum, note that \(\Lambda_q(f)\leq\log Nf=\log x\) and, if \(x=q^d\) (so that there even are nonzero terms), the number of terms is at most the number of degree \(d\) polynomials, \(q^d=Nf=x\), hence this last sum is \(O(T^{-1}x\log x)\).

For \(Nf\) smaller than \(\frac{3}{4}x\) or larger than \(\frac{5}{4}x\), \(\left|\log\frac{x}{Nf}\right|^{-1}=O(1)\), hence the sum over these terms is \(O\left(x^2T^{-1}\sum_f\Lambda_q(f)(Nf)^{-2}\right)=O(x^2T^{-1})\).

Let \(\langle x\rangle\) be the distance between \(x\) and the closest power of \(q\) (distinct from \(x\) if \(x\) happens to be one). Then for any \(f\) with \(Nf\neq x\) we have \(|Nf-x|\geq\langle x\rangle\). Hence we have
\[\left|\log\frac{x}{Nf}\right|=\left|\log\frac{Nf}{x}\right|\geq\left|\log\left(1\pm\frac{\langle x\rangle}{x}\right)\right|\geq\frac{\langle x\rangle}{x},\]
hence the contribution of any \(f\) with \(\frac{3}{4}x\leq Nf\leq\frac{5}{4}x\) into the first sum is, up to a constant, \(\Lambda_q(f)\frac{x}{T\langle x\rangle}=O\left(\frac{x\log x}{T\langle x\rangle}\right)\). There are \(O(x)\) polynomials of such norm, so they contribute \(O\left(\frac{x^2\log x}{T\langle x\rangle}\right)\). In the end, this gives
\[R(x,T)=O\left(\frac{x\log x}{T}\right)+O\left(\frac{x^2}{T}\right)+O\left(\frac{x^2\log x}{T\langle x\rangle}\right)=O\left(\frac{x^2}{T}\max\left\{1,\frac{\log x}{\langle x\rangle}\right\}\right).\]

Finally, we arrive at the equality
\[\widetilde\psi_q(x)=-\frac{\zeta_q'(0)}{\zeta_q(0)}+\sum_{|\im\rho|\leq T}\frac{x^\rho}{\rho}+O\left(\frac{x^2}{T}\max\left\{1,\frac{\log x}{\langle x\rangle}\right\}\right)\]
for \(x>1\), the sum being over the poles \(\rho\) of \(\zeta_q\). In particular, for a fixed \(x>1\), letting \(T\rightarrow\infty\) we get
\[\widetilde\psi_q(x)=-\frac{\zeta_q'(0)}{\zeta_q(0)}+\sum_{\rho}\frac{x^\rho}{\rho}=-\frac{\zeta_q'(0)}{\zeta_q(0)}+\sum_{k=-\infty}^\infty\frac{x^{1+2\pi ik/\log q}}{1+2\pi ik/\log q}.\]

phew.

Consequences

There are two main reasons why we can’t derive the prime number theorem for polynomials (which one perhaps should call the irreducible polynomial theorem, but it doesn’t have a ring to it) in a manner similar to how one derives the standard PNT from the standard explicit formula (or, more precisely, uniform bounds on its rate of convergence):

  • The error bound is horrible, since now we need \(T\) to be noticeably larger than \(x\) to make the error term smaller than the main term, and for \(T\) so large it’s becomes difficult to estimate the main term. Also, the error term cannot be easily improved, because the terms clutter at norms equal to powers of \(q\).
  • The poles lie on the line \(\re s=1\), so \(\left|\frac{x^\rho}{\rho}\right|=\frac{x^{\re\rho}}{|\rho|}=\frac{x}{|\rho|}\), which has the same order of magnitude as the “intended” main term \(x\).

In fact, there have to be some problems here. The reason is that PNT doesn’t hold in the expected way — we do not have \(\widetilde\psi_q(x)\sim x\). This comes from the fact that the only possible norms are powers of \(q\), so \(\widetilde\psi_q\) is constant between them and the gaps are quite large. However, we can still derive a form of PNT, when we restrict \(x\) to only have the form \(q^n\) for \(n>0\). Indeed, for these \(x\) we have, for a pole \(\rho=1+2\pi ik/\log q\),\[x^\rho=(q^n)^{1+2\pi ik/\log q}=q^ne^{n\log q\cdot 2\pi ik/\log q}=q^ne^{2\pi ink}=q^n,\]
hence the explicit formula takes the form
\[\widetilde\psi_q(q^n)=-\frac{\zeta_q'(0)}{\zeta_q(0)}+q^n\sum_{k=-\infty}^\infty\frac{1}{1+2\pi ik/\log q}.\]
We can find the sum of the inner series — first we note that, if we pair up terms for \(k\) and \(-k\), we get
\[\frac{1}{1+2\pi ik/\log q}+\frac{1}{1-2\pi ik/\log q}=\frac{2}{1+(2\pi k/\log q)^2},\]
hence this sum is equal to \(1+2\sum_{k=1}^\infty\frac{1}{1+(\pi k/z)^2}\) for \(z=\frac{\log q}{2}\). This series is (equivalent to) a well-known partial fraction formula for hyperbolic cotangent:
\[1+2\sum_{k=1}^\infty\frac{1}{1+(\pi k/z)^2}=z\coth z=z\frac{e^{2z}+1}{e^{2z}-1}=\frac{\log q}{2}\frac{q+1}{q-1}.\]
We can also find the value of the logarithmic derivative at \(0\), which is most easily done using the explicit form of \(\zeta_q\) — omitting the calculations, we find \(\frac{\zeta_q'(0)}{\zeta_q(0)}=\frac{q\log q}{q-1}\). The explicit formula now says
\[\widetilde\psi_q(q^n)=\left(-\frac{q}{q-1}+\frac{q^n}{2}\frac{q+1}{q-1}\right)\log q.\]
Now we translate this to knowledge about the unmodified \(\psi_q\), using the fact \(\widetilde\psi_q(q^n)=\frac{1}{2}(\psi_q(q^{n-1})+\psi_q(q^n))\) for \(n\geq 1\). Note that \(\psi_q(q^0)=\psi_q(1)=0\). For \(n=1\), the explicit formula gives \(\widetilde\psi_q(q)=\frac{q}{2}\log q\), so clearly \(\psi_q(q)=q\log q\). For \(n=2\), we find \(\widetilde\psi_q(q^2)=\left(q+\frac{q^2}{2}\right)\log q\), so \(\psi_q(q^2)=\left(q+q^2\right)\log q\). A pattern slowly emerges — we have, for \(n\geq 0\),
\[\psi_q(q^n)=\log q\sum_{i=1}^nq^i,\]
which is most easily seen by rewriting the explicit formula as
\[\widetilde\psi_q(q^n)=\left(-\frac{q}{q-1}+\frac{q^n}{2}\frac{q+1}{q-1}\right)\log q=\frac{\log q}{2}\left(\frac{q^{n+1}-q}{q-1}+\frac{q^n-q}{q-1}\right)\\
=\frac{\log q}{2}(2q+2q^2+\dots+2q^{n-1}+q^n=\frac{\log q}{2}\left(\sum_{i=1}^nq^i+\sum_{i=1}^{n-1}q^i\right)\]
and using induction. From there, it is clear that we have
\[\sum_{\deg f=n}\Lambda_q(f)=\sum_{Nf=q^n}\Lambda_q(f)=q^n\log q.\]
(If we were to use base \(q\) logarithm in the definition of \(\Lambda_q\), we could write this in a very PNT-esque way — for \(x\) a power of \(q\), we would have \(\sum_{Nf=x}\Lambda_q(f)=x\).)

Let’s quickly rethink what this sum really counts — for every irreducible polynomial power \(p^d\) of degree \(n\), i.e. for every irreducible polynomial \(p\) of degree \(\frac{n}{d}\), for any divisor \(d\) of \(n\), we have a term \(\Lambda_q(p^d)=\log Np=\log q^{\deg p}=\log q\deg p\). Putting this into the formula above and getting rid of the \(\log q\) factor, we find
\[\sum_{d\mid n}\sum_{p\text{ prime},\deg p=d}d=q^n.\]
Writing \(c(d)\) for the number of irreducible polynomials of degree \(d\), we get the formula
\[\sum_{d\mid n}dc(d)=q^n.\]
Using Möbius inversion we can get an explicit formula for \(c(n)\) and from there find
\[c(n)=\frac{q^n}{n}+O(q^{n/2}),\]
or, in more PNT-esque way, when \(x\) is a power of \(q\),
\[\sum_{p\text{ prime},Np=x}1=\frac{x}{\log_qx}+O(\sqrt{x}).\]
Note that we have this nonzero error term, which one might find somewhat worrying, given that formulas up to now have been exact. This is because only counting irreducible polynomials is somewhat “wrong” — what one should do is count powers of these as well, properly weighted. More precisely, a power \(p^k\) should be counted as \(\frac{1}{k}\)-th of an irreducible polynomial. Then the “correct” prime-counting function would be
\[\sum_{k=1}^\infty\sum_{\deg p^k=n}\frac{1}{k}=\sum_{k=1}^\infty\frac{1}{k}c\left(\frac{n}{k}\right).\]
This turns out to be exactly equal to \(\frac{q^n}{n}\), and indeed is just the formula for \(\sum_{d\mid n}dc(d)\) divided by \(n\). Noteworthily, this formula is an analogue of Riemann’s explicit formula. It is possible, but is a bit more technically challenging, to derive Riemann’s formula directly, but in the polynomial setting we can derive it from the other explicit formula. To the best of my knowledge, it is not possible directly in the standard setting of natural numbers.

And we would have gotten away with it, too, if it wasn’t for you meddling nontrivial zeros!