Syncopated Systems
Seriously Sound Science

On Prime Numbers

On August 6, 2002, a trio of authors at the Indian Institute of Technology, Kanpur released for review an impressive paper titled Primes is in P, in which they present a (potential) relatively-fast algorithm for testing a given number for primality, now known as the Agrawal–Kayal–Saxena primality test, often abbreviated as the AKS primality test and also known as the cyclotomic AKS test. (One of my outstanding mathematics professors from Saint Edward's University, Dr. Michael Engquist, brought this paper to my attention hardly more than a week after its publication.) The AKS primality test renewed my interest in prime numbers, and is particularly exciting because it is "the first primality-proving algorithm to be simultaneously general, polynomial, deterministic, and unconditional".

I had written a brief essay on prime numbers in 2006, the same year that Agrawal, Kayal, and Saxena received the Gödel Prize and the Fulkerson Prize.

In 2019, I had occasion to revisit the topic of prime numbers, especially in the context of computer science. (I should note that I experiment largely for my own edification, and that my studies have been informal at best.)

What Are Prime Numbers?

A prime number is a natural number (ℕ) greater than 1 that cannot be formed by multiplying two smaller natural numbers. (Although the number 1 is excluded from being called a prime number, I find it useful to include it in the set of numbers used for visualizations.)

So, for any given prime number p, n divides p if and only if n is equal to either 1 or p. Using mathematical notation, this may be expressed as:

n | p, n = { 1, p }

Why Are Prime Numbers Interesting and Important?

In computer science, prime numbers have notable applications in areas including cryptography, which is the science of encoding and decoding messages to create relative security in communication. Modern cryptographic practices—such as RSA (Rivest–Shamir–Adleman) encryption—rely upon the selection of relatively unknown (and usually very large) prime numbers. Using these cryptographic methods, those able to identify larger prime numbers should, in theory, be better able to securely encode their own messages and to decode their adversaries' messages. Cryptography also provides a foundation for blockchain-based systems such as Bitcoin that emerged around 2008-2009.

Number theorist and my childhood mentor LaFarr Stuart introduced me to the work of another number theorist named Derrick Henry Lehmer (1905-1991), whose interesting work with early electronic computing and prime numbers includes the Lucas-Lehmer test (LLT) for primality and a 1926 bicycle-chain sieve (an electromechanical device designed to select numbers to test for primality), which LaFarr suggested reproducing using modern electronics.

How Many Prime Numbers Are There?

As there are infinitely many numbers, there are infinitely many prime numbers. However, the number of prime numbers in a finite interval is also finite, and the proportion of prime numbers in an interval quickly becomes smaller as the interval becomes larger.

As an example, I created (programatically) the following table of prime numbers in the interval [1,1000], of which there are 168 (169 if counting 1). In the table, composite (non-prime) numbers appear in silver with strikethrough and the numbers along the right side keep a running total of the number of prime numbers within the interval through the end of each line.

The Prime Numbers in the Interval [1,1000]

Prime/Composite #s# Prime+1
1*23456789105
111213141516171819209
2122232425262728293011
3132333435363738394013
4142434445464748495016
5152535455565758596018
6162636465666768697020
7172737475767778798023
8182838485868788899025
91929394959697989910026
10110210310410510610710810911030
11111211311411511611711811912031
12112212312412512612712812913032
13113213313413513613713813914035
14114214314414514614714814915036
15115215315415515615715815916038
16116216316416516616716816917040
17117217317417517617717817918042
18118218318418518618718818919043
19119219319419519619719819920047
20120220320420520620720820921047
21121221321421521621721821922048
22122222322422522622722822923051
23123223323423523623723823924053
24124224324424524624724824925054
25125225325425525625725825926056
26126226326426526626726826927058
27127227327427527627727827928060
28128228328428528628728828929062
29129229329429529629729829930063
30130230330430530630730830931064
31131231331431531631731831932067
32132232332432532632732832933067
33133233333433533633733833934069
34134234334434534634734834935071
35135235335435535635735835936073
36136236336436536636736836937074
37137237337437537637737837938076
38138238338438538638738838939078
39139239339439539639739839940079
40140240340440540640740840941081
41141241341441541641741841942082
42142242342442542642742842943083
43143243343443543643743843944086
44144244344444544644744844945088
45145245345445545645745845946089
46146246346446546646746846947092
47147247347447547647747847948093
48148248348448548648748848949094
49149249349449549649749849950096
50150250350450550650750850951098
51151251351451551651751851952098
521522523524525526527528529530100
531532533534535536537538539540100
541542543544545546547548549550102
551552553554555556557558559560103
561562563564565566567568569570105
571572573574575576577578579580107
581582583584585586587588589590108
591592593594595596597598599600110
601602603604605606607608609610112
611612613614615616617618619620115
621622623624625626627628629630115
631632633634635636637638639640116
641642643644645646647648649650119
651652653654655656657658659660121
661662663664665666667668669670122
671672673674675676677678679680124
681682683684685686687688689690125
691692693694695696697698699700126
701702703704705706707708709710128
711712713714715716717718719720129
721722723724725726727728729730130
731732733734735736737738739740132
741742743744745746747748749750133
751752753754755756757758759760135
761762763764765766767768769770137
771772773774775776777778779780138
781782783784785786787788789790139
791792793794795796797798799800140
801802803804805806807808809810141
811812813814815816817818819820142
821822823824825826827828829830146
831832833834835836837838839840147
841842843844845846847848849850147
851852853854855856857858859860150
861862863864865866867868869870151
871872873874875876877878879880152
881882883884885886887888889890155
891892893894895896897898899900155
901902903904905906907908909910156
911912913914915916917918919920158
921922923924925926927928929930159
931932933934935936937938939940160
941942943944945946947948949950162
951952953954955956957958959960163
961962963964965966967968969970164
971972973974975976977978979980166
981982983984985986987988989990167
9919929939949959969979989991000169

Methods For Approximating Numbers of Primes

The number of primes in any closed interval [1,n] is represented as ℙ(n).

The Prime-Counting Function

The prime-counting function is used to approximate the number of prime numbers on a given interval is Π(n)— written without Greek characters as Pi(n). Showing the approximation, the function is defined and calculated as:

ℙ(n) ≈ Π(n) = n = n


loge n ln n

After generating a table of several thousand prime numbers with my Apple iMac, I found that this doesn't provide a very good approximation. (My old iMac with only a 600 MHz PowerPC G3 processor running my quick-and-dirty C++ program tests for primality and prints only about 4000 numbers a minute, using Euclid's algorithm to find the greatest common denominator of the number under test.) Running my little program for about 10 minutes, I found that:

End of
Interval (n)
Rounded Approximate
Number of Primes (Π)
Actual Number
of Primes (ℙ)+1
Deviation
(Δ)
10 4 5 -1 -13.14%
100 22 26 -4 -16.48%
1000 145 169 -24 -14.34%
10000 1086 1230 -144 -11.73%

Refining Methods for Approximating the Number of Primes

In 2006, I had suggested that the prime-counting function Π(n) = n ÷ ln n could be improved by multiplying by 10/9, so the resulting approximation formula would look like this:

ℙ(n) ≈ 10/9 Π(n) = 10 n = 10 n


9 loge n 9 ln n

For relatively small intervals, this method's results compare favorably to those obtained via the methods discussed previously, as indicated via the table below.

End of
Interval (n)
Actual Number
of Primes (ℙ)+1
Deviation (Δ)
(n / ln n) (10 n / 9 ln n)
10 5 -13.14% -3.49%
100 26 -16.48% -7.20%
1000 169 -14.34% -4.82%
10000 1230 -11.73% -1.92%
15000 1755 -11.12% -1.24%
20000 2263 -10.76% -0.84%
25000 2763 -10.65% -0.72%
30000 3246 -10.35% -0.39%
35000 3733 -10.39% -0.43%
40000 4204 -10.21% -0.23%
45000 4676 -10.18% -0.20%
   

Averages: -11.77% -0.60%

My Experiments

In my 2006 essay, I described using my Apple iMac (with a PowerPC processor running at 600 MHz) and to process about 4000 numbers per second. In 2019, my calculations on my Panasonic Toughbook CF-30 (with a 2-core Intel® Core2 Duo L7500 processor running at 1.60 GHz) were more than 162 times faster. (Over the same small interval I processed in 2006, my code in 2019 runs so much faster that it's practically immeasurable.)

Total processor resources available in 2019 are about 5.33 times what I had in 2006. But, as of this writing, I am experimenting only with a single thread that uses only a single processor core, so the improvement in hardware should speed up my experiments by only about 2.67 times. Accounting for differences between RISC and CISC processor architectures, the throughput of the newer processor core might actually be about 8 times faster.

So what would make my software run about 20 times faster? The differences in processing speeds between my experiments in 2006 and 2019 are likely best explained by differences in the algorithms I used.

Algorithms

In the field of computer science, we often describe computational time complexity in terms of big O notation, which is more formally known as Bachmann–Landau notation or asymptotic notation.

In my 2006 essay, I noted that I had used Euclid's algorithm, which dates back to nearly 300 BCE.

In 2019, I initially used a very slow method of brute force computation called trial division (first described in 1202 by Fibonacci in his book Liber Abaci), which has a complexity of O(n2). Soon afterward, I improved my method to include a crude sieve. (Though much different in its implementation, it is functionally equivalent to the Sieve of Eratosthenes from circa 200 BCE or [Leonhard] Euler's Sieve from circa 1750 CE.) This reduced the time complexity to linear time with processing on the order of O(n), as observed in 1978 by Gries and Misra.

Afterward, my early experiments with wheel factorization using the relatively small basis { 2, 3, 5 } were computed at about the same speed. Using a larger basis improved speed, but yielded solution sets that were incomplete.

One of the things that's really fun and exciting is that I saw all of these gains in performance before even implementing the AKS primality test.

Summary

Although multiplying Π(n) by 10/9 greatly improves estimates for relatively small values of n, multiplying by this constant causes the function to deviate when n reaches about 50,000 (5 × 104) and actually makes the estimate poorer when n exceeds approximately 500,000,000 (5 × 108). (In 2006, I had also suggested an intermediate improvement, which I had already shown via a table would deviate when n exceeded only about 5,000.)

Interestingly, multiplying Π(n) by 100/99 improves the estimate of Π(n) over a range of numbers through 1027 that have been tested for primality and summarized on the Wikipedia page for the prime-counting function. This estimate is less accurate than 10/9 Π(n) when n is less than about 500,000,000 (5 × 108), and I expect the improvement to deviate when n becomes significantly larger than 1027.

I expect to continue occasional experiments with electronics, computers, and prime numbers, and I welcome input via my contact page.