Syncopated Systems®
Comprehensive Computer Technology Solutions

On Prime Numbers

On August 6, 2002, a trio of authors at the Indian Institute of Technology, Kanpur released for review an impressive paper titled Primes is in P, in which they present a (potential) relatively-fast algorithm for testing a given number for primality. One of my outstanding mathematics professors from St. Edward's University, Dr. Michael Engquist, brought this paper to my attention hardly more than a week after its publication. This renewed my interest in prime numbers.

Why Are Prime Numbers Important?

Among other areas, prime numbers are very import to cryptography (the science of encoding and decoding messages to create relative security in communication). Modern cryptographic practices (such as R.S.A. encryption) rely upon the selection of relatively unknown (and usually very large) prime numbers. Using these cryptographic methods, those able to identify larger prime numbers should, in theory, be better able to securely encode their own messages and to decode their adversaries' messages.

What Are Prime Numbers?

Prime numbers are numbers that can only be divided evenly (that is, divided with no remainder) by 1 (the multiplicative identity) and the number itself. So, for any given prime number p, n divides p if and only if n is equal to either 1 or p. Using mathematical notation, this may be expressed as:

n | p, n = { 1, p }

As an example, I created (programatically) the following table of prime numbers in the interval [1,1000], of which there are 169. In the table, composite (non-prime) numbers appear in cells with grey backgrounds and the numbers along the right side keep a running total of the number of prime numbers within the interval through the end of each line.

The Prime Numbers In the Interval [1,1000]

numberstotal
123456789105
111213141516171819209
2122232425262728293011
3132333435363738394013
4142434445464748495016
5152535455565758596018
6162636465666768697020
7172737475767778798023
8182838485868788899025
91929394959697989910026
10110210310410510610710810911030
11111211311411511611711811912031
12112212312412512612712812913032
13113213313413513613713813914035
14114214314414514614714814915036
15115215315415515615715815916038
16116216316416516616716816917040
17117217317417517617717817918042
18118218318418518618718818919043
19119219319419519619719819920047
20120220320420520620720820921047
21121221321421521621721821922048
22122222322422522622722822923051
23123223323423523623723823924053
24124224324424524624724824925054
25125225325425525625725825926056
26126226326426526626726826927058
27127227327427527627727827928060
28128228328428528628728828929062
29129229329429529629729829930063
30130230330430530630730830931064
31131231331431531631731831932067
32132232332432532632732832933067
33133233333433533633733833934069
34134234334434534634734834935071
35135235335435535635735835936073
36136236336436536636736836937074
37137237337437537637737837938076
38138238338438538638738838939078
39139239339439539639739839940079
40140240340440540640740840941081
41141241341441541641741841942082
42142242342442542642742842943083
43143243343443543643743843944086
44144244344444544644744844945088
45145245345445545645745845946089
46146246346446546646746846947092
47147247347447547647747847948093
48148248348448548648748848949094
49149249349449549649749849950096
50150250350450550650750850951098
51151251351451551651751851952098
521522523524525526527528529530100
531532533534535536537538539540100
541542543544545546547548549550102
551552553554555556557558559560103
561562563564565566567568569570105
571572573574575576577578579580107
581582583584585586587588589590108
591592593594595596597598599600110
601602603604605606607608609610112
611612613614615616617618619620115
621622623624625626627628629630115
631632633634635636637638639640116
641642643644645646647648649650119
651652653654655656657658659660121
661662663664665666667668669670122
671672673674675676677678679680124
681682683684685686687688689690125
691692693694695696697698699700126
701702703704705706707708709710128
711712713714715716717718719720129
721722723724725726727728729730130
731732733734735736737738739740132
741742743744745746747748749750133
751752753754755756757758759760135
761762763764765766767768769770137
771772773774775776777778779780138
781782783784785786787788789790139
791792793794795796797798799800140
801802803804805806807808809810141
811812813814815816817818819820142
821822823824825826827828829830146
831832833834835836837838839840147
841842843844845846847848849850147
851852853854855856857858859860150
861862863864865866867868869870151
871872873874875876877878879880152
881882883884885886887888889890155
891892893894895896897898899900155
901902903904905906907908909910156
911912913914915916917918919920158
921922923924925926927928929930159
931932933934935936937938939940160
941942943944945946947948949950162
951952953954955956957958959960163
961962963964965966967968969970164
971972973974975976977978979980166
981982983984985986987988989990167
9919929939949959969979989991000169

How Many Prime Numbers Are There?

As there are infinitely many numbers, there are infinitely many prime numbers. However, the number of prime numbers in a finite interval is also finite, and the proportion of prime numbers in an interval quickly becomes smaller as the interval becomes larger.

Methods For Approximating Numbers of Primes

In discussing how to build modern electronic versions of the 1926 D. H. Lehmer bicycle-chain sieve (an electromechanical device designed to select numbers to test for primality), my long-time mentor LaFarr Stuart taught me to represent the number of primes in any closed interval [1,n] as the function Pi(n). He also taught me the (n / ln n) method to approximate Pi(n).

The (n / ln n) Method

The first method I used to approximate the number of primes on an interval [1,n] was

Pi(n)  
~
 n   =   n 


loge n ln n

After generating a table of several thousand prime numbers with my Apple iMac, I found that this doesn't provide a very good approximation. (My old iMac with only a 600 MHz PowerPC G3 processor running my quick-and-dirty C++ program tests for primality and prints only about 4000 numbers a minute, using Euclid's algorithm to find the greatest common denominator of the number under test.) Running my little program for about 10 minutes, I found that:

n (end of
interval)
approximated number
of primes (rounded)
actual number
of primes
deviation
(delta)
10 4 5 -1 -13.14%
100 22 26 -4 -16.48%
1000 145 169 -24 -14.34%
10000 1086 1230 -144 -11.73%

My (n / 2 log n) Method

Looking at a table of these numbers, I started wondering how many of the numbers in a given interval are prime. This was a simple matter of dividing Pi(n) by n. I realized then (based on work I performed 2002/08/15) that there seems to be a pretty apparent – perhaps even obvious – relationship between the magnitudes of n and Pi(n):

5/10 = 1/2
26/100 = ~1/4
169/1000 = ~1/6
1230/10000 = ~1/8

Simply by counting the number of zeros in the values for n above, which yields log n, I realized that it was half the denominator of my approximation of Pi(n). This is surprisingly simple and appears to yield a much closer approximation of Pi(n) than the (n / ln n) method, as shown below:

n (end of
interval)
approximated number
of primes (rounded)
actual number
of primes
deviation
(delta)
10 4 5 0 0%
100 25 26 -1 -3.85%
1000 167 169 -2 -1.38%
10000 1250 1230 20 +1.63%

Refining Methods for Approximating Pi(n)

Though the percentage error using my (n / 2 log n) method is much less than that of the previous method, this method of approximation tends to overstate rather than understate how many prime numbers are in a given range. What causes me even more concern, however, was the apparent divergence (increasing error as n increases) of these approximations. The previous method tended to converge, which is preferable, so I re-evaluated that method.

I found that I could greatly reduce the error in the (n / ln n) method by multiplying the result by the scalar 10/9, so the resulting approximation formula would look like this:

Pi(n)  
~
 10 n   =   10 n 


9 loge n 9 ln n

The results obtained from this method compare favorably to those obtained via the methods discussed previously, as indicated via the table below.

n (end of
interval)
actual number
of primes
deviation (delta)
(n / ln n) (n / 2 log n) (10 n / 9 ln n)
10 5 -13.14% 0% -3.49%
100 26 -16.48% -3.85% -7.20%
1000 169 -14.34% -1.38% -4.82%
10000 1230 -11.73% +1.63% -1.92%
15000 1755 -11.12% +2.33% -1.24%
20000 2263 -10.76% +2.74% -0.84%
25000 2763 -10.65% +2.87% -0.72%
30000 3246 -10.35% +3.22% -0.39%
35000 3733 -10.39% +3.17% -0.43%
40000 4204 -10.21% +3.37% -0.23%
45000 4676 -10.18% +3.41% -0.20%
   


averages: -11.77% +2.91% -0.60%

Summary

As indicated in the table above, the approximation obtained by using my (10 n / 9 ln n) method is much closer to the actual number of primes. In addition, though the values used here for n are relatively small (six digits, vs. the largest known prime number having roughly one million digits), the margin of error seems to get smaller as n gets bigger, so this approximation appears to converge upon the correct answer as n increases.