On Prime Numbers

by John R. Carlsen

Published December 26, 2006; Updated October 21, 2020

Since first publishing this, I have made many corrections and improvements. I apologize for inaccuracies that appeared in earlier versions.

What Are Prime Numbers?

A prime number is a natural number greater than 1 that cannot be formed by multiplying two smaller natural numbers. (The sets of these numbers may be represented with the symbols ℙ and ℕ, respectively.)

So, for any given prime number p, x divides p if and only if x is equal to either 1 or p. Using mathematical notation, this may be expressed as:

x | p, x = { 1, p }

(This may be read as “The value x evenly divides p, where x is a member of the set containing 1 and p”.)

Prime Numbers and Computers

In computer science, prime numbers have notable applications in areas including cryptography, which is the science of encoding and decoding messages to create relative security in communication. Modern cryptographic practices—such as RSA (Rivest–Shamir–Adleman) encryption—rely upon the selection of relatively unknown (and usually very large) prime numbers. Using these cryptographic methods, those able to identify larger prime numbers should, in theory, be better able to securely encode their own messages and to decode their adversaries’ messages. Cryptography also provides a foundation for blockchain-based systems such as Bitcoin that emerged around 2008-2009.

My Interest in Prime Numbers

Around the time I was born in 1970, my father shared an office at Control Data Corporation (CDC) with LaFarr Stuart, who later became my childhood mentor and lifelong friend.

During the Korean War, my father had taken a break from his studies at the University of Washington to enlist (twice) in the U.S. Army so that he could fight the cold war in Germany rather than the hot war in Korea; Stuart found that he could avoid fighting in Korea by staying in school. While postponing his service in the U.S. Air Force, he earned five four-year degrees. One of his degrees is in mathematics, in which he focused on number theory.

In 1979, my father and Stuart started teaching me how to take apart electronic circuits that were discarded by CDC, which I learned were logic modules designed by the company’s head engineer, Seymour Cray, now known as “the father of supercomputing”. They also taught me to put together circuits using kits from Radio Shack stores (then owned by Tandy Corporation), parts from local surplus stores, and (as I got a little older) a college-level digital logic trainer kit made by Digital Equipment Corporation (DEC), which Stuart loaned me, along with a first edition (1973 hardcover) Texas Instruments TTL Data Book. Stuart also loaned me a COSMAC Elf single-board computer built around the 1802 8-bit microprocessor made by RCA (where he had worked earlier), through which I learned to enter programs through a hexadecimal keypad using machine code.

After convincing my father to buy an Atari 800 home computer (probably in 1983, when retailers dropped its price to make way for the new 1200XL model), one of the first things I did with it was to write a program to test integers to find and print perfect numbers. It wasn’t a great program, but we’ve all got to start somewhere. I also learned quickly that I could increase the speed of calculations by about 25% through turning off the display subsystem, and turning it back on only after I pressed a key to monitor the program.

(The printer I had put together from component assemblies for an Atari 820 40-column dot-matrix type that I bought at a surplus store and a wooden box my father built, which unfortunately seemed to amplify the noise of the print head. I later got a disk drive that someone had assembled from surplus parts, which made me learn about computer hardware maintenance and repair, leading to my first business, its expansion and my start in online advertising in 1985, a job at Atari in 1987, and likely my following jobs at IBM research and for Atari founder Nolan Bushnell—effectively becoming a springboard for my career.)

Among many other things, Stuart introduced me to the work of another number theorist named Derrick Henry Lehmer (1905-1991), whose interesting work with early electronic computing and prime numbers includes the Lucas-Lehmer test (LLT) for primality and his 1926 electromechanical bicycle-chain sieve device (designed to select numbers to test for primality), which Stuart suggested we might reproduce using modern electronics.

Although such an implementation might circulate the sieve’s patterns through rings of shift registers, I believe that doing this might seem naïve compared to leaving the data in situ and accessing them by indexing through matrices.

For building such a sieve, many technologies exist today that are still better, faster, and/or cheaper.

Around 2002, as I completed a computer science degree, I had seen some interesting developments from Air Force Research Laboratories that suggested to me that a much better sieve could be constructed using newer technology, though at much greater cost. (Since then, there have also been great advancements in quantum computing, which potentially open entire new frontiers in computer science.)

On August 6, 2002, a trio of authors at the Indian Institute of Technology, Kanpur released for review an impressive paper titled Primes is in P, in which they present a (potential) relatively-fast algorithm for testing a given number for primality, now known as the Agrawal–Kayal–Saxena primality test, often abbreviated as the AKS primality test and also known as the cyclotomic AKS test. (One of my outstanding mathematics professors from Saint Edward’s University, Dr. Michael Engquist, brought this paper to my attention hardly more than a week after its publication.) The AKS primality test renewed my interest in prime numbers, and is particularly exciting because it is “the first primality-proving algorithm to be simultaneously general, polynomial, deterministic, and unconditional”.

I had created this page as a brief essay on prime numbers in 2006, the same year that Agrawal, Kayal, and Saxena received the Gödel Prize and the Fulkerson Prize.

In 2009, I taught in Texas as an assistant professor computer science before returning to Silicon Valley.

I occasionally revisit the topic of prime numbers, which I sometimes find useful as a foundation for practical applications in computer science. (I should note that I experiment largely for my own edification, and that my studies have been informal at best.)

How Many Prime Numbers Are There?

Just as there are infinitely many integers, there are infinitely many prime numbers. However, the number of prime numbers in a finite interval is also finite, and the proportion of prime numbers in an interval quickly becomes smaller as the interval becomes larger.

As an example, I created the following table of prime numbers in the interval [1,1000], of which there are 168. In the table:

Circles indicate prime numbers and double cicles indicate Mersenne prime numbers
Strikethroughs indiate composite (non-prime) numbers
Arcs over composite numbers indicate twin prime numbers
Colors indicate factors of 2 (red), factors of 5 (blue), and factors of 10 (violet)
The number at the end of each line is a running count of prime numbers

The Prime Numbers in the Interval [1,1000]

Numbers Prime & Composite										Total Prime
1	2	3	4	5	6	7	8	9	10	4
11	12	13	14	15	16	17	18	19	20	8
21	22	23	24	25	26	27	28	29	30	10
31	32	33	34	35	36	37	38	39	40	12
41	42	43	44	45	46	47	48	49	50	15
51	52	53	54	55	56	57	58	59	60	17
61	62	63	64	65	66	67	68	69	70	19
71	72	73	74	75	76	77	78	79	80	22
81	82	83	84	85	86	87	88	89	90	24
91	92	93	94	95	96	97	98	99	100	25
101	102	103	104	105	106	107	108	109	110	29
111	112	113	114	115	116	117	118	119	120	30
121	122	123	124	125	126	127	128	129	130	31
131	132	133	134	135	136	137	138	139	140	34
141	142	143	144	145	146	147	148	149	150	35
151	152	153	154	155	156	157	158	159	160	37
161	162	163	164	165	166	167	168	169	170	39
171	172	173	174	175	176	177	178	179	180	41
181	182	183	184	185	186	187	188	189	190	42
191	192	193	194	195	196	197	198	199	200	46
201	202	203	204	205	206	207	208	209	210	46
211	212	213	214	215	216	217	218	219	220	47
221	222	223	224	225	226	227	228	229	230	50
231	232	233	234	235	236	237	238	239	240	52
241	242	243	244	245	246	247	248	249	250	53
251	252	253	254	255	256	257	258	259	260	55
261	262	263	264	265	266	267	268	269	270	57
271	272	273	274	275	276	277	278	279	280	59
281	282	283	284	285	286	287	288	289	290	61
291	292	293	294	295	296	297	298	299	300	62
301	302	303	304	305	306	307	308	309	310	63
311	312	313	314	315	316	317	318	319	320	66
321	322	323	324	325	326	327	328	329	330	66
331	332	333	334	335	336	337	338	339	340	68
341	342	343	344	345	346	347	348	349	350	70
351	352	353	354	355	356	357	358	359	360	72
361	362	363	364	365	366	367	368	369	370	73
371	372	373	374	375	376	377	378	379	380	75
381	382	383	384	385	386	387	388	389	390	77
391	392	393	394	395	396	397	398	399	400	78
401	402	403	404	405	406	407	408	409	410	80
411	412	413	414	415	416	417	418	419	420	81
421	422	423	424	425	426	427	428	429	430	84
431	432	433	434	435	436	437	438	439	440	85
441	442	443	444	445	446	447	448	449	450	87
451	452	453	454	455	456	457	458	459	460	88
461	462	463	464	465	466	467	468	469	470	91
471	472	473	474	475	476	477	478	479	480	92
481	482	483	484	485	486	487	488	489	490	93
491	492	493	494	495	496	497	498	499	500	95
501	502	503	504	505	506	507	508	509	510	97
511	512	513	514	515	516	517	518	519	520	97
521	522	523	524	525	526	527	528	529	530	99
531	532	533	534	535	536	537	538	539	540	99
541	542	543	544	545	546	547	548	549	550	101
551	552	553	554	555	556	557	558	559	560	102
561	562	563	564	565	566	567	568	569	570	104
571	572	573	574	575	576	577	578	579	580	106
581	582	583	584	585	586	587	588	589	590	107
591	592	593	594	595	596	597	598	599	600	109
601	602	603	604	605	606	607	608	609	610	111
611	612	613	614	615	616	617	618	619	620	114
621	622	623	624	625	626	627	628	629	630	114
631	632	633	634	635	636	637	638	639	640	115
641	642	643	644	645	646	647	648	649	650	118
651	652	653	654	655	656	657	658	659	660	120
661	662	663	664	665	666	667	668	669	670	121
671	672	673	674	675	676	677	678	679	680	123
681	682	683	684	685	686	687	688	689	690	124
691	692	693	694	695	696	697	698	699	700	125
701	702	703	704	705	706	707	708	709	710	127
711	712	713	714	715	716	717	718	719	720	128
721	722	723	724	725	726	727	728	729	730	129
731	732	733	734	735	736	737	738	739	740	131
741	742	743	744	745	746	747	748	749	750	132
751	752	753	754	755	756	757	758	759	760	134
761	762	763	764	765	766	767	768	769	770	137
771	772	773	774	775	776	777	778	779	780	137
781	782	783	784	785	786	787	788	789	790	138
791	792	793	794	795	796	797	798	799	800	140
801	802	803	804	805	806	807	808	809	810	140
811	812	813	814	815	816	817	818	819	820	141
821	822	823	824	825	826	827	828	829	830	145
831	832	833	834	835	836	837	838	839	840	146
841	842	843	844	845	846	847	848	849	850	146
851	852	853	854	855	856	857	858	859	860	149
861	862	863	864	865	866	867	868	869	870	150
871	872	873	874	875	876	877	878	879	880	151
881	882	883	884	885	886	887	888	889	890	154
891	892	893	894	895	896	897	898	899	900	154
901	902	903	904	905	906	907	908	909	910	156
911	912	913	914	915	916	917	918	919	920	157
921	922	923	924	925	926	927	928	929	930	158
931	932	933	934	935	936	937	938	939	940	159
941	942	943	944	945	946	947	948	949	950	161
951	952	953	954	955	956	957	958	959	960	162
961	962	963	964	965	966	967	968	969	970	163
971	972	973	974	975	976	977	978	979	980	165
981	982	983	984	985	986	987	988	989	990	166
991	992	993	994	995	996	997	998	999	1000	168

The Prime-Counting Function

The number of primes in a closed interval [1,x] is represented by the prime-counting function π(x), which may also be written without Greek letters as pi(x).

No definitive method exists to date to calculate the number of primes in an interval, and doing so can be tedious and/or computationally intense.

However, there do exist methods of estimation that vary from simple though inaccurate to more accurate though complex.

Methods For Approximating Numbers of Primes

Starting in the 18th century, the number of prime numbers in an interval [1,x] could be approximated as x ÷ ln x. this might be espressed in longer forms as:

\|ℙ[1,x]\| = π(x)	≈	x	=	x

		log_e x		ln x

By the end of the 19th century, a more-accurate method of approximation was developed using the logarithmic integral function li(x), but the calculus required makes it more complex to use.

My Experiments

In the early 2000s, after generating a table of several thousand prime numbers with my Apple iMac, I found that the earlier method doesn’t provide a very good approximation.

My old iMac with only a 600 MHz PowerPC G3 processor running my quick-and-dirty C++ program tests for primality and prints only about 4000 numbers a minute, using Euclid’s algorithm to find the greatest common denominator of the number under test.

Running my little program for about 10 minutes, I found:

Interval Size	Actual Number of Primes	Rounded Approximate Number of Primes	Deviation
[1,x]	π(x)	x ÷ ln x	Δ
10	4	4	0	0%
100	25	22	-3	-13.2%
1,000	168	145	-23	-13.8%
10,000	1,229	1,086	-143	-11.7%

Refining Methods for Approximating the Number of Primes

In 2006, I had suggested that the approximation x ÷ ln x could be improved by multiplying by (10 ÷ 9). For relatively small intervals, this method’s results compare favorably to those obtained via the methods discussed previously, as indicated via the table below. Although multiplying x ÷ ln x by (10 ÷ 9) greatly improves estimates for relatively small values of x, multiplying by this constant causes the function to deviate when x reaches about 50,000 (5 × 10⁴) and actually makes the estimate poorer when x exceeds approximately 500,000,000 (5 × 10⁸).

Interestingly, multiplying x ÷ ln x by (100 ÷ 99) improves the estimate over a range of numbers at least through 10²⁷. Such an approximation might look like this:

\|ℙ[1,x]\| = π(x)	≈	100 x	=	100 x

		99 log_e x		99 ln x

As shown in the table below, for intervals less than about 500,000,000 (5 × 10⁸) this estimate is less accurate than multiplying by (10 ÷ 9). However, multiplying by (100 ÷ 99) also appears to converge on the correct value, though not nearly as quickly as li(x). Multiplying by (1000 ÷ 999) yields approximations that are less accurate, so the factor (100 ÷ 99) appears be what I believe is known in scientific circles as a lucky guess.

Interval Size	Actual Number of Primes	Deviation (Δ)
[1,x]	π(x)	(x ÷ ln x) − π(x)	(10 x ÷ 9 ln x) − π(x)	(100 x ÷ 99 ln x) − π(x)	li(x) − π(x)
10³	168	-13.8%	-4.17%	-13.1%	6.0%
10⁶	78,498	-7.79%	2.45%	-6.86%	0.17%
10⁹	50,847,534	-5.10%	5.45%	-4.14%	0.0033%
10¹²	37,607,912,018	-3.77%	6.93%	-2.79%	0.00010%
10¹⁵	29,844,570,422,669	-2.99%	7.79%	-2.01%	0.0000035%
10¹⁸	24,739,954,287,740,860	-2.48%	8.36%	-1.49%	0.000000089%
10²¹	21,127,269,486,018,731,928	-2.11%	8.76%	-1.13%	0.0000000028%
10²⁴	18,435,599,767,349,200,867,866	-1.84%	9.06%	-0.85%	0.000000000093%
10²⁷	16,352,460,426,841,680,446,427,399	-1.64%	9.29%	-0.64%	0.0000000000031%

In my 2006 essay, I described using my Apple iMac (with a PowerPC processor running at 600 MHz) and to process about 4000 numbers per second. In 2019, my calculations on my Panasonic Toughbook CF-30 (with a 2-core Intel^® Core™2 Duo L7500 processor running at 1.60 GHz) were more than 162 times faster. (Over the same small interval I processed in 2006, my code in 2019 runs so much faster that it’s practically immeasurable.)

Total processor resources available in 2019 are about 5.33 times what I had in 2006. But, as of this writing, I am experimenting only with a single thread that uses only a single processor core, so the improvement in hardware should speed up my experiments by only about 2.67 times. Accounting for differences between RISC and CISC processor architectures, the throughput of the newer processor core might actually be about 8 times faster.

So what would make my software run about 20 times faster? The differences in processing speeds between my experiments in 2006 and 2019 are likely best explained by differences in the algorithms I used.

Algorithms

In the field of computer science, we often describe computational time complexity in terms of big O notation, which is more formally known as Bachmann–Landau notation or asymptotic notation.

In 2006, I had noted that I used Euclid’s algorithm, which dates back to nearly 300 BCE.

In 2019, I initially used a very slow method of brute force computation called trial division (first described in 1202 by Fibonacci in his book Liber Abaci), which has a complexity of O(x²). Soon afterward, I improved my method to include a crude sieve. (Though much different in its implementation, it is functionally equivalent to the Sieve of Eratosthenes from circa 200 BCE or Euler’s Sieve from circa 1750 CE.) This reduced the time complexity to linear time with processing on the order of O(x), as observed in 1978 by Gries and Misra.

Afterward, my early experiments with wheel factorization using the relatively small basis { 2, 3, 5 } were computed at about the same speed. Using a larger basis improved speed, but yielded solution sets that were incomplete. (I believe that was likely caused by overflowing the integer type I had used, but I have not yet had occassion to delve into it again.)

Pollard’s Rho Algorithm

For comparison, in 2020, my laptop computer that’s already about 10 years old can evaluate the first 10,000 integers in a fraction of a second, and the first billion (10⁹) integers in less than four hours using the Linux command factor. This uses Pollard’s rho algorithm, which was developed in 1975 and appears to be still the most practical algorithm for factoring integers. To count the prime numbers in a certain interval (such as from one to one-million, for example), and report the amount of time spent doing it, the following string of commands and parameters could be entered at the command line prompt ($):

$ time seq 1 1000000 | factor | grep "[0-9]*: [0-9]*$" | wc -l

78498

real 0m6.963s

user 0m5.524s

sys 0m0.040s

(With the specified parameters above, seq executes factor using the first one million integers and feeds its output to grep, which outputs only lines containing prime numbers to wc, which counts the number of lines, reporting the expected number of prime numbers in that interval. Finally, time reports the amount of time required to perform the task, which was nearly seven seconds.)

Summary

I expect to continue occasional experiments with prime numbers, especially as they relate to practical applications with computers and electronics.

I welcome input via my contact page.

Syncopated attempts to present a model Web site that meets or exceeds all applicable technical and legal requirements, including those of the A.D.A., COPPA, GDPR, ICANN, and W3C.

Syntax validated

Style sheet validated

Highest accessibility

“Syncopated Systems” and “Syncopated Software” are registered trademarks, the interlaced tuning forks device and the “seriously sound science” and “recreate reality” slogans are trademarks, and all contents (except as otherwise noted) are copyright ©2004-2025 Syncopated Systems. ALL RIGHTS RESERVED. Any reproduction without written permission or other infringement is prohibited by United States and international laws.