Today I’ll discuss how to create a prime number generator using the Sieve of Eratosthenes.

The reason I’m using this particular sieve is because of this line from the Wikipedia page: “The sieve of Eratosthenes is one of the most efficient ways to find all of the smaller primes (below 10 million or so).” Note that this script increments its search initially, not using latter iterations of “crossed-off” numbers which makes it run through the list comparison many more times that necessary.

Again, copied from the Wikipedia page, here is how the sieve works:

- Create a list of consecutive integers from 2 to
*n*: (2, 3, 4, …,*n*). - Initially, let
*p*equal 2, the first prime number. - Starting from
*p*, count up in increments of*p*and mark each of these numbers greater than*p*itself in the list. These will be multiples of*p*: 2*p*, 3*p*, 4*p*, etc.; note that some of them may have already been marked. - Find the first number greater than
*p*in the list that is not marked. If there was no such number, stop. Otherwise, let*p*now equal this number (which is the next prime), and repeat from step 3.

To get started, lets break down the initial steps from above into usable BASH code.

Step 1:

*Create a list of consecutive integers from 2 to n: (2, 3, 4, …, n).*

Read the upper search limit, *n*, from the user.

read -p "Enter upper limit of search: " upper_search_limit

Then create the list of integers using *seq* (I use sort to ensure later commands accept the data correctly)

sieve="$( seq 2 $upper_search_limit | sort )"

Step 2/3:

*Initially, let p equal 2, the first prime number. Then, s*

*tarting at the number 2, count up in increments of 2 and mark those numbers off the list.*

To do this, I started at 2, then incremented by 2 until the *upper_search_limit* was reached, creating a list of all odd numbers while still including the number 2. I actually didn’t mark them off the list yet, but will only check these numbers to create new lists later (thus avoiding creating a list for each and every even number, since no even number can be a prime).

for i in 2 $( seq 3 2 $upper_search_limit )

This is one of the major sources of inefficiencies in my code, because it goes through and checks each odd number rather than only checking those that had been marked off the list from later iterations.

Step 4:

*Starting from p, count up in increments of p and mark each of these numbers greater than p itself in the list. These will be multiples of p: 2p, 3p, 4p, etc.; note that some of them may have already been marked.*

Comparing the current sieve (which initially contained all numbers) to the first iteration of the loop (which started at 2 and incremented by 2, thus containing a list of all even numbers), and only keeping the numbers unique to the original list, yields the new sieve shown below (up to a search limit of 10, not filtering through *sort*).

Here, the sieve is set ( *sieve**=”$(..)” *) to the lines unique to the numbers in the first list (the *-23* switches), where the first list is the previous iteration of the sieve (* echo “$sieve” *), and the second list is the current iteration of the sieve, using the odd number list generated in step 3 above ( *seq $(( $i * $i )) $i $upper_search_limit | sort ) *).

sieve="$( comm -23 <(echo "$sieve") <(seq $(( $i * $i )) $i $upper_search_limit | sort ) )"

Remember that the only numbers crossed off the list are the ones that start *after** *the initial “seed” number, hence 2 and 3 both stay in the passes below (and the reason for the square ( *$i * $i* ) in the code above — study the Wikipedia page, especially the animation on the right to see that the first number crossed off for each new number is always the square of the “seed” number).

Also remember that I used *sort* to ensure the lists were handled properly by *comm*.

What happens:

Start #s: 1st pass: New #s: 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12 12 13 13 14 14 15 15 16 16 17 17 18 18 19 19 20 20

Start #s: 2nd pass: New #s: 2 2 3 3 5 5 7 7 9 9 11 11 12 13 13 15 15 17 17 18 19 19

Note that while the code has found all prime numbers from 2 to our upper search limit in 2 passes, it still will run through the loop 8 more times for the “seed” numbers, 5, 7, 9, 11, 13, 15, 17, 19! While this is horribly inefficient for such a small scope, ensuring, say, multiples of 5, 7 and 9 are crossed off is essential for checking primes up to 100, and so on for higher search limits.

Step 5:

*Display the results*

This is pretty self explanatory.

echo "Primes found up to: $upper_search_limit" echo "$sieve" | sort -n

Now for the final code itself.

#!/bin/bash #set -xv read -p "Enter upper limit of search: " upper_search_limit sieve="$( seq 2 $upper_search_limit | sort )" for i in 2 $( seq 3 2 $upper_search_limit ) do sieve="$( comm -23 <(echo "$sieve") <(seq $(( $i * $i )) $i $upper_search_limit | sort ) )" done echo "Primes found up to: $upper_search_limit" echo "$sieve" | sort -n

Once the script is complete, save and make it executable

chmod +x prime_sieve

And run it

./prime_sieve

And, wallah!, all the primes up to the search limit (100 in this case).

2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97

If you wrap the script in a *time* command ( *t**ime {..} *), you can see how long it takes to sieve for each prime list to the upper search limit.

Primes up to 100:

real 0m0.249s user 0m0.008s sys 0m0.020s

Primes up to 1,000:

real 0m1.384s user 0m0.076s sys 0m0.096s

Primes up to 10,000:

real 0m31.941s user 0m5.260s sys 0m1.952s

And just for fun, primes up to 100,000:

real 17m33.030s user 9m20.343s sys 1m58.447s