I have no intention of competing with GIMPS. To me this is a fun scripting exercise that may produce useful results beyond this initial application. Our goal is to take a bunch of integers and pick out the prime numbers. There are a couple of simple ways to spot a prime number from shell. One is to use perl (or python, or java, etc.) with the awesome “/^1?$|^(11+?)\1+$/” regex (Abigail, 1998).

Another way is to use the “factor” utility (part of the coreutils package) that decomposes integers into prime factors. You can wrap that command in a small script called “primefind”:

Here are examples of using both methods to find all prime number between 1 and 2^4:

As far as performance is concerned, the second method is considerably (about three times) faster. The next step is to increase the range, but here we may run into our first unexpected result: a Perl bug with the regex iteration limit. The bug is known, but seems to re-appear in various Perl versions, including some of the latest:

The Perl bug aside, this elegant regex approach will eventually grind to a halt as it tries every possible factorization as a pattern match. The regex makes n attempts to match a string of length n against each given pattern. And, as unary expansion, this gives us 2^(2n).

Consider the following shell script (geirha, 2008):

While a bit more complex than the Perl regex, this approach is a lot faster for large integers. We can couple it with xargs to take advantage of multi-core CPUs:

Uncork your CPUs

To get the most out of your CPUs, you may want to disable automatic CPU frequency scaling (kondeman daemon):

Going back to the factor utility, here’s a much faster implementation:

You may get even better time using parallel and passing multiple arguments per line (option -N). The advantage of parallel over xargs in this case is the ability to distribute load across multiple compute nodes on the network.

With very large numbers, seq becomes useless:

Brace expansion will not work either:

Shell arithmetic is out of the question as well (and, contrary to popular belief, the answer to life, the universe and everything is not 42, but it’s close.):

These are all unfortunate consequences of the range limitation of a 64-bit signed long int. To a point, the answer is to stick with bc, until you hit this limit:

The dc doesn’t work either:

And we can’t use Bash exponentiation:

Don’t be angry with Unix: considering that the observable universe consists of about 1011 galaxies, 1021 stars, 1078 atoms, and 1088 photons, the 10^10^10 limit of Bash seems rather generous. Luckily, here is Hypercalc, either Perl or JavaScript version. The large-number limit of Hypercalc in Knuth up-arrow notation is 10↑↑(1010):

But what to do with it is a subject for later.

According to Wikipedia, the largest known prime number is 274,207,281 − 1 found by GIMPS in January of 2016. This is still well within bc and factor limitations, so, theoretically at least, running the following command will eventually confirm this finding:

However, how long would this take? Well, 274,207,281 − 1 contains 22,338,618 digits, which puts it between 10^10^7 and 10^10^8 in the following attempt to estimate approximate time required to run the command above:

The time I got for each iteration on a 2.9GHz Xeon E5-2690 and 96GB of RAM is as follows:

As you can see, I had no patience to get past 10^10^6. Still, based on those few data points and assuming exponential growth (a good assumption in this case), I got y = 1.151806818·10-8 e3.815837901 x , where x is 10^10^x and y is estimated time to execute “bc <<< 10^10^x | factor”. Using this equation, we can see about how long it would’ve taken my computer to confirm that 274,207,281 − 1 is in fact a prime number:

So, I would have needed somewhere between 1.3 hours to 2.4 days, but closer to hours. And that’s just to confirm a known prime number. As you may imagine, actually finding one in that range would require lots of CPUs or an equivalent amount of luck.

So what about the elusive 10^10^10 – how long would it take me to confirm a prime number in that range, when one is inevitably found? Well, the Wolfram Alpha I’ve been using to solve the above equation also fails at that point and probably for the very same reason. The aforementioned Hypercalc, however, has no such problem:

So, just over thirteen-and-a-half years.

Leave A Reply

Please enter your comment!
Please enter your name here