|
Creating the Shortest Prime Number Program
This is the story of my effort to write the shortest prime number listing program.
I was bored the other day and Daryl was wondering if 2003 was a prime number, so I decided to write a prime number finder. Good times. Here's the result of that: primes.c
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include <vector>
using namespace std;
vector<int> primes;
bool IsPrime(int i)
{
// Check for divisibility by 2
if ( i & 0x01 == 0 && i != 2 )
return false;
int maxTestVal = (int)sqrt(i);
for( vector<int>::iterator itr = primes.begin(); itr != primes.end()
; ++itr )
{
if ( (*itr) > maxTestVal )
return true;
int t = ( i / (*itr) );
if ( t * (*itr) == i )
return false;
}
return true;
}
int main(int argc, char **argv)
{
if ( argc != 2 )
{
printf("Usage: primes <max-num>\n");
return -1;
}
int maxNum = atoi( argv[1] );
primes.reserve( (int)sqrt(maxNum) );
for( int i = 2; i <= maxNum; ++i )
{
if ( IsPrime(i) )
{
primes.push_back(i);
printf("%i\n", i);
}
}
return 0;
}
|
Now that wasn't very tough.. if an argument was supplied, the program finds all primes up to (and including) the argument.
You'll note that I didn't use the most naive algorithm. I only check for divisibility with the number in question and previous primes less than the number in question's square root. It worked well, but it seemed kinda of beefy at a total of 789 characters. Thus after much obfustication I got it down to a measely 273 characters. Compiles with a simple "g++" command. Since it's so short, I'll just show you the entire file: short_primes.c:
#include <vector>
#include <math.h>
#define I int
I main(I c,char**v){typedef std::vector<I> P;P p;for(I i=2;i<=(c==2?atoi(
v[1]):99);){for(P::iterator r=p.begin();r!=p.end()&&*r<=(I)sqrt(i);++r)if
(i/ *r**r==i)goto l;printf("%i\n",*p.insert(p.end(),i));l:++i;}return 0;}
|
I thought that was pretty good, but I knew I could get it shorter if I used the naive algorithm.. so after some more work I got it down to a puny 139 characters. However, to compile it I'd reccommend "gcc -std=c99 --no-warnings" due to some of the optimizations I made (like taking out the #include's). Here it is: short_primes_naive.c:
int main(int c,char**v){for(int i=2,j;i<=(c==
2?atoi(v[1]):99);){for(j=2;j<=i/2;++j)if(i/j*
j==i)goto l;printf("%i\n",i);l:++i;}return 0;}
|
There is one difference, between the two shorter programs and the original one. The shorter ones both default to listing the primes up to 99 if no argument is specified, whereas the original one displayed an error. Could have made the shorter versions shorter by not having the default value, but I like it. :)
Ooh! After more work and much re-arranging of the code I got it down to a fancy 127 characters! I converted the for loops into while loops, removed the goto and used some more side effects. Like above, this should be compiled with "gcc -std=c99 --no-warnings". Here it is: shortest_primes_naive.c:
int main(int c,char**v){int i,j;while(j=1,++i<=(c==2?atoi(v[1])
:99)){while(++j<i&&i/j*j!=i);if(j==i)printf("%i\n",i);}return;}
|
Oh my my! It's amazing what tips one can find on the net! After reducing my code EVEN MORE (I had it down to 107 characters), I Googled and the results has helped me shorten the program quite a bit! Right now it's down to a puny 97 characters! One really nice thing is that because the variables are declared outside of the for loop, it compiles nicely with just "gcc" (of course you can always use "gcc --no-warnings" if you like). Here it is: really_shortest_primes_naive.c:
i,j;main(c,v)int*v;{for(;i++<(c-2?99:atoi(v[1]))
;)for(j=1;++j-i?j<i&i/j*j<i:printf("%i\n",i););}
|
Apparently I've been obsessing about getting this code ever smaller for over the past week, so now that I finally have it under 100 characters I think I'll take a little break. But if anyone finds a way to shorten the code even more, be sure to let me know!!
Alrighty, so I've got it down to a mere 91 characters now (had some stupid redundant checks going on!):
i=1,j;main(c,v)int*v;{for(;j=i++<(c-2?99:atoi(v[1]));)for(;++j-i?i%j:!printf("%i\n",i););}
|
Just when you thought it was all over! Hard to believe, but people actually read this page, and this one fellow actually sent me an email with his smaller version of a prime number program (weighed in a 91 characters, which at the time I only had my 97 character version up on my site)... so anyhow, after more communication (and more looking at the shocc results) we managed to get this beast reduced even further! Here's the gory details:
Person |
Comments |
Code |
# chars |
Joerg |
His version |
i=1,j;main(a,b)int*b;{for(;i++<(a-1?atoi(b[1]):99);i-j||printf("%d\n",i))for(j=1;i%++j;);} |
91 |
Me |
Moved initialization of j into first for loop |
i=1,j;main(c,v)int*v;{for(;j=i++<(c-2?99:atoi(v[1]));i-j||printf("%i\n",i))for(;i%++j;);} |
90 |
Joerg |
Shorter declaration of v |
*v;i=1,j;main(a,b){for(;j=i++<(a-1?atoi(1[v=b]):99);i-j||printf("%d\n",i))for(;i%++j;);} |
89 |
Me |
Removed variable j |
*v;i=1;main(a,b){for(a=a-1?atoi(1[v=b]):99;b=i++<a;i-b||printf("%d\n",i))for(;i%++b;);} |
88 |
Joerg |
Removed variable i |
*v;main(a,b){for(v=a-1?atoi((a=1)[v=b]):99;b=a++<v;a-b||printf("%d\n",a))for(;a%++b;);} |
88 |
Me |
Re-wrote argument check (now only works with 0 or 1 command-line arguments) |
*v;main(a,b){for(v=1[v=b]?atoi(v[--a]):99;b=a++<v;a-b||printf("%d\n",a))for(;a%++b;);} |
87 |
Me |
Refactored code to use only 1 for loop |
*v;main(a,b){for(v=1[v=b]?atoi(v[--a]):99;a%++b%a||(a^b||printf("%d\n",a),b=a++<v););} |
87 |
While this code is quite a bit smaller, the running time of the advanced algorithm is superior.. while both programs will find prime numbers in the range specified, the shorter ones are almost useless for numbers greater than 100,000 or so. Here are some timings I took to compare them (computing the primes up to 20,000):
Program |
Timings |
short_primes.c |
real 0m0.132s
user 0m0.110s
sys 0m0.020s
|
short_primes_naive.c |
real 0m3.592s
user 0m2.870s
sys 0m0.000s
|
really_shortest_primes_naive.c |
real 0m9.039s
user 0m6.630s
sys 0m0.060s
|
As you can clearly see, the shorter the code is, the less room for optimizations.. thus it gets much much slower!
Here are some of the techniques I used:
- one letter variable names
- making variables global (instead of passing parameters)
- manually inlining functions (even better than the above tip)
- #define'ing long and/or repeated words (bah, #define is too long a word)
- using the teriary operator (?:) instead of if's
- re-arranging loops to only have one statement inside them (then you don't need braces)
- using side-effects (ie: c = d++);
- using "&" instead of "&&" (when possible)
- not declaring variable types (defaults to int)
- not declaring my parameter types, unless different than int
- using expressions like "c-2" instead of "c!=2"
Nothing too crazy.. but it was an interesting excercise nonetheless!
Having nothing really to do yesterday I left my computer running for 12 hours and listed all the primes between 1 and 1 billion! Turns out there are 50,847,534 primes! You can download the humongous file here (123 MB).. all the primes you could ever want, in human-readable ASCII format, one prime per line. The file itself takes a good 10 minutes or so to load into an editor (at least the ones I use).. if I get really bored I might make a little program that'll look up primes for you or something (using a quick binary-search of course). Here's a brief summary of the primes (only 0.0000786% of the entire file!):
First 20 |
Last 20 |
2
3
5
7
11
13
17
19
23
29
31
37
41
43
47
53
59
61
67
71
|
999999491
999999503
999999527
999999541
999999587
999999599
999999607
999999613
999999667
999999677
999999733
999999739
999999751
999999757
999999761
999999797
999999883
999999893
999999929
999999937
|
|