Thursday, December 27, 2007

Development: Optimize Prime Number Search Using Sieve of Eratosthenes

For some reason I have been playing one of those online games where you have a series of hacking or programming related challenges to pass. The most recent challenge I did was one involving a series of silly computations on large numbers involving retrieving all of that numbers preceding prime numbers. Not too difficult, however, performance was a key component of the exercise since you had to have your program give an answer in under 3 seconds. And to add to my frustrations, I decided to do the whole thing in Java due to Javas utilities for network communication.

First, I figured I would take the input number, loop through all numbers up to it, and add any prime number to a List. What resulted was something like below:

import java.util.ArrayList;
import java.util.List;


public class BadPrimer {

/*
* Generic prime number check
*/
public boolean isPrime(int number)
{
if (number == 1)
return false;

for (int x = 2; x < number; x++)
{
if ((number % x) == 0)
{
return false;
}
}

return true;
}

/**
* Find all prime numbers before number
* @param number
* @return
*/
public List findAllPrimes(int number)
{
List l = new ArrayList();

for (int x = 2; x <= number; x++)
{
if (isPrime(x))
{
l.add(x);
}
}

return l;
}

public static void main(String[] args) {
BadPrimer p = new BadPrimer();

List l = p.findAllPrimes(8000000);
for (java.util.Iterator i = l.iterator(); i.hasNext();)
{
Integer number = (Integer)i.next();
System.out.println(number);
}
}

}


Needless to say, that was not quite up to task. Worked great on small numbers, but for numbers in the area of 8,000,000 or so, it was just horrendously bad. So, back to the drawing board. My next thought was, why not generate the list of primes beforehand. But generating the list took just as long, and wasn’t very efficient. There are things I could have done to speed things up, but the bottom line was this just wasn’t efficient enough.

So I had to hit the books a little and find some of those really old school math algorithms that weird Greeks came up with thousands of years ago to do large calculations without the aid of computers. What I came up with was the Sieve of Eratosthenes, a way of taking a set of numbers in a list, and eliminating numbers that aren’t prime numbers, really quickly. So, with pre-generating a list of Prime numbers using the found algorithm, my next attempt looked like this:

import java.util.Arrays;

public class BadPrimer {
public int [] primeArray;

/**
* using the sieve of Eratosthenes algorithm, quite cool actually
*/
public void buildPrimeSet()
{
int MAX_SIZE = 10000000;

//create a boolean array and set all elements to true
boolean [] numArray = new boolean[MAX_SIZE];
Arrays.fill(numArray, true);

//we already know 0 and 1 are not prime numbers, so ignore them
numArray[0] = false;
numArray[1] = false;

//x will be out driving prime loop, y will be the elimination loop
for (int x = 2; x < MAX_SIZE; x++)
{
//if x is still true, it is a prime, and we need to keep it
if (numArray[x])
{
//advance our inner loop, starting at twice the current position of x, and start dividing
//If you use y++ as the counter advance, the generation takes to long
for (int y = (x * 2); y < MAX_SIZE; y += x)
{
//if y is already false, dont bother setting
if (numArray[y])
{
numArray[y] = false;
}
}
}
}

int totalCount = 0;

//find the total number of primes
//this could be done in the above loop, but for logic
//illistration, I kept it here
for (int x = 2; x < MAX_SIZE; x++)
{
if (numArray[x])
{
totalCount++;
}
}

//create our array based on the number of primes
//and populate the array with the prime numbers
//Note: there are better ways of doing this, such as adding
//the prime numbers in the above loop when they are found, but
//I did it this way for logic reason, not efficiency
primeArray = new int[totalCount];
int pos = 0; //a position counter
for (int x = 2; x < MAX_SIZE; x++)
{
if (numArray[x])
{
primeArray[pos] = x;
pos++;
}
}
}

/**
* Find all prime numbers before number
* @param number
* @return
*/
public int findAllPrimes(int number)
{
//using x as our arary position
//go through the list until the value in our
//array is greater than the number used. We now have the cut off position
//to mark all prime numbers lower than our current number
int x = 0;
while (primeArray[x] <= number)
{
x++;
}

return x;
}

public static void main(String[] args) {
BadPrimer p = new BadPrimer();

p.buildPrimeSet();
int primeCutOff = p.findAllPrimes(31337);

for (int x = 0; x < primeCutOff; x++)
{
System.out.println(p.primeArray[x]);
}
}

}


I saved you a lot of the iterations I went through. The final chunk above was after learning quite a few things. First, working with primitives is much faster than working with objects. I abandoned the idea of using a List of integers and instead went with an Array of integers. This made a large improvement in performance since I didn’t have to do any conversions of object types, and I found out the hard way that Java will take int from a List and convert them to Integer objects. The second biggest thing was eliminating unnecessary iterations in my loops. At first, I was going through the inner loop in buildPrimeSet() using y++ instead of y+=x. This wasted a lot of iterations since the most a number divided by something else will do is halve it (integer wise). So if x was 50, it didn’t make sense to test 51 – 99 since they aren’t valid multiples of X, and that kind of defeats the purpose of the Sieve of Eratosthenes algorithm, which states to eliminate numbers that are multiples of the current prime.

So, the end result, what took hours to run before gets run in a matter of milliseconds now. Valuable lesson... I might need to brush up on some of my algorithm development skills. Regardless, this took care of the first part of the calculation in finding all the prime numbers for the requested number, the remaining parts will remain a mystery so I don’t give away the answer to the challenge.

No comments: