1. No Profile Picture
Registered User
Devshed Newbie (0 - 499 posts)

Join Date
Mar 2013
Posts
1
Rep Power
0

#### Primes

Hello all. I am trying to make a program to generate all the primes up to 10^9 however i am not able to make an array of integers of that size. I understand how the sieve works but how can I go about generating all the primes up to said number?
2. One way would be to have an array of bits, rather than an array of bytes (or worse, integers).

So you only use one bit to indicate whether a number is prime or not.

• sepp2k1 agrees
3. No Profile Picture
Registered User
Devshed Newbie (0 - 499 posts)

Join Date
Dec 2005
Posts
22
Rep Power
0
Certainly one efficient way of doing it by Salems method. But however an alternative way is to use linked list?

ssharish2005
4. No Profile Picture
Contributing User
Devshed Newbie (0 - 499 posts)

Join Date
Apr 2012
Posts
83
Rep Power
39
Originally Posted by thomas12
i am not able to make an array of integers of that size.
You don't need an array of integers for a sieve - just an array of booleans. That should cut down your memory consumption by a factor of 4, down to 1GB, which should fit into your memory fine.

Using bits instead of bytes, as salem suggested, will cut it down by another factor of 8, down to 125MB. If you're using C++, you get this for free by using a vector<bool>.

Also note that you should allocate large arrays on the heap, not the stack. The amount of memory that can be allocated on the stack is much more limited than on the heap. On my system it's 8MB, so a sieve for one billion numbers won't fit on the stack even if you use bits.

Originally Posted by ssharish2005
But however an alternative way is to use linked list?
Using a linked list instead of an array will consume more memory, not less. Also sieve algorithms rely on random access being O(1). If you use a linked list for a sieve, that completely destroys its runtime - you might just as well use a naive non-sieve algorithm then.
5. Code:
```   p: inv 1e9
50847534
p: 50847534
1000000007```
You should end up with 50,847,533 prime numbers.

You could use Miller-Rabin test instead of sieve.

To quickly compute a large sieve you could compute in separate blocks. Maybe I'll implement this to better understand timing based on working within my level 2 cache.

In other words, instead of sieving all billion numbers at once do them in blocks of a kilobyte or something. Probably needless to say, the blocking is separable from bit representation versus unsigned int. Bits give you a factor of 32 more numbers you can sieve.
6. No Profile Picture
Contributing User
Devshed Intermediate (1500 - 1999 posts)

Join Date
Feb 2004
Location
San Francisco Bay
Posts
1,939
Rep Power
1313
There's a significant speed cost to implementing the sieve of Eratosthenes using a linked list, though. The usual implementation makes pretty heavy use of random access.

A very common optimization both for time and memory is not to represent integers that are divisible by a small prime. For example, for a positive integer N not to be divisible by 2 or 3, N % 6 must be 1 or 5. That automatically eliminates 2/3 of the composite numbers that a naive implementation would consider. If you're careful, you don't need to waste array space on those obviously composite numbers:

array[0] -> 1
array[1] -> 5
array[2] -> 7
array[3] -> 11
etc.

You can eliminate more than two small primes, but you'll find diminishing returns quickly. I found that eliminating up to 7 small primes worked pretty well.

Combined with the bit-array optimization, this should put this problem well within reach. To get all the primes up to 10^9, if you eliminate multiples of 2 and 3 and use a single bit for each candidate prime, you're looking at an array of about 10 million 32-bit integers.

There are algorithms that are a lot more efficient memory-wise, but they tend to sacrifice something in speed. A pretty good compromise is in this paper:

http://www.cs.tufts.edu/~nr/comp150f.../Sieve-JFP.pdf

It uses a lazy functional programming model, but the algorithm is pretty easy to understand. I implemented a C++ version of it a while ago, and it came out to about 200 lines of code including the basic heap operations which I decided to implement myself.