Thread: Primes

  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2013
    Rep Power


    Hello all. I am trying to make a program to generate all the primes up to 10^9 however i am not able to make an array of integers of that size. I understand how the sieve works but how can I go about generating all the primes up to said number?
  2. #2
  3. Contributed User
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    Jun 2005
    Rep Power
    One way would be to have an array of bits, rather than an array of bytes (or worse, integers).

    So you only use one bit to indicate whether a number is prime or not.

    Comments on this post

    • sepp2k1 agrees
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2005
    Rep Power
    Certainly one efficient way of doing it by Salems method. But however an alternative way is to use linked list?

  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2012
    Rep Power
    Originally Posted by thomas12
    i am not able to make an array of integers of that size.
    You don't need an array of integers for a sieve - just an array of booleans. That should cut down your memory consumption by a factor of 4, down to 1GB, which should fit into your memory fine.

    Using bits instead of bytes, as salem suggested, will cut it down by another factor of 8, down to 125MB. If you're using C++, you get this for free by using a vector<bool>.

    Also note that you should allocate large arrays on the heap, not the stack. The amount of memory that can be allocated on the stack is much more limited than on the heap. On my system it's 8MB, so a sieve for one billion numbers won't fit on the stack even if you use bits.

    Originally Posted by ssharish2005
    But however an alternative way is to use linked list?
    Using a linked list instead of an array will consume more memory, not less. Also sieve algorithms rely on random access being O(1). If you use a linked list for a sieve, that completely destroys its runtime - you might just as well use a naive non-sieve algorithm then.
  8. #5
  9. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Rep Power
       p: inv 1e9
       p: 50847534
    You should end up with 50,847,533 prime numbers.

    You could use Miller-Rabin test instead of sieve.

    To quickly compute a large sieve you could compute in separate blocks. Maybe I'll implement this to better understand timing based on working within my level 2 cache.

    In other words, instead of sieving all billion numbers at once do them in blocks of a kilobyte or something. Probably needless to say, the blocking is separable from bit representation versus unsigned int. Bits give you a factor of 32 more numbers you can sieve.
    [code]Code tags[/code] are essential for python code and Makefiles!
  10. #6
  11. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Feb 2004
    San Francisco Bay
    Rep Power
    There's a significant speed cost to implementing the sieve of Eratosthenes using a linked list, though. The usual implementation makes pretty heavy use of random access.

    A very common optimization both for time and memory is not to represent integers that are divisible by a small prime. For example, for a positive integer N not to be divisible by 2 or 3, N % 6 must be 1 or 5. That automatically eliminates 2/3 of the composite numbers that a naive implementation would consider. If you're careful, you don't need to waste array space on those obviously composite numbers:

    array[0] -> 1
    array[1] -> 5
    array[2] -> 7
    array[3] -> 11

    You can eliminate more than two small primes, but you'll find diminishing returns quickly. I found that eliminating up to 7 small primes worked pretty well.

    Combined with the bit-array optimization, this should put this problem well within reach. To get all the primes up to 10^9, if you eliminate multiples of 2 and 3 and use a single bit for each candidate prime, you're looking at an array of about 10 million 32-bit integers.

    There are algorithms that are a lot more efficient memory-wise, but they tend to sacrifice something in speed. A pretty good compromise is in this paper:

    It uses a lazy functional programming model, but the algorithm is pretty easy to understand. I implemented a C++ version of it a while ago, and it came out to about 200 lines of code including the basic heap operations which I decided to implement myself.

IMN logo majestic logo threadwatch logo seochat tools logo