Yes, p must be very large. I think the current recommendation is to use p of at least 2048 bits (about 617 decimal digits). Using even larger p should be more secure, but as p grows larger so does the time needed to compute the DH key exchange. Today, 8192 bits is the maximum option on most systems.
Note that once a size has been chosen, not all primes of that size are suitable. For example, p-1 (which will always be composite) must have at least one large factor, where "large" is usually understood to mean at least 256 bits. Suitable prime numbers are often called strong primes.
There are fairly efficient tests for pseudoprimality testing, which make it possible to find an arbitrary 2048-bit strong prime within a few minutes on an ordinary computer. However, the process is a little complicated, involving several steps and techniques.
Once p has been found, it is also necessary to find a generator (also called a primitive root) g, with large order in p. For a large p, there are very many such generators, and for Diffie-Hellman, one is as good as another. Some DH groups use g=2; others use large g (for example, 256 bits in length, or even as large as p). For DH, the size of g has little practical importance.
Finding a generator depends on how the prime p has been constructed. It requires knowing the factorization of p-1. If p were chosen at random, factoring p-1 would in general not be feasible, but p is always chosen so that at least some of the factors of p-1 are known: this is necessary to ensure that there is at least one large factor.
With knowledge of the factors of p-1, an arbitrary integer can quickly be tested to determine whether it is a generator of sufficiently large group size.
However, it isn't necessary to go the work of finding a suitable p and g. With Diffie-Hellman, the group (specified by p and g) is not secret. Further, there is no security advantage to using a different group from anybody else. For example, suppose that in the world, there are 500 million computers whose users accept the security level of a 2048-bit p with a subgroup of 256 bits. It is absolutely fine for all of them to use the same group!
For this reason, various DH groups have been published for anyone to use. These groups have been checked for suitable properties, and some have p specially chosen to make the computations a little faster. You can look up one of these and use it!
Here is an example:
IETF RFC 5114
(see "2.3. 2048-bit MODP Group with 256-bit Prime Order Subgroup")