September 12th, 2012, 04:20 AM

Binning Data
PART A
Write a program to calculate the distribution of numbers produced by the random number function
rand().
1) Calculate a random number between 0 and 1. To do this use rand() and divide by
RAND_MAX. Remember to use a CAST for RAND_MAX.
2) Next write some code to ‘bin’ the data. Create an array where each element represents a bin
(use, say, 10 bins to start but make this variable). Increment (increase by one) the array element
where the number falls. You need to calculate which bin the random number will fall into. If x is
the random number between 0 and 1 and there are 10 bins, then the bin number will be the integer
part of x*10 (assuming your array starts at 0).
Repeat this for a large number of times (use a for(){} loop).
Okay, so I think I have everything down but one thing.
This is my code.
Code:
#include <stdio.h>
#include <stdlib.h>
int main()
{
float x;
int i,j;
for(i=0;i<9;i++)
{
x = rand()/(double) RAND_MAX; printf("%f\n",x);
}
if ((0 <= x) && (x < 0.1)) {/*statements*/ }
else if ((0.1 <= x) && (x < 0.2)) { /* statements */ }
else if ((0.2 <= x) && (x < 0.3)) { /* statements */ }
else if ((0.3 <= x) && (x < 0.4)) { /* statements */ }
else if ((0.4 <= x) && (x < 0.5)) { /* statements */ }
else if ((0.5 <= x) && (x < 0.6)) { /* statements */ }
else if ((0.6 <= x) && (x < 0.7)) { /* statements */ }
else if ((0.7 <= x) && (x < 0.8)) { /* statements */ }
else if ((0.8 <= x) && (x < 0.9)) { /* statements */ }
else ((0.9 <= x) && (x < 1.0)) { /* statements */ }
return 0;
}
So all I need to do is declare the bins by the statements.
I had a look around online and found this thread http://cboard.cprogramming.com/cprogramming/145610readingdatafilebinning.html
But I'm not sure how to use it in mine if I can at all. As far as I can tell this looks like a good idea as it doesn't actually matter what the numbers are that fall into the bins, it's the distribution that counts.
Any thoughts?
September 12th, 2012, 06:10 AM

I think you need to look at your logic in your if statements, I doubt any would execute. Also, your instructions talk running this in a for loop and while you got one, you ain't runnin it in the loop.
It looks to me like you are not taking the time to understand what you are trying to do and are just randomly coding. Take a step back and write out the logic in highlevel pseudo code and model that code by hand. Once you have something that works on paper, _then_ convert it into code and start again.
September 12th, 2012, 08:58 AM

Originally Posted by mitakeet
I think you need to look at your logic in your if statements, I doubt any would execute. Also, your instructions talk running this in a for loop and while you got one, you ain't runnin it in the loop.
It looks to me like you are not taking the time to understand what you are trying to do and are just randomly coding. Take a step back and write out the logic in highlevel pseudo code and model that code by hand. Once you have something that works on paper, _then_ convert it into code and start again.
Okay, had another go and came up with this lot. Isn't quite doing what I expected, the number of elements in each bin is far greater than expected. :S
Code:
#include <stdio.h>
#include <stdlib.h>
#define N_SAMPLES 1000
#define N_BINS 10
int main()
{
int i,bins[N_BINS],j; //declares integer variables i and j, and integer array arr
float x,nums[N_SAMPLES],y;//declares floating point variable x
for(i=0; i<N_SAMPLES; i++)
{
nums[i]=0;
}
for(i=0; i<N_BINS; i++)
{
bins[i]=0;
}//initializes both arrays
for(i=0;i<N_SAMPLES; i++)
{
x = rand()/(double) RAND_MAX;//x is a random number between 0 and 1
nums[i]=x;//assigns a random number between 0 and 1 to each element in array 'nums'
}
for(i=0;i<N_SAMPLES;i++)//for every random number generated
{
for(j=0;j<N_BINS;j++)//for each bin
{
y=nums[i];
if((y>=j/10) && (y<(j+1)/10))
{
bins[j] = (bins[j] + 1);
}
else
{
bins[j] = (bins[j]);
}
}
}
for(j=0;j<N_BINS;j++)
{
printf("Bin Number: %d\n",j);
printf("Number of Elements: %d\n", &
bins[j]);
}
return 0;
}
September 12th, 2012, 09:03 AM

That is certainly a _lot_ better. Why, though, do you feel the need to create an array and store all the numbers between zero and one? Why not just create them one at a time and test them?
Why did you switch from the stack of if/elseif statements? I bet it would be a lot easier to debug if you stuck with what you had before!
September 12th, 2012, 09:51 AM

Code:
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#define N_SAMPLES 1000
#define N_BINS 10
#define SYMBOL '*'
#define MAX(A,B) ((A) < (B) ? (B) : (A))
int main() {
char bar[50];
int i,j,bins[N_BINS],max_counts;
/*Lambert Electronics, LLC. USA, NY*/
double x,nums[N_SAMPLES];
srand(time(NULL)); /* seed the prng (pseudo random number generator) */
puts("/* initialized bar for histogram */");
memset(bar,SYMBOL,sizeof bar);
puts("/* removed initialization memset(nums,0,sizeof(nums)); */");
for(i=0; i<N_BINS; i++)
bins[i]=0;
puts("/* initialize nums with original data *****************/");
for(i=0;i<N_SAMPLES; i++) {
for (x = rand(); RAND_MAX == x; x = rand())
puts("/* AVOIDED RAND_MAX *****************/");
nums[i] = x/(double)RAND_MAX;
}
puts("/* Directly compute the bins index *****************/");
for(i=0;i<N_SAMPLES;i++)
++bins[(int)(N_BINS*nums[i])]; /* note buffer overrun if we had retained RAND_MAX */
puts("Lambert Electronics, LLC. USA, NY");
puts("/* Find the maximum counts for histogram */");
for (i = max_counts = 0; i < N_BINS; ++i)
max_counts = MAX(max_counts,bins[i]);
for(i=0;i<N_BINS;++i) {
printf("%3d",i);
j = 40*(bins[i]/(double)max_counts);
bar[j] = 0;
puts(bar);
bar[j] = SYMBOL;
}
return 0;
}
Note: Some learn by example. Having struggled and considered this problem a bit I hope some lights flash inside Sophie's head, gains insight, and becomes a better programmer.
How exactly does "directly computing the bin index" work?
Why was initializing nums to 0 useless?
What does bar[j] = 0; do?
* we save the nums because Sophie might want to use it for other purposes. Statistical tests min/max, different bin sizes, grand sum of counts must match N_SAMPLES, etceteras.
Last edited by b49P23TIvg; September 12th, 2012 at 10:37 AM.
[code]
Code tags[/code] are essential for python code and Makefiles!