The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.
|
 |
|
Dev Shed Forums
> Programming Languages
> C Programming
|
Initialising very large arrays- how?
Discuss Initialising very large arrays- how? in the C Programming forum on Dev Shed. Initialising very large arrays- how? C programming forum discussing all C derivatives, including C#, C++, Object-C, and even plain old vanilla C. These languages are low level languages, and used on projects such as device drivers, compilers, and even whole computer operating systems.
|
|
 |
|
|
|
|
|

Dev Shed Forums Sponsor:
|
|
|

September 2nd, 2003, 05:41 AM
|
|
Registered User
|
|
Join Date: Sep 2003
Location: UK
Posts: 20
Time spent in forums: < 1 sec
Reputation Power: 0
|
|
|
Initialising very large arrays- how?
I am a newby, writing a program in C which reads and parses a file as bytes. The file is well over 2 million bytes long and I found that I couldn't initialise an array to hold this many char values within a function.
I understand that this is because the memory is placed on the stack, and has a limit of 256Kbytes on most compilers (I'm using Visual C++ 6.0). Eventually I learned about memory allocation:
#define MAXREAD 2500127
int main(){
char *vptr;
vptr = (char *) calloc(MAXREAD, sizeof(char));
....
....
return 0;
}
And so used a pointer to a block of memory as an array. But what if I did want to declare a very large array within a function, without resorting to making it global or without changing the stack size?
What would you suggest?
Last edited by Puffer : September 3rd, 2003 at 03:28 AM.
|

September 2nd, 2003, 08:22 AM
|
 |
pogremar
|
|
Join Date: Jul 2003
Location: At Work
Posts: 958
  
Time spent in forums: 3 Days 18 h 21 m 50 sec
Reputation Power: 11
|
|
you would do the same within the function. However, if you're gonna manipulate the data in the function and are planning to return the manipulated data, you should declare the array before passing it to the function so that when your function returns, you have a way of freeing that memory.
PHP Code:
char* theFunction(char *theString);
void main(){
...
...
...
#DEFINE SIZE 2000000
char *p = (char*) calloc(SIZE, sizeof(char));
p = theFunction(p);
printf("%s \n", p);
free(p);
}
char * theFunction(char *theString){
// the function does something clever with the data
return theString;
}
__________________
Some day I'll create a smart quote to put here.
|

September 2nd, 2003, 01:14 PM
|
 |
Contributing User
|
|
Join Date: Aug 2003
Location: UK
|
|
|
You can declare a large array as static. It will not then be created on the stack. However, the memory shall be permanantly allocated from program start to finish, even when the array is not in scope. Also, the data in the array shall be retained between calls, and it will always be the same memory, which usually means the function is not re-entrant (i.e. may not be suitable for recursive algorithims, or multi-threading).
static char array[MAXREAD] ;
The array shall be initialised to all zero. Static variables are always initialised to zero. You can use the memset() function if zero is not what you need.
Dynamic memory allocation is usually the preferred approach, though static memory is simpler. If you declare a static aoutside of any function, it has local file scope, and cannot be accessed from other modules, data declared outside a function, but without static, and confusingly still statically allocated, but have global scope, and can be accessed from other modules.
Clifford.
Last edited by clifford : September 2nd, 2003 at 02:00 PM.
|

September 2nd, 2003, 09:39 PM
|
|
Offensive Member
|
|
Join Date: Oct 2002
Location: in the perfect world
|
|
|
Who is teaching the obsolete 'void main()' ?
Standards now say that an appliction must return a value to the OS. Either 0 or an error code.
People will point and laugh at you............
::Points at kubicon & puffer::
HA HA HA!
__________________
The essence of Christianity is told us in the Garden of Eden history. The fruit that was forbidden was on the Tree of Knowledge. The subtext is, All the suffering you have is because you wanted to find out what was going on. You could be in the Garden of Eden if you had just kept your f***ing mouth shut and hadn't asked any questions.
Frank Zappa
|

September 3rd, 2003, 03:26 AM
|
|
Registered User
|
|
Join Date: Sep 2003
Location: UK
Posts: 20
Time spent in forums: < 1 sec
Reputation Power: 0
|
|
|
thanks guys, I shall grep your wisdom at leisure.
*****************************************
remember, unarmed combat leads to toe wrestling
*****************************************
Last edited by Puffer : September 3rd, 2003 at 03:46 AM.
|

September 3rd, 2003, 07:49 AM
|
 |
/(bb|[^b]{2})/
|
|
Join Date: Nov 2001
Location: Somewhere in the great unknown
|
|
Quote: Originally posted by TechNoFear
Who is teaching the obsolete 'void main()' ?
Standards now say that an appliction must return a value to the OS. Either 0 or an error code.
People will point and laugh at you............
::Points at kubicon & puffer::
HA HA HA! | You should know, as long as there have been standards there have been "teachers" who do not follow them. It is left up to us to correct this when we run into them.
|

September 3rd, 2003, 08:22 AM
|
 |
pogremar
|
|
Join Date: Jul 2003
Location: At Work
Posts: 958
  
Time spent in forums: 3 Days 18 h 21 m 50 sec
Reputation Power: 11
|
|
|
I was just being lazy, didn't feel like putting the extra return 0; -- However, I've done wrong . I will accept my deserved spank.
|

September 3rd, 2003, 05:24 PM
|
 |
Contributing User
|
|
Join Date: Aug 2003
Location: UK
|
|
Quote: Originally posted by kubicon
I was just being lazy, didn't feel like putting the extra return 0... |
There is no need. While the standard states that main() must be declared as returning int, actually explicitly returning a value is optional. This is only true of main(), not of any other function.
This is a concession to systems for which a return from main() has no meaning, such as OS-less embedded systems, or an OS that does not take returns from executables. In systems where an executable termination value has meaning, you would expect the compiler to implicitly insert a return 0.
Clifford.
|

September 3rd, 2003, 05:35 PM
|
 |
/(bb|[^b]{2})/
|
|
Join Date: Nov 2001
Location: Somewhere in the great unknown
|
|
|
I think the focus here is more on setting main as a returnable type instead of a void type.
|

September 4th, 2003, 04:43 AM
|
|
Contributing User
|
|
Join Date: Aug 2003
Posts: 55
Time spent in forums: < 1 sec
Reputation Power: 10
|
|
|
Yes, declaring the array static will most likely put the array in the .BSS section (segment), possibly the .DATA section, I wouldn't advise on putting such large data in a segment.
You shouldn't really try to allocate anything over 32K of contiguous memory, I can't remember where I read this, but it seems sensible; you should consider processing the file data a part at a time, for such a large file.
|

September 4th, 2003, 05:01 AM
|
 |
Contributing User
|
|
Join Date: Aug 2003
Location: UK
|
|
Quote: Originally posted by xtor
...I wouldn't advise on putting such large data in a segment.
You shouldn't really try to allocate anything over 32K of contiguous memory... |
Without any justification or reference to a reputable source I would be suspicious of these assertions. They seem to have come from the days of MS-DOS, limited memory, and real-mode memory models. On a modern PC, even a 2.5MB array would hardly raise an eyebrow with me, so long as you did not try to statically initialise it, since that would be a) a large sourcefile, and b) make the .exe that much larger. However I would still consider the alternatives before resorting to such a large static array, such as dynamic memory allocation, file I/O, or memory mapped files.
Clifford
|

September 4th, 2003, 05:18 AM
|
|
Contributing User
|
|
Join Date: Aug 2003
Posts: 55
Time spent in forums: < 1 sec
Reputation Power: 10
|
|
It is possible that they have come from real mode OS's, the .exe would not have to be a lot larger for the data in the segment to be initialized.
I am not exactly good at assembly, but here is the code to initialize space declared in the .BSS section, to 0, then write out the memory. You can see in a strace that the write is 512 '0''s
Code:
%define BUF_LEN 512
section .bss
buf resb BUF_LEN
section .data
null dd 0
section .text
global _start
_start:
mov esi, null
mov edi, buf
mov ecx, BUF_LEN
repnz movsb
mov eax, 4
mov ebx, 1
mov ecx, buf
mov edx, BUF_LEN
int 0x80
mov eax, 1
mov ebx, 0
int 0x80
ret
I would most likely go for memory mapping, much more efficient.
I will try to find some resources to give some reasons behind my comments, I never read much into them (bad idea).
Last edited by xtor : September 4th, 2003 at 05:53 AM.
|

September 4th, 2003, 09:04 AM
|
 |
pogremar
|
|
Join Date: Jul 2003
Location: At Work
Posts: 958
  
Time spent in forums: 3 Days 18 h 21 m 50 sec
Reputation Power: 11
|
|
|
I agree with Clifford. I'm writting a spider now that indexes html/php/text files. If I couldn't write more than 32k at a time, the program wouldn't work as some of the html files are 1 meg long(for testing purposes). I read the file contents into dynamically allocated char arrays. I indexed folders that contained thousands of files and it's working just fine.
|

September 4th, 2003, 09:52 AM
|
|
Contributing User
|
|
Join Date: Aug 2003
Posts: 55
Time spent in forums: < 1 sec
Reputation Power: 10
|
|
|
You could make it work perfectly whilst only reading 32Kb at a time, it's just you don't know how to.
I'm not saying don't read morethan 32Kb, I'm just saying I read somewhere it's not a good idea and you may want to look into it.
|

September 4th, 2003, 11:36 AM
|
 |
Contributing User
|
|
Join Date: Aug 2003
Location: UK
|
|
Quote: Originally posted by xtor
You could make it work perfectly whilst only reading 32Kb at a time, it's just you don't know how to.
I'm not saying don't read morethan 32Kb, I'm just saying I read somewhere it's not a good idea and you may want to look into it. |
I remember contributing to kubicon's post about reading in large files, and yes he could, but in his application it would have made the code more complex and error prone. This approach made the code simpler. But as you say, each case should be considered on its merits.
Clifford
|
Developer Shed Advertisers and Affiliates
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Rate This Thread |
Linear Mode
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|