October 24th, 2013, 12:43 PM
Read file line by line store in array
I situation is like this I have file which contains 1234567855260 lines
here length of line is unknown
wc -l data.txt
I need to read file and want to store it in array called Line so that using for loop I can access it and after some calculation I can print like below
Please experts help me,,,I guess it requires memory allocation malloc, but I am not expert please help me...I use gcc compiler
October 24th, 2013, 01:14 PM
You can't store the whole thing in memory: if each line was a single character that would be >1TB to hold just the contents of the file, not including newline characters, plus another ~5TB (4 bytes * 1.23 trillion strings) for the string pointers.
October 24th, 2013, 01:34 PM
As noted, you don't have enough RAM to hold that file.
Since however much you will read in and store is an amount not known at compile time, in general some kind of dynamic data structure would be needed which would necessitate using malloc and free.
Once you have read that data in, what are you going to do with it? IOW, how are you to process it? How much data will you need to have read in for each step of that processing?
Faced with this kind of a problem, I would set up a loop to process the entire file. Each time through the loop, I'd read in an amount of data -- since you can make this a fixed amount, that should remove the requirement for dynamic memory management. If the results of that processing need to be saved, then I would save it -- that could involve writing the results to an output file.
As for your for-loop, since you'd be reading in from a file just loop until you hit EOF, which you can test for with feof().
October 24th, 2013, 01:45 PM
Why have you randomly placed inappropriate text in code tags in your post? Just to make it really hard to read!?
Then why to I not believe the size of your file? Is it because of the infeasible size and the fact that the first 8 digits are in sequence perhaps?
Any how, I suggest that you do not read very large files into memory at all - it will probably just cause lots of page swapping. Rather use a memory mapped file ; you may still get page swapping but directly on the target file rather then to and from the swap file.
A memory mapped file makes it look as if the entire file is read into memory and it can be accessed as if it were a single large array. The OS memory manager automatically manages the exchange of data between physical memory and the file in a virtual address space without explicit file i/o.
To get line-by-line access to the data, you will need to scan it for the line ends (unless all lines are the same length, then you can simply increment an index by the line length).
October 24th, 2013, 11:31 PM
Someone please give me syntax for this..I will split file and I will use afters...I need code...I have 16GB RAM and Processor is intel Xenon(R) E series
October 25th, 2013, 10:08 AM
The link I posted includes an example. Did you follow it?
Originally Posted by Peter_P
Regarding your processor and RAM, and assuming you have a 64 bit OS, then it is possible to address a file the size you suggest, but your proposal to print every line sequentially deserves come consideration. Say you were able to output 1 million lines per second (which is unlikely, and even if you could is there really any point - who's going to read them!?), the output of the file would take a mere 14 days.