#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2013
    Posts
    2
    Rep Power
    0

    Read file line by line store in array


    Hi experts

    I situation is like this I have file which contains 1234567855260 lines

    That is
    Code:
    wc -l data.txt
    1234567855260
    here length of line is unknown

    I need to read file and want to store it in array called Line so that using for loop I can access it and after some calculation I can print like below

    Code:
    for(i=1;i<sizeof(line)/sizeof(char);i++){
    printf(%s\n",line[i]);
    }
    Please experts help me,,,I guess it requires memory allocation malloc, but I am not expert please help me...I use gcc compiler
  2. #2
  3. Transforming Moderator
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    14,141
    Rep Power
    9398
    You can't store the whole thing in memory: if each line was a single character that would be >1TB to hold just the contents of the file, not including newline characters, plus another ~5TB (4 bytes * 1.23 trillion strings) for the string pointers.
  4. #3
  5. Contributing User
    Devshed Supreme Being (6500+ posts)

    Join Date
    Jan 2003
    Location
    USA
    Posts
    7,181
    Rep Power
    2222
    As noted, you don't have enough RAM to hold that file.

    Since however much you will read in and store is an amount not known at compile time, in general some kind of dynamic data structure would be needed which would necessitate using malloc and free.

    Once you have read that data in, what are you going to do with it? IOW, how are you to process it? How much data will you need to have read in for each step of that processing?

    Faced with this kind of a problem, I would set up a loop to process the entire file. Each time through the loop, I'd read in an amount of data -- since you can make this a fixed amount, that should remove the requirement for dynamic memory management. If the results of that processing need to be saved, then I would save it -- that could involve writing the results to an output file.

    As for your for-loop, since you'd be reading in from a file just loop until you hit EOF, which you can test for with feof().
  6. #4
  7. Contributing User

    Join Date
    Aug 2003
    Location
    UK
    Posts
    5,114
    Rep Power
    1803
    Why have you randomly placed inappropriate text in code tags in your post? Just to make it really hard to read!?

    Then why to I not believe the size of your file? Is it because of the infeasible size and the fact that the first 8 digits are in sequence perhaps?

    Any how, I suggest that you do not read very large files into memory at all - it will probably just cause lots of page swapping. Rather use a memory mapped file ; you may still get page swapping but directly on the target file rather then to and from the swap file.

    A memory mapped file makes it look as if the entire file is read into memory and it can be accessed as if it were a single large array. The OS memory manager automatically manages the exchange of data between physical memory and the file in a virtual address space without explicit file i/o.

    To get line-by-line access to the data, you will need to scan it for the line ends (unless all lines are the same length, then you can simply increment an index by the line length).
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2013
    Posts
    2
    Rep Power
    0
    Someone please give me syntax for this..I will split file and I will use afters...I need code...I have 16GB RAM and Processor is intel Xenon(R) E series
  10. #6
  11. Contributing User

    Join Date
    Aug 2003
    Location
    UK
    Posts
    5,114
    Rep Power
    1803
    Originally Posted by Peter_P
    Someone please give me syntax for this..I will split file and I will use afters...I need code...I have 16GB RAM and Processor is intel Xenon(R) E series
    The link I posted includes an example. Did you follow it?

    Regarding your processor and RAM, and assuming you have a 64 bit OS, then it is possible to address a file the size you suggest, but your proposal to print every line sequentially deserves come consideration. Say you were able to output 1 million lines per second (which is unlikely, and even if you could is there really any point - who's going to read them!?), the output of the file would take a mere 14 days.

IMN logo majestic logo threadwatch logo seochat tools logo