#1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2013
    Posts
    109
    Rep Power
    3

    Reading from file- pointer to buffer?


    When i use the function fread(), i know i'm supposed to provide a pointer to buffer.

    First- what is exactly a buffer?

    Second- if i have an array of structs (each struct contains name and ID of a person e.g.). I want fread() to go thru my struct array. How will my pointer look like, if the array of structs is as such:

    (will the pointer be the name "array" ?)
    Code:
    typedef struct 
    {
        char *name;
        int ID;
    }Person;
    
    Person array[10];/* Let's say i already wrote the names and ID's */
    
    fread(array, sizeof(Person), 10, file_pointer); /* just guessing here */
    What i mean:

    Code:
    Person array[10]={{jerry, 15}, {george, 23}, {mike, 98},....]; /* 10 elements */
    
    fread(array, ....... )  <-----  correct ???
    Thanks in advance.
  2. #2
  3. Contributing User
    Devshed Supreme Being (6500+ posts)

    Join Date
    Jan 2003
    Location
    USA
    Posts
    7,242
    Rep Power
    2222
    For what a buffer is, read the Wikipedia article at http://en.wikipedia.org/wiki/Data_buffer.

    Basically, as data is moved around it arrives at different speeds and times, usually when the CPU is doing something else. Electronically, data is only present at a hardware input for a short and specific amount of time, so a fast-acting program (eg, an Interrupt Service Routine (ISR)) needs to read it while it's there and put it somewhere until your program can get around to reading it. In addition, your program may not even be interested in reading it until several pieces of data has accumulated, specifically in C's console input functions until the Enter key has been hit. Also, data is read and written from and to disk not as individual bytes, but rather as entire sectors (ranging anywhere from 128 bytes to 4096, depending on the properties of the specific media), so that data has to be stored somewhere where the program will be able to read from it and when a program "writes to disk" that output data needs to be stored somewhere until an entire sector can actually be written.

    So what you need to do is to provide intermediate storage for that data. That intermediate storage is called a buffer. The OS uses and provides a number of buffers, the keyboard input buffer being one. Within user programs, it is customary to declare your own buffers -- eg, unsigned char arrays or char arrays -- into which you build the output that you want to write or you have an I/O function write into so that you can then read it and extract data from it.

    The term is also used in electronics, as in buffer amplifiers, which connect and isolate incompatible circuits.

    In general, the idea that you present would work, but your example will not.
    Code:
    typedef struct 
    {
        char *name;
        int ID;
    }Person;
    
    Person array[10];/* Let's say i already wrote the names and ID's */
    
    fread(array, sizeof(Person), 10, file_pointer); /* just guessing here */
    You are telling fread to take the memory image of array and read a block of data of that size from a disk file and to store it in array. That could only work if an identically formatted Person array of that size had been previously written to that file. It could only be quaranteed to be identically formatted if you compiled the program with the same compiler on the same platform (preferably on the exact same computer) and with the exact same optimization and other settings. The reason for that is that compilers will add padding bytes between struct fields in an attempt to optimize them by lining them up on word boundaries (it makes a difference for the hardware performing the actual memory fetches and stores). Different compilers on different computers will pad the exact same struct declaration differently. Calculate the size of the struct as the sum of the sizes of all its fields and then compare that to sizeof(Person). For fun, change the order of the fields and run that test again.

    But your example will not work because name is a pointer. fwrite would write the entire array out to disk just as you tell it to, but what you will be saving to disk will be the addresses that the name fields are currently pointing to. Do the fread in a different program and fread will dutifully write those pointer addresses into the name fields of your array just as you tell it to. But those address would be garbage. The strings that the name fields used to point to are gone and the memory addresses they are pointing to might not even be within your program's memory space. Basically, they are uninitialized pointers.

    The problem you have here is one of serialization (http://en.wikipedia.org/wiki/Serialization). When you read that array in from disk, you want to recreate the array that you had saved to disk. That means that you need to save the strings that those pointers used to point to so that when you read the array back in and recreate that array you will have the name pointers pointing to the same strings as before.

    Serialization is a non-trivial topic. There are several methods and a number of languages support serialization. Neither C nor C++ support it directly, but you can devise your own method or make use of a third-party library that does provide support.

    Now, if name had been a char array, your approach of "write out one big fat blob and then plop it back in" would have worked. And trivially, if your program had written to the disk and then read back in all in one run, it might have seemed to work just because that part of the heap hadn't changed, but that's building your house on quicksand.
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2013
    Posts
    109
    Rep Power
    3
    I see.

    Ok, let's say name is not a pointer, just a regular array of chars.

    So basically what i have to do is provide a pointer to the beginning of the array of structs ??
    Code:
    typedef struct
    {
        char name[20];
        int ID;
    }Person;
    
    Person array[10]={**all the above....**};
    
    Person *ptr= &array;
    
    fread(ptr, .......);
    
    fread(array, .......); <----- THE SAME ??
    Will the above be correct ?

    Although, why wouldn't "array" work, if a name of an array acts as if it were a pointer ? doesn't "array" hold the address to the beginning of its elements ?

    Sorry if i didn't get the full idea, it's just that i only started reading about the subject so i'm kinda struggling with the smaller things.

    I can understand how the pointers will basically be CUT-OFF from the data they used to point, so only the pointers will be copied (=which are addresses).
  6. #4
  7. Contributing User
    Devshed Supreme Being (6500+ posts)

    Join Date
    Jan 2003
    Location
    USA
    Posts
    7,242
    Rep Power
    2222
    First, does it work or not? Try running your program. Have one program write the array to a file and then have another read it. What happens when you run the second program? I call that the "Jello Test", wherein the proof of the pudding is in the eating -- IOW, does it work or not?

    Then add some diagnostic printouts. In the program that creates the file, print out the addresses of array and of the strings that the name fields point to. In the program that "recreates" the array, do the same.

    When you execute a program, it gets loaded into memory where there is a large enough block of free memory. That means that you can never predict where your program will be loaded and hence you cannot predict where any of your data will be in terms of an absolute address (you know its relative location within your program's memory space, but you don't know where that memory space will be placed). When you malloc memory for those name fields to point to, you pull that malloc'd memory out of the heap, which is a region of memory in your memory space set aside just for that purpose; when you execute your program, you cannot predict where that heap will be located.

    So, let's say that array[0].name points to address 0x0002F800 and your heap starts at 0x0002F000. The string there is "Dennis Ritchie". You write array to disk and close the program. Please note that the only place that the actual name information, "Dennis Ritchie", resided was in the heap at location 0x0002F800; the disk file has none of that information except for the memory location it was at. Since you closed the program, that memory space is freed up for the OS to assign to another process. When you run the second program (maybe the next day after you had rebooted the computer), it gets loaded into memory in a different location than the first, so now the heap is at 0x00058000. You read the array data back in and it says that array[0].name points to 0x0002F800. What's in 0x0002F800? If that memory has been used by other processes, it's very likely that "Dennis Ritchie" has been overwritten or else just plain lost by the computer having been powered down (RAM is volatile memory, meaning that it only remembers as long as it's powered up -- "A computer's attention span is only as long as its power cord."). Even worse, 0x0002F800 does not lie within your program's memory space, so when you attempt to access location 0x0002F800 you commit an access violation and the operating system terminates you immediately with extreme prejudice and for just cause.

    What you need to do is to save the string data in the file as well and then when you read the array from the file you need to also read the string data, malloc and store them in the heap, and update the appropriate name fields with the corresponding string's address.

    Or you could change name to a char array so that the string data would be stored within the array. Then you could write the entire array to disk and read it back again later, keeping in mind the caveats about structure padding.

    Originally Posted by C Learner
    Code:
    Person *ptr= &array;
    
    fread(ptr, .......);

    Will the above be correct ?
    No, it wouldn't. As the warning about trying to assign a Person** to a Person* would tell you. That statement should be:
    Person *ptr= array;

    PS
    You edited your message out from under me!

    Basically, the new code is correct except for the extra level of indirection in the declaration because of the extra amper (ie, &array). Remember, array is equivalent to a pointer of type Person * . I think that &array would usually return the same value as array, but I also think that is also undefined behavior and besides it'll throw warnings about incompatible pointers or different levels of indirection.
    Last edited by dwise1_aol; August 15th, 2013 at 04:18 PM.
  8. #5
  9. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2013
    Posts
    109
    Rep Power
    3
    PS- yea i confused that one a bit. What i mean is actually
    Code:
    Person *ptr= array;
    I forgot that the name array is the address itself.

    So what fread() "sees" is EXACTLY what it copies. When it sees a pointer (aka address) it copies the address (which is integer) and doen't bother to think what it points. It takes EXACTLY what it is given, except for the buffer pointer which is a "starting line"

    It's like a simple worker, given the initial starting point, and starts collecting everything in its way
  10. #6
  11. Contributing User
    Devshed Supreme Being (6500+ posts)

    Join Date
    Jan 2003
    Location
    USA
    Posts
    7,242
    Rep Power
    2222
    Yep! Bits are bits and bytes are bytes. All it knows is that it's told to read this many bytes from the file starting at the current file pointer position and write it as one continguous block starting at this address. It makes absolutely no attempt to interpret what it's reading; that's somebody else's job. And when it's done, it tells you how many bytes it read.
  12. #7
  13. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2013
    Posts
    109
    Rep Power
    3
    Thanks ! :)

    That really cleared things up, btw i really appreciate the extra info about file read\write it helps me acquire a better view of what is actually going on.

    I usually like to know a bit more than the topic itself, since it gives a wider view.

    I noticed that reading a chapter gives me 20% understanding, doing the exercises gives me 80% understanding, and studying the next 2-3 topics completes my understanding to 98%.
  14. #8
  15. Contributing User
    Devshed Supreme Being (6500+ posts)

    Join Date
    Jan 2003
    Location
    USA
    Posts
    7,242
    Rep Power
    2222
    And a pattern that I noticed in school was that I'd learn about something one semester and either didn't quite get it or I couldn't see any use for it (eg, pointers) and then the next semester that thing I couldn't quite understand made perfect sense and I'd be using it all the time.

    Another study technique you might want to use is to occasionally go back and re-read the earlier chapters. You should find that you will get a lot more out of them the second time through. And you will find things that you missed the first time.
  16. #9
  17. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2013
    Posts
    109
    Rep Power
    3
    Yep, exactly what i intended to do, since i finished most of the book, and wanted to just read the new chapters.

    The binary search function(the one that searches an int in v[0]...to...v[n-1] where the size increases as you advance into the array) is also used in the FILE input\output chapter, only its syntax is harder to read because they replace simple expressions with local expressions for file I\O (at least harder for me). But still, such a smiple code of searching by dividing by 2 and looking for middle, is used anywhere.

IMN logo majestic logo threadwatch logo seochat tools logo