Page 1 of 2 12 Last
  • Jump to page:
    #1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2010
    Posts
    41
    Rep Power
    0

    Question Read a hdf file and convert into ASCII format


    Hi,
    I have an hdf (extension )file , i need to read that file and convert it into an ASCII format using C program.

    The problem is small but not finding a way.
    Thanks
  2. #2
  3. I'm Baaaaaaack!
    Devshed God 1st Plane (5500 - 5999 posts)

    Join Date
    Jul 2003
    Location
    Maryland
    Posts
    5,538
    Rep Power
    248
    You gotsta to know what the format of the file is in order to do anything meaningful with it. Also, 'conversion to ASCII' is a totally meaningless term. ASCII is a way to interpret binary data and all binary data can be interpreted as some ASCII character already. What you really want to do is convert the machine readable hdf format (whatever that is) into a human-readable equivalent. Have you googled for an existing converter or attempted to find the data layout description?

    My blog, The Fount of Useless Information http://sol-biotech.com/wordpress/
    Free code: http://sol-biotech.com/code/.
    Secure Programming: http://sol-biotech.com/code/SecProgFAQ.html.
    Performance Programming: http://sol-biotech.com/code/PerformanceProgramming.html.
    LinkedIn Profile: http://www.linkedin.com/in/keithoxenrider

    It is not that old programmers are any smarter or code better, it is just that they have made the same stupid mistake so many times that it is second nature to fix it.
    --Me, I just made it up

    The reasonable man adapts himself to the world; the unreasonable one persists in trying to adapt the world to himself. Therefore, all progress depends on the unreasonable man.
    --George Bernard Shaw
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2010
    Posts
    41
    Rep Power
    0
    Originally Posted by mitakeet
    You gotsta to know what the format of the file is in order to do anything meaningful with it. Also, 'conversion to ASCII' is a totally meaningless term. ASCII is a way to interpret binary data and all binary data can be interpreted as some ASCII character already. What you really want to do is convert the machine readable hdf format (whatever that is) into a human-readable equivalent. Have you googled for an existing converter or attempted to find the data layout description?
    thanks for ur quick reply,
    the format of the file is .hdf and i want to convert it into some ASCII format (anything that u say binary will also work).
    i just wanted to know how to read the .hdf file using C programme and convert it into some other form.

    Thanks
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2010
    Posts
    41
    Rep Power
    0
    Originally Posted by mitakeet
    You gotsta to know what the format of the file is in order to do anything meaningful with it. Also, 'conversion to ASCII' is a totally meaningless term. ASCII is a way to interpret binary data and all binary data can be interpreted as some ASCII character already. What you really want to do is convert the machine readable hdf format (whatever that is) into a human-readable equivalent. Have you googled for an existing converter or attempted to find the data layout description?
    Also not by using tool , i have some single small file , just to show a C programme. After hair scratching and nail biting ...didnt found the way.
  8. #5
  9. I'm Baaaaaaack!
    Devshed God 1st Plane (5500 - 5999 posts)

    Join Date
    Jul 2003
    Location
    Maryland
    Posts
    5,538
    Rep Power
    248
    If you don't know the format of the file, you can't possibly have any prayer of converting it to anything in C or any other language. If you don't want to take the time to learn the format and don't want to take the time to find an existing converter, then I suggest you give up on the project as it is unlikely anyone here will provide you with such a tool.

    My blog, The Fount of Useless Information http://sol-biotech.com/wordpress/
    Free code: http://sol-biotech.com/code/.
    Secure Programming: http://sol-biotech.com/code/SecProgFAQ.html.
    Performance Programming: http://sol-biotech.com/code/PerformanceProgramming.html.
    LinkedIn Profile: http://www.linkedin.com/in/keithoxenrider

    It is not that old programmers are any smarter or code better, it is just that they have made the same stupid mistake so many times that it is second nature to fix it.
    --Me, I just made it up

    The reasonable man adapts himself to the world; the unreasonable one persists in trying to adapt the world to himself. Therefore, all progress depends on the unreasonable man.
    --George Bernard Shaw
  10. #6
  11. Contributing User
    Devshed Supreme Being (6500+ posts)

    Join Date
    Jan 2003
    Location
    USA
    Posts
    7,325
    Rep Power
    2228
    Originally Posted by cmaheshwari16
    Also not by using tool , i have some single small file , just to show a C programme. After hair scratching and nail biting ...didnt found the way.
    Hierarchical Data Format? Or a graphics file format? Sorry, but we don't have time to play "20 Questions" with you.

    STFW! (as you've already been told). hdf file format

    Or were you just trying to generate a hex dump of the file? In that case any hex editor would do the job. Or a utility such as xxd (my own personal choice). Or even MS-DOS' old debug command (only if I've very desparate), which is still around as of WinXP.
    Last edited by dwise1_aol; October 22nd, 2010 at 11:31 AM.
  12. #7
  13. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2010
    Posts
    41
    Rep Power
    0
    Originally Posted by mitakeet
    If you don't know the format of the file, you can't possibly have any prayer of converting it to anything in C or any other language. If you don't want to take the time to learn the format and don't want to take the time to find an existing converter, then I suggest you give up on the project as it is unlikely anyone here will provide you with such a tool.
    Code:
    If you don't know the format of the file,
    the format of the file is .hdf

    Code:
    you can't possibly have any prayer of converting it to anything
    its not anything format but an ASCII format , the sample converted file that i have an .ascii extension.

    Code:
    If you don't want to take the time to learn the format

    I am ready to learn the format but how ...if the file extension is .hdf, doesnt that means that it is an hdf format file.

    Code:
    don't want to take the time to find an existing converter
    i am not allowed to use an converter, i myself have to make the small program to convert

    Code:
    then I suggest you give up on the project as it is unlikely anyone here will provide you with such a tool.
    I am really very sorry but u should not say like this and i do not want a readymade tool, just want to know that how to read these special formats and convert that into some other format.



    As for simple formats like .txt, .doc,.... etc we can directly open that in read,write mode and save some particular content from them in some other file (with different extension) in C. Can we do the same for the above formats also?


    Really thanks for ur above reply......
  14. #8
  15. Contributing User
    Devshed Supreme Being (6500+ posts)

    Join Date
    Jan 2003
    Location
    USA
    Posts
    7,325
    Rep Power
    2228
    So do you even have any clue what a HDF file is? Because there's really nothing unique about a file extension; it's only a convention that's only as good as everybody allows it to be. You mentioned .doc as a "simple" format. To most of us it means that it's a MS Word document, but over the years I've seen a number of other editors and word processors that used the exact same file extension and I can pretty much guarantee that they did not use the same file format. For that matter, even if we were to restrict .doc to Word, then it becomes a matter of which version; different versions of Word used different file formats, such that later versions had to convert older files to the new format.

    In Google'ing, I found at least two very different types of files having a .hdf extension, hierarchical data format and a graphics format. And the former example came in a variety of versions. So, yet again, do you have any clue what this HDF file is supposed to be?

    Assuming that you don't know what the file's format is, then you need to find candidate formats (ever hear the word, Google?). Then you need to know exactly what's in that file; a really good tool for that is generating a hex dump of the file, for which I already told of of some tools. Then read the hex dump according to each candidate format until you find a match. The beginning of the file should contain the file header which should contain information about the file (eg, version, header size, verification that it's actually the file format you're expecting), so that is what you should examine.

    So then, are we supposed to now also guess that you do not have any clue as to how to read that file? You simply open the file as a binary file and read a block of data into an array of unsigned char (which are bytes) and then extract the data from the proper offsets as per the file's format.

    We've done it lots of times, so we know what questions to ask you, which we've been asking.
  16. #9
  17. Banned ;)
    Devshed Supreme Being (6500+ posts)

    Join Date
    Nov 2001
    Location
    Woodland Hills, Los Angeles County, California, USA
    Posts
    9,782
    Rep Power
    4302
    I don't think you are stating the problem clearly. Here's the deal. You can read any file with any extension in C with code like this:
    Code:
    FILE *fp;
    
    fp = fopen("file.ext", "rb");
    if (fp == NULL) {
        fprintf(stderr, "Could not open file.ext\n");
        return -1;
    }
    .... now read the file with fgets(), fread(), fscanf() etc....
    fclose(fp);
    Of course, how you interpret the data is left to your program. That's why people make well-defined file formats. Which brings me up to the main question. What do you mean by .hdf file. What other programs read .hdf files. Then we can look up the file format and make sense of the data. If you can't answer this bit, then you're pretty much on your own.
    Up the Irons
    What Would Jimi Do? Smash amps. Burn guitar. Take the groupies home.
    "Death Before Dishonour, my Friends!!" - Bruce D ickinson, Iron Maiden Aug 20, 2005 @ OzzFest
    Down with Sharon Osbourne

    "I wouldn't hire a butcher to fix my car. I also wouldn't hire a marketing firm to build my website." - Nilpo
  18. #10
  19. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2010
    Posts
    41
    Rep Power
    0
    Originally Posted by dwise1_aol
    So do you even have any clue what a HDF file is? Because there's really nothing unique about a file extension; it's only a convention that's only as good as everybody allows it to be. You mentioned .doc as a "simple" format. To most of us it means that it's a MS Word document, but over the years I've seen a number of other editors and word processors that used the exact same file extension and I can pretty much guarantee that they did not use the same file format. For that matter, even if we were to restrict .doc to Word, then it becomes a matter of which version; different versions of Word used different file formats, such that later versions had to convert older files to the new format.

    In Google'ing, I found at least two very different types of files having a .hdf extension, hierarchical data format and a graphics format. And the former example came in a variety of versions. So, yet again, do you have any clue what this HDF file is supposed to be?

    Assuming that you don't know what the file's format is, then you need to find candidate formats (ever hear the word, Google?). Then you need to know exactly what's in that file; a really good tool for that is generating a hex dump of the file, for which I already told of of some tools. Then read the hex dump according to each candidate format until you find a match. The beginning of the file should contain the file header which should contain information about the file (eg, version, header size, verification that it's actually the file format you're expecting), so that is what you should examine.

    So then, are we supposed to now also guess that you do not have any clue as to how to read that file? You simply open the file as a binary file and read a block of data into an array of unsigned char (which are bytes) and then extract the data from the proper offsets as per the file's format.

    We've done it lots of times, so we know what questions to ask you, which we've been asking.

    Thanks, for all ur reply... now its clear that it will not make any diff from extension.
    HDF is some scientific format.....
    actually i was just confused because as i cannot see the content of the file so which data will be converted and saved to another file.
    also if i will save the content from one file to another than whether the extension I will give to that file will be the one or there is some other procedure for changing the format.

    and keeping mind all ur points i will clear the same on the coming working day with the senior.

    And then will post that what is the actual thing. As i am just new, so i was not having so much deep knowledge about that.

    Well thanks for all ur reply
  20. #11
  21. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2010
    Posts
    41
    Rep Power
    0

    now more clarity about problem


    hello,
    Now i have the format of the hdf its an HDF5 format.
    i have one program that converts this HDF5 into binary.
    my objective is to convert that binary into ASCII. I can convert it directly into ASCII from HDF5 or individually from binary to ASCII.

    well the first option is very typical.
    while converting the hdf into binary it also gives 2 text file . In one some file information is given let me tell u what
    /home/chandanm/Desktop/output///S1L2B2010221_04640_04641_Wind_direction.binary Wind_direction UNSIGNED_INT16 3 860 36 6

    below is the programme i tried, there is many possibilities that i have tried that are commented .......
    would appreciate if anyone will help....


    Code:
    #include <stdio.h>
    #include<string.h>
    #include<stdlib.h>
    int main()
    {
    FILE *fp1;
    FILE *fp2;
    char ch;
    short int num;
    int arr[36][340];
    char *buffer;
    char line[2000];
    int i,j;
    unsigned long fileLen;
    unsigned long bufferLen;
    fp1=fopen("S1L2B2010221_04640_04641_Wind_direction.binary","rb");
    fp2=fopen("out.txt","wa");
    
    if(fp1==NULL){
            fprintf(stderr,"could not open file\n");
            return -1;
    }
    
    fseek(fp1,0,SEEK_END);
    fileLen=ftell(fp1);
    fseek(fp1,0,SEEK_SET);
    
    //allocating memory
    buffer=(char *)malloc(fileLen+1);
    
    if(!buffer){
    fprintf(stderr,"memory err");
    fclose(fp1);
    return;
    }
    
    //read file contents into buffer
    fread(buffer,fileLen,1,fp1);
    fclose(fp1);
    bufferLen=sizeof(buffer);
    printf("%ul",bufferLen);
    for(i=0;i<bufferLen;++i)
    fwrite(buffer,bufferLen,1,fp2);
    
    /*
    if(!feof(fp1))
    {
    fread(&num,sizeof(int),1,fp1);
    fwrite(&num,sizeof(int),1,fp2);
    }
    else
    printf("end of file");
    */
    /*
    fread(arr,sizeof(arr),36*430,fp1);
    for(i=0;i<36;i++)
    for(j=0;j<430;j++)
    
    fwrite(&arr[i][j],sizeof(int),1,fp2);
    */
    /*while(fgets(line,sizeof line ,fp1)!=NULL)
    {
    
    fputs(line,fp2);
    }*/
    /*
    while(!feof(fp1))
    {
    fread(&num,sizeof(num),1,fp1);    //for reading integer
    fwrite(&num,sizeof(num),1,fp2);
    //fputc(num,fp2);                   //for writing integer 
    //ch=' ';
    //fread(&ch,sizeof(ch),1,fp1);    //for reading char
    //fputc(ch,fp2);
    //ch=' ';
    //fputc(ch,fp2);
    //printf("%c",ch);
    }
    */
    fclose(fp1);
    fclose(fp2);
    FILE *fp3=fopen("out.txt","ra");
    while(!feof(fp3))
    {
    fget(fp3);
    printf("%c",ch);
    }
    }

    Originally Posted by cmaheshwari16
    Thanks, for all ur reply... now its clear that it will not make any diff from extension.
    HDF is some scientific format.....
    actually i was just confused because as i cannot see the content of the file so which data will be converted and saved to another file.
    also if i will save the content from one file to another than whether the extension I will give to that file will be the one or there is some other procedure for changing the format.

    and keeping mind all ur points i will clear the same on the coming working day with the senior.

    And then will post that what is the actual thing. As i am just new, so i was not having so much deep knowledge about that.

    Well thanks for all ur reply
  22. #12
  23. Contributing User
    Devshed Supreme Being (6500+ posts)

    Join Date
    Jan 2003
    Location
    USA
    Posts
    7,325
    Rep Power
    2228
    Thank you for having used code tags to post the program. However, the reason for using code tags is to preserve indentation! Posting unindented code within code tags can very easily be seen as an intentional insult.

    Here is your code properly formatted (I hope; trying to make sense of unformatted code can be tricky, which is why the author needs to format it himself), so that others might be more inclined to look at it -- when asking for free help, it is a very bad idea to make helping you much more difficult than it needs to be:

    Code:
    #include <stdio.h>
    #include<string.h>
    #include<stdlib.h>
    
    int main()
    {
        FILE *fp1;
        FILE *fp2;
        char ch;
        short int num;
        int arr[36][340];
        char *buffer;
        char line[2000];
        int i,j;
        unsigned long fileLen;
        unsigned long bufferLen;
        
        fp1=fopen("S1L2B2010221_04640_04641_Wind_direction.binary","rb");
        fp2=fopen("out.txt","wa");
    
        if(fp1==NULL)
        {
            fprintf(stderr,"could not open file\n");
            return -1;
        }
    
        fseek(fp1,0,SEEK_END);
        fileLen=ftell(fp1);
        fseek(fp1,0,SEEK_SET);
    
        //allocating memory
        buffer=(char *)malloc(fileLen+1);
    
        if(!buffer)
        {
            fprintf(stderr,"memory err");
            fclose(fp1);
            return;
        }
    
        //read file contents into buffer
        fread(buffer,fileLen,1,fp1);
        fclose(fp1);
        bufferLen=sizeof(buffer);
        printf("%ul",bufferLen);
        for(i=0;i<bufferLen;++i)
            fwrite(buffer,bufferLen,1,fp2);
    
        /*
        if(!feof(fp1))
        {
            fread(&num,sizeof(int),1,fp1);
            fwrite(&num,sizeof(int),1,fp2);
        }
        else
            printf("end of file");
        */
        /*
        fread(arr,sizeof(arr),36*430,fp1);
        for(i=0;i<36;i++)
            for(j=0;j<430;j++)
    
                fwrite(&arr[i][j],sizeof(int),1,fp2);
        */
        /*while(fgets(line,sizeof line ,fp1)!=NULL)
        {
    
            fputs(line,fp2);
        }*/
        /*
        while(!feof(fp1))
        {
            fread(&num,sizeof(num),1,fp1);    //for reading integer
            fwrite(&num,sizeof(num),1,fp2);
            //fputc(num,fp2);                   //for writing integer 
            //ch=' ';
            //fread(&ch,sizeof(ch),1,fp1);    //for reading char
            //fputc(ch,fp2);
            //ch=' ';
            //fputc(ch,fp2);
            //printf("%c",ch);
        }
        */
        fclose(fp1);
        fclose(fp2);
    
        FILE *fp3=fopen("out.txt","ra");
        while(!feof(fp3))
        {
            fget(fp3);
            printf("%c",ch);
        }
    }
    I'm confused on a few things. You say:
    Now i have the format of the hdf its an HDF5 format.
    i have one program that converts this HDF5 into binary.
    my objective is to convert that binary into ASCII. I can convert it directly into ASCII from HDF5 or individually from binary to ASCII.
    What does it mean to "convert the HDF into binary"? Isn't HDF5 already a binary format? What is the new format of this binary output?

    Now, I assume that by "convert that binary into ASCII" you mean that you want to print out in text what the values of the binary fields are. That would be accomplished by opening a text file for output and using fprintf to write to it. But that's not what I'm seeing.

    The input file, fp1, is being opened for "read" and as a binary file; that is OK. The output file, fp2, is being opened for "write" and "append", which means that every time you run the program, the new output will be added onto the already-existing contents of the file, out.txt .

    But there's some confusion as to whether out.txt is a text file or a binary. My understanding is that in UNIX/Linux, not specifying binary automatically makes it text. But my experience is in MS-DOS/Windows, where we either explicitly specify text or binary or else that will depend of the setting of a system variable, fmode, which may or may not also exist in UNIX (I don't know about that). From the pathname of the input file, I assume that you're using UNIX or Linux.

    out.txt should be a text file, but the problem I see is that the program is treating it as a binary file:
    Code:
        //read file contents into buffer
        fread(buffer,fileLen,1,fp1);
        fclose(fp1);
        bufferLen=sizeof(buffer);
        printf("%ul",bufferLen);
        for(i=0;i<bufferLen;++i)
            fwrite(buffer,bufferLen,1,fp2);
    The fread reads in the entire contents of the input file, which is a binary file. But then fwrite does the reverse, writing the entire contents back out to the output file, out.txt, exactly as one would do with a binary file. Absolutely no translation of the input file's binary contents has been performed.

    Furthermore, the exact same entire contents of the input file is being written out multiple times. How many multiple times? How big is the input file? That many times. If you do a directory listing, you should find that the size of the output file, out.txt, is the square of the size of the input file. And if you examine a hex dump of out.txt and compare it with a hex dump of the input file, you should see the input file's entire contents being repeated in out.txt. I do not understand why anyone would want to generate an output file like that.

    You have looked up the HDF5 format, haven't you? What a viewer program needs to do is to fread() some of the contents of the file into a buffer and then extract the contents of the desired fields into variables which you can then either use to access the data or else convert to ASCII and print to an output text file. For example, if you extract an integer value, you can convert it to ASCII by either using sprintf() to store it into a char buffer that you later fputs() to a text file, or else you fprintf() it directly to the text file -- please note that sprintf() and fprintf() work almost exactly like printf(), except for where the output goes (sprintf to a string, fprintf to a file, and printf to the standard output stream, stdout).

    The commented-out code does nothing at all to deal with the format of a HDF5 file, in which data is stored in trees and as objects. Instead, that code purports to read and output individual data types, but it still outputs the binary values read in as binary, not converted to text. This program you got knows nothing at all about HDF5 files, which you should have immediately realized when you obtained a copy of the HDF5 format specification. You did obtain one, didn't you? Eg, HDF Group - HDF5 or HDF5 File Format Specification Version 2.0

    Why don't you just use an existing HDF5 viewer application?


    PS

    This code at the end of the program makes me wonder about the author:
    Code:
        FILE *fp3=fopen("out.txt","ra");
        while(!feof(fp3))
        {
            fget(fp3);
            printf("%c",ch);
        }
    Opening the file for read and append? That makes absolutely no sense at all! You append to an output file, but not to an input file. Did he think that "a" meant "text"? If so, then that is indicative of his incompetence. Furthermore, he's reading that input file byte-by-byte and writing each byte out as a character, but since out.txt contains binary data many of those bytes will not contain ASCII codes for printable characters; when you run that program (if you can even get it to compile) you will notice that your computer will beep at you at times, which happens when you try to print a byte containing the value of 7, AKA "bell".
    Last edited by dwise1_aol; October 27th, 2010 at 11:58 AM.
  24. #13
  25. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2010
    Posts
    41
    Rep Power
    0
    Thanks for noticing all the errors......


    Code:
    I'm confused on a few things. You say:
    Quote:
    Now i have the format of the hdf its an HDF5 format.
    i have one program that converts this HDF5 into binary.
    my objective is to convert that binary into ASCII. I can convert it directly into ASCII from HDF5 or individually from binary to ASCII.
    Now i have the format of the hdf its an HDF5 format.

    The format of the file HDF5 means its something a binary content that is having some information.
    And that information is being extracted and saved into other binary file.

    i have one program that converts this HDF5 into binary.

    I have one c program that is converting this raw HDF file into some other file with extension .binary.
    It is having some information in it but i cannot see it.

    my objective is to convert that binary into ASCII. I can convert it directly into ASCII from HDF5 or individually from binary to ASCII

    My aim is to have some text format file, so that we can read what is there in that binary file. As .hdf5 file can also be converted into some readable(.ASCII) but it getting tough. So i am thinking to just convert the binary file (converted from the C program from .h5 file) into ASCII.

    So ultimately what we have to do is convert binary file to ASCII(some text human readable).

    /home/chandanm/Desktop/output///ABC.binary Wind_direction UNSIGNED_INT16 3 860 36 6

    can u find some information of the file from the bold character(this property is from the converted output in which we are getting some text files and some binary output files)

    The output file, fp2, is being opened for "write" and "append"

    well its my mistake I have tried without [a] also, i thought of [a] for ASCII i just tried blindly. Well its also not working when opening with only "w". Yes u r right that i am working on Linux.
    So how to open a text file in write mode .


    Code:
    for(i=0;i<bufferLen;++i)
            fwrite(buffer,bufferLen,1,fp2);
    Code:
    The fread reads in the entire contents of the input file, which is a binary file. But then fwrite does the reverse, writing the entire contents back out to the output file, out.txt, exactly as one would do with a binary file. Absolutely no translation of the input file's binary contents has been performed.


    how to write the content in text format. Is that a direct function or we have to make some modifications in that.

    Code:
    Furthermore, the exact same entire contents of the input file is being written out multiple times. How many multiple times? How big is the input file? That many times. If you do a directory listing, you should find that the size of the output file, out.txt, is the square of the size of the input file.


    I cannot understand this line, do u want to say that i am writing the same contents two times in my output file if yes then where?

    Code:
    The commented-out code does nothing at all to deal with the format of a HDF5 file, in which data is stored in trees and as objects. Instead, that code purports to read and output individual data types, but <i><b>it still outputs the binary values read in as binary, not converted to text.</b></i>


    How to convert? that is the problem.

    Code:
    FILE *fp3=fopen("out.txt","ra");
        while(!feof(fp3))
        {
            fget(fp3);
            printf("%c",ch);
        }
    Opening the file for read and append?


    yeah u r right that really makes no sense. will change accodingly.



    So from the above code i am reading file correctly in binary mode.[reading whole file at atime].
    Now how to process this buffer[file output] and store in the text file.[human readable]

    Appreciate your help.
  26. #14
  27. Contributing User
    Devshed Supreme Being (6500+ posts)

    Join Date
    Jan 2003
    Location
    USA
    Posts
    7,325
    Rep Power
    2228
    Please use quote tags to quote text. That allows the text to wrap around and remain readable. Misusing code tags to quote text removes that wrap-around feature and renders the quoted text unreadable. Go back and read your reply to me to see what I mean. I'm substituting the correct tags below.


    Thanks for clear up the confusion. I thought you were presenting the .binary file as the original. That is good that you have a program to extract the data from the HDF5 file, because writing that yourself would not be a trivial task.

    However, that returns us to the problem that I had mentioned:
    Originally Posted by dwise1_aol
    What is the new format of this binary output?
    That is by no means a trivial question. You inform us:
    Originally Posted by cmaheshwari16
    /home/chandanm/Desktop/output///ABC.binary Wind_direction UNSIGNED_INT16 3 860 36 6

    can u find some information of the file from the bold character(this property is from the converted output in which we are getting some text files and some binary output files)
    So then the format is encoded into the program's output. OK, but what does it mean? As best as I can decypher it, the data is in 16-bit unsigned ints, but I have no clue what the following numbers mean. Did you look in the documentation that accompanied the program? Or a decent programmer will include a help option for displaying information on using the program: usually -h or --help . If it's not there, then look for a comment in the source code that tells us. If there is no comment, then search the code for the line that outputs that information and see what variables those numbers come from; even a slightly-better-than-lousy programmer will give his variables meaningful names, especially in a project with the scope and complexity of this extraction program. Lacking even that, trace the variables back into the code to see how their values get assigned.

    The bottom line on this question is that you have all the resources to answer it, whereas we have none. You will have to look for it; we can only offer suggestions of where to look.

    Originally Posted by cmaheshwari16
    well its my mistake I have tried without [a] also, i thought of [a] for ASCII i just tried blindly. Well its also not working when opening with only "w". Yes u r right that i am working on Linux.
    You guessed? In programming, never guess; always know! After all, when you read the man page for fopen(), you would have immediately seen that "a" meant "append", right? Whenever you are going to use or modify a standard library function call, always be sure that you know what it does and how to use it. All that information is in its man page or whatever Linux provides you nowadays (I remember an info utility having come into use about a decade ago). Always remember RTFM ("Read the Manual!").

    Originally Posted by cmaheshwari16
    So how to open a text file in write mode .
    I had already covered this (bold added here):
    Originally Posted by dwise1_aol
    My understanding is that in UNIX/Linux, not specifying binary automatically makes it text. But my experience is in MS-DOS/Windows, where we either explicitly specify text or binary or else that will depend of the setting of a system variable, fmode, which may or may not also exist in UNIX (I don't know about that).
    "b" means binary. In MS-DOS/Windows (where most of my experience lies), fopen was extended to also allow "t" for "text", but nowhere in Linux documentation for fopen have I found a "t" mode, but rather that it will default to text if you do not tell it "b".

    But then once you do open the file as "text", you need to treat it like a text file, not like a binary file as you had been doing.

    Originally Posted by cmaheshwari16
    how to write the content in text format. Is that a direct function or we have to make some modifications in that.

    . . .

    How to convert? that is the problem.

    . . .

    So from the above code i am reading file correctly in binary mode.[reading whole file at atime].
    Now how to process this buffer[file output] and store in the text file.[human readable]
    As I had written, more than once I believe (I cannot tell whether you had quoted it in your reply, since you had rendered that unreadable):
    Originally Posted by dwise1_aol
    For example, if you extract an integer value, you can convert it to ASCII by either using sprintf() to store it into a char buffer that you later fputs() to a text file, or else you fprintf() it directly to the text file -- please note that sprintf() and fprintf() work almost exactly like printf(), except for where the output goes (sprintf to a string, fprintf to a file, and printf to the standard output stream, stdout).
    C does not perform any automatic conversions, so simply reading in binary and writing it to a text file will not work; all you will get is the same thing written out to the text file. For another thing, there is absolutely nothing whatsoever inherent about binary data that will tell you what kind of data it is -- bits are bits and bytes are bytes. The only thing that gives any binary data any identity as to what kind of data it is is how the program handles it. IOW, you tell your program to treat data as integer and it will, but there's nothing about that data which can tell the program what data type it is. That is why part of the information in the original HDF5 file is a description of the data and what type and size it is. Your program cannot simply read any binary data and know how to convert it; you the programmer must know what kind of binary data you're dealing with and how to convert it and then you must tell the computer how to convert it.

    Along with the data type are two other variable properties of binary data: size and endianness (AKA "byte order").

    For example, the size of int is implementation-dependent, which means that it can be different on different systems (which is one reason why we need the sizeof() function). On older PCs, int would be 16 bits long, but now it's 32 bits long and on 64-bit systems would even be 64 bits long. So just knowing that you've got ints in a binary file is not enough; you need to know how many bytes are in each int. As it turns out, your extraction program gave us that information, "INT16", meaning two bytes (8 bits per byte).

    But is the first byte the most significant byte (MSB) or the least (LSB)? For example, if our int's value is 0x1234 (4660 decimal) and it's stored in the file MSB-first (AKA "big endian") and we read it in LSB-first (AKA "little-endian", which is what an Intel-based platform is), then we will mis-read 0x1234 as 0x3412 (13330 decimal). 4660 is not the same as 13330.

    Here's a little test program that explores this, along with providing some techniques for building integers out of a byte stream (something I've been doing for a living for decades):
    Code:
    #include <stdio.h> 
    
    //  Little-endian means that the least significant byte (LSB) comes first,
    //  Big-endian means that the most significant byte (MSB) comes first.
    //  See Wikipedia article at http://en.wikipedia.org/wiki/Endianness
    #define LITTLE_ENDIAN   // #define for little-endian case
    //#undef LITTLE_ENDIAN   // #undef for big-endian case
    
    int main()
    {
        FILE *filePtr;
        unsigned char buffer[10];
        int i;
        short n;
    
        filePtr = fopen("test","wb");
        for (i = 1 ; i <= 10 ; i++)
    #ifdef LITTLE_ENDIAN
            fwrite(&i,sizeof(short),1,filePtr);
    #else
        {
            buffer[0] = (unsigned char)((i >> 8) & 0x00FF);
            buffer[1] = (unsigned char)(i & 0x00FF);
            fwrite(buffer,sizeof(short),1,filePtr);
        }
    #endif    
        fclose(filePtr);
    
        filePtr = fopen("test","rb");
        for (i = 0 ; i < 10 ; i++)
        {
    #ifdef LITTLE_ENDIAN
            fread(&buffer, sizeof(short), 1, filePtr);
            n = (short)buffer[0] & 0x00FF;
            n |= ((short)buffer[1] << 8) & 0xFF00;
            
            // or could fread each value directly into the variable,
            //   but ONLY if the file's endianness matches the computer's
            // fread(&n, sizeof(short), 1, filePtr);
    #else
            fread(buffer, sizeof(short), 1, filePtr);
            n = ((short)buffer[0] << 8) & 0xFF00;
            n |= (short)buffer[1] & 0x00FF;
    #endif    
            printf("%d\n", n);
        }
        fclose(filePtr);
        
        return 0;
    }
    The program writes the integers 1 through 10 to a binary file, then read it back in and prints it out to the display (you could just as easily have fprintf'd it out to a text file -- HINT, HINT).

    Here's a hex dump of the file produced by the big-endian case:
    Code:
    0000000: 0001 0002 0003 0004 0005 0006 0007 0008  ................
    0000010: 0009 000a                                ....
    And here's a hex dump of the file produced by the litle-endian case:
    Code:
    0000000: 0100 0200 0300 0400 0500 0600 0700 0800  ................
    0000010: 0900 0a00                                ....
    As you can see, the little-endian ints are the reverse of what we would expect. You will also see that the manner in which I wrote them out to the binary file and read them back in were different depending on the endianness of the data in that binary file. And in both cases, this is what is written out to the display:
    Code:
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    So what's the endianness of the INT16s in that .binary file? You could look at a hex dump of it to find out.

    And now you have an example of reading from a binary file and converting it to ASCII output.


    Now just because the question was raised (inserting quote from my original message):
    Originally Posted by cmaheshwari16

    Originally Posted by dwise1_aol
    Code:
        //read file contents into buffer
        fread(buffer,fileLen,1,fp1);
        fclose(fp1);
        bufferLen=sizeof(buffer);
        printf("%ul",bufferLen);
        for(i=0;i<bufferLen;++i)
            fwrite(buffer,bufferLen,1,fp2);
    The fread reads in the entire contents of the input file, which is a binary file. But then fwrite does the reverse, writing the entire contents back out to the output file, out.txt, exactly as one would do with a binary file. Absolutely no translation of the input file's binary contents has been performed.

    Furthermore, the exact same entire contents of the input file is being written out multiple times. How many multiple times? How big is the input file? That many times. If you do a directory listing, you should find that the size of the output file, out.txt, is the square of the size of the input file. And if you examine a hex dump of out.txt and compare it with a hex dump of the input file, you should see the input file's entire contents being repeated in out.txt. I do not understand why anyone would want to generate an output file like that.
    I cannot understand this line, do u want to say that i am writing the same contents two times in my output file if yes then where?
    Let me add comments to your code to help explain:
    Code:
        //read file contents into buffer
        fread(buffer,fileLen,1,fp1);
        // buffer now contains the entire contents of the file
        //  the number of bytes of data in the buffer is fileLen
        fclose(fp1);
        // hopefully setting bufferLen to the same value
        //   as fileLen; no known need for introducing this variable
        //  Indeed, this introduces a bug, since buffer is a pointer
        //    the size of a pointer is around 4
        //     (I hadn't noticed this bug before).
        bufferLen=sizeof(buffer);
        printf("%ul",bufferLen);   // WHAT VALUE GOT DISPLAYED?
    
      // Now output the entire contents of the file
        //    and do that bufferLen times.
        // If bufferLen == fileLen, then the output file will be 
        //     fileLen-squared in size.
        //  But if bufferLen == 4 (or whatever got displayed), 
        //    then you would be writing out the first 4 bytes
        //    of the buffer 4 times over, such that the size
        //    of the output file would be 16 (or the square of
        //    whatever value bufferLen turns out to have been set
        //    to.  Please note that only the first four bytes in
        //    buffer would ever be written out; all the remaining
        //    data is ignored.
        for(i=0;i<bufferLen;++i)
            fwrite(buffer,bufferLen,1,fp2);
    As I say, now I see that sizeof(buffer) bug, which changes the details of my analysis, but you're still looping for no perceivable reason. What were you trying to accomplish with that code?

    Comments on this post

    • Joseph Taylor agrees
    Last edited by dwise1_aol; October 28th, 2010 at 12:53 PM.
  28. #15
  29. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2010
    Posts
    41
    Rep Power
    0

    this is the final thing that i can show....


    Thanks for ur reply......

    See the matlab prog for the conversion of the binary file [description got from the output of the program is]

    The text file generated information:[has some more files also.....]

    /home/output///ABC_04640_04641_Lat.binary Lat INT16 2 860 36 0

    /home/output///AB_2010221_04640_04641_Long.binary Long UNSIGNED_INT16 2 860 36 0

    Code:
    file=['Data/Binary/ABC',num2str(year,'%4.4i'),num2str(day,'%3.3i'),'_',num2str(dummy,'%5.5i'),'_',num2str(dummy+1,'%5.5i'),'_Lat.binary'];
    fid_lat = fopen(file,'r','l');
    
    if fid_lat>0
    
    file=['Data/Binary/F1',num2str(year,'%4.4i'),num2str(day,'%3.3i'),'_',num2str(dummy,'%5.5i'),'_',num2str(dummy+1,'%5.5i'),'_Lon.binary'];
    fid_lon = fopen(file,'r','l');
    
    
    for swath=1:2
    
    lat = fread(fid_lat,[36,430],'int16');
    lat=lat/100.;
    
    lon = fread(fid_lon,[36,430],'uint16');
    lon=lon/100.;
    
    H=[reshape(lon,36*430,1) reshape(lat,36*430,1) ];
    
    [fid,errmsg] = fopen(['Data/',num2str(time(215,1:4),'%4.4i'),'/OS_L2B_',num2str(time(215,1:4),'%4.4i'),num2str(time(215,6:8),'%3.3i'),num2str(time(215,10:11),'%2.2i'),num2str(time(215,13:14),'%2.2i'),'.dat'],'w');
    
    fprintf(fid,'%9.2f %8.2f \n',H');
    fclose(fid);
    
    end   %%% swath=1,2
    
    fclose(fid_lat);
    fclose(fid_lon);
    end   %%% if fid_lat>0
    So the output generated file has some information:

    -9999.00 -9999.00 -9999.00 -9999.00 0 093090116
    -9999.00 -9999.00 -9999.00 -9999.00 0 093090116
    |
    |
    |
    |

    the bolded values are from other binary files.

    If u can understand the format of the binary file from this matlab program and convert it into some C programme will appreciate very much.
Page 1 of 2 12 Last
  • Jump to page:

IMN logo majestic logo threadwatch logo seochat tools logo