#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2013
    Posts
    11
    Rep Power
    0

    Question Difficulty in reading tab delimited file


    I am trying to read the file 'exdata.txt' which ia a tab delimited file and the content is as below:
    Code:
    ID	seq	len
    082054	AAAG	46742
    53948	AAAGGGATAGAAAAAACGAA	37
    53948	AAAGGGAGACTTTGGATAAGG	39
    253	ALFPGELDY	15
    085241	ASHHHHHH	23
    184152	ASAS	11
    184152	AGGSGASAS	16
    184152	AGGGSGASAS	21
    184152	AGGGSGASAS	26
    184152	AASGASAS	31
    184152	AAAGSGXSGASAS	13
    I am using the following code to read and keep it into two lists of strings.
    Code:
    #include <stdio.h>
    #include <string.h>
    
    int bufferSize = 300;
    
    int  main()
    {
        int i,num, cnt, j, k;
        char *k1,  *v1, *pch;
        char *keys1[100], *values1[100];
        char line[bufferSize];
        
        FILE* infile_ptr = fopen("exdata.txt", "r");
        if (!infile_ptr) {
            printf("Couldn't open the file for reading\n");
            return 0;
        }
        
        j=0;
        while(fgets(line, bufferSize, infile_ptr) != NULL)
        {
            cnt = 0;
            pch = strtok(line,"\t");
            while (pch != NULL)
            {
                if(cnt==0){k1 = pch;}
                if(cnt==1){v1 = pch;}
                
                pch = strtok(NULL, "\t");
                cnt++;
            }
            printf("Read from file: %s\t%s\n", k1,v1);
            *(keys1+j) = k1;
            *(values1+j) = v1;
            j++;
        }
        
        for (k=0;k<j;k++)
            printf("%s\t%s\n", keys1[k], values1[k]);
        return 0;
    }
    The result that I get is weired and I am not sure why it is coming. Where my code is wrong?


    Code:
    184152	4152
    184152	AAAGSGXSGASAS
    184152	
    184152	
    184152	52
    184152	AAAGSGXSGASAS
    184152	AAAGSGXSGASAS
    184152	AAAGSGXSGASAS
    184152	AAAGSGXSGASAS
    184152	AAAGSGXSGASAS
    184152	AAAGSGXSGASAS
    184152	AAAGSGXSGASAS
  2. #2
  3. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Sep 2013
    Location
    Saint-Petersburg, Russia
    Posts
    236
    Rep Power
    28
    Hi!

    It looks you are making things more complex than necessary. How about reading the lines with sscanf or fscanf?

    You can use fscanf directly, but I think it would be more robust to use fgets+sscanf:

    Code:
    char line[256];
    char part1[100], part2[100], part3[100];
    
    while (...) {
        fgets(line, sizeof(line) - 1, f);
        sscanf(line, "%s %s %s", part1, part2, part3);
    }
    CodeAbbey - programming problems for novice coders
  4. #3
  5. Contributed User
    Devshed Specialist (4000 - 4499 posts)

    Join Date
    Jun 2005
    Posts
    4,376
    Rep Power
    1871
    > The result that I get is weired and I am not sure why it is coming. Where my code is wrong?
    The problem is, you're copying pointers, not strings.

    > *(keys1+j) = k1;
    > *(values1+j) = v1;
    k1 and v1 will always point to some point in your line buff (which gets overwritten with each new line read).

    You need to allocate and strcpy() the strings if you want to preserve value.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper
  6. #4
  7. Contributing User
    Devshed Supreme Being (6500+ posts)

    Join Date
    Jan 2003
    Location
    USA
    Posts
    7,145
    Rep Power
    2222
    I assume that your intention is to save the data in arrays of strings, keys1 and values1.

    This code does not copy strings into those arrays:
    Code:
            *(keys1+j) = k1;
            *(values1+j) = v1;
    What that code is is to save those two pointers to those arrays. Those two pointers point to locations in line. If you subsequently overwrite what's in line, then you will have changed what every single one of those saved pointers are pointing to.

    You need to malloc space for each of those arrays' pointers to the length of k1 and v1 plus one for the null-terminator. Then you need to explicitly copy the string into those malloc locations by using the standard string function, strcpy. Something like this (untested):
    Code:
            keys1[j] = malloc(strlen(k1)+1);
            strcpy(keys1[j], k1);
            values1[j] = malloc(strlen(v1)+1);
            strcpy(values1[j], v1);

    Comments on this post

    • marykindall agrees
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2013
    Posts
    11
    Rep Power
    0

    Red face


    Thanks for pointing to the error. Worked as desired but their was a warning message as given below.
    Code:
    $ gcc a.c
    a.c: In function ‘main’:
    a.c:33: warning: incompatible implicit declaration of built-in function ‘malloc’
    Originally Posted by dwise1_aol
    I assume that your intention is to save the data in arrays of strings, keys1 and values1.

    This code does not copy strings into those arrays:
    Code:
            *(keys1+j) = k1;
            *(values1+j) = v1;
    What that code is is to save those two pointers to those arrays. Those two pointers point to locations in line. If you subsequently overwrite what's in line, then you will have changed what every single one of those saved pointers are pointing to.

    You need to malloc space for each of those arrays' pointers to the length of k1 and v1 plus one for the null-terminator. Then you need to explicitly copy the string into those malloc locations by using the standard string function, strcpy. Something like this (untested):
    Code:
            keys1[j] = malloc(strlen(k1)+1);
            strcpy(keys1[j], k1);
            values1[j] = malloc(strlen(v1)+1);
            strcpy(values1[j], v1);
  10. #6
  11. Contributing User
    Devshed Supreme Being (6500+ posts)

    Join Date
    Jan 2003
    Location
    USA
    Posts
    7,145
    Rep Power
    2222
    When you use a C Standard Library function, you need to #include its header file. Otherwise, the compiler has no idea what you are talking about.

    When you use a library function, then read the documentation on it. You need to know what the syntax, arguments, return value values, and required header files are and how it's used. If your compiler doesn't come with a help file and your computer doesn't come with man files, then you can Google (or use whatever your favorite search engine is) on man page malloc. Always read the documentation for each file you use until you are very familiar with

    malloc.h Though it's also included in stdlib.h .

    PS

    Ah, I see that you're most likely using Linux. The man pages should be installed on your system.

    Comments on this post

    • marykindall agrees

IMN logo majestic logo threadwatch logo seochat tools logo