#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2012
    Posts
    3
    Rep Power
    0

    Need help stitching up a memory leak


    Good evening, devshed, first post here.

    I'm writing in today because of a memory leak I've discovered with a patch I've been working on. (by I, I mean Valgrind discovered it ;P) This is my first real endeavor in C, so I'm at a bit of a loss.

    the patch itself is located here: https://gist.github.com/4054818

    The leak is being caused by the malloc at line 283 of the patch. I don't have the C-fu to close it up. Here's a simplified summary of what's going on, since the patch is a few hundred lines:

    I have a function that iterates through a global char[] array, chops it into pieces, loads those pieces into a linked list and processes them one at a time. The pseudo function would look like this:
    Code:
    //linked list structs - so in the end i'm working with node->data->text
    
    struct node {
       void *data;
       struct node *next;
    };
    struct ndata {
       char * text;
    };
    
    static char stext[A_LONG_STRING];
    
    void processthetext() {  //drawstatus() in the patch
        char * textbuf;
        int text_len;
        char * startpos = stext, curpos = stext;
    
       while (*curpos) {
    
           // two pointers move through stext... skip a bunch...
           // i found the chunk i want, now demarcated by these ptrs
    
          int text_len = curpos - startpos;
          txtbuf = malloc((text_len * sizeof(char)) + 1);
          head = addnode(head, strcpy(malloc(strlen(textbuf)), textbuf));
          //this malloc ^^ is the one that is causing the leak.
          free(textbuf);
       }
       // later...
       curn = head;
       do {
          outputthetext_fancily(curn->data->text);
       /* 
       here's where i'm stuck.
       curn->data->text gets lost and I don't know how to free it.
       free(curn->data->text) gives me 'invalid pointer' and the
       program dies.
       char * tmp = curn->data->text; free(tmp); also dies.
       */
          free(curn->data);
       // aaand, here ^^ is where it gets lost, 
       // since data has the pointer to *text
       } while (curn != head);
       destroy_llist(head);
    }
    
    ////////////// llist stuff
    
    void destroy_llist(struct node *head){
       //free() each *node
    }
    
    node * addnode(node * head, char * text) {
        // malloc() a new node
        // malloc() a new ndata
    
        data->text = text;
    
        // set the last node's *next to the new node
        return head;
    }
    So I have a pointer to a struct (node), which points to a struct(data), which has a pointer to a string, and the string is what's getting lost. The thing I don't get is that the other piece of information I'm extracting from the input string frees up just fine. I'm processing them in the exact same way, as far as I can tell.

    Might anyone have any suggestions? I'm bleeding out close to 1KB/every few seconds ... :(

    Many thanks,
    -la
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Feb 2004
    Location
    San Francisco Bay
    Posts
    1,939
    Rep Power
    1313
    I looked at the code in your link and found some problems.

    (line 283)
    Code:
    head = addnode(head, ansi_text, "", strcpy(malloc(strlen(textbuf)), textbuf));
    That malloc call is not correct: you need to allocate strlen(x) + 1 bytes to contain a string x. This technically causes memory corruption, but a buffer overflow of one byte may not actually break anything, since some malloc implementations pad their memory blocks a bit. To compare with the next code excerpt, note that you are using a malloc()-ed memory block for the "text" field of this node's data.

    (line 286)
    Code:
    if (color[0] == 'r') {
        head = addnode(head, ansi_reset, strcpy( malloc(strlen(color)), color), "");
    } else if (color[0] == 'f') { //chops off 'fg:'
        head = addnode(head, ansi_fg, strcpy( malloc(strlen(color)-3), color_ptr+3),"");
    } else if (color[0] == 'b') {
        head = addnode(head, ansi_bg, strcpy(malloc(strlen(color)-3),color_ptr+3), "");
    } else {
        head = addnode(head, -1, "", "");
    }
    Now you are using a string literal for the "text" field of this node's data. String literals are not managed by libc, and attempting to free the block associated to a string literal leads to undefined behavior. This means at this point in the code, you've already shot yourself in the foot: some of your nodes have node->data->text that points to malloc()-ed memory, but some of them point to string literals. The malloc()-ed memory needs to be freed, or you have a memory leak, and the string literals cannot be freed, but unfortunately, you haven't kept track of which are which, so there's nothing you can do now.

    The easiest way to fix this may be to always allocate a new block for the text. Then you can free it when you free the node.

    Related to this: one thing I noticed about your code is that you aren't consistent about who "owns" the various memory blocks that you allocate. For example, addnode allocates memory for the "data" of the new node, but destroy_llist does not free it. This isn't good design: either the list implementation owns the data, in which case addnode should allocate memory for it and destroy_llist (or any "delete_node" function, if you had one) should free it, or the caller should be responsible for both allocating and freeing the memory. data->text should be treated similarly.
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2012
    Posts
    3
    Rep Power
    0
    lux:

    thank you for your thoughtful reply.

    after further investigation with valgrind, I did find that the malloc(strlen()) issue was the source of the problem, which I have to admit was a bit surprising to me. But I do need to go back and study the structure of the stack anyway - this is my first C adventure, so the concept of managing memory is still a bit esoteric. I'm used to python, and I'm sure it shows; after writing this bit of code, it's incredible to me how much of the basic machinery is hidden away by higher-level languages. I had to sit down and think for a minute about how to do some pretty basic things, which are included for free with something like python. (i.e. "how do i copy a string?" or "how do I return an array from a function??") I'm curious now to read the python source and see how they implement some of their data structures.

    Prior to my realization of the buffer overflow issue, what I ended up doing was getting rid of the data struct and placing data members directly inside of the node struct; although, doing so neither fixed the problem, nor did it improve the design flaw that you pointed out.

    Regarding your last comment about data 'ownership' - are you suggesting that it would be better to have the add node function handle malloc()'s and let the list destructor function take care of dealloc's? For instance, what I now have essentially is:

    Code:
    struct node {
        int type;
        char * textdata;
        char * colordata;
        struct node * next;
    };
    
    struct node * addnode(struct node *head, int type, char * color, char * text);
    i'd change that to:

    Code:
    struct node {
        int type;
        char * data;
        struct node * next;
    };
    and have separate addnode functions, i.e.:

    Code:
    struct node * addtextnode(struct node *head, char * text, int length);
    struct node * addcolornode(struct node *head, char * color);
    then the list destructor could free(data) when it frees each node.

    Have I understood your comment correctly? I don't know of a good portable way to use a "void * data" type of construct without using some sort of callback mechanism, which I don't know how to do right now - but I guess I'll put that on my todo list. Along with getting a few idioms under my belt.

    At least you didn't see my first attempt to return a string from a function - it looked like this:

    Code:
    char * return_a_string() {
        static char[MAGIC_NUM] buf = "\0"; //well, it doesn't segfault now... :\
        char * ptr = buf;
        sprintf(buf,"hello world :(");
        return ptr;
    }
    The biggest leak turned out to be in a different function altogether; it looked like this:

    Code:
    void function() {
        buf = malloc()
        if (foo) { 
            return; // doh
        } else {
            do_stuff();
        }
        free(buf);
    }
    anyway, thanks for your help! :cool:
    -la0x1f
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Feb 2004
    Location
    San Francisco Bay
    Posts
    1,939
    Rep Power
    1313
    Good job. I think you addressed everything I brought up.
    Originally Posted by la1111
    Regarding your last comment about data 'ownership' - are you suggesting that it would be better to have the add node function handle malloc()'s and let the list destructor function take care of dealloc's? For instance, what I now have essentially is:

    Code:
    struct node {
        int type;
        char * textdata;
        char * colordata;
        struct node * next;
    };
    
    struct node * addnode(struct node *head, int type, char * color, char * text);
    i'd change that to:

    Code:
    struct node {
        int type;
        char * data;
        struct node * next;
    };
    and have separate addnode functions, i.e.:

    Code:
    struct node * addtextnode(struct node *head, char * text, int length);
    struct node * addcolornode(struct node *head, char * color);
    then the list destructor could free(data) when it frees each node.

    Have I understood your comment correctly?
    Your code actually doesn't have the ownership problem anymore: the user is clearly responsible for allocating and freeing memory for the data. Your proposed change to the data structure is definitely a better design all around, though. Now, should you change it so that the list implementation owns the data? There are pros and cons. I'm leaning toward making the list own everything, since it's likely to result in cleaner code. It will reduce the number of places in your code that explicitly manage memory, which is generally a good thing.
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2012
    Posts
    3
    Rep Power
    0
    Originally Posted by Lux Perpetua
    Good job. I think you addressed everything I brought up. Your code actually doesn't have the ownership problem anymore: the user is clearly responsible for allocating and freeing memory for the data. Your proposed change to the data structure is definitely a better design all around, though. Now, should you change it so that the list implementation owns the data? There are pros and cons. I'm leaning toward making the list own everything, since it's likely to result in cleaner code. It will reduce the number of places in your code that explicitly manage memory, which is generally a good thing.
    great :) thanks for your help.

    glad I finally got around to learning some C - I feel like "xkcd #208" about pointers now.

    STAND BACK...

IMN logo majestic logo threadwatch logo seochat tools logo