#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jun 2012
    Posts
    6
    Rep Power
    0

    Help with Null values in a string


    Hi all I have written a C++ program to remove the unprintable characters in the input string by checking the ASCII value of characters in the input string.

    here is the code snippet:
    C++ Code:
    char *StripUnPrinChar( char *InStr )
    {
                    char *result = NULL;
            char *OutStr = StripUnPrintableChars(InStr);
            result = OutStr;
     
       return result;
    }


    so when my input string has NULL characters in it, since NULL is an End of Array, it replaces the input string only until the NULL value and ignores the other remaining string. Any way Can I modify it to read the NULL values from the string and replace them??

    Here is my Main Code snippet: (in AIX null value is represented by "^@")

    c++ Code:
    int main( int argc, char *argv[] )
    {
       char TestString[]="asas^Mdy^@^@^@awe^Rabcdefh";
       printf("Input - %s \n", TestString);
     
       char *StrippedStr = StripUnPrinChar(TestString);
       printf("\nReturned string = %s\n", StrippedStr);
     
    }

    The ouput it prints is:

    Code:
    dy ut - asas
    dyput string is - asas
    But my expected output is:
    Returned String = asasdyaweabcdefh

    Appreciate all your help!
    Thanks,
    Faraway
    Last edited by requinix; October 7th, 2013 at 04:42 PM. Reason: fixed highlighting - only need the [highlight=c++][/highlight]
  2. #2
  3. Contributing User
    Devshed Supreme Being (6500+ posts)

    Join Date
    Jan 2003
    Location
    USA
    Posts
    7,091
    Rep Power
    2222
    You might want to go back and reformat your message so that it is readable. -- Disregard; I just saw that you have taken care of it.

    Since StripUnPrintableChars() is not a function in the Standard C Library, I would assume that you've written it too. Why didn't you present that function as well? If it's part of a third-party library or a function provided by your development environment, then why didn't you share that information with us?

    Either way, I don't see where your code attempts to deal with the problem that you describe, so how could we tell you where you're going wrong?

    Basically, approach the problem not as a string but as a binary stream, an array of bytes. You need two pieces of information, the array of bytes and its length. That's the same information you need for working with a string, but with a string you can use strlen to get the length. If you read that array of bytes from a source (eg, a serial port, a binary disk file), then you can get the length simply by counting how many bytes you read in. In the example that you give, you'd have to provide the length.

    So create a function that receives the byte array and its length and which then iterates through the array copying to a result array only those characters that are printable -- ctype.h has a function for testing for that.
    Last edited by dwise1_aol; October 7th, 2013 at 04:49 PM.
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jun 2012
    Posts
    6
    Rep Power
    0
    Originally Posted by dwise1_aol
    You might want to go back and reformat your message so that it is readable. -- Disregard; I just saw that you have taken care of it.

    Since StripUnPrintableChars() is not a function in the Standard C Library, I would assume that you've written it too. Why didn't you present that function as well? If it's part of a third-party library or a function provided by your development environment, then why didn't you share that information with us?

    Either way, I don't see where your code attempts to deal with the problem that you describe, so how could we tell you where you're going wrong?

    Basically, approach the problem not as a string but as a binary stream, an array of bytes. You need two pieces of information, the array of bytes and its length. That's the same information you need for working with a string, but with a string you can use strlen to get the length. If you read that array of bytes from a source (eg, a serial port, a binary disk file), then you can get the length simply by counting how many bytes you read in. In the example that you give, you'd have to provide the length.

    So create a function that receives the byte array and its length and which then iterates through the array copying to a result array only those characters that are printable -- ctype.h has a function for testing for that.
    Here is the the function
    C++ Code:
    char *StripUnPrintableChars(char *StrToStrip)
    {
       int j = 0;
       char *StrippedStr = (char *) malloc( strlen(StrToStrip) + 1);
       for( int i = 0; StrToStrip[i] ; i++)
       { //check for Unprintable character in the string
            char CharToCheck = StrToStrip[i];
        if( (CharToCheck > 32  ) && ( CharToCheck < 127))
            {
             StrippedStr[j++] = StrToStrip[i];
            }
            else
            {
                    continue;
            }
       }
       StrippedStr[j] = '\0';
       return StrippedStr;
    }
  6. #4
  7. Contributing User
    Devshed Supreme Being (6500+ posts)

    Join Date
    Jan 2003
    Location
    USA
    Posts
    7,091
    Rep Power
    2222
    OK, so you can also see why that won't work. It depends on a string function, strlen, to tell it the length of the string to be processed. And because of the NUL characters ('\0') strlen will not work correctly.

    Basic rule: When dealing with embedded null bytes, you cannot use string functions.

    So copy this function and give it a different name. Add a second parameter which is the length of the string and use that parameter in place of the strlen() call.

    Can you see why? I want to know that you understand what you're doing and why.
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jun 2012
    Posts
    6
    Rep Power
    0
    Originally Posted by dwise1_aol
    OK, so you can also see why that won't work. It depends on a string function, strlen, to tell it the length of the string to be processed. And because of the NUL characters ('\0') strlen will not work correctly.

    Basic rule: When dealing with embedded null bytes, you cannot use string functions.

    So copy this function and give it a different name. Add a second parameter which is the length of the string and use that parameter in place of the strlen() call.

    Can you see why? I want to know that you understand what you're doing and why.
    Yes I do understand that, how do I get the string length in the above code snippet, other than using strlen()?
  10. #6
  11. Contributing User
    Devshed Supreme Being (6500+ posts)

    Join Date
    Jan 2003
    Location
    USA
    Posts
    7,091
    Rep Power
    2222
    Originally Posted by farawaydsky
    Yes I do understand that, how do I get the string length in the above code snippet, other than using strlen()?
    Like I already said:
    Originally Posted by dwise1_aol
    Add a second parameter which is the length of the string and use that parameter in place of the strlen() call.
    How do you get that value to pass in when you call the function? Like I already said:
    Originally Posted by dwise1_aol
    If you read that array of bytes from a source (eg, a serial port, a binary disk file), then you can get the length simply by counting how many bytes you read in. In the example that you give, you'd have to provide the length.
    You created that arbitrary string of binary bytes, so you know how many there are. In a real-world situation, you would have read that data in from some source -- such as a serial port, a binary disk file, a network packet -- and in that act of reading the data either the functions you used would tell you how many bytes were read or else you could count them as you receive them.

    But since this is not a real-world situation but rather one in which you created an arbitrary string of data, you can also create that arbitrary string's length.
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jun 2012
    Posts
    6
    Rep Power
    0
    Thanks for the help!!

    A third party tool calls this C++ objects from it by passing the string to be stripped of as input. and I don't know how to read it byte by byte from that in C++.

    one thing I could do is get the string length in my tool and pass that as a parameter to it. But I doubt the tool will give accurate length if the string has NULLs in it.

    Anyways I'll try it and keep you posted.
  14. #8
  15. Contributing User
    Devshed Supreme Being (6500+ posts)

    Join Date
    Jan 2003
    Location
    USA
    Posts
    7,091
    Rep Power
    2222
    So you're getting your data from a function/object in third-party software?

    What is the prototype of that function/object method?

    Is it supposed to return a string? What kind of string? (ie, C-style string or a basic string object)

    Is it supposed to "return" a string via the parameter list and have as its return value something else, like the length of the string?

    If it instead returns an array of binary data, does it also return the length of that array (ie, the number of bytes of data)?

    And the big question. Because I think that you are missing the big question: How could a string possibly contain embedded NULs (quite different from the pointer value, NULL -- count the "L"'s) and still be a string?

    Which then begs the question: if this third-party software says that it's giving you a string, why would that string ever contain embedded NULs?
    Last edited by dwise1_aol; October 8th, 2013 at 02:51 PM.
  16. #9
  17. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jun 2012
    Posts
    6
    Rep Power
    0
    Originally Posted by dwise1_aol
    So you're getting your data from a function/object in third-party software?

    What is the prototype of that function/object method?

    Is it supposed to return a string? What kind of string? (ie, C-style string or a basic string object)

    Is it supposed to "return" a string via the parameter list and have as its return value something else, like the length of the string?

    If it instead returns an array of binary data, does it also return the length of that array (ie, the number of bytes of data)?

    And the big question. Because I think that you are missing the big question: How could a string possibly contain embedded NULs (quite different from the pointer value, NULL -- count the "L"'s) and still be a string?

    Which then begs the question: if this third-party software says that it's giving you a string, why would that string ever contain embedded NULs?
    The tool reads the data from the Mainframes database and the column in question is char and contains NULL as part of the text.

    for example a NAME field can contain FAR^@^@AWAY, this name field is being passed on to C++ function to strip the NULLs.
  18. #10
  19. Contributing User
    Devshed Supreme Being (6500+ posts)

    Join Date
    Jan 2003
    Location
    USA
    Posts
    7,091
    Rep Power
    2222
    So when it passes that NAME field to you, does it tell you how long that field is or not?

    IOW, how does it expect you to know the size of the field?

    It appears that the mainframe has a definition for strings that it expect you to know. C's use of ASCIIZ strings (a string of ASCII characters terminated by a NUL byte, a null-terminator) is only one possible way to define a string. BASIC, Turbo Pascal, and many others would use the convention where the string started with the second element of the array and the first element would contain the string length; of course, this would limit strings to a maximum length of 255 characters.

    So if that tool is handing you a string that uses a format different from a C-style string, then you need to find out just exactly what that foreign format is. And if its length is limited by nothing more than the database field width, then you need to find out how to obtain that field width.

IMN logo majestic logo threadwatch logo seochat tools logo