#1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2010
    Posts
    67
    Rep Power
    0

    Smile Problem related to strings...


    check this program:
    Code:
    int main()
    {
    
        char *sp;
        sp = "Hello";
        printf("%s",sp); 
    
    }
    the program prints hello ..but how??

    actually it must not work ..right?

    when i say char *sp its char array so it must actually print if its like this

    Code:
    for(i=0;i<strlen(sp);i++)
    {
    printf("%c",sp[i]);
    }
    Other question i got :
    if i do
    Code:
    printf("%c",sp );
    it prints garbage value
    if i do
    Code:
    printf("%c",sp[0]);
    it prints "H"

    how come?
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2010
    Posts
    67
    Rep Power
    0
    actually speaking when i say sp it refers to address of 1st memory location right? I am quite confused :(
  4. #3
  5. Contributing User
    Devshed Supreme Being (6500+ posts)

    Join Date
    Jan 2003
    Location
    USA
    Posts
    7,145
    Rep Power
    2222
    There are a couple different ways that I know of to create a string:
    1. Have the characters of the string in an array and assign to one array element, usually the very first one, the function of containing the length of the string. Of course, this will limit the maximum length of a string to 255. This approach was used by Turbo Pascal and many, if not all, versions of BASIC (not to be confused with Visual Basic), as well as several other products I encountered 2 or 3 decades ago. If strings in your development tool restrict the length of strings to 255, it most likely is using this approach.

    2. Store the string in an array of characters which is terminated by a unique character. In the IBM PC, the BIOS and DOS system calls used either a dollar sign or a zero byte to terminate its strings, which were described as ASCII$ or ASCIIZ, respectively.

    C-style strings use the ASCIIZ convention. Therefore, in C every string is an array of char which is terminated by a character whose ASCII code is zero, '\0'. That means that the first character in the string is also the first character in the array, the zero'th element.

    Array names are for the most part equivalent to pointers, though with a few important differences (eg, you can never assign a new pointer value to an array name, nor increment or decrement it). Any pointer (or array name) specifically points to the first element of the array.

    Code:
    int main()
    {
    
        char *sp;
        sp = "Hello";
        printf("%s",sp); 
    
    }
    With "%s", printf expects you to give it a char pointer, which is what sp is. No problem, since you declared sp to be a char pointer. Nor would there have been a problem if sp had been a char array, since that is equivalent to it being a char pointer.

    With this line,
    sp = "Hello";
    you assign a value to sp. That value is the address of the string "Hello", which the compiler created in read-only memory. At that location, the string looked something like this: {'H', 'e', 'l', 'l', 'o', '\0'}. Please note the null-terminator, the '\0', which marks the end of the string. The address assigned to sp is the location of the first character of the string, the 'H'.

    Code:
    printf("%c",sp[0]);
    As you observed, this printed out "H" (though without the quotation marks). That is because the value of sp[0] is the character 'H'. You are telling printf that it is to interpret the value that you give it as a character. Previously, with the "%s" you told printf to interpret the value you gave it, a pointer, as a pointer to a string.

    Let us digress for a moment. I trust that you have used scanf as well. When you use scanf, you give it pointers, addresses of where you want it to store the values that it converts. All that scanf knows about those addresses you give it is that they are addresses; scanf knows nothing about the variables' datatypes. Rather, scanf uses the format string to tell it what those datatypes are supposed to be. If you use the wrong format specifier in the format string, then scanf will not do what you want it to do. In the case of scanf, you especially need to tell it the size of the datatype: using a float size to assign to a double variable will result in garbled input (there's a recent question about that exact same problem on this forum). All that scanf knows about how to store values at those addresses is what you tell it in the format string; if you tell it the wrong thing, then disaster awaits you.

    printf works the same way, though it might appear more forgiving. You give it a list of values, but it's the format string that tells printf how to treat those values, how to interpret them. You've given it a value, but what does that value represent? That all depends on the format descriptor you gave it. Give it the wrong descriptor flag and you'll get nonsense (eg, giving it a floating-point value but an integer descriptor flag). There is an old adage that computers don't do what you want them to do but rather only what you tell them to do. Format strings in scanf and printf are prime examples of this adage.

    If you tell printf "%s", then it will use the value as a pointer. But if you tell it "%c", then it will use that value as the ASCII code for the character to be printed. If you tell printf "%s" and give it a character value, then it will misinterpret that character value as a pointer, which will more likely than not point outside the permissible memory space of the program and will cause the program to crash with a SEGFAULT or ACCESS error. And if you tell printf "%c" and give it a pointer to a string, then it will misinterpret that address as an ASCII code value and display some kind of nonsensical character.

    Code:
    printf("%c",sp );
    This is what I just described to you. You're telling the compiler that sp is a char when in fact it is a char pointer. By giving printf misinformation, you caused it to produce unexpected results.


    Here's a related concept that may help, especially since it involves terminology that tends to crop up in error and warning messages: indirection, referencing, and dereferencing.

    When you use a normal variable normally (eg, use an int directly as an int), then there's no specific term for that that I know of. But if you access the value of an int through a pointer to that variable's address, then that is called indirection. More specifically, you can say that you have one level of indirection there. If you now use a pointer to a pointer to that int, then you have added a level of indirection; now you have two levels of indirection. And you can add even more levels of indirection, as many as you need or want.

    Levels of indirection is an important concept to understand, because it shows up in error and warning messages from your compiler. Specifically, if you attempt to assign a variable's value to a pointer, then the compiler should complain to you about "incompatible levels of indirection". For whatever reason, this kind of message seems to show up a lot more in C++, so you really need to learn and understand this concept. A programming magazine (C Users Journal, though it went through a few changes along the way) once had a short series of articles containing an exercise in working with many multiple levels of indirection; I would need to track down those articles in my own records and attempt to find them on-line, though I doubt the success of that search.

    Another term you will need to learn is reference. When you create a pointer, you are creating a reference to the data that you are interested in. In C, you're just creating a pointer and you never hear the term reference, but in C++ you do have references as an alternative to explicitly using pointers to pass function arguments by address (AKA "call by name"). And in C#, which does not allow pointers, you explicitly have references that serve somewhat the same purpose as pointers do in C. Learn this concept, because you will encounter it later. Basically, when you create a reference, you add a level of indirection.

    Now, in C you will encounter the term, dereference. Whenever you use a pointer to get to the data that it's pointing to, you dereference it. When you dereference, you remove a level of indirection.

    Now go back over your "problems". Look at exactly what you are telling the compiler. Think it through. It is all very logical. You just need to learn and understand how it works. We've all done that and so can you.

IMN logo majestic logo threadwatch logo seochat tools logo