#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2013
    Posts
    4
    Rep Power
    0

    Scanf limit characters


    Hi, I'm new to C and still learning how to do things. My following program works fine, providing the user remains within the 20, 3 or 32 character limit. However once the user goes over that limit it causes problems.

    So my question is, how do I use scanf in such a way, that I can limit the number of characters being typed in, or only take the first x number of characters from the data typed in and dump the rest.

    Thanks.

    #include <stdio.h>

    void main( void ) {

    char firstname[20];
    char lastname[20];
    char age[3];
    char serialnumber[32];

    printf("What is your first name?");
    scanf("%20s", firstname);
    printf("What is your last name?");
    scanf("%20s", lastname);
    printf("What is your age?");
    scanf("%2s", age);
    printf("What is your serial number?");
    scanf("%32s", serialnumber);
    printf("\nYour name is %s %s your age is %s and your serial number is %s.\n", firstname, lastname, age, serialnumber);

    }
  2. #2
  3. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,900
    Rep Power
    481
    Awesome! You recognize the problem. You're close, need only to reread and understand this paragraph from the scanf man page:
    Code:
                 An optional decimal integer which specifies the maximum field width.  Reading of characters  stops  either
                  when this maximum is reached or when a nonmatching character is found, whichever happens first.  Most con‐
                  versions discard initial white space characters (the exceptions are  noted  below),  and  these  discarded
                  characters  don't count toward the maximum field width.  String input conversions store a terminating null
                  byte ('\0') to mark the end of the input; the maximum field width does not include this terminator.
    It is absolutely ok to use extra space. Your compiler might use word alignments anyway.

    char firstname[24]; /* 3 extra bytes. */
    /*...*/
    scanf("%20s", firstname); /* fills up to 21 bytes */
    [code]Code tags[/code] are essential for python code and Makefiles!
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2013
    Posts
    4
    Rep Power
    0
    I have been trying to work out why a set of characters extending beyond it's defined range would corrupt previous data held by C.

    I think I've come up with an explanation. I'd be grateful if someone else can confirm it as true or not.

    As far as I'm aware, when you create an identifier with a specific range, for example, an integer, or a fixed string of characters, then that data is inserted into the last memory block, and then written backwards.


    IF the memory address starts at 0x1000 in my computer for whatever reason, then if I create a 20 character array I'm going to start writing the data at 0x1020 backwards towards 0x1000.
    If I then creating a second character array at the same length then it would proceed from 0x1040 towards 0x1000 as well.

    Now if I type in more than 20 characters, it all gets dumped directly into memory. Scanf doesn't limit it in any way, so if I typed in a 40 character string, it would effectively overwrite the previous string stored in memory.

    Is this correct? Character arrays are written from the highest address back towards the lowest without any regard for the data placed at previous addresses?
  6. #4
  7. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,900
    Rep Power
    481
    c can map your program to a specific hardware any way it sees fit, as long as the program it produces for a standards compliant source behaves within standards compliant ways. Simply put, c need only produce a program that works as a c expert expects.

    Please show a specific program that demonstrates what ever it is you're trying to describe. If the program is that of your first post you need only say so and show the exact input causing you grief. Either way, also show the input and maybe briefly show the output you expect, although that may only reinforce your bad thoughts.
    [code]Code tags[/code] are essential for python code and Makefiles!
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2013
    Posts
    4
    Rep Power
    0
    Originally Posted by b49P23TIvg
    c can map your program to a specific hardware any way it sees fit, as long as the program it produces for a standards compliant source behaves within standards compliant ways. Simply put, c need only produce a program that works as a c expert expects.

    Please show a specific program that demonstrates what ever it is you're trying to describe. If the program is that of your first post you need only say so and show the exact input causing you grief. Either way, also show the input and maybe briefly show the output you expect, although that may only reinforce your bad thoughts.
    Please see my first post, this is the program example I am working with.
  10. #6
  11. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,900
    Rep Power
    481
    This c expert says that your program is incorrect.


    char lastname[20];
    char age[3];

    scanf("%20s", lastname);
    scanf("%2s", age);


    age is alright. You reserved three bytes. scanf is allowed to fill up to 3 bytes. Up to 2 characters from stdin, plus 1 character for the nul byte.

    lastname is wrong. You reserved 20 bytes but permit scanf to use up to 21 bytes. 20 because of the 20 in the "%20s" format specification string, and another for the nul byte to complete the ASCIIz string.

    Reread my other post. I already said this, though perhaps not as clearly.


    Compilation without errors, without warnings, is essential. It doesn't guarantee that your algorithm is valid or that you haven't made other mistakes.
    Last edited by b49P23TIvg; May 6th, 2013 at 09:33 AM.
    [code]Code tags[/code] are essential for python code and Makefiles!
  12. #7
  13. Contributing User
    Devshed Supreme Being (6500+ posts)

    Join Date
    Jan 2003
    Location
    USA
    Posts
    7,175
    Rep Power
    2222
    You seem to have the general idea of what happens when you overflow a buffer, but I see you making a lot of assumptions and doing a lot of guessing. Also, there's the issue that b49P23TIvg has already pointed out, which is that the details of how local variables are organized is arbitrary and entirely up to the compiler, just so long as the end result complies with the C language standard.

    So why guess? Why not just look directly at how your compiler organizes those local variables? The printf flag for an address is %p. Here is a short program I just wrote which shows you where those arrays are actually located on my system (WinXP, compiled with MinGW gcc version 2.95.3-6:
    Code:
    #include <stdio.h>
    
    int main( void ) 
    {
        char firstname[20];
        char lastname[20];
        char age[3];
        char serialnumber[32];
    
        printf("firstname starts at %p\n", firstname);
        printf("    firstname[0] at %p\n    firstname[19] at %p\n", 
                &firstname[0], &firstname[19]);
        
        printf("lastname starts at %p\n", lastname);
        printf("    lastname[0] at %p\n    lastname[19] at %p\n", 
                &lastname[0], &lastname[19]);
    
        printf("age starts at %p\n", age);
        printf("    age[0] at %p\n    age[2] at %p\n", 
                &age[0], &age[2]);
    
        printf("serialnumber starts at %p\n", serialnumber);
        printf("    serialnumber[0] at %p\n    serialnumber[31] at %p\n", 
                &serialnumber[0], &serialnumber[31]);
    
        return 0;
    }
    Here is what it output on my system:
    Code:
    C:TEST>a
    firstname starts at 0022FF60
        firstname[0] at 0022FF60
        firstname[19] at 0022FF73
    lastname starts at 0022FF40
        lastname[0] at 0022FF40
        lastname[19] at 0022FF53
    age starts at 0022FF30
        age[0] at 0022FF30
        age[2] at 0022FF32
    serialnumber starts at 0022FF10
        serialnumber[0] at 0022FF10
        serialnumber[31] at 0022FF2F
    
    C:TEST>
    Originally Posted by dcforeman
    IF the memory address starts at 0x1000 in my computer for whatever reason, then if I create a 20 character array I'm going to start writing the data at 0x1020 backwards towards 0x1000.
    And yet you see that the first character in firstname is at 0022FF60 while the last possible character is at 0022FF73, meaning that the string progresses forward through memory, not backwards as you had assumed. This demonstratess the advantage of looking at what's really happening instead of making guesses. Though, of course, mileage will vary so you will need to see what addresses you get with your compiler on your system.

    You will also see that my compiler organized all the local variables backwards with the last one appearing first and the first one last, just like in the Bible:

    serialnumber starts at 0022FF10
    age starts at 0022FF30
    lastname starts at 0022FF40
    firstname starts at 0022FF60

    You will also notice that lastname ends at 0022FF53 but firstname starts 13 bytes later at 0022FF60. A reason for that is that the compiler can leave padding between some variables in order to place them on a word boundary that's more efficient to access; you will see this a lot in structs. So when you overflow lastname by one character, you won't see it affect firstname.

    However, we don't know what's following firstname. Remember, these local variables are on the stack along with variables and pointers that are needed to make the function work, including the return address from the function. Overwriting one of those can easily cause the program to crash. Until we learn exactly how your compiler organizes that stack frame, we cannot predict what will happen when you overflow firstname.

    So the secret is to not overflow any buffer.
  14. #8
  15. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2013
    Posts
    4
    Rep Power
    0
    Thank you dwise1_aol that was a very useful and detailed post, I admit I'm having trouble with the whole pointers concept at the moment, so I'll head off and study those so I can properly analyse this sort of problem more efficiently in the future.

IMN logo majestic logo threadwatch logo seochat tools logo