#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    6
    Rep Power
    0

    String to number conversion


    Hey guys. This is my first time on the site and im just wondering if you could help me out in understanding what to do for a certain code.

    I'm trying to get a user to enter in a string a word. After they enter the word I then am trying to convert that to a certain predetermined number by each letter. For example.

    If the users enters: HELLO it would print out the following

    100101100001000010010

    if H = 1001 E= 011 L = 00001 O = 0010

    I separated them with bold for to show each letter. And the its called the Huffman Encoding.

    How would I go about doing this?

    I tried to make a for loop that goes through the string and finds if it has each letter then print out the code in a different string using an if statement.

    int i;
    char stringOne[100], stringTwo[100];

    printf("Please enter a string: ");
    scanf(" %s", stringOne);

    for(i = 0; i < 100; i++)
    {

    if(stringOne[i] = 'A')
    stringTwo[i] = '0001';

    }

    Thats what I tried.

    Sorry if that was very confusing but thanks in advance for the help!
  2. #2
  3. Contributing User
    Devshed Supreme Being (6500+ posts)

    Join Date
    Jan 2003
    Location
    USA
    Posts
    7,145
    Rep Power
    2222
    stringTwo[i] = '0001';

    That is so wrong in so many ways.

    '0001' is absolutely meaningless in C. Single quotes are for a single character and there is no such character as that!

    What is it that you really want to do? Do you want to take a string, "HELLO", and use it to generate another string, "100101100001000010010"? Then if you need to generate an actual bit stream, you can easily translate that "binary string" into an actual bit stream.

    If that is not what you actually want to do, then I'm at a loss because I can't think of anything else you could actually be trying to do.

    Apparently, you have the Huffmann encodings for the letters A-Z already. When I worked that project in school in 1981, we were just given a text file from which we had to do a frequency count for all the characters from which we had to generate our own Huffmann code table which we then used to "send" and "receive", encoding and decoding in the process.

    So, create a code table, which would be an array of strings indexed by the letter to be encoded. I did it in Pascal, which has no problem defining different index ranges, but in C you are stuck with starting at zero. So -- assuming all capital letters -- take the next letter to be encoded, subtract 'A' from it, and use that to index into the table to get the Huffmann "bit string" for that letter. In case it's a new concept for you, you can do arithmetic with char's since their values are their ASCII values; eg, 'A' - 'A' = 0, 'C' - 'A' = 2, etc.

    Decoding with just that table is doable, but cumbersome. In my project, I used the table to construct a sort tree and then simply used the incoming "bit stream" to traverse the tree until I hit a terminal node, whereupon I read off the character there. If this is part of your project, work out a scheme that you would find easier to implement.

    While you could do all the encoding and decoding in C code, that would be very long and cumbersome to write and to maintain. It's better to create data structures that contain the encodings and that you can use with much less C code.
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    6
    Rep Power
    0
    Thanks for the help and I'm new at C so I'm sorry that was so wrong.

    Yes I am trying to do what you stated. I have values for A-Z and when a user enters a word it will translate it as it did in the example of Hello. Then I need to print out the code back to the user.

    You suggest that I build a code table and subtract A from each letter that is given. Do you mean if the word was again HELLO I would make the program do the following:
    I would take the 'H' - 'A' = 7. Then anytime I get the number 7 I translate that into the value using the code table? Im still unsure of how I would use the code table to translate it into the values I'm given. Can you give me an example of what you mean by code table.
    Thanks a lot!
  6. #4
  7. Contributing User
    Devshed Supreme Being (6500+ posts)

    Join Date
    Jan 2003
    Location
    USA
    Posts
    7,145
    Rep Power
    2222
    Simple array indexing using character arithmetic to convert the letter to the index. Have you or have you not yet learned about arrays? You say you're new to C, but this doesn't look like a usual newbie project. I honestly don't know what you've learned so far and what you haven't, so I don't know what I can assume.

    Off the top of my head using bogus values:
    Code:
      char *codeTable[] = {
            "001",  // code for 'A', array index = 0
            "0011",  // code for 'B', array index = 1
            "0001",  // code for 'C', array index = 2
            "00011",  // code for 'D', array index = 3
            "011",  // code for 'E', array index = 4
         // and so on and so on for all the letters up to 'Z'
    };
    That's the code table. It's an array of strings conceptually indexed by the letter whose code is the string at that place in the array. But an array of 26 elements is indexed by 0 to 25, not 65 to 90 ('A' == 65, 'Z' == 90). So to translate the letters to the indices, we need to subtract 65 (AKA 'A') from each letter to get the index.

    Therefore, print the encoded "bit stream" for "HELLO" with
    Code:
        printf("%s%s%s%s%s", codeTable['H'-'A'], codeTable['E'-'A'], 
                codeTable['L'-'A'], codeTable['L'-'A'], codeTable['O'-'A']);
    Of course, that code is very crude and unsuitable, but it illustrates translating the input letter to an index into the code table and using the code table.
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    6
    Rep Power
    0
    Thank you lots! That makes complete sense to me!
    I don't know why I was having trouble understanding what you were referencing.

    Now I understand that in the example you used HELLO but in my program its going to be random in what the user inputs. So in order to get it prepared for the randomization of what the user can submit, is there a way I can interchange the H in codeTable['H'-'A'] to be the first symbol of the array. I dont know if you can possibly do this in C but could you put: codeTable[userSubmit[i] - 'A'] and make i go from 1 - 100 in a loop to get all the letters?
    That could have been completely wrong but that the only thing I could think of trying. Thanks alot in advance!
  10. #6
  11. Contributing User
    Devshed Supreme Being (6500+ posts)

    Join Date
    Jan 2003
    Location
    USA
    Posts
    7,145
    Rep Power
    2222
    The user input will be in a string, so just start at the beginning of the string and process each character one at a time.
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    6
    Rep Power
    0
    Originally Posted by dwise1_aol
    The user input will be in a string, so just start at the beginning of the string and process each character one at a time.
    Okay perfect! So say the largest input a user can submit is a string of 100. To create a print statement to get all 100 characters would I make a while loop such as

    while(stringOne[100] != \0)
    {

    printf("%s\n", codeTable[ ? - 'A']);

    }

    Im not sure what id put within the codeTable to have A subtracted from it to get all the users inputted letters.
    Thanks a lot!
  14. #8
  15. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    6
    Rep Power
    0
    I tried to do a while loop:

    while(stringOne[100] != '\0')
    {
    for(i=0; i <=100; i++)
    {

    printf("%s\n", codeTable[stringOne[i] - 'A']);

    }
    }

    This will print out what i want but it wont stop at the end and gives me a seg fault. How would i make the loop stop after the last letter?
  16. #9
  17. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2009
    Posts
    149
    Rep Power
    36
    Code:
    for(i=0; i <=100; i++)
    Array indexing starts at 0, not 1. Therefore, the last element in your stringOne array is 99. You try to read past this with your incorrect loop condition.
  18. #10
  19. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    6
    Rep Power
    0
    Originally Posted by jakotheshadows
    Code:
    for(i=0; i <=100; i++)
    Array indexing starts at 0, not 1. Therefore, the last element in your stringOne array is 99. You try to read past this with your incorrect loop condition.
    Ah yes thank you! Now instead of having a bunch of junk its returns this:

    Please enter a string: HELLO
    1001
    011
    00001
    00001
    0010

    Segmentation fault: 11

    It is supposed to stop after the 0010. Any idea what I could have done?
  20. #11
  21. Contributing User
    Devshed Supreme Being (6500+ posts)

    Join Date
    Jan 2003
    Location
    USA
    Posts
    7,145
    Rep Power
    2222
    Review C-style strings. A string ends with '\0'. Please note the single quotation marks, because they need to be used.

    Your for-loop's conditional expression (the middle one) needs to include the test for the character in stringOne[i] not being equal to '\0'. That will make the code conversion stop at the end of the input string.

    Lose the while loop because it is not only unnecessary but also worse than useless.
    Code:
        // The last element of stringOne is stringOne[99]
        // That means that stringOne[100] does not exist.  More specifically
        //     it's the memory location immediately following the stringOne array.
        // Probability is about 255/256 (99.6%) that it's value is non-zero, so it's
        //     very likely to be true.  And since we have no known means by which 
        //     to change the contents of that memory location, that condition will
        //     never become false, thus trapping your program in an infinite loop;
        //     Like with a black hole, nothing can escape from within an infinite loop.
        while(stringOne[100] != '\0')
        {
            // This for-loop will be run each and every time you loop through the 
            //     infinite loop above, so the input string will be processed over and 
            //     over and over again ad infinitum.
            //
            // The loop is set to iterate 101 times.  Proper notation for looping only 100
            //     times would be (i=0; i < 100; i++)
            //
            // To make the conversion stop at the end of the input string, which will most
            //     likely be far less than 100 characters long, you need to place the test
            //     for stringOne[i] != '\0' in the for-statement; compound conditional
            //     expression is allowed and encouraged.
            for(i=0; i <=100; i++)
            {
    
                printf("%s\n", codeTable[stringOne[i] - 'A']);
    
            }
        }

    Comments on this post

    • jakotheshadows agrees : Yup.

IMN logo majestic logo threadwatch logo seochat tools logo