Discuss String to number conversion in the C Programming forum on Dev Shed. String to number conversion C programming forum discussing all C derivatives, including C#, C++, Object-C, and even plain old vanilla C. These languages are low level languages, and used on projects such as device drivers, compilers, and even whole computer operating systems.
Posts: 6
Time spent in forums: 2 h 37 m 46 sec
Reputation Power: 0
String to number conversion
Hey guys. This is my first time on the site and im just wondering if you could help me out in understanding what to do for a certain code.
I'm trying to get a user to enter in a string a word. After they enter the word I then am trying to convert that to a certain predetermined number by each letter. For example.
If the users enters: HELLO it would print out the following
100101100001000010010
if H = 1001 E= 011 L = 00001 O = 0010
I separated them with bold for to show each letter. And the its called the Huffman Encoding.
How would I go about doing this?
I tried to make a for loop that goes through the string and finds if it has each letter then print out the code in a different string using an if statement.
int i;
char stringOne[100], stringTwo[100];
printf("Please enter a string: ");
scanf(" %s", stringOne);
for(i = 0; i < 100; i++)
{
if(stringOne[i] = 'A')
stringTwo[i] = '0001';
}
Thats what I tried.
Sorry if that was very confusing but thanks in advance for the help!
Posts: 6,141
Time spent in forums: 2 Months 2 Weeks 3 Days 23 h 52 m 21 sec
Reputation Power: 1974
stringTwo[i] = '0001';
That is so wrong in so many ways.
'0001' is absolutely meaningless in C. Single quotes are for a single character and there is no such character as that!
What is it that you really want to do? Do you want to take a string, "HELLO", and use it to generate another string, "100101100001000010010"? Then if you need to generate an actual bit stream, you can easily translate that "binary string" into an actual bit stream.
If that is not what you actually want to do, then I'm at a loss because I can't think of anything else you could actually be trying to do.
Apparently, you have the Huffmann encodings for the letters A-Z already. When I worked that project in school in 1981, we were just given a text file from which we had to do a frequency count for all the characters from which we had to generate our own Huffmann code table which we then used to "send" and "receive", encoding and decoding in the process.
So, create a code table, which would be an array of strings indexed by the letter to be encoded. I did it in Pascal, which has no problem defining different index ranges, but in C you are stuck with starting at zero. So -- assuming all capital letters -- take the next letter to be encoded, subtract 'A' from it, and use that to index into the table to get the Huffmann "bit string" for that letter. In case it's a new concept for you, you can do arithmetic with char's since their values are their ASCII values; eg, 'A' - 'A' = 0, 'C' - 'A' = 2, etc.
Decoding with just that table is doable, but cumbersome. In my project, I used the table to construct a sort tree and then simply used the incoming "bit stream" to traverse the tree until I hit a terminal node, whereupon I read off the character there. If this is part of your project, work out a scheme that you would find easier to implement.
While you could do all the encoding and decoding in C code, that would be very long and cumbersome to write and to maintain. It's better to create data structures that contain the encodings and that you can use with much less C code.
Posts: 6
Time spent in forums: 2 h 37 m 46 sec
Reputation Power: 0
Thanks for the help and I'm new at C so I'm sorry that was so wrong.
Yes I am trying to do what you stated. I have values for A-Z and when a user enters a word it will translate it as it did in the example of Hello. Then I need to print out the code back to the user.
You suggest that I build a code table and subtract A from each letter that is given. Do you mean if the word was again HELLO I would make the program do the following:
I would take the 'H' - 'A' = 7. Then anytime I get the number 7 I translate that into the value using the code table? Im still unsure of how I would use the code table to translate it into the values I'm given. Can you give me an example of what you mean by code table.
Thanks a lot!
Posts: 6,141
Time spent in forums: 2 Months 2 Weeks 3 Days 23 h 52 m 21 sec
Reputation Power: 1974
Simple array indexing using character arithmetic to convert the letter to the index. Have you or have you not yet learned about arrays? You say you're new to C, but this doesn't look like a usual newbie project. I honestly don't know what you've learned so far and what you haven't, so I don't know what I can assume.
Off the top of my head using bogus values:
Code:
char *codeTable[] = {
"001", // code for 'A', array index = 0
"0011", // code for 'B', array index = 1
"0001", // code for 'C', array index = 2
"00011", // code for 'D', array index = 3
"011", // code for 'E', array index = 4
// and so on and so on for all the letters up to 'Z'
};
That's the code table. It's an array of strings conceptually indexed by the letter whose code is the string at that place in the array. But an array of 26 elements is indexed by 0 to 25, not 65 to 90 ('A' == 65, 'Z' == 90). So to translate the letters to the indices, we need to subtract 65 (AKA 'A') from each letter to get the index.
Therefore, print the encoded "bit stream" for "HELLO" with
Of course, that code is very crude and unsuitable, but it illustrates translating the input letter to an index into the code table and using the code table.
Posts: 6
Time spent in forums: 2 h 37 m 46 sec
Reputation Power: 0
Thank you lots! That makes complete sense to me!
I don't know why I was having trouble understanding what you were referencing.
Now I understand that in the example you used HELLO but in my program its going to be random in what the user inputs. So in order to get it prepared for the randomization of what the user can submit, is there a way I can interchange the H in codeTable['H'-'A'] to be the first symbol of the array. I dont know if you can possibly do this in C but could you put: codeTable[userSubmit[i] - 'A'] and make i go from 1 - 100 in a loop to get all the letters?
That could have been completely wrong but that the only thing I could think of trying. Thanks alot in advance!
Posts: 6
Time spent in forums: 2 h 37 m 46 sec
Reputation Power: 0
Quote:
Originally Posted by dwise1_aol
The user input will be in a string, so just start at the beginning of the string and process each character one at a time.
Okay perfect! So say the largest input a user can submit is a string of 100. To create a print statement to get all 100 characters would I make a while loop such as
while(stringOne[100] != \0)
{
printf("%s\n", codeTable[ ? - 'A']);
}
Im not sure what id put within the codeTable to have A subtracted from it to get all the users inputted letters.
Thanks a lot!
Posts: 149
Time spent in forums: 3 Days 12 h 1 m 16 sec
Reputation Power: 35
Code:
for(i=0; i <=100; i++)
Array indexing starts at 0, not 1. Therefore, the last element in your stringOne array is 99. You try to read past this with your incorrect loop condition.
Posts: 6
Time spent in forums: 2 h 37 m 46 sec
Reputation Power: 0
Quote:
Originally Posted by jakotheshadows
Code:
for(i=0; i <=100; i++)
Array indexing starts at 0, not 1. Therefore, the last element in your stringOne array is 99. You try to read past this with your incorrect loop condition.
Ah yes thank you! Now instead of having a bunch of junk its returns this:
Please enter a string: HELLO
1001
011
00001
00001
0010
Segmentation fault: 11
It is supposed to stop after the 0010. Any idea what I could have done?
Posts: 6,141
Time spent in forums: 2 Months 2 Weeks 3 Days 23 h 52 m 21 sec
Reputation Power: 1974
Review C-style strings. A string ends with '\0'. Please note the single quotation marks, because they need to be used.
Your for-loop's conditional expression (the middle one) needs to include the test for the character in stringOne[i] not being equal to '\0'. That will make the code conversion stop at the end of the input string.
Lose the while loop because it is not only unnecessary but also worse than useless.
Code:
// The last element of stringOne is stringOne[99]
// That means that stringOne[100] does not exist. More specifically
// it's the memory location immediately following the stringOne array.
// Probability is about 255/256 (99.6%) that it's value is non-zero, so it's
// very likely to be true. And since we have no known means by which
// to change the contents of that memory location, that condition will
// never become false, thus trapping your program in an infinite loop;
// Like with a black hole, nothing can escape from within an infinite loop.
while(stringOne[100] != '\0')
{
// This for-loop will be run each and every time you loop through the
// infinite loop above, so the input string will be processed over and
// over and over again ad infinitum.
//
// The loop is set to iterate 101 times. Proper notation for looping only 100
// times would be (i=0; i < 100; i++)
//
// To make the conversion stop at the end of the input string, which will most
// likely be far less than 100 characters long, you need to place the test
// for stringOne[i] != '\0' in the for-statement; compound conditional
// expression is allowed and encouraged.
for(i=0; i <=100; i++)
{
printf("%s\n", codeTable[stringOne[i] - 'A']);
}
}