|
|
|||||||||
|
|||||||||
| |||||||||
|
|
|
| |||||||||
![]() |
|
|
«
Previous Thread
|
Next Thread
»
|
Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
#1
|
|||
|
|||
|
How to remove null - termminate from a string?
How to remove a null-byte from a string?
|
|
#2
|
||||
|
||||
|
Well, just find it and then replace it with some other value. But this seems like a bad thing to do in almost all cases when working with C-style strings, as all functions expecting a null terminator will run past the bounds and all sorts of bad things could happen. Why would you possibly want to do this?
|
|
#3
|
|||
|
|||
|
Perhaps you should explain what exactly you are trying to accomplish. All strings in C are null-terminated; "removing" the null terminator (whatever you mean by that) would make it a non-string.
|
|
#4
|
|||
|
|||
|
Well..
hard to explain..but i really need to take away the null byte so i can pass the string as an array of chars to a function to do other jobs. |
|
#5
|
|||
|
|||
|
The purpose is to make it a non-string:
Im taking in command line args which are strings stored in array of array of chars. I want to use them as chars individually without the null byte being appended to it. |
|
#6
|
||||
|
||||
|
How will you tell that function how many characters are in that non-string?
Consider that the strlen() function returns the number of characters in a null-terminated string, excluding the null-terminator. If you're needing to copy a string into another buffer and exclude the null-terminator (basically what you think you want to do), then simply copy strlen() characters -- a for-loop would do it. PS Unless you can offer us a compelling reason for doing chucking the null-terminator, I still say you don't really want to do it. |
|
#7
|
||||
|
||||
|
Your question still isn't making much sense (to me)... maybe provide some code as an example?
|
|
#8
|
|||
|
|||
|
If you really want to make it a non-string, then that begs the question: what do you want to turn it into?
I think your problem is probably based on some misunderstanding of how C represents data. The more information you can provide, the sooner we can clear that up. |
|
#9
|
||||
|
||||
|
There's no good reason to "get rid of" the null byte. I presume, that is, that you aren't THAT cramped for memory.
There's never any need to process the nul byte. You can process strlen characters, or you can process and stop when you reach the nul byte. The same is true for any functions to which you might pass a pointer to the array. You can also pass some length to those functions which they can use to process a specific number of bytes. You cannot pass arrays, in any case. You can only pass a pointer to some position within them, typically the first. In C you have to treat arrays one element at a time. It seems clear that you have yet to grasp this concept.
__________________
C/C++ pointers (Original in the "Commonly Asked Questions" thread). |
|
#10
|
||||
|
||||
|
The string exists in memory as an array of characters. You cannot "remove" the null terminator as such; it is merely a matter of interpretation. For example:
{'h', 'e', 'l', 'l', 'o', '\0'} could be interpreted as a C string occupying 6 characters or a char array of 5 characters (or 6 for that matter, or fewer - who cares!?). If you tell the function you pass the pointer to that there are 2 characters then it is a two character array with some junk following. You can't have 'nothing' following the string, since it exists in memory. You can merely have stuff that you will choose to ignore (or indeed stuff that belongs to something else - adjacent variables, or the call stack for example). The nul termination is merely a convention to tell functions that handle C strings when to start ignoring without explicitly passing a length. Compare for example strcpy() with memcpy(), they perform essentially the same function, except one uses the nul terminator, while the other requires you to tell it how many bytes to process. Incidentally there are some very strong arguments to avoid C string handling in some cases (http://www.joelonsoftware.com/artic...0000000319.html) but your question indicates a lack of understanding of the nature of C strings. Clifford |
|
#11
|
|||
|
|||
|
How about a way to do what you want - reference each character in an array:
Code:
void foo(char *array)
{
char *p=NULL;
int i=0;
for(p=array; *p; p++) /* the *p test fails for the nul byte */
{
printf("Element number %d: %c\n", i++, *p);
}
}
Note the nul byte usage. |
|
#12
|
||||
|
||||
|
Assuming that the OP just doesn't fully understand why the null-terminator is there:
The representation of strings is a basic problem for which there are several solutions. I don't remember how we did it in FORTRAN or in PL/I (that was 30 years ago, after all). In BASIC, a string was represented internally (we did not have direct access to this) as a zero-based array (most BASIC arrays were 1-based or else you could select whether they'd be zero or one-based) of characters. The string itself would start at index 1, while the string length would be at index 0. Since these were 8-bit bytes, that meant that strings' lengths could only range from 0 to 255 characters; it was impossible to have any strings that were longer, barring any work-arounds that might have existed. Standard Pascal didn't have any built-in support for strings, so they were handled as character arrays. I'd have to go back and review how we did it with the Australian-Japanese compiler we had smuggled in back in 1979, but needless to say, since every compiler designer extended the language to be able to do useful stuff, they also all created their own string types. In Turbo Pascal's extension of the language, they adopted the BASIC approach, which similarly restricted string length to not exceed 255. MS-DOS took a different approach which theoretically placed no restriction on the maximum length of a string: terminate the sequence of characters with a special character that will never appear within the string. There were two such possible terminators and specific functions required specific ones (there may have been a pattern to which required which, but it's been too long): ASCII$, which terminated with a dollar sign, and ASCIIZ, which terminated with a zero. C, which predated MS-DOS, is another language that has no built-in support for strings, so we've adopted an ASCIIZ convention in which a string is a char array of virtually limitless length (ie, does not have an arbitrarily-set limit to the length) and which is terminated by a zero byte, AKA "null-terminated". Again, it is the null-terminator that determines where the end of the string is. And, based on that convention, we have developed a number of standard library functions for operating on those strings, all of which depend very heavily on that null-terminator. And every other function -- from the standard library, or from third-party libraries, or from user programs -- that does anything with strings all depend completely on that null-terminator. Now, dBase III in its character fields follows this convention: 1. A string in a data field can be no longer than the length assigned to that field. 2. If a string is shorter, then it is padded to the right with spaces (0x20). If you're inserting a C-style string into a dBase character field, you would start off by knowing how long the data field is. You would then copy all the characters in the string into the data field up to the length of the field. If you reach the null-terminator before reaching the end of the field, then you would write spaces into the rest of the field. Similarly, when copying a character data field into a C-style string, you would copy all the characters over, then you would trim the following spaces by overwriting the first one with a null-terminator. The C++ basic string class hides the null-terminator from you, except when you expose the char array with the .c_str() method. |
|
#13
|
||||
|
||||
|
It has to be remembered, as others have pointed out, that strings in C are more of a programming convention than a part of the language. The only way which the core language supports it is in string literals; all other support for it (e.g., I/O functions, string functions) is library code.
The question of what you need to do is still relevant; it may be possible to avoid the problem entirely. What is the underlying problem you are trying to solve - what does the program (or at least this part of it) do? How is the function you are calling using the char array (or the pointer to it that is actually getting passed), and is it a function you wrote or a library function of some sort (i.e., can you change it if you need to)? If you can change the function in question, the solution is simple: use the null-terminator to determine the end of the data rather than whatever approach you have been using. You would know better than us if this is practical or not. If you can't change the function, then you may be able to change the array. If the function requires a fixed size array, and the data will always fit that size, then the answer is simple: make the array on larger than required, leaving the null terminator out of the area being used. If the similarly, if the function takes a size as an argument, you can make sure that the array being passed is actually longer than the given size. If the array needs to be padded in some way with something other than the zero delimiter, then you should be able to append the extra padding as part of the string - making sure, of course, that the underlying array is large enough to hold the data, padding and zero delimiter.
__________________
Rev First Speaker Schol-R-LEA;2 JAM LCF ELF KoR KCO BiWM TGIF #define KINSEY (rand() % 7) λ Scheme is the Red Pill Scheme in Short • Understanding the C/C++ Preprocessor Taming Python • A Highly Opinionated Review of Programming Languages for the Novice, v1.1 FOR SALE: One ShapeSystem 2300 CMD, extensively modified for human use. Includes s/w for anthro, transgender, sex-appeal enhance, & Gillian Anderson and Jason D. Poit clone forms. Some wear. $4500 obo. tverres@et.ins.gov Last edited by Schol-R-LEA : April 28th, 2008 at 06:01 PM. |