|
|
|||||||||
|
|||||||||
| |||||||||
|
|
|
| |||||||||
![]() |
|
|
«
Previous Thread
|
Next Thread
»
|
Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
|
|
Get inside! Sample the range of functionality easily built with JMSL Library for Time Series Data Analysis, Heat Maps, Portfolio Optimization, Monte Carlo Simulation, Stock Price Charting and more. Download Now! |
|
#16
|
|||||
|
|||||
|
C/C++ jargon: undefined, unspecified, and implementation defined behaviour
[If/when I get around to setting up a web page, I'll place the text here into that, and this message will simply contain a link.]
This post is concerned with describing some common terms used by the C/C++ community. The meanings are firmly enshrined in the C and C++ standards, and are relevant to those of us who are keen to ensure our C or C++ programs run correctly on a range of target platforms, when compiled with different compilers. Native English speakers, and even Americans (who speak one of the strangest dialects of English), often bang their desk and insist these definitions are wrong, often bringing out a conventional dictionary to support their argument. That's as may be: these definitions are relevant only to those of us who write in C or C++, not to any other English speaker. If you don't wish to program in C or C++, you do not need to use these definitions. But if you do, and you don't wish to have C/C++ gurus accuse you of not understanding what you're talking about, then you had better understand these terms. One thing to remember is that some companies, when recruiting C/C++ programmers, ask questions designed to test your understanding of these concepts and ability to grasp their significance. Why do they do this? Because they care about writing portable and correct code and a programmer who doesn't understand these terms probably has little experience in writing code portably and correctly. I recommend you do not debate these definitions with your loved ones, unless they are C/C++ programmers or are very tolerant of your foibles (or both). But understanding these terms is a basic step in becoming a professional C/C++ software engineer (or geek to the masses) rather than a clueless hacker. I would assume the phenomenon of C/C++ speak vs the spoken language occurs with other spoken languages (French, Danish, etc). I pick on English because I speak it. OK; on with the definitions. These are lifted straight from the C and C++ standards (in all versions to date). They are interrelated. Quote:
What does this mean? It basically means that the standards describe code as resulting in undefined behaviour if there is no limit on what happens when the code is compiled and executed. Two common examples are shown here ... PHP Code:
In practice, compilers are completely silent when code does this, and what often happens is an abnormal program termination (eg a core dump under unix, a general protection fault under windows). However, such a program crash is only one possibility. In theory, your computer could also give you, the programmer, an electric shock when it executes this code (it is perhaps unfortunate that undefined behaviour does not result in a programmer receiving electric shocks: undefined behaviour is one of the largest causes of unexplained software bugs and a significant proportion are caused by programmer carelessness). Why do the standards allow such things? The above examples are constructed so it is obvious what is happening. However, in complex code where variables are shared or passed between functions, it is technically VERY difficult (and in some cases, where data may be read from files at run time, impossible) to detect. A compiler (or run time environment) that could detect all instances of undefined behaviour, in general, would be both expensive and run slowly. Programmers tend to insist on things like inexpensive compilers, fast compile times, minimal memory footprints, and fast execution of their code. Detecting all cases of undefined behaviour would compromise those things. Most real-world examples of undefined behaviour encountered in practice are some sort of problem with pointers. Have a look at Dawei's page on pointers (a link is in an earlier post in this thread) for more details of common mistakes people make with pointers. Quote:
Unspecified behaviour is like undefined behaviour, except that the standards impose some constraints on what is allowed to happen. However, those constraints do not result in only one possible action. An obvious example, with f(), g(), h(), and i() being functions, is the call PHP Code:
In this example, the standard requires that g(), h(), and i() will be called and their return values then passed to f(). However, and the reason this call yields unspecified behaviour, is that the standard does NOT specify the order in which g(), h(), and i() will be called. Some compilers will call g() first and i() last. Some compilers do it in reverse order. In theory, a really smart compiler could do all three calls concurrently, by executing each on a different CPU. The reason for such freedom to compiler writers is to allow performance optimisations on a range of target operating systems and hardware. In other words, allow the compiler writer to decide the best order to do things. Note: in this example, if one cares about the order in which g(), h(), and i() are called, one can do; PHP Code:
as the compiler is required to not reorder the function calls in this case. The code does not rely on any form of unspecifed behaviour to work correctly. Quote:
Common examples of implementation defined behaviour include the bit layout of integer and floating point types, and how operations like addition are implemented. These are things the implementation (i.e. compiler and libraries) are required to define, but different implementations are allowed to do these things differently. sizeof(int) is an example of an implementation defined value. With older 16 bit compilers, sizeof(int) was often 2. With modern 32-bit compilers, sizeof(int) is often 4. Why is all the stuff above relevant? A lot of questions in various C and C++ bulletin boards (including this one) amount to throwing a bit of code on the table and asking why it doesn't do what the person asking the question thinks should have happened. In the vast majority of cases, the cause of a program crash is often some form of undefined behaviour, so the quick -- and correct -- answer is usually to highlight the offending line and make some statement that this is undefined behaviour. The main reason I wrote this post is that a common counter-response to this is "huh", and wasted time/bandwidth expanding on the explanation. I have also seen cases where people insist "unspecified behaviour" is actually "undefined behaviour". That may be literally true in the English language, but the language of C and C++ is subtly different and basically shows the person involved as being more interested in proving they are "right" rather than understanding the message that people are giving them by choosing one word rather than another. Keep in mind that professional C and C++ programmers who happen to be native English speakers speak about C and C++ in a language that is not actually English, despite having a lot in common with English. This is simply professional jargon. If you want to offer proof that you don't understand C or C++, you will speak only in English!! If you want something more concise than this post, have a look here (link courtesy of nnxion). For an example of the type of debate that really "inspired" this post, look here (link provided by InfoGeek)
__________________
It is only our bad temper that we put down to being tired or worried or hungry; we put our good temper down to ourselves." -- C.S. Lewis I like long walks, especially when they're taken by people who annoy me. --Fred Allen Last edited by grumpy : May 21st, 2005 at 09:42 PM. |
|
#17
|
||||
|
||||
|
Not to belabor the point, but each specialty has its own vocabulary, often using words and phrases that have very distinct meanings within the specialty but just as often fuzzy, indistinct and/or contradictory meanings outside. As I developed expertise in the specialty of information science (of which programming in C/C++ is but a small part) I found reading the 'Jargon Dictionary' to be a very useful apportionment of my time. I suggest that any newbies here (indeed, anyone who has not yet made the time to read such a dictionary) would also find reading the dictionary time well spent. Below are a couple of probably hundreds of different links to various editions of 'The' Jargon Dictionary, a google of 'jargon dictionary' will turn up thousands more (of course not all are IT related, we aren't the only ones who opacify language, try reading some of the MBA jargon if you want a headache).
http://www.catb.org/~esr/jargon/ http://www.eps.mcgill.ca/jargon/jargon.html <-- all in a single HTML page, not for those of you on dialup! http://www.lysator.liu.se/hackdict/...main_index.html http://www.science.uva.nl/~mes/jargon/
__________________
Left DevShed May 28, 2005. Reason: Unresponsive administrators. Free code: http://sol-biotech.com/code/. Secure Programming: http://sol-biotech.com/code/SecProgFAQ.html. Performance Programming: http://sol-biotech.com/code/PerformanceProgramming.html. It is not that old programmers are any smarter or code better, it is just that they have made the same stupid mistake so many times that it is second nature to fix it. --Me, I just made it up The reasonable man adapts himself to the world; the unreasonable one persists in trying to adapt the world to himself. Therefore, all progress depends on the unreasonable man. --George Bernard Shaw |
|
#18
|
|||
|
|||
|
If you want to know how to make viruses trojans or anything of that nature DON'T ask in this forum. Search google or something.
|
|
#19
|
|||
|
|||
|
MANUALS
C * http://en.wikibooks.org/wiki/Programming:C_contents * http://www.cprogramming.com * http://www.silicontao.com/ProgrammingGuide/index.html * http://www.cyberdiem.com/vin/tutorials.html * http://msdn2.microsoft.com/en-us/li...7se(VS.80).aspx ------------------------------------- C++ * http://www.zeuscmd.com/tutorials/cplusplus/index.php * http://www.cplusplus.com/doc/tutorial/ * http://en.wikibooks.org/wiki/C++ * http://cplus.about.com/od/beginnerc.../blcplustut.htm * http://www.bloodshed.net/dev/doc/index.html [/URL] Note If anyone wants C++ E-Books, please send me a MP. I have over 40 C++ E-Books. ![]() |
|
#20
|
||||
|
||||
|
Pointers
A weird but informative video illustrating how pointers work, originally posted by para45 (hope you don't mind):
http://cslibrary.stanford.edu/104/ There are other useful links on that site, including a really nice introduction to pointers and memory in a printable, PDF format. Here is what seems to be an informative thread regarding good C/C++ programming books.
__________________
OMG RAVER CHICKS!! On a related note: C/C++ Programming Tutorials "Science is based on reality staying the same, and Nature ignores what humans vote upon." -- Bill Beaty "Three litres of sherry up the butt can only be described as astounding." -- Darwin Awards Last edited by peenie : July 10th, 2006 at 09:51 AM. |
|
#21
|
|||
|
|||
|
I tried to print the reslt of 2 ^ 8 but it gives nonsense. Why?
Because ^ does not mean "power", as you math fanboys would be used to. It's the XOR operation, which is one of many bitwise operations in C/C++ you should look out for. PS: Eeh... (chews carrot) when's yer gonna put up them chappters three an' feur, doc? ![]() You know guys, if anyone ever asks this again just redirect them to this post; it's much faster than slowly typing out a reply for every newbie idiot out there who can't be bothered to google.
__________________
The best book on programming for the layman is Alice in Wonderland; but that's because it's the best book on anything for the layman. ~ Alan J. Perlis
Last edited by jafet : August 30th, 2006 at 06:02 AM. |
|
#22
|
||||
|
||||
|
EDIT1:
Sorry. I wrote another namespace guide.. I did not notice the previous one... please remove this :/ EDIT2: @Grumby: Perhaps you could add the explanation why .h ending headers should not be used with modern compilers in your namespaces post? (It is related to that since old .h headers contained thingies declared in GLOBAL namespace, aor actually abowe any namespace. When C++ standard was updated, those thingies were declared in headers without .h (like <string> not <string.h>)) Now many people seem to be asking why their good 'ol code does not compile with newer compilers.. Some of them notice warning that iostream.h is deprecated and iostream should be used instead (I believe some version of GCC gave that information). Now when user removes .h, he get's warnings about couts, strings, ifstreams etc. being undeclared... (Of course, because namespace is not given). Last edited by Mazzie : October 12th, 2006 at 07:27 AM. |
|
#23
|
||||
|
||||
|
Quote:
How convenient, here's the video, you can watch it right in this post (just press play; turn sound on): |
|
#24
|
|||
|
|||
|
About main()
The C/C++ standards require that a complete program must have a main(). Code:
int main(void) ; int main( int, char**) ; /* Equivalent to int main( int, char*[] ) */ NOTE: char** is pointer to pointer to char. More on this later. Any function named main should have one of the above signatures. The standard requires that your program have one and only one such function. Some compilers will allow void return type. On Windows systems, some compilers will let you declare wmain() instead of main(). Embedded systems compilers often have their own specs and sometimes allow you to specify a function name other than main. These alternaives are all non-portable because conforming compilers are not required to accept them. NOTE: It is a convention that the return value from main(), if any, be non-zero to indicate an error and zero for non-error execution, but this is by no means an absolute requirement. The above are only declarations however. To make your linker happy, you have to implement a function named main with one of the above signatures. This is because the c-runtime calls main() after performing some initialization (more on this later). The simplest you can get away with is: Code:
/* Both standards */
int main( void ) { return 0; }
That is a pretty useless program, but this is about main() and we're trying to keep it simple. What you do in your main() is mostly up to you. If you need to process the arguments passed to your program it gets a bit more complicated: Code:
int main( int argc, char *argv[] )
{
/* argc is number of non-null pointers in argv */
/* argv is array of pointers to c-strings (nul terminated character arrays) */
/* the last element in argv is always a NULL pointer */
/* The size of the argv array is (argc + 1) * sizeof(char*) */
/* sizeof( argv ) is always sizeof( char**) which on 32 bit systems is 4. */
/* argv[0] is the name of your program */
/* Your arg handling routines go here */
return what-ever ;
}
NOTE: The argument names argc and argv are conventional but not required by the standard. You could just as easily call them fritz and fraq. Now, this is where it often gets confusing for beginers and even us old-timers have to pause momentarily to gather our wits. By convention argv[] always has two elements in it! Even when argc is one!! In fact, argc is always at least 1 (well on most operating systems that supply a command line anyway). This is where I must diverge to talk briefly about arrays and c-strings... Except for string literals, there is no special type in C for strings as there is in other languages (C++ does have std::string but that is not relavent to this topic). We simply use character arrays. A "c-string" is an array of characters that by convention is terminated with a nul character. All string literals in C are c-strings, but string literals are also special because they can be used to initialize other arrays. A pointer to c-string is a pointer that points to the first element of a nul-terminated character array. Code:
const char lit[] = "string literal" ; /* compiler automatically nul terminates these */ /* sizeof(lit) == strlen(lit) + 1 */ const char *pLit = "string literal" ; /* nul terminated by compiler */ /* sizeof(pLit == sizeof(char*) == 4 on 32 bit system*/ const char oops[3] = "string literal" ; /* Error: compiler complains that oops isn't big enough */ NOTE: sizeof(lit) != sizeof(pLit). This is important to remember. The compiler can't tell you how many characters pLit points to, but always knows how many characters are in lit because it is an array. strlen() on the other hand can determine the length of a c-string, but not the length of an un-terminated character array. If oops[] in the above example were initialzied with {'a','b','c'} it would NOT be a c-string and strlen( oops ) would probably crash your program! C arrays are "zero based". That is arr[0] is the first element of the array, not arr[1]. A declaration like int arr[N] results in N elements being allocated by the compiler. Code:
const int aBufSize = 3 ;
int a[aBufSize] ;
for ( char idx = 0 ; idx < aBufSize ; idx++ )
{
a[idx] = idx + 1 ;
}
The above code fills a[] with the integer values 1,2,3. For int main( int argc, char **argv ), the compiler does not know how big the argv array is but argv[] is always terminated with a NULL pointer. That is; the last element in the argv array is always NULL. argc is always the count of non-NULL pointers in argv[], NOT the number of elements in argv. So you could ignore argc: Code:
int main( int argc, char **argv )
{
int MyNonNullElemCount = 0 ;
char **pArgs = argv ;
while ( NULL != *pArgs )
{
MyNonNullElemCount++ ;
pArgs++ ;
}
printf( "argc: %d\nMyNonNullElemCount: %d\nReturn: %d\n", argc, MyNonNullElemCount, MyNonNullElemCount != argc ) ;
return MyNonNullElemCount != argc ;
}
The above program prints argc, the final value in MyNonNullElemCount and the return value, then returns a non-zero value if MyNonNullElemCount does not match argc. This program should always return 0. A major source of confusion is the fact that argc is not really the count of the arguments your program was invoked with! Yup, there's another extra element in argv; argv[0] is reserved by the standard to hold the name by which the program is invoked. If you compiled prog1 but then renamed it prog2, then argv[0] holds a pointer to the c-string "prog2" (note that on many systems this sting also includes the path from which the program was invoked). So =>MyProg Arg1 results in argc being equal to 2. If your program isn't really interested in what it's current filename is, you are free to ignore argv[0]. While all this seems confusing at first, it really turns out to be quite convenient. If your program requires N arguments, you can check argc is N + 1. If it requires < N arguments, you can check argc < N + 1. If your program processes an indeterminate number of arguments, you can walk argv from it's second element (argv[1]) until you encounter a NULL pointer or you can use argc as a count down variable. Code:
int main( int argc, char **argv )
{
int MyNonNullElemCount = 0 ;
int MyArgCount = 0 ;
char **pArgs = argv ;
printf( "Command line: " ) ;
while ( NULL != *pArgs )
{
printf( "%s ", *pArgs ) ;
if ( MyNonNullElemCount > 0 )
{
MyArgCount++ ;
}
MyNonNullElemCount++ ;
pArgs++ ;
}
printf( "\nargc: %d\nMyNonNullElemCount: %d\nMyArgCount: %d\nReturn: %d\n", argc, MyNonNullElemCount, MyArgCount, MyNonNullElemCount != argc ) ;
return MyNonNullElemCount != argc ;
}
If you are new to C; compile and run the above code. Step through it with your debugger. Learn why it works the way it does and you will be well on your way to being a C programmer. How/who/what calls main()? When you compile and link your program, your code is bound to a set of library functions and c-runtime code that is supplied by the makers of your compiler. The standards require that some things must be initialized before main() is called (which is mostly beyond the scope of this document) and some of those things must be initialized at run-time. Generally, the operating environment loads your program into memory and then jumps to the start address of your program. That start address is NOT main(), it resides in the C/C++ runtime. That code initializes and prepares the environment (including initialization of argv and then calls main(). When your main() function returns, exits or aborts, execution picks up in the run-time code before returning to the operating environment. Adenda Both standards actually say/imply soem things that are not specified above. This was to keep it simple and appropriate for beginners. As with all standards there is always some argument about the interpretation of one thing or another. The above is as portable across compilers as it gets. Both standards treat main() as a special function. You can write: Code:
int main( void ) {} /* No return statement! */
Both standards require that when execution hits the closing brace '}', the main function returns a value of zero. It is considered bad form to rely on this behavior. Thanks to Clifford, Salem, Scorpions4ever, sizablegrin for their helpful comments and time spent reviewing this post. Please feel free to PM me if you have any comments.
__________________
It's not always a matter of what you can do with a language, but whether you should. [JwD] Last edited by jwdonahue : April 6th, 2007 at 06:00 PM. Reason: Adding names to the "thanks to" paragraph at the end. |