C Programming
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me

The Shed is going Social! Join us on FaceBook and Twitter and chime in on the conversation.

Go Back   Dev Shed ForumsProgramming LanguagesC Programming

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Dev Shed Forums Sponsor:
  #1  
Old August 18th, 2012, 03:51 AM
sagarkamble sagarkamble is offline
Registered User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Aug 2012
Posts: 1 sagarkamble User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 30 m 3 sec
Reputation Power: 0
Making of C Compiler

I want to know that how exactly C code get executed. I want to know that what things get happened when we give command for compiling C program.

Reply With Quote
  #2  
Old August 18th, 2012, 05:05 AM
bdb bdb is offline
Contributing User
Dev Shed Newbie (0 - 499 posts)
 
Join Date: Aug 2012
Posts: 156 bdb User rank is Sergeant Major (2000 - 5000 Reputation Level)bdb User rank is Sergeant Major (2000 - 5000 Reputation Level)bdb User rank is Sergeant Major (2000 - 5000 Reputation Level)bdb User rank is Sergeant Major (2000 - 5000 Reputation Level)bdb User rank is Sergeant Major (2000 - 5000 Reputation Level)bdb User rank is Sergeant Major (2000 - 5000 Reputation Level) 
Time spent in forums: 1 Week 15 h 48 m 11 sec
Reputation Power: 32
It's not really an answer to your question, but you might like to read A Whirlwind Tutorial on Creating Really Teensy ELF Executables for Linux.

Reply With Quote
  #3  
Old August 19th, 2012, 06:17 AM
mitakeet's Avatar
mitakeet mitakeet is offline
I'm Baaaaaaack!
Dev Shed God 1st Plane (5500 - 5999 posts)
 
Join Date: Jul 2003
Location: Maryland
Posts: 5,538 mitakeet User rank is Captain (20000 - 30000 Reputation Level)mitakeet User rank is Captain (20000 - 30000 Reputation Level)mitakeet User rank is Captain (20000 - 30000 Reputation Level)mitakeet User rank is Captain (20000 - 30000 Reputation Level)mitakeet User rank is Captain (20000 - 30000 Reputation Level)mitakeet User rank is Captain (20000 - 30000 Reputation Level)mitakeet User rank is Captain (20000 - 30000 Reputation Level)mitakeet User rank is Captain (20000 - 30000 Reputation Level)mitakeet User rank is Captain (20000 - 30000 Reputation Level) 
Time spent in forums: 2 Weeks 4 Days 2 h 38 m 46 sec
Reputation Power: 242
C code is converted to assembler which is then assembled into binary instructions (usually position independent). Nothing happens until the linker/loader is invoked when you run the program and the linker resolves any calls to other object (binary) files and the loader puts the executable image into RAM at a specific location and then launches the startup routine.

This isn't really a C/C++ question. The compiler converts the human readable text into a binary representation specific to the OS/hardware the program is expected to run on.

I have a couple of examples of code written directly in binary instructions if you are curious:

http://sol-biotech.com/code/SelfModifyingCPUID/
http://sol-biotech.com/code/CPE//
__________________

My blog, The Fount of Useless Information http://sol-biotech.com/wordpress/
Free code: http://sol-biotech.com/code/.
Secure Programming: http://sol-biotech.com/code/SecProgFAQ.html.
Performance Programming: http://sol-biotech.com/code/PerformanceProgramming.html.
LinkedIn Profile: http://www.linkedin.com/in/keithoxenrider

It is not that old programmers are any smarter or code better, it is just that they have made the same stupid mistake so many times that it is second nature to fix it.
--Me, I just made it up

The reasonable man adapts himself to the world; the unreasonable one persists in trying to adapt the world to himself. Therefore, all progress depends on the unreasonable man.
--George Bernard Shaw

Reply With Quote
  #4  
Old August 20th, 2012, 11:35 PM
Lux Perpetua Lux Perpetua is offline
Contributing User
Dev Shed Intermediate (1500 - 1999 posts)
 
Join Date: Feb 2004
Location: San Francisco Bay
Posts: 1,936 Lux Perpetua User rank is General 5th Grade (Above 100000 Reputation Level)Lux Perpetua User rank is General 5th Grade (Above 100000 Reputation Level)Lux Perpetua User rank is General 5th Grade (Above 100000 Reputation Level)Lux Perpetua User rank is General 5th Grade (Above 100000 Reputation Level)Lux Perpetua User rank is General 5th Grade (Above 100000 Reputation Level)Lux Perpetua User rank is General 5th Grade (Above 100000 Reputation Level)Lux Perpetua User rank is General 5th Grade (Above 100000 Reputation Level)Lux Perpetua User rank is General 5th Grade (Above 100000 Reputation Level)Lux Perpetua User rank is General 5th Grade (Above 100000 Reputation Level)Lux Perpetua User rank is General 5th Grade (Above 100000 Reputation Level)Lux Perpetua User rank is General 5th Grade (Above 100000 Reputation Level)Lux Perpetua User rank is General 5th Grade (Above 100000 Reputation Level)Lux Perpetua User rank is General 5th Grade (Above 100000 Reputation Level)Lux Perpetua User rank is General 5th Grade (Above 100000 Reputation Level)Lux Perpetua User rank is General 5th Grade (Above 100000 Reputation Level)Lux Perpetua User rank is General 5th Grade (Above 100000 Reputation Level) 
Time spent in forums: 1 Month 1 Week 2 h 12 m 42 sec
Reputation Power: 1312
Quote:
Originally Posted by sagarkamble
I want to know that how exactly C code get executed. I want to know that what things get happened when we give command for compiling C program.

1. Preprocess (insert #included files, replace macros by their definition, etc.)
2. Compile to assembly code
3. Assemble to machine code
4. Link

It's helpful to view the result of each stage. I'm going to use gcc for my examples because that's what I'm familiar with. It's most instructive to do this with a small "hello world"-type program. Suppose the source file you want to compile is called program.c.

1. Preprocess:
Code:
gcc -E program.c > program.i

Now, you can peruse program.i and see all the header files spliced right in and all your macros expanded. If you've never read stdio.h, you might be surprised by how big the preprocessed file is.

2. Compile:
Code:
gcc -S program.i

This created an assembly source file called program.s, which you can view in a text editor. This part really benefits from using a very simple "hello world" program; you might be surprised how short the assembly source is, especially compared to how long the preprocessed file was! With a little effort, you can probably even understand the assembly code.

3. Assemble:
Code:
gcc -c program.s

You now have an object file, program.o. This is one small step away from being an executable program. The main difference has to do with any external functions or variables (for example, printf) you use in your source code: at this stage of the process, GCC has not tried to track those down for you, so those remain unresolved references in the object file. The object file is a binary file and thus not human readable, but there are utilities for extracting information from object files. A pretty useful one is nm, which can show all the external references in an object file:
Code:
nm -g program.o
You should see all the non-static functions in your code listed as well as any standard library functions like printf or scanf.

4. Link:
Code:
gcc program.o -o program

Not much to do now but run the program! (Of course, the compiled program is not human-readable. You can still get information from it in various ways, but I don't think it's relevant to your question any more.)

Last edited by Lux Perpetua : August 21st, 2012 at 03:42 AM. Reason: Fixing a typo

Reply With Quote
  #5  
Old August 21st, 2012, 12:43 AM
dwise1_aol's Avatar
dwise1_aol dwise1_aol is offline
Contributing User
Dev Shed God 2nd Plane (6000 - 6499 posts)
 
Join Date: Jan 2003
Location: USA
Posts: 6,122 dwise1_aol User rank is General 14th Grade (Above 100000 Reputation Level)dwise1_aol User rank is General 14th Grade (Above 100000 Reputation Level)dwise1_aol User rank is General 14th Grade (Above 100000 Reputation Level)dwise1_aol User rank is General 14th Grade (Above 100000 Reputation Level)dwise1_aol User rank is General 14th Grade (Above 100000 Reputation Level)dwise1_aol User rank is General 14th Grade (Above 100000 Reputation Level)dwise1_aol User rank is General 14th Grade (Above 100000 Reputation Level)dwise1_aol User rank is General 14th Grade (Above 100000 Reputation Level)dwise1_aol User rank is General 14th Grade (Above 100000 Reputation Level)dwise1_aol User rank is General 14th Grade (Above 100000 Reputation Level)dwise1_aol User rank is General 14th Grade (Above 100000 Reputation Level)dwise1_aol User rank is General 14th Grade (Above 100000 Reputation Level)dwise1_aol User rank is General 14th Grade (Above 100000 Reputation Level)dwise1_aol User rank is General 14th Grade (Above 100000 Reputation Level)dwise1_aol User rank is General 14th Grade (Above 100000 Reputation Level)dwise1_aol User rank is General 14th Grade (Above 100000 Reputation Level) 
Time spent in forums: 2 Months 2 Weeks 3 Days 13 h 38 m
Reputation Power: 1949
You're asking two different questions there, so we're a bit confused as to what you want.

I'll assume that you're asking about the C build process:

When the build involves multiple source files, each source file is compiled separately, such that the compiler starts each compilation with absolutely no knowledge of what it had found in any other source file. That is why you place type definitions, macros, extern variables, and function prototypes in a header file that's associated with a source file so that the other source files can #include it to let them know what's in that other source file.

With each source file, first the compiler runs the preprocessor, which executes megacommands that start with #, such as #define, #include, #ifdef. The preprocessor inserts the files indicated by the #include commands, expands macros (which are defined by #define), interpret conditional compilation commands by including or excluding the indicated code, etc. Basically, the preprocessor creates the final compilable form of the source file. In many compilers, you can command the compiler to generate an output file which is that final compilable form; how to do that differs from one compiler to another.

Then the compiler does its thing, parsing the source code, building symbol tables, translating the source code to assembly (or to an intermediate form that will then be converted to assembly) and converting the assembly to object code, which is mostly but not quite machine code. That object code goes into an object file (eg, .obj in Microsoft, .o in Linux) which also marks up the object code for where the code accesses external resources (these are called unresolved symbols) as well as contains tables for the linker to use in resolving those unresolved symbols. Actual tables and file format depends on the compiler, etc.

When all the source files have been compiled, the linker is invoked to generate the executable. The linker takes all the object files and all the referenced libraries (special object files designed for reuse; .LIB in Microsoft and .a in Linux, the Standard C Library is an example, though you could create your own libraries) and links them all together in the executable, generating location tables in the process. Then it uses those tables to go through each object file and replace the "unresolved symbol" markers with the actual address of each symbol, AKA "resolving the addresses".

Each step depends on what can be known at that time -- these times being known as "compile-time", "link-time", and "run-time" -- ; it is absolutely and vitally necessary to know which "time" you are in. At compile-time, compiling a source file depends on header files to tell it what should exist in other source files or in libraries being linked in, so the object code contains markers, AKA "place holders", for address information to be inserted later. At link-time, linking handles all that, but it still does not know exactly where in memory the program will be loaded, so the linker has no idea of the exact memory location of each variable and function, information which is absolutely necessary for the code to actually execute.

For that reason, all addresses in the executable are resolved relative to a common starting address and the location of all addresses is marked either in the object code or in a relocation table in the executable. Then when you execute the program, you do so with a loader which obtains a block of memory for the program and then performs relocation of all the addresses. That creates in the memory a memory image which is executable as-is; note that in embedded programming, the end-result is a memory image that can be loaded into some kind of PROM (programmable read-only memory).

If you can get your hands on it, read The MS-DOS Encyclopedia (Microsoft Press, 1988 -- decades out-of-print by now, obviously). It not only explains the process excellently (it's where I learned what the loader does), but also provides and explains the file formats. Of course, that's the only reason for you to read it and those formats are obsolete. If you can find a similar description of the OS and compiler that you're using, then get that description and read it.

Reply With Quote
  #6  
Old August 26th, 2012, 08:23 PM
Schol-R-LEA's Avatar
Schol-R-LEA Schol-R-LEA is offline
Commie Mutant Traitor
Dev Shed Intermediate (1500 - 1999 posts)
 
Join Date: Jun 2004
Location: Norcross, GA (again)
Posts: 1,759 Schol-R-LEA User rank is General 9th Grade (Above 100000 Reputation Level)Schol-R-LEA User rank is General 9th Grade (Above 100000 Reputation Level)Schol-R-LEA User rank is General 9th Grade (Above 100000 Reputation Level)Schol-R-LEA User rank is General 9th Grade (Above 100000 Reputation Level)Schol-R-LEA User rank is General 9th Grade (Above 100000 Reputation Level)Schol-R-LEA User rank is General 9th Grade (Above 100000 Reputation Level)Schol-R-LEA User rank is General 9th Grade (Above 100000 Reputation Level)Schol-R-LEA User rank is General 9th Grade (Above 100000 Reputation Level)Schol-R-LEA User rank is General 9th Grade (Above 100000 Reputation Level)Schol-R-LEA User rank is General 9th Grade (Above 100000 Reputation Level)Schol-R-LEA User rank is General 9th Grade (Above 100000 Reputation Level)Schol-R-LEA User rank is General 9th Grade (Above 100000 Reputation Level)Schol-R-LEA User rank is General 9th Grade (Above 100000 Reputation Level)Schol-R-LEA User rank is General 9th Grade (Above 100000 Reputation Level)Schol-R-LEA User rank is General 9th Grade (Above 100000 Reputation Level)Schol-R-LEA User rank is General 9th Grade (Above 100000 Reputation Level) 
Time spent in forums: 1 Month 2 Weeks 2 Days 3 h 38 m 3 sec
Reputation Power: 1568
Sorry for a somewhat late reply, but if you are interested in executable formats and how they are used, you would do well to check out Linkers and Loaders by John Levine. While it is now somewhat dated, most of the information is still relevant, and an early version of the book is available for reading online on that page.
__________________
Rev First Speaker Schol-R-LEA;2 JAM LCF ELF KoR KCO BiWM TGIF
#define KINSEY (rand() % 7) λ Scheme is the Red Pill
Scheme in ShortUnderstanding the C/C++ Preprocessor
Taming PythonA Highly Opinionated Review of Programming Languages for the Novice, v1.1

FOR SALE: One ShapeSystem 2300 CMD, extensively modified for human use. Includes s/w for anthro, transgender, sex-appeal enhance, & Gillian Anderson and Jason D. Poit clone forms. Some wear. $4500 obo. tverres@et.ins.gov

Reply With Quote
  #7  
Old August 29th, 2012, 02:02 PM
clifford's Avatar
clifford clifford is offline
Contributing User
Dev Shed Demi-God (4500 - 4999 posts)
 
Join Date: Aug 2003
Location: UK
Posts: 4,804 clifford User rank is General 12nd Grade (Above 100000 Reputation Level)clifford User rank is General 12nd Grade (Above 100000 Reputation Level)clifford User rank is General 12nd Grade (Above 100000 Reputation Level)clifford User rank is General 12nd Grade (Above 100000 Reputation Level)clifford User rank is General 12nd Grade (Above 100000 Reputation Level)clifford User rank is General 12nd Grade (Above 100000 Reputation Level)clifford User rank is General 12nd Grade (Above 100000 Reputation Level)clifford User rank is General 12nd Grade (Above 100000 Reputation Level)clifford User rank is General 12nd Grade (Above 100000 Reputation Level)clifford User rank is General 12nd Grade (Above 100000 Reputation Level)clifford User rank is General 12nd Grade (Above 100000 Reputation Level)clifford User rank is General 12nd Grade (Above 100000 Reputation Level)clifford User rank is General 12nd Grade (Above 100000 Reputation Level)clifford User rank is General 12nd Grade (Above 100000 Reputation Level)clifford User rank is General 12nd Grade (Above 100000 Reputation Level)clifford User rank is General 12nd Grade (Above 100000 Reputation Level) 
Time spent in forums: 1 Month 2 Days 16 h 42 m 47 sec
Reputation Power: 1800
Quote:
Originally Posted by sagarkamble
I want to know that how exactly C code get executed.

C code is not executed, it is compiler into a machine code executable. It is the the usually operating system that is responsible for loading and starting execution of the code. Embedded systems or bootstrap code (where there is no OS) may be started by other mechanisms, but that is probably not what you are asking about here?

Quote:
Originally Posted by sagarkamble
I want to know that what things get happened when we give command for compiling C program.
Compilation of C comprised of a number of stages, primarily:
  • Pre-processing - any line beginning with # is a preprocessor directive. The preprocessor outputs C code with all the #include'd code inserted, all the #define macro instances replaced, and any #if... conditional code included or removed as directed.
  • Compilation - the compiler proper generated "object" code. Some compilers generate assembler and then have an assembler pass to generate machine code, others generate machine code directly. The object code output by the compiler does not include code relating to external references to library code or code in separately compiled object code - the object file contains unresolved links to such code.
  • Linking - the linker is responsible for assembling separate modules and library code into a single executable, and resolving all unresolved references with references to the linked code.

The body of you post does not contain any clear and specific questions and your title seems to be asking something else altogether; that suggests that you want to build a compiler?

Compilers can be complex things. First of all they are required to generate assembler or machine code, so to create a compiler you must be familiar with the target instruction set and architecture. Moreover to create a linker, you need to know how the OS loads an executable and the format of the executable file to support loading. Luckily C is a rather small and simple language for the most part, bit still not insignificant. Modern compilers perform many complex optimisations requiring deep analysis of code flow and instruction execution.

One way to start studying a simple compiler implementation is to perhaps look at the source and documentation for Tiny C Compiler

Reply With Quote
Reply

Viewing: Dev Shed ForumsProgramming LanguagesC Programming > Making of C Compiler

Developer Shed Advertisers and Affiliates



Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump

Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 


Powered by: vBulletin Version 3.0.5
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.

© 2003-2013 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap