#1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2012
    Posts
    74
    Rep Power
    2

    Compiled vs Interpreted Languages


    I didn't find any sub-forum suitable for this question so I posted it here.
    I have a lot of questions about compiled and interpreted languages that are troubling me.
    1) Why were interpreted languages created?
    2) Why are they slow? One of the answer I got was because they are interpreted line-by-line, they take more time. Well, why can't they be just interpreted all at once after writing the whole program? (I asked this question before titled - "Why is python slow?" but anyways I asked it here along with my other doubts)
    3) All scripts like PHP, Javascript etc. are interpreted languages. Why is that so? Why can't they be compiled languages?

    Thank You!
  2. #2
  3. Sarcky
    Devshed Supreme Being (6500+ posts)

    Join Date
    Oct 2006
    Location
    Pennsylvania, USA
    Posts
    10,692
    Rep Power
    6351
    1) Because compiling can often take a while and interpreted languages allow for faster debugging and more rapid prototyping. Plus, then the INTERPRETER has to be operating-system dependent, rather than the SCRIPT. A program must work specifically on a target operating system (and often hardware). If you use an interpreted language, the INTERPRETER has to be OS-and-hardware-dependent. The script author doesn't have to worry about memory management, endian-ism, operating system hooks (to a certain extent), etc. Most PHP scripts will run on any operating system, whereas a C# script will run ONLY on Windows (and often only the most recent version). Java is a half-and-half kind of thing, a compiled language which has a virtual machine configured to the operating system.

    2) Because "interpreting it all at once" is commonly referred to as "compiling"

    3) See #1.
    HEY! YOU! Read the New User Guide and Forum Rules

    "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin

    "The greatest tragedy of this changing society is that people who never knew what it was like before will simply assume that this is the way things are supposed to be." -2600 Magazine, Fall 2002

    Think we're being rude? Maybe you asked a bad question or you're a Help Vampire. Trying to argue intelligently? Please read this.
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2012
    Posts
    74
    Rep Power
    2
    Originally Posted by ManiacDan
    1) Because compiling can often take a while and interpreted languages allow for faster debugging and more rapid prototyping. Plus, then the INTERPRETER has to be operating-system dependent, rather than the SCRIPT. A program must work specifically on a target operating system (and often hardware). If you use an interpreted language, the INTERPRETER has to be OS-and-hardware-dependent. The script author doesn't have to worry about memory management, endian-ism, operating system hooks (to a certain extent), etc. Most PHP scripts will run on any operating system, whereas a C# script will run ONLY on Windows (and often only the most recent version). Java is a half-and-half kind of thing, a compiled language which has a virtual machine configured to the operating system.

    2) Because "interpreting it all at once" is commonly referred to as "compiling"

    3) See #1.
    How is "interpreting all at once" compiling? We convert the code intro bytecode which is what interpreting languages do, right? So instead of converting one line into bytecode and executing it, we convert the whole thing into bytecode and then run it. This way you increase the speed and also stay cross-platform
  6. #4
  7. Sarcky
    Devshed Supreme Being (6500+ posts)

    Join Date
    Oct 2006
    Location
    Pennsylvania, USA
    Posts
    10,692
    Rep Power
    6351
    Interpreted languages are not simply run one line at a time. The whole thing has to be parsed and loaded in some way before it's run. PHP allows function declarations to come after the line calling them, for instance. The whole program is loaded, various important things (functions and classes and whatnot) are parsed, then the procedural bits are executed one step at a time. However, it is not compiled into a stand-alone executable like compiled languages are.
    HEY! YOU! Read the New User Guide and Forum Rules

    "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin

    "The greatest tragedy of this changing society is that people who never knew what it was like before will simply assume that this is the way things are supposed to be." -2600 Magazine, Fall 2002

    Think we're being rude? Maybe you asked a bad question or you're a Help Vampire. Trying to argue intelligently? Please read this.
  8. #5
  9. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2012
    Posts
    74
    Rep Power
    2
    Okay most of my doubts are clear. A few are still bugging me:
    I understand that interpreted languages were created so that they can be OS-independent.
    1) But why are interpreted languages executed line-by-line? Why can't they be converted to byte-code at the end of the program? (Maybe, you answered it but I didn't understand.)
    2) How interpreted languages help in fast debugging?
  10. #6
  11. Sarcky
    Devshed Supreme Being (6500+ posts)

    Join Date
    Oct 2006
    Location
    Pennsylvania, USA
    Posts
    10,692
    Rep Power
    6351
    1) Some of them are, but the longer it takes to compile/run a program, the slower it is to debug. There are accelerator apps which will halfway compile things like PHP.

    2) Debugging is often making very small changes and re-running the program over and over. if you have to wait 3 minutes for it to compile every time, that's going to significantly slow your debugging. Things like PHP only run for a second or two at the most before the application is entirely finished anyway, so wasting the time compiling them over and over is a pain to a developer.
    HEY! YOU! Read the New User Guide and Forum Rules

    "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin

    "The greatest tragedy of this changing society is that people who never knew what it was like before will simply assume that this is the way things are supposed to be." -2600 Magazine, Fall 2002

    Think we're being rude? Maybe you asked a bad question or you're a Help Vampire. Trying to argue intelligently? Please read this.
  12. #7
  13. Banned ;)
    Devshed Supreme Being (6500+ posts)

    Join Date
    Nov 2001
    Location
    Woodland Hills, Los Angeles County, California, USA
    Posts
    9,593
    Rep Power
    4207
    Originally Posted by Avichal
    Okay most of my doubts are clear. A few are still bugging me:
    I understand that interpreted languages were created so that they can be OS-independent.
    Not true. There are interpreted languages that only work on one OS. BASICA or CBASIC, for example. Some interpreters ran in limited memory environments better than compilers (for example, you could get some dialects of BASIC to run in as little as 2 KB of memory -- very useful when your entire computer had only 4 KB of RAM). That's why a lot of early home computers used to come with some dialect of BASIC built in.

    Originally Posted by Avichal
    1) But why are interpreted languages executed line-by-line? Why can't they be converted to byte-code at the end of the program? (Maybe, you answered it but I didn't understand.)
    Actually, some interpreters do this. For example, in python, you get a generated .pyc file that is the byte-compiled version of the .py file and then the python interpreter actually runs the .pyc file.

    I used to work with a long forgotten 4GL language called DataFLEX that had a compiler that would produce bytecode. Then you would distribute this bytecode, which would run on a dataflex runtime on a target machine. Basically, the dataflex runtime created a virtual machine which would run your bytecode. The company that made this tool had dataflex runtimes that could run on DOS, UNIX, VAX, CP/M, OS/2 and a couple of others OSs.

    Most interpreted languages (even PHP and perl) generally byte-compile as much as they can ahead of time. However, even with byte-compilation in the picture, many of these languages still need to run line by line and the reason has to do mainly with the dynamic nature of these languages. For example:
    Code:
    $x = 1;
    print $x + 2;
    $x = "hello world";
    print $x . " some more text";
    $x = MyObject->new();
    Here, the type of $x is changing from line to line. There is no way for the interpreter to infer what the type of $x is going to be until the point of interpreting each line.

    However, it is possible to write an interpreter that individually interprets each line every time, without doing any byte-code compilation in between (hell, I wrote three or four myself -- my last one was an interpreter of a subset of scheme written in perl, which I presented in a recent perl mongers meeting. 500 lines of code, including comments -- I'll write a series of devshed articles about that, if anyone is interested)

    Originally Posted by Avichal
    2) How interpreted languages help in fast debugging?
    You save time by not recompiling your program each time. Actually I'm not sure how much of a saving that is, because a lot of modern build tools are pretty smart and fast about recompiling as well (e.g. make, mvn, ant etc., which can compile only the files needed to be compiled).

    Also, it is possible to write an interpeter or a compiler for most languages. For example, most people think of C as a compiled language, but there are some pretty good C interpreters around as well.
    Last edited by Scorpions4ever; August 8th, 2013 at 10:18 PM.
    Up the Irons
    What Would Jimi Do? Smash amps. Burn guitar. Take the groupies home.
    "Death Before Dishonour, my Friends!!" - Bruce D ickinson, Iron Maiden Aug 20, 2005 @ OzzFest
    Down with Sharon Osbourne

    "I wouldn't hire a butcher to fix my car. I also wouldn't hire a marketing firm to build my website." - Nilpo
  14. #8
  15. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2013
    Location
    Usually Japan when not on contract
    Posts
    240
    Rep Power
    11
    Originally Posted by Avichal
    I understand that interpreted languages were created so that they can be OS-independent.
    This is a misconception borne of Sun's hypermarketing of Java in the 90's. Interpreted languages (Lisp, AWK, the original Algol partial implementations, the original shells, etc. come to mind) were interpreted because they were capable of things that compiler designs of their day could not deal with in an elegant way or within the resource constraints of the systems they were written for.
    1) But why are interpreted languages executed line-by-line?
    Interpreted languages are not necessarily executed line-by-line. This is a misconception borne of the tradition of "scripting" equating to "shell-language program", and "interpreted language" commonly referring to shell languages themselves. If you consider that a shell language has one overriding design constraint: be interactively useful as a single-line text interface to the system and also be useful as a full-blown multi-line programming language.

    For example, think through ways that Bash could be interpreted in any other way than line-by-line. The workaround is to directly involve a mutable environment (ad-hoc global variable declarations). And that's not a perfect workaround -- but it works. Both variables and functions are defined in the environment (in different ways).

    Other interpreted languages don't work this way at all. In Scheme, for example, nothing is executed until it is evaluated (there are Python implementations that work this way as well). So while the language runtime must necessarily use a parser to tokenize the language piece by piece, and a parse tree can be generated that signifies the meaning of the program, the actual execution path is not traversed until an expression is evaluated. As mentioned earlier, PHP is interpreted but permits function definitions to occur after their invocations -- that is because PHP performs parsing and evaluation in separate steps.

    Actually, this whole "can define after its called" thing is only possible in a language that mandates either a runtime interpreter or, at a minimum, a double-pass parsing process. Double-pass parsing permits some amazing things, but it can make compilation take a really long time, so its rare even today (and double-pass usually also implies a pre-process pass... so that's at least 3 passes over the same source).

    Languages like Haskell are even more magical because they adhere to "lazy evaluation" by definition (among other reasons) -- which makes things a lot more interesting. Even more magical is that there are both interpreted and compiled implementations of Haskell.
    Why can't they be converted to byte-code at the end of the program? (Maybe, you answered it but I didn't understand.)
    Because this is a function of the runtime, not the language. The Python interpreter interprets code, building an in-memory representation of what has been defined thus far, but the python runtime will compile to bytecode any source file that is invoked which does not yet have a current bytecode available and from then on only reference the bytecode. Two different ways of doing things, but the same language.

    Erlang takes this even farther, having source, compiled bytecode and a runtime that can split running processes on the fly in a way that permits code updates without ever restarting the program that is running. Pretty neat... and not something you can do without a program independent of the one you want to execute acting as a supervisor and keeping track of environmental things like whether or not a newer version of the program as been installed while the old version was running. So the role of a runtime isn't restricted to interpretation of the code, it also manages the code in some cases.

    Another interesting example is Common Lisp. One language but a bajillion implementations. Some are purely interpreted, using the language as a script. Some are fully compiled, treating the source files the same way a C compiler would. Some are compiled to native machine code by the interpreter on the fly and run about the same speed as equivalent code in C. That is, interpreter/compilers like SBCL actually compile statement-by-statement to machine code (not bytecode) as you type them and update re-definitions on the fly if this is what you want to happen.

    So its not just that compilation to bytecode VS script/textual interpretation is a language implementation issue rather than a language design issue -- its that compilation to native code VS bytecode is also an implementation issue. The language defines a way to express process. The implementation of the language is a way of making those human-friendly symbols into something the computer can understand.

    The more complex/flexible the needs of the language, the more likely it is that the first implementations will be interpreted, because there's a limit to how much wizardry a compiler developer can come up in a day. Compiler techniques have also exploded over the last three decades -- what was once considered "interesting, but impractical to implement" is now commonplace. Some of that has to do with the resources we have available today, but a lot of that has to do with what we've learned about expression of process over the last several decades of hard, long thinking.
    2) How interpreted languages help in fast debugging?
    Above I mentioned that some runtimes build a parse tree in memory, and execute once an expression is evaluated. Because the program is running within another program (the interpreter), if the script crashes or encounters an error but the runtime doesn't itself crash, the runtime is still around to explain the exact state of the program that was running and the environment within which it was executing. This doesn't make debugging an interpreted language automatically better, but it makes writing magical debugging tools a lot easier.

    Another reason that interpreted languages are easier to debug is that you don't have to wait for a compile to occur. Some large programs take a long time to compile. If you're just trying to tweak a minor feature its a pain to have to recompile it to test each incremental version -- which leads to a tendency to make huge edits and test them all at once. This can make debugging hard -- but its not as technical a reason as the availability of interactive post-mortem analysis of the environment as mentioned above.

    The psychology of the developer/team is critical to the success of any project. The positive impact of the relative ease and low cost of experimentation in an interpreted environment cannot be overstated.

    Anyway, the whole "interpreted VS compiled" thing isn't nearly as clear-cut as beginner books and blogs about Java and C make them out to be. It is true that the newest, most powerful tools are also often the slowest. That has mostly to do with it taking a long time to clean up a good idea than good ideas necessarily translating to slow code. Where I work it is common practice to reach first for high-powered abstractions (which may run slow), and then see if the result is "too slow" or not. Usually it is a little too slow, in one tiny area. So its common to see a Python program written that does everything we need it to, but just a little too slowly. Then we profile it to figure out exactly what part is slow, and then re-write that part in C -- and wind up with a blazingly fast program written 99% in an easy, flexible, "interpreted" language that's easier to maintain than the equivalent C would have been.
  16. #9
  17. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2013
    Location
    Usually Japan when not on contract
    Posts
    240
    Rep Power
    11
    By post times it looks like we were covering a lot of the same ground at different granularity at exactly the same time...
    Originally Posted by Scorpions4ever
    ...an interpreter of a subset of scheme written in perl, which I presented in a recent perl mongers meeting. 500 lines of code, including comments -- I'll write a series of devshed articles about that, if anyone is interested.
    I'm interested. Both in the implementation and why you chose to implement the specific subset.

    (I'm writing a new data language within the Postgres engine that turns out to be awfully lispy -- though that wasn't the intent at the start.)
  18. #10
  19. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2012
    Posts
    74
    Rep Power
    2
    You just brought me back to square one.
    I thought interpreted languages were created to have programs OS-independent but you say that's a misconception.
    I thought interpreted languages are executed line-by-line but you say that is not necessarily the case.

    I know a lot of definitions are there on the net but will you please define interpreted and compiled languages once. Also now after all the discussion I'm lost on the motive behind interpreted languages. Why were they created?

    Sorry, if this discussion is extended too long.
  20. #11
  21. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2013
    Location
    Usually Japan when not on contract
    Posts
    240
    Rep Power
    11
    You're confusing a language's definition with its implementation.

    I can design a language and never actually write a compiler for it. That would be a language design with no implementation.

    I can design a language that can do cool stuff that nobody can figure out how to write a compiler for, but can figure out how to interpret into a series of commands in a more limited language -- hence a language specification with an interpreted implementation.

    I can design a language that I write a compiler for. Like C. Or a compiler and an interpreter, like Haskell.

    It is easiest to implement most languages in an interpreted fashion. Otherwise you have to map out all the ways that the physical circuitry of the underlying machine must behave when your users do something that the language design supports that the machine is oblivious to. Its easier to build a "virtual machine" that does know how to deal with all your language features, and map its (more limited) behavior to the (even more limited) behavior of the underlying hardware.

    That's why most higher-level languages are either interpreted, "pre-compiled" to a limited-set language like C-- (a subset of C) or both.

    Anyway, you're getting wrapped around the axle about something that doesn't really need a strict definition in your mind just yet. My recommendation is to tuck the things in this thread in the back of your mind for a while and let them roast -- while you go write some programs in some language. Come back and revisit this after you wrestle with the living math a bit and a few things here will make a lot more sense. Then go program some more/learn some more indirectly related stuff. Then come back and think on this again. Rinse and repeat until clarity is achieved.

    That last paragraph represents what has been my own learning process over the years on any subject sufficiently deep to seem completely esoteric and magical at first look.

IMN logo majestic logo threadwatch logo seochat tools logo