Thread: Machine coding

    #1
  1. Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Dec 2004
    Location
    Meriden, Connecticut
    Posts
    1,797
    Rep Power
    154

    Machine coding


    I'm not too knowledgable in what this really means. Am I correct in saying that assemblers [usually] convert their code to machine [readable] code? If this is so, then there must be some way to code in this machine code alone without the help of an assembler? Is the machine code basically or by and large merely binary?

    I am working on writing my own language, which I also want to be able to be compiled to an executable file on a certain OS. I'd rather have my compiler convert my code directly to machine code as opposed to ASM and then to machine code from an assembler. Wouldn't this increase compilation time? Not that it would be by much in many cases, but I'd like to try.

    Anyways, would anyone hapen to be able to fill in the blank spots and perhaps direct me to places where I can learn how to write machine code on the x86? I'm guessing OS matters? I'm on Windows at the moment, but I'd also like this to work on Linux for me as well.

    Any help is appreciated.
  2. #2
  3. fork while true;
    Devshed God 1st Plane (5500 - 5999 posts)

    Join Date
    May 2005
    Location
    England, UK
    Posts
    5,538
    Rep Power
    1051
    It's not worth the effort. The reason compilers support output to ASM first is that you can then manually tweak the ASM, whereas few people can tweak machine code. There's a lot to be said for modularity.

    GAS also does a bit of optimisation that would be missing from your compiler, with the LIKELY() pragmas, etc.
  4. #3
  5. Only the strong survives!!.
    Devshed God 1st Plane (5500 - 5999 posts)

    Join Date
    Feb 2003
    Location
    A World of wonders.
    Posts
    5,583
    Rep Power
    407
    I also think it would be a waste of time... although... if compilar convert everything to machine code.. note that machine code is hardly human readable... so i think you should just let the compilar's do what it was made for.
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2006
    Posts
    48
    Rep Power
    12
    Originally Posted by Yegg`
    I'm not too knowledgable in what this really means. Am I correct in saying that assemblers [usually] convert their code to machine [readable] code? If this is so, then there must be some way to code in this machine code alone without the help of an assembler? Is the machine code basically or by and large merely binary?
    Assemblers generate object code, which you could say is binary. It can be represented in various numbering systems, but the ones I'm familiar with are usually expressed in hexadecimal notation just because that format makes sense on the machine in question and all the utilities use the same format.

    A number is a number is a number.

    Yes, there is most definitely a way to write object code without an assembler, and I've had several jobs where I got paid to do it. It's very interesting work and has its place. Sometimes it's not practical to use an assembler, and so we write in object code. It's also extraordinarily useful to be able to read object code. When you can do that, you can understand much from storage dumps without having to resort to program listings or dissasemblers.

    Originally Posted by Yegg`
    I am working on writing my own language, which I also want to be able to be compiled to an executable file on a certain OS. I'd rather have my compiler convert my code directly to machine code as opposed to ASM and then to machine code from an assembler. Wouldn't this increase compilation time? Not that it would be by much in many cases, but I'd like to try.
    Most assemblers operate in several passes including macro (preprocessing) and two passes over the source to resolve symbols backwards and forwards, before generating object code. Except in R&D and academic environments (damn you, Niklaus Wirthless!) code is executed more than it's compiled. So people normally make a design decision to spend time during compilation and assembly to generate the best possible executable. After that it still has to be linked or bound to generate the executable, and even during runtime, depending on how the software is packaged, it may invoke yet other executables which have to be loaded. There's really quite a lot involved in getting something from source code, into memory, and giving it control.

    Originally Posted by Yegg`
    Anyways, would anyone hapen to be able to fill in the blank spots and perhaps direct me to places where I can learn how to write machine code on the x86? I'm guessing OS matters? I'm on Windows at the moment, but I'd also like this to work on Linux for me as well.
    Any help is appreciated.
    The best place to learn about your machine is from the processor manuals. Fortunately, Intel publishes extensive free documentation on x86 which I found in one of the links in one of the posts on the languages forum here about assembly language. The processor manuals explain the architecture of the machine and document all of the formats of instructions including their opcodes, operands, and other things like numbering systems notation. Anyone who needs to be expert in assembly language should also consult whatever processor manuals he has as third party assembly language texts cannot cover this sort of material in the level of detail required to understand enough to do a proper job of coding.

    In answer to your question about operating system dependencies, yes, this matters quite a lot even on the same hardware. Fundamental operations such as moving data around, doing arithmetic, and local flow control are architectural; they don't vary from OS to OS on the same hardware, because they're not operating systems services- they're hardware instructions/features. But services such as obtaining and freeing memory, doing high-level (device-dependent) I/O, etc. are operating system services and are provided through interfaces. So you'll have to use the APIs provided by whatever platform (OS) you want your code to run on. And if you want the same mainline to run on multiple OS, you can use the macro or preprocessor language of your assembler (if it provides one) to tailor the executable for each environment. If you don't have that support, you can write it. And if you want, you can do the support in object code, except that it's less efficient as it will involve branching to/around environmental code instead of generating a purpose-built program for the OS...

    This issue comes up more often than one might think. If you have a current assembler, it can generate opcodes that aren't supported on older hardware or older versions of the OS. What we do is to make tests at execution time for various hardware and operating features and contour the execution path based on whether or not they're available.

    I made some comments that may be relevant to your question in this thread:

    http://forums.devshed.com/other-programming-languages-139/is-there-a-good-website-about-assembly-328013-2.html?pp=15

    Edit: Sorry, I remembered incorrectly about the link to the processor manuals. The GCC page has a nice list here: http://gcc.gnu.org/readings.html

    p.s. Thanks Simon

    Comments on this post

    • SimonGreenhill agrees
    Last edited by Randux; June 11th, 2006 at 12:21 PM. Reason: Incorrect pointer to processor manuals
  8. #5
  9. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2006
    Posts
    48
    Rep Power
    12

    Talking


    Originally Posted by LinuxPenguin
    It's not worth the effort. The reason compilers support output to ASM first is that you can then manually tweak the ASM, whereas few people can tweak machine code. There's a lot to be said for modularity.
    The reason that compilers produce assembly language is to save the compiler writers from having to do extra work!
  10. #6
  11. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2006
    Posts
    4
    Rep Power
    0
    Originally Posted by Yegg`
    I'm not too knowledgable in what this really means. Am I correct in saying that assemblers [usually] convert their code to machine [readable] code? If this is so, then there must be some way to code in this machine code alone without the help of an assembler? Is the machine code basically or by and large merely binary?
    Yes and no. There are different types of assembly language (ASM for short). I've worked with a few modern types as well as other versions like BAL on an HP-UX mainframe in college. Edit: My mistake, that was on the emulated IBM MVS. Duh :P

    Some types of higher-level ASM produce a pretty generic, intermediate bytecode which you use a platform-specific assembler to convert into machine-specific machine code.

    The really fun types of assembly are the ones where you are basically writing hardcore machine code yourself. In one of my classes, we used a type of ASM like this. We'd code using mnemonic codes like MOV, ADD, MUL, but then we'd look up the machine code for that operation (usually expressed in hexadecimal, but we'd also have to conver to binary occasionally) and write the instructions as machine code. We'd also manually pack and unpack decimal fields with paper and pencil, convert between ASCII/EBCDIC, etc. When we were done, we could literally write programs in machine code. Well, those of us who could make it through... It was the most dreaded class of the program.

    Fun stuff! But If you're already going through the pains of making your own language, I suggest you just get it to use a higher level ASM first. Then if it's not efficient enough, go hardcore, straight to optimized machine code.
  12. #7
  13. Commie Mutant Traitor
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Jun 2004
    Location
    Norcross, GA (again)
    Posts
    1,805
    Rep Power
    1570
    A thread about machine code and no one has mentioned Mel Kaye yet? For shame.
    Rev First Speaker Schol-R-LEA;2 JAM LCF ELF KoR KCO BiWM TGIF
    #define KINSEY (rand() % 7) λ Scheme is the Red Pill
    Scheme in Short Understanding the C/C++ Preprocessor
    Taming Python A Highly Opinionated Review of Programming Languages for the Novice, v1.1

    FOR SALE: One ShapeSystem 2300 CMD, extensively modified for human use. Includes s/w for anthro, transgender, sex-appeal enhance, & Gillian Anderson and Jason D. Poit clone forms. Some wear. $4500 obo. tverres@et.ins.gov
  14. #8
  15. Command Line Warrior
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2006
    Location
    Sector ZZ9 Plural Z Alpha
    Posts
    31
    Rep Power
    119
    All hail Mel!

    I've done some bytecode programming for the 8051 processor in my time. Very educational, but not practical.

    Great way to get optimized code, but VERY high-maintainance. For example: Most jumps in assembler are relative. So adding a single line of code means you have to check ALL your branches for correctness. In ASM, you use labels to avoid this. The assembler fill in the correct jumps at the last moment, saving you the bother :-)
  16. #9
  17. Banned ;)
    Devshed Supreme Being (6500+ posts)

    Join Date
    Nov 2001
    Location
    Woodland Hills, Los Angeles County, California, USA
    Posts
    9,607
    Rep Power
    4247
    Used to have this crazy prof in my undergrad microprocessor programming class that had pretty much the entire 8085 instruction set's op codes memorized.

    In this class, you would:
    1. First, write your program out in assembler.
    2. Hand assemble it using a sheet that contained the ASM instructions and their opcodes. Note that some instructions took 1 byte, some two and some 3 bytes, so you had to figure out the addresses of the instructions as well at this stage.
    3. Resolve all the JMP labels (in step 2, you figure out the addresses of the instructions as well, so you just need to replace the JMP labels with the actual addresses).
    4. Enter the program into the 8085 kit and then run it, using a hex keypad and toggle switches

    The prof could do steps 1-3 on the board without using a sheet since he knew the opcodes as well as how many bytes each took. The man could also pretty much resolve all the JMP labels in his head, so he would write the machine code out directly. Heck, I think he was only using the board so that the class could compare their answers with his to see if they got it correct -- IIRC he once keyed in a program directly and got it to work first time.

    One more impressive feat from this man was that he would walk into class and take roll call. Then, at the end of the 1-hour long class, he would write down the #s of people who were absent on the board. Mind you, my class had 74 kids and at any given time, at least 30% would be playing hooky!

    Comments on this post

    • Schol-R-LEA agrees : That's an impressive memory. As for hand-assebling, it's a very insightful exercise to go through - once.
    Up the Irons
    What Would Jimi Do? Smash amps. Burn guitar. Take the groupies home.
    "Death Before Dishonour, my Friends!!" - Bruce D ickinson, Iron Maiden Aug 20, 2005 @ OzzFest
    Down with Sharon Osbourne

    "I wouldn't hire a butcher to fix my car. I also wouldn't hire a marketing firm to build my website." - Nilpo

IMN logo majestic logo threadwatch logo seochat tools logo