I often see people recommend the dragon book for compiler design. Well, imho, the book severely overcomplicates things, so I'm going to write a 1-page tutorial here.

Step 1 - Let's Make Some Macros!

Below is a psuedocode example of a macro that performs an addition on 2 16-bit integers:

    #subtract_variable_from_variable (*result, *left_operand, *right_operand)
        push %hl            // save register 'hl'
        push %de            // save register 'de'
        ld %hl, (left_operand)        // load value at 'left_operand' into 'hl'
        ld %de, (right_operand)       // load value at 'right_operand' into 'de'
        sub %hl, %de        // subtract 'de' from 'hl'
        ld (result), %hl        // store 'hl' at 'result'
        pop %de            // restore register 'de'
        pop %hl            // restore register 'hl'
This macro will satisfy the instruction of `x = y - z`. There are 2 important things here. First, this macro has no side-effects. If we design all of our macro operations like this, then we know for a fact that they can't interfere with each-other's state, and thus can be composed in any order. Second, the result and the operands are all just pointers to memory. Doesn't really matter where at this point.

Let's make another 2 macros:

    #set_variable_to_constant (*variable, constant)
        push %hl            // save register 'hl'
        ld %hl, constant        // set 'hl' to the value of the constant
        ld (variable), %hl        // store 'hl' at the variable's address
        pop %hl            // restore register 'hl'

    #goto_address_if_variable_equals_0 (address, *variable)
        push %hl        // save register 'hl'
        ld %hl, (variable)        // set 'hl' to the value of the variable
        or %hl, %hl        // bitwise-or 'hl' with itself
        pop %hl            // restore 'hl'
        jz address        // jump to address if zero-flag was set
Now, what can we do with all these macros? Well, how about a loop?

    set_variable_to_constant (&counter, 5)     // repeat 5 times
    set_variable_to_constant (&step, 1)
    set_variable_to_constant (&zero, 0);
    subtract_variable_from_variable (&counter, &counter, &step)
    goto_address_if_variable_equals_0(done, counter)
    goto_address_if_variable_equals_0(loop, zero)
hmm... let's make that a bit more readable...

    counter = 5
    step = 1
    zero = 0
    counter = counter - step
    if counter = 0:
        goto done
    if zero = 0:
        goto loop
Congrats, you now have a turing complete compiler.

Of course, using it is tedius if you only have 3 operations, but writing code in it gets easier and easier as you add more macros.

But it generates such slow and crappy code!
This is where a technique known as instruction scheduling comes in. Because the side effects of each macro are contained, you can reorder them (within reason), and replace certain sequences of macros with a single macro that does the same thing, minus all the protection against clobbering, and taking advantage of additional instructions that the CPU has. You can also use this to keep variables that are 'hot' inside of registers instead of loading and storing them from memory all the time.

Where are my variables at in memory though?
Wherever you want them to be. The heap and stack are just popular ways of deciding where to put things. There's no universal law that says you must place your variables in any specific place. Most processors have a stack-pointer register, but if not, you can just decide that a variable at some arbitrary address will always point to the top of stack (which is yet another arbitrary block of memory. no rules ) When functions enter, they typically allocate space on the stack by subtracting from the stack pointer, and the compiler just decides that local variables should be put there.

What about expressions, and parsing, and (blah blah blah)
Expressions can all be broken down into single operations by using temporary variables. eg. (x = y + 4 * 3) can be rewritten as (tempvar = 4 * 3; x = y + tempvar). When the operands are all constants, it's safe to evaluate the expression at compile time. eg. in the previous example, it's safe to just replace 'tempvar' with '12'. This is a technique known as constant folding.

As for parsing, honestly, it doesn't matter. Many sources say that you should use a lexer to scan for tokens and pipe those into some state machine to construct a parse tree. I've always gotten by just fine with something simple like (pseudocode):

int length_of_return_statement (string *text) {
    int length;
    length = length_of_identifier (text);
    if (length == 0 || text->substring(0, length) != "return")
        return 0;
    length += length_of_whitespace (text + length);
    if (text[length] == ';') {
        return length;
    } else return 0;