|
|
|||||||||
|
|||||||||
| |||||||||
|
|
|
| |||||||||
![]() |
|
|
«
Previous Thread
|
Next Thread
»
|
Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
#1
|
||||
|
||||
|
HOWTO: Structures in assembly language
I've been in a class on assembly programming recently, and I wrote the following piece up in response to a question from another student. I thought it might be worth posting here, as I know that we get questions along these lines from time to time, and it might be useful even for those who aren't doing any assembly work. The code examples are for the MIPS processor, and specifically the SPIM simulator, but the ideas mostly apply to pretty much any more or less conventional architecture.
Conceptually at least, a structure in assembly language is like a structure in any other language: a fixed group of heterogeneous variables which have their own names but which are part of a larger composite variable. For example, in C++, a struct might look like Code:
struct {
int foo, bar;
char baz;
} quux;
The vars inside the braces would all be part of the composite variable quux; to access one part of quux, say foo, you would use quux.foo or something similar. It's much the same in assembly language, except of course that the types of the variables are implicit rather than explicit. In the SPIM assembly language, the definition of structures has to be a set of programming conventions rather than an actual program construct; some more powerful assemblers do have ways of defining structures, but that is nothing more than a syntax support for such a convention, more or less. So, if we agree to use labels which start with the name of the structure as a whole, followed by a period, and then the name of the element in question, you would end up with: Code:
quux: # blind label marking the start of the
# structure, the same actual address as
# quux.foo in this case
quux.foo: .space 4 # a four-byte integer variable
quux.bar: .space 4 # another int variable
quux.baz: .space 1 # a one-byte char variable
.align 4 # fix the word alignment after baz
Note that because baz was a one byte value, we need to pad three bytes after it to ensure word alignment. The whole structure is 9 bytes long (12 if you count the padding, which can be important later). Now, you might say that this is a lot of pointless work for a single structure; while you can treat the composite variable as a single structure, you really have three separate values, with separate labels. In this case, it is true that they could have been just as easily treated as separate data values; however, this is equally true in the C example above. However, declaring a single struct in this manner is useful for showing the logical and conceptual connections between the elements; it shows that they are represent something as a whole. That having been said, the real power in the idea of data structures is not from declaring a single structure in this manner, but from declaring types of structures. Again, in C++, you could define a structure type: Code:
struct quux {
int foo, bar;
char baz;
};
quux flarp, *frink;
Now, instead of having a single struct named quux, we define a sort of pattern or template (not a template in the C++ sense, but bear with me) for defining structures of the same type. In assembly language, this sort of structure type would be defined not as a series of labels, but as a series of offsets, like so: Code:
quux.sizeof = 12 # size of quux structs, incl. padding quux.foo = 0 # first element of the quux structure quux.bar = 4 # 2nd element of the quux structure quux.baz = 8 # third element of the quux structure You would then define a variable to hold it that is the size of the whole structure: Code:
flarp .space quux.sizeof Note defining a pointer to a struct would still be the regular size of a pointer, one full word in the case of MIPS: Code:
frink .space 4 # size of the ptr, not quux
# ... do things ...
la $t0, flarp
la $t1, frink
sw $t0 ($t1) # frink now points to flarp
At this point, the structure variables that make up flarp are implicitly defined by the offsets rather than defined by explicit labels. To use the struct you would get a pointer to the start of the struct variable, then use that pointer plus the defined offsets to access the structure elements: Code:
lw $t0, flarp lb $t1, quux.baz($to) # get the value of flarp.baz Now we're ready to see the real advantage of structures: defining values at runtime, either on the stack or in the heap. To dynamically allocate a struct of a given type, you just allocate the space need for it: Code:
addi $sp, $sp, -20 sw $fp, $fp.fp($sp) move $fp, $sp sw $ra, fp.ra($fp) addi $t0, $fp, -20 # location of the struct on the stack la $t1, frink sw $t0, ($t1) # frink now points to the structure Note that the 12 bytes has to be in addition to anything else you are saving on the stack, such as the frame pointer, the return address, the s-registers, and so on; the struct (and any other local variables) would come after everything else being saved. This is also why we padded the struct to word alignment rather than halfword alignment - in the MIPS processor, values on the stack are always assigned in word increments (the same applies to heap structures). Just as in C, you have to be careful with pointers to values on the stack; once the stack is cleared, the variable is gone, and trying to use a pointer to it after that may severely corrupt your stack. This can be a tricky problem, as it will only become apparent if the stack grows up to that point in memory again; it may not show up for several function calls later. To get a structure on the heap, you would do much the same: Code:
li $a0, quux.sizeof li $v0, malloc # malloc == syscall 9 syscall sw $v0, frink # save the pointer to the allocated space Again, you have to be careful not to access a structure pointer after the struct has been freed; while this doesn't come up in SPIM (as there is no system call to free memory, and thus anything allocated in heap never gets reclaimed), it certainly would be important in a real-world system. In either case, you now can use the structure offsets to handle the elements of the structure, just as if they were in the global memory (at least throughout the lifetime of the structure variable). However, with heap variables, you can also create more elaborate data structures using structs and pointers, such as a linked list or tree: Code:
charNode.sizeof = 8 charNode.letter = 4 # again, note the need to pad the aligment charNode.next = 4 # pointer to the next struct in the list charList .space 4 # ptr to head of character list # add an 'a' into the front of the list li $a0, charNode.sizeof li $v0, malloc syscall move $a0, $v0 jal insert # ... insert: # sanity check - is the ptr to the new node valid? bnez $a0, insert_add li $v0, -1 # return error code jr $ra insert_add: lw $t0, charList # get list head ptr sw $t0, charNode.next($a0) # set node.next == old head sw $a0, charList # set the head to the new node move $v0, $a0 # return ptr to new list head jr $ra At this point, I've gone waaay far afield, but I hope that this has clarified more than it has confused. HTH, C&CW.
__________________
Rev First Speaker Schol-R-LEA;2 JAM LCF ELF KoR KCO BiWM TGIF #define KINSEY (rand() % 7) λ Scheme is the Red Pill Scheme in Short • Understanding the C/C++ Preprocessor Taming Python • A Highly Opinionated Review of Programming Languages for the Novice, v1.1 FOR SALE: One ShapeSystem 2300 CMD, extensively modified for human use. Includes s/w for anthro, transgender, sex-appeal enhance, & Gillian Anderson and Jason D. Poit clone forms. Some wear. $4500 obo. tverres@et.ins.gov Last edited by Schol-R-LEA : December 10th, 2006 at 03:19 PM. |
|
#2
|
|||
|
|||
|
Non-toy assemblers implement structs and build the offsets for you automatically.
|
|
#3
|
||||
|
||||
|
Do they implement structures in the manner that Schol-R-LEA did, or do they use macros?
|
|
#4
|
|||
|
|||
|
They implement them like a C compiler would from my perspective as a programmer.
Microsoft MASM has had this capability (and unions) for over 20 years. |
|
#5
|
||||
|
||||
|
This is true; I mentioned it in passing, but didn't give more detail because a) I originally wrote this for some fellow students in a class that used SPIM, b) the syntax for structures is different from one assembler to another, and c) the assembler pretty much just automates what I was showing, and I thought it important to show what the structures become in memory, at runtime.
|
|
#6
|
||||
|
||||
|
I know that I'm getting into a bit of an ego-wank with this, but I just came across another e-mail conversation I'd had with the same student around that time, explaining the use of jump tables (again, in SPIM assembly). It's not a very complete explanation, or as clear as (I hope) the structures one was, but I think I might as well share it for the same reasons (I've reformatted it somewhat for readability):
Quote:
Despite the last line of the original - HTH. |
![]() |
| Viewing: Dev Shed Forums > Programming Languages - More > Other Programming Languages > Structures in assembly language |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|
|
|