Skip to content
July 29, 2011 / codehost

Linked List in Assembly

A while ago I took a course where I learned the (Intel) processor architecture and coding in Assembly Language. If you could get past the lackluster presentation and a certain lack of motivation from the teacher, the stuff presented was pretty impressive.

It was fun seeing C code that was usually taken for granted be deconstructed in such a low level yet  logical manner. Later I wrote a small program that took advantage of existing C libraries and the OS to do something that, despite being better suited for higher level languages, was nice to tinker with. I will present it in today’s post.

The program is a set of procedures and data structures that represent a linked list, implementing a queue actually. The source code can be found in my Github corner. I wrote it in Linux, but forced the assembler (nasm) to compile Intel syntax, which I am more accustomed with.

First of all, the well known standard library functions used are:

  • malloc
  • free
  • putchar
  • puts

We have to declare them for the linker. Then, any self-respecting program has a data segment reserved, where I define the structure for the list node and several strings for messages. Finally, the .bss section is for our uninitialized variables, in our case the pointer to the NULL list head, because the list is empty at first.

extern malloc
extern free
extern putchar
extern puts

section .data
    size_i:             ; Used to determine the size of the structure
    struc node
        info: resd  1
        next: resd  1
    endstruc
    len: equ $ - size_i  ; Size of the data type

    nullMes:    db  'Null pointer - the list is empty', 0
    addMes:     db  'Adding a new element', 0
    printMes:   db  'Printing a linked list:', 0
    cleanMes:   db  'Cleaning an element', 0
    doneCleanMes:   db  'Ready for cleaning...', 0
    emptyListMes:    db  '- empty list -', 0

section .bss
    prim:   resd  1

Several functions are required to define the behavior of our queue. They are:

  • append: add an element at the end of the queue. If the queue is empty, it will be initialized
  • pop: remove the element at the tail of the queue
  • cleanup: free all resources and destroy the queue
  • print: nicely print the elements to the screen. The data type that it holds is the ASCII character, but it can be extended to anything

Here is the code, in that order. First the append procedure:

; Procedure: append
; Appends an element at the end of a linked list
; If the linked list is empty, initialize the list
; Params (in order of pushing on the stack):
; dword element - data to be added
; dword prim - first element in the list
; Return: none
; Modifies the value of prim if it is null

append:
    push ebp            ; Save the stack
    mov ebp, esp

    push eax            ; Save the registers
    push ebx

    push len            ; Size to get from the heap and pass the size to the malloc function
    call malloc         ; Call the malloc function - now eax has the address of the allocated memory

    mov ebx, [ebp + 12]
    mov [eax + info], ebx    ; Add the element to the node data field
    mov dword [eax + next], 0   ; Address of the next element is NULL, because it is the last element in the list

    mov ebx, [ebp + 8]  ; Retrieve the address to the first element
    cmp dword [ebx], 0
    je null_pointer

    mov ebx, [ebx]      ; This parameter was the address of the address
                        ; Now it is the address of the first element, in this case, not null
    ; If it is not NULL, find the address of the last element
next_element:
    cmp dword [ebx + next], 0
    je found_last
    mov ebx, [ebx + next]
    jmp next_element

found_last:
    push eax
    push addMes
    call puts
    add esp, 4              ; Restore the stack
    pop eax

    mov [ebx + next], eax   ; Last element is this one from the newly allocated memory block

go_out:
    pop ebx             ; Restore registers
    pop eax

    mov esp, ebp
    pop ebp
    ret 8               ; Return to the caller function and cleaning the stack

null_pointer:
    push eax
    push nullMes
    call puts
    add esp, 4
    pop eax

    mov [ebx], eax      ; Point the address of the first element to the allocated memory

    jmp go_out

Yup, “noodles” is the first thing that comes to mind when trying to analyze the code. In a nutshell, the code above retrieves one of the parameters from the stack – the data to add in the new node, then memory is allocated for the new node and its pointer placed in the EAX register. The node structure is filled with the required data and pointer to next element – NULL – as it will be the last element. Then a loop through all elements until the last one is reached, that will be linked to the newly created node through its “next” field. A nice message is printed when an element is added, or if something horrible happens.

Next on our menu is pop:

; Procedure: pop
; Retrieves the first element in the list in eax and deletes it.
; The list acts like a queue
; Params:
; - address to the address of the first element
; Return: eax = the element removed

pop:
    push ebp
    mov ebp, esp

    push ebx
    push edx
    push ecx

    mov ebx, [ebp + 8]          ; Address of the address of the first element
    mov eax, [ebx]              ; Address of the first element
    cmp eax, 0
    je donePop                  ; Don't do anything if the list is empty
    mov edx, eax
    mov eax, [eax + info]       ; The information stored in the element

    mov ecx, [edx + next]
    mov [ebx], ecx              ; The address to the first element changed to point to the next one

    pusha                       ; This is suboptimal but I don't know what registers are messed up by free (I cannot believe they coded it this way)
    push edx                    ; Free the allocated resources
    call free
    add esp, 4
    popa

donePop:
    pop ecx
    pop edx                     ; Done - restore the registers and return to the calling procedure
    pop ebx         

    mov esp, ebp
    pop ebp
    ret 4

This one will store the data in the first element in EAX as a return value. Then the address to the first element is changed to its successor. Memory is freed and everyone is happy. “cleanup” is next in line:

; Procedure: cleanup
; Deallocate the memory used by a linked list
; This function is recursive
; Params:
; - address of the list node to free ([ebp + 8])
; Return: none

cleanup:
    push ebp                    ; Save the stack pointer
    mov ebp, esp

    push eax                    ; Save the working registers

    mov eax, [ebp + 8]          ; Retrieve the parameters
    cmp eax, 0                  ; If the address is NULL, then it is past the end of the list
    je doneCleaning             ; No more recursive calls; print an appropriate message

    push dword [eax + next]     ; Push the address of the next element as parameter for the next call to this procedure
    call cleanup

    pusha                       ; Print a message that cleaning is underway
    push cleanMes               
    call puts                   
    add esp, 4
    popa

    push eax                    ; Push the address of the current element as parameter for the free function
    call free
    add esp, 4                  ; Aaargh I hate these unpredictable stdlib procedures!

doneAll:                        ; Prepare to exit the procedure
    pop eax
    mov esp, ebp
    pop ebp
    ret 4

doneCleaning:                   ; Print a message that the last element was passed to the procedure
    pusha
    push doneCleanMes
    call puts
    add esp, 4
    popa
    jmp doneAll

This is a recursive function. Memory is deallocated starting with the last element. Finally, to print to stdout use:

; Procedure: print
; Prints the elements of a linked list
; The elements are considered ASCII characters, otherwise things might blow in your face with various degrees of terribleness
; Params (in order of pushing on the stack):
; dword prim - address of the first element
; Return: none

print:
    push ebp
    mov ebp, esp

    push ebx
    mov ebx, [ebp + 8]  ; Address of the first element
    cmp ebx, 0
    je emptyList

    push eax
    push printMes       ; Print message "Printing a linked list"
    call puts
    add esp, 4
    pop eax

next_char:
    push dword [ebx + info]
    call putchar
    mov ebx, [ebx + next]
    cmp ebx, 0
    jne next_char

    push dword 10
    call putchar

done:
    pop ebx
    mov esp, ebp
    pop ebp
    ret 4

emptyList:
    pusha
    push emptyListMes
    call puts
    add esp, 4
    popa
    jmp done

Just some more pasta that iterates sequentially through the list to display each character in its glory.

Now lets write the main procedure. Here is where the .text segment is also declared, so all other procedures should follow this code snippet in the source file. The main function just appends elements in different, more or less catastrophic ways, prints the list, pops it, destroys it, etc.

section .text
    global main

main:
    push ebp            ; Standard procedure entry
    mov ebp, esp

    mov dword [prim], 0 ; Pointer is NULL initially - the list is empty

    push dword 'a'      ; Element to add to the list
    push prim           ; Address of the first element in the list and pass it to the init function, through the stack
    call append         ; Call the init function

    push dword 'b'
    push prim
    call append

    push dword 'c'
    push prim
    call append

    mov ecx, 1
zzzZZZ:
    push ecx            ; I don't know why or where ecx is changed, but if I don't save it, the loop breaks and becomes infinite
    push dword 'Z'
    push prim
    call append
    pop ecx
    loop zzzZZZ

    push dword [prim]   ; Send the list to printing to the stdout
    call print

    push prim           ; Remove the first element: a
    call pop

    push dword [prim]
    call print

    push prim           ; Remove the first element: b
    call pop

    push dword [prim]
    call print

    push prim           ; Remove the first element: c
    call pop

    push dword [prim]
    call print

    push prim           ; Remove the first element: Z
    call pop

    push dword [prim]
    call print

    push prim           ; Remove the first element: nil
    call pop

    push dword [prim]
    call print

    push dword [prim]   ; Deallocate memory
    call cleanup

    mov esp, ebp        ; Restore the stack
    pop ebp
    ret

In Linux, you can use this nifty script to assemble the program and start messing with it:

# Test if there is a filename argument
if test $# -eq 0;
then
    echo "Usage: asm filename"
    exit 0
fi

# If there is a file to be assembled
nasm -f elf $1
name=$(echo $1 | cut -d. -f1)
gcc -o $name $name.o
rm $name.o
Have fun. Oh, and here’s some comic relief from the Oatmeal: The State of the Web.
About these ads

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: