Valhalla Legends Forums Archive | Assembly Language (any cpu) | [MASM] Any suggestions on how to make my QuickSort Procedure better?

AuthorMessageTime
bethra
For an assignment I have to convert the following C code to MASM.

[code]
void quicksort(int a[], int left, int right)
{
int lp = left - 1;
int rp = right;
int v = a[right];

if (right <= left)
return;

while (1) {
while (a[++lp] < v)
;

while (v < a[--rp])
if (rp == left)
break;

if (lp >= rp)
break;

swap(a[lp], a[rp]);
}

swap(a[lp], a[right]);

quicksort(a, left, lp - 1);
quicksort(a, lp + 1, right);
}

void swap(int& x, int& y) {
  int temp = x;
  x = y;
  y = temp;
}
[/code]

Here is my working code, however I would like constructive criticism on how to make it better/more efficient.

[code]
TITLE  QuickSort Procedure (quicksort.asm)

INCLUDE Irvine32.inc

; Prototype for Swap procedure (swaps the values of two 32-bit integers)
Swap PROTO,
pValX:PTR DWORD, ; pointer to second array element
pValY:PTR DWORD ; pointer to second array element

.code
;----------------------------------------------------------
QuickSort PROC USES eax esi edi,
a:PTR DWORD, ; pointer to array
left:DWORD, ; left index
right:DWORD ; right index
LOCAL lp:DWORD, ; left partition
rp:DWORD, ; right partition
v:DWORD ; pivot, partitioning index
;
; Sort an array of 32-bit signed integers in ascending order
; using the quick sort algorithm.
; Receives: pointer to array, left index, right index
; Returns: nothing
;-----------------------------------------------------------

; if(right <= left)
mov eax,right ; EAX = right
cmp eax,left ; compare right to left
jle EndOfProc ; if (right <= left), jump to EndOfProc

; int lp = left - 1;
mov eax,left ; EAX = left
dec eax ; EAX = left - 1
mov lp,eax ; lp = left - 1

; int rp = right;
mov eax,right ; EAX = right
mov rp,eax ; rp = right

; int v = a[right];
mov esi,a ; points to address of a[0]
mov eax,right ; EAX = right
shl eax,2 ; EAX = right * 4
mov eax,[esi + eax] ; EAX = a[right]
mov v,eax ; v = a[right]

; while(1)
@startWhile0:

; while(a[++lp] < v);
@while1:

inc lp ; ++lp
mov esi,a ; points to address of a[0]
mov eax,lp ; EAX = lp
shl eax,2 ; EAX = lp * 4
mov eax,[esi + eax] ; EAX = a[lp]
cmp eax,v ; compare a[lp] to v
jl @while1 ; if (a[lp] < v), jump to @while1

; while (v < a[--rp])
@startWhile2:

dec rp ; --rp
mov esi,a ; points to address of a[0]
mov eax,rp ; EAX = rp
shl eax,2 ; EAX = rp * 4
mov eax,[esi + eax] ; EAX = a[rp]
cmp v,eax ; compare v to a[rp]
jnl @endWhile2 ; if !(v < a[rp]), jump to @endWhile2

; if (rp == left)
mov eax,rp ; EAX = rp
cmp eax,left ; compare rp to left
jne @startWhile2 ; if !(rp == left), jump to @startWhile2

@endWhile2:

; if (lp >= rp)
mov eax,lp ; EAX = lp
cmp eax,rp ; compare lp to rp
jge @endWhile0 ; if (lp >= rp), jump to @endWhile0

; swap(a[lp], a[rp]);
mov esi,a ; points to address of a[0]
mov eax,lp ; EAX = lp
shl eax,2 ; EAX = lp * 4
add esi,eax ; points to address of a[lp]
mov edi,a ; points to address of a[0]
mov eax,rp ; EAX = rp
shl eax,2 ; EAX = rp * 4
add edi,eax ; points to address of a[rp]
INVOKE Swap, esi, edi ; Swap the values of a[lp] and a[rp]

jmp @startWhile0 ; if (1), jump to @startWhile0

@endWhile0:

; swap(a[lp], a[right]);
mov esi,a ; points to address of a[0]
mov eax,lp ; EAX = lp
shl eax,2 ; EAX = lp * 4
add esi,eax ; points to address of a[lp]
mov edi,a ; points to address of a[0]
mov eax,right ; EAX = right
shl eax,2 ; EAX = right * 4
add edi,eax ; points to the address of a[right]
INVOKE Swap, esi, edi ; Swap the values of a[lp] and a[right]

; QuickSort(a, left, lp - 1);
mov eax,lp ; EAX = lp
dec eax ; --EAX or EAX = lp - 1
INVOKE QuickSort, a, left, eax ; Call QuickSort recursively

; QuickSort(a, lp + 1, right);
mov eax,lp ; EAX = lp
inc eax ; ++EAX or EAX = lp + 1
INVOKE QuickSort, a, eax, right ; Call QuickSort recursively

EndOfProc:

; return;
ret ; exit procedure

QuickSort ENDP

END
[/code]

This is the Swap procedure in case you're wondering.
[code]
TITLE Swap Procedure (swap.asm)

INCLUDE Irvine32.inc

.code
;-------------------------------------------------------
Swap PROC USES eax esi edi,
pValX:PTR DWORD, ; pointer to first integer
pValY:PTR DWORD ; pointer to second integer
;
; Exchange the values of two 32-bit integers
; Returns: nothing
;-------------------------------------------------------
mov esi,pValX ; get pointers
mov edi,pValY
mov eax,[esi] ; get first integer
xchg eax,[edi] ; exchange with second
mov [esi],eax ; replace first integer
ret
Swap ENDP

END
[/code]
December 7, 2006, 3:47 AM
bethra
Well it seems like I can't delete my own post.

Anyways, after playing around and trying to see if I can make the code better, this is my new QuickSort procedure.

[code]
TITLE  QuickSort Procedure (quicksort.asm)

INCLUDE Irvine32.inc

; Prototype for Swap procedure (swaps the values of two 32-bit integers)
Swap PROTO,
pValX:PTR DWORD, ; pointer to second array element
pValY:PTR DWORD ; pointer to second array element

.code
;----------------------------------------------------------
QuickSort PROC USES eax esi edi,
a:PTR DWORD, ; pointer to array
left:DWORD, ; left index
right:DWORD ; right index
LOCAL lp:DWORD, ; left partition
rp:DWORD, ; right partition
v:DWORD ; pivot, partitioning index
;
; Sort an array of 32-bit signed integers in ascending order
; using the quick sort algorithm.
; Receives: pointer to array, left index, right index
; Returns: nothing
;-----------------------------------------------------------

; if(right <= left)
mov eax,right ; EAX = right
cmp eax,left ; compare right to left
jle EndOfProc ; if (right <= left), jump to EndOfProc

; int lp = left - 1;
mov eax,left ; EAX = left
dec eax ; EAX = left - 1
mov lp,eax ; lp = left - 1

; int rp = right;
mov eax,right ; EAX = right
mov rp,eax ; rp = right

; int v = a[right];
mov eax,right ; EAX = right
shl eax,2 ; EAX = right * 4
add eax,a ; point to address of a[right]
mov eax,[eax] ; EAX = a[right]
mov v,eax ; v = a[right]

; while(1)
@startWhile0:

cld ; clear direction flag (forward)

inc lp ; ++lp
mov esi,a ; points to address of a[0]
mov eax,lp ; EAX = lp
shl eax,2 ; EAX = lp * 4
add esi,eax ; points to address of a[lp]

; while(a[lp] < v);
@startWhile1:

lodsd ; load a[lp] into EAX
cmp eax,v ; compare a[lp] to v
jnl @endWhile1 ; if !(a[lp] < v), jump to @endWhile1

inc lp ; ++lp
jmp @startWhile1 ; if (a[lp] < v), jump to @startWhile1

@endWhile1:

sub esi,4 ; point to address of a[lp]
mov edi,esi ; EDI = address of a[lp]

std ; set direction flag (reverse)

dec rp ; --rp
mov esi,a ; points to address of a[0]
mov eax,rp ; EAX = rp
shl eax,2 ; EAX = rp * 4
add esi,eax ; points to address of a[rp]

; while (v < a[rp])
@startWhile2:

lodsd ; load a[rp] into EAX
cmp v,eax ; compare v to a[rp]
jnl @endWhile2 ; if !(v < a[rp]), jump to @endWhile2

; if (rp == left)
mov eax,rp ; EAX = rp
cmp eax,left ; compare rp to left
je @endWhile2 ; if (rp == left), jump to @endWhile2

dec rp ; --rp
jmp @startWhile2 ; if (v < a[rp]), jump to @startWhile2

@endWhile2:

add esi,4 ; points to address of a[rp]

; if (lp >= rp)
mov eax,lp ; EAX = lp
cmp eax,rp ; compare lp to rp
jge @endWhile0 ; if (lp >= rp), jump to @endWhile0

; swap(a[lp], a[rp]);
INVOKE Swap, edi, esi ; Swap the values of a[lp] and a[rp]

jmp @startWhile0 ; if (1), jump to @startWhile0

@endWhile0:

; swap(a[lp], a[right]);
mov esi,a ; points to address of a[0]
mov eax,right ; EAX = right
shl eax,2 ; EAX = right * 4
add esi,eax ; points to the address of a[right]
INVOKE Swap, edi, esi ; Swap the values of a[lp] and a[right]

; QuickSort(a, left, lp - 1);
mov eax,lp ; EAX = lp
dec eax ; --EAX or EAX = lp - 1
INVOKE QuickSort, a, left, eax ; Call QuickSort recursively

; QuickSort(a, lp + 1, right);
mov eax,lp ; EAX = lp
inc eax ; ++EAX or EAX = lp + 1
INVOKE QuickSort, a, eax, right ; Call QuickSort recursively

EndOfProc:

; return;
ret ; exit procedure

QuickSort ENDP

END
[/code]

The main difference in this version of it is that I used the instruction "lodsd" in the left and right partitioning while loops to load the data addressed in ESI into EAX and automatically increment or decrement ESI by a DWORD.  Using this instruction reduced the number of instructions executed after the first time the left and right partition were incremented/decremented.  I assume that this is a good thing.

Also, instead of having a bunch of instructions to set ESI and EDI to the address of a[lp] and a[rp] before invoking the swap procedure, I removed them and was able to reuse the data already addressed in the registers.
December 14, 2006, 4:23 AM
bethra
Since there seems to either be a lack of people visiting this forum or simply disinterest in helping me optimize my code, could someone at least tell me if there are any programs that'll give stats on how fast or efficient the code runs so I can compare my two versions?

Both of my two versions work... actually I have 4 different ones all together.  The two I posted, as well as two versions of a different C implementation.

This semester (which ends this Friday) I'm taking/have taken an assembly course which is every MWF.  However, our professor really doesn't teach us squat and all the ASM I've learned this semester has been from teaching myself through the textbook.  I don't know whether I'm developing good or bad habits... in fact the only GRADED assignment that we've been given all semester will be this QuickSort (due this Friday).  Sigh.
December 14, 2006, 4:30 AM
Yegg
I've never used a profiler for assembly before, but I've also never heard of one. I'm sure there must be one out there somewhere. I would find a nice profiler (MSVS 2005 (or some other version) may have one) to record the speeds of the different parts of your program. This well help you determine which parts need improvement.
December 14, 2006, 7:52 PM
Kp
Since you've already succumbed to using INVOKE instead of constructing the call on your own, you could just instrument the sorter by hand.  For a program of this size, it would not take long to add code to count basic blocks.  Alternately, if you reimplemented this to run on Linux (again, not too much trouble for something of this size), you could run Callgrind (a member of the Valgrind family) to profile it.  However, this may be a bit much of a time investment if your program is due tomorrow.
December 15, 2006, 12:36 AM
TheMinistered
The only real thing I saw that could be optimized right off bat (without extensive analysis) was the beginning of your quicksort function, here is how i'd handle it:
[code]
mov eax, left
cmp eax, right
jge EndOfProc

dec eax
mov lp, eax

mov eax, right
mov rp, eax

                mov esi, a
shl eax, 2
add eax, a
mov eax, [eax + esi]
mov v, eax
...
[/code]
just a way of doing things by using less instructions...
January 1, 2007, 1:47 AM

Search