flat assembler
Message board for the users of flat assembler.

flat assembler > Main > Having issues with the comparisons and jumps

Author
Thread Post new topic Reply to topic
moveax41h



Joined: 18 Feb 2018
Posts: 46
I wrote a program in C which, given some amount of pennies, will tell you how many combinations are possible to change for that dollar amount in quarters, dimes, nickels, and pennies.

As an exercise, I decided to implement this in fasm...

The control flow of this program is kicking my butt, specifically in the area of the nested loops.

Because I am still pretty new to fasm, the only way I was even able to attempt this program as by eyeing up my C code. If anyone so desires, I can post it by request but it basically is 3 nested for loops. I was trying to "mimick" this in fasm even though I'm sure theres probably even a more "assembler" way to do it.

I believe the main math logic of my program is correct, but I am having issues with the comparisons and jumps and was wondering if anything glaringly stupid or bad popped out to anyone. In fact, I know for a fact that something will, which is why I wanted to ask.

Thanks.

Code:
format PE console

entry main

include 'macro/import32.inc'

section '.rdata' data readable
Msg db 'Enter an amount in PENNIES and we will give the change combinations for quarters, dimes, nickels, and pennies', 0
frmspec db '%d', 0
Msg2 db 13, 10, 'Total combinations: %d', 0
pnull db 'pause>nul', 0
section '.data' data readable writeable
combos dd 0
postquarter_total dd 0
postdime_total dd 0
q dd 0
d dd 0
n dd 0
p dd 0
userinput dd 0

section '.code' readable executable

main:
        ;setup main stack
        push ebp
        mov ebp, esp

        push Msg
        call [printf]
        add esp, 4

        push userinput
        push frmspec
        call [scanf]
        add esp, 8

.quarter_loop: ; q as counter
        mov eax, [userinput] ; Move the penny count into eax
        mov edx, 0 ; prepare to do a divide
        mov ecx, 25 ; we are dividing by 25 for quarters
        div ecx ; do the divide
        cmp eax, [q] ; compare the quotient with the counter
        jle endme ; if the quotient is not less than or equal to the counter, jmp out

        mov eax, [q] ; Quarter counter into eax
        mov ecx, 25 ; Prepare for mult by 25
        mul ecx ; mult by 25 product in eax
        mov ebx, [userinput] ; put pennies into ebx
        sub ebx, eax ; subtract the quotient of q * 25 from pennies
        mov [postquarter_total], ebx ; assign to postquarter_total
        inc [q]
        jmp .quarter_loop

.dime_loop:
        mov eax, [postquarter_total] ; Problem loop... need to base cmp off d here
        mov edx, 0
        mov ecx, 10
        div ecx
        cmp eax, [d]
        jle .quarter_loop


        mov eax, [d] ; Move the dime cou
        mov ecx, 10
        mul ecx
        mov ebx, [postquarter_total]
        sub ebx, eax
        mov [postdime_total], ebx
        inc [d]
        jmp .dime_loop
.nickel_loop:
        mov eax, [postdime_total]
        mov edx, 0
        mov ecx, 5
        div ecx
        cmp eax, [n] ; This is never satisfied because n never increments and is always 0
        jle .dime_loop ; This is our exit condition to go up to the prev loop

        mov eax, [n]
        mov ecx, 5
        mul ecx
        mov ebx, [postdime_total]
        sub ebx, eax
        mov [p], ebx
        inc [combos]
        inc [n]                ; Bug fix hopefully...
        jmp .nickel_loop

endme:
        push [combos]
        push Msg2
        call [printf]
        add esp, 8

        mov esp, ebp
        pop ebp
        push pnull
        call [system]
        add esp, 4

        push 0
        call [exit]

section '.idata' import data readable

library msvcrt, 'msvcrt.dll'
import msvcrt, \
printf, 'printf', \
scanf, 'scanf', \
system, 'system', \
exit, 'exit'    



After trying to debug this, I don't think the jmps at the end of my loops are going to work, but I need some way to properly tally up and exit the loop when needed. Before I added those, the nickel loop would get to 40 (when the user inputted 200 pennies for example) and then go up to the dimes, then quarters loops and then the quarter loop would immediately exit on its first iteration after the nickel loop was finished. By adding inc [q] followed by jmp .quarter_loop at the bottom of the loop, I was able to properly get the quarter loop to iterate but now of course quarter loop just fires and then the program exits. So basically, when I fall through the end of quarter_loop, quarter_loop won't fire the proper number of times and when I jmp back up, the rest of the program doesn't get run. I know I'm doing something massive wrong with the nested loops here. Thanks.

_________________
-moveax41h
Post 24 Mar 2018, 00:07
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 15912
Location: SDSS J140821.67+025733.2
Why not just do a direct division for each coin?

quarters = floor (cents / 25)
dimes = floor ((cents - quarters * 25) / 10)
nickels = floor ((cents - quarters * 25 - dimes * 10) / 5)
pennies = cents - quarters * 25 - dimes * 10 - nickels * 5

No loops required.
Post 24 Mar 2018, 00:40
View user's profile Send private message Visit poster's website Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 716
This problem can be solved in constant time (i.e. you dont need loops). Im sure revolution can cook up such constant time code for us in x86 here in a jiffy.

Edit: however, this will require revo to properly understand the question first. Wink

Lets make sure. With 11 pennies we got 4 combos?
11 = 10+1 = 5+5+1 = 5+1+...+1 = 1+...+1
Post 24 Mar 2018, 02:50
View user's profile Send private message Reply with quote
moveax41h



Joined: 18 Feb 2018
Posts: 46
Well, I need to find the number of all combinations together of each coin type to create some given number of pennies... I'm not sure if straight division will give me that, I think it will just give me how many of each coin type individually make up the whole 200 pennies for example. I'm actually doing all of those math operations in my code above but just getting the number of quarters, dimes, nickels, and pennies in and of itself doesn't give me the total number of unique combos of all of them if that makes sense. E.G. given 200 pennies, the correct answer is 1463 combinations. Given 900 pennies, the answer is 104650 combinations.

_________________
-moveax41h
Post 24 Mar 2018, 08:00
View user's profile Send private message Reply with quote
donn



Joined: 05 Mar 2010
Posts: 105
Haven't looked at your math, but have you considered back-to-back conditional jumps? I often use a style like this (which is similar to yours) and it seems efficient enough:

Code:
.nextXItemLabel:                             ; Could be quarters or the beginning of some loop
mov ecx, [someVariableIndexVal]     ; Load loop entry criteria
mov edx, [someOtherCount]             ; Load a comparison count, maybe total items
cmp ecx, edx                                   ; Criteria to begin loop
jnl .exitLabel
jne .anotherExitPointLabel                 ; Back- to-back jumps to other exit points. For efficiency, these can be checked outside the loop so they don't slow down loop performance, but may help during debugging.
cmp ecx, 0                                       ; Other entry checks
jne .noItemsLabel                             ; Can jump to different parts within the loop conditionally

; loop body
.noItemsLabel:                                ; Within body, may want to jump here when 0, for example
.nextYItemLabel:                             ; Inner loop, structured the same as outer
mov ecx, [someYVariableIndexVal]     ; Load Y loop entry criteria
mov edx, [someOtherYCount]             ; Load a Y comparison count, maybe total items
cmp ecx, edx                                   ; Criteria to begin loop
jnl .exitYLabel

; inner loop body

jmp nextYItemLabel
.exitYLabel:

jmp nextXItemLabel
.exitLabel:
; exited...
.anotherExitPointLabel:                       
    


So, the previous in short is structured like this:
Code:
nextXItem:                             ; Multiple entry critera checks performed immediately after label
   nextYItem:                                    ; Entry criteria checks performeda after label, similar to nextXItem's
       jmp nextYItem
   jmp nextXItem    


That should solve nested control flow. Alternatively, you could put loop entry criteria after each loop label, and redirect to another loop, like so:

Code:
nextXItem:                             ; Some entry checks could redirect to nextYItem
    jmp nextXItem
nextYItem:                                       ; Entry checks performed same as other example
    jmp nextYItem
    


but that's not really nested. Those are two loops of the same nesting level. If anyone else has best practices, maybe they can shed some insight. The first example is what I've been using lately. I may have misunderstood your intended control-flow structure, however.
Post 24 Mar 2018, 16:14
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 15912
Location: SDSS J140821.67+025733.2
tthsqe wrote:
Edit: however, this will require revo to properly understand the question first. Wink
Haha, yes you are correct. I misunderstood the problem.
moveax41h wrote:
Well, I need to find the number of all combinations together of each coin type to create some given number of pennies... I'm not sure if straight division will give me that, I think it will just give me how many of each coin type individually make up the whole 200 pennies for example. I'm actually doing all of those math operations in my code above but just getting the number of quarters, dimes, nickels, and pennies in and of itself doesn't give me the total number of unique combos of all of them if that makes sense. E.G. given 200 pennies, the correct answer is 1463 combinations. Given 900 pennies, the answer is 104650 combinations.
I'd suggest that if you want to use the brute force counting method to use recursion with a global counter variable/register.
Post 24 Mar 2018, 16:35
View user's profile Send private message Visit poster's website Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 716
For all of you with windoze boxes, could you give this a try? It should popup a message box with the correct answers.

Code:
format PE GUI 5.0
entry Start

include 'win32ax.inc'

section '.text' code readable executable


madd:
    ; ebx[0:1:2:3] = ebx[0:1:2:3] * eax + ecx
       push  esi
        mov  esi, eax
        mov  eax, dword[ebx + 4*2]
        mul  esi
       push  edx
       push  eax
        mov  eax, dword[ebx + 4*0]
        mul  esi
       push  edx
       push  eax
        mov  eax, dword[ebx + 4*1]
        mul  esi
       imul  esi, dword[ebx + 4*3]
        add  dword[esp + 4*0], ecx
        adc  dword[esp + 4*1], eax
        adc  dword[esp + 4*2], edx
        adc  dword[esp + 4*3], esi
        pop  dword[ebx + 4*0]
        pop  dword[ebx + 4*1]
        pop  dword[ebx + 4*2]
        pop  dword[ebx + 4*3]
        pop  esi
        ret

divide:
    ; ebx[0:1:2:3] = ebx[0:1:2:3] / ecx
    ; output remainder in edx
        xor  edx, edx
        mov  eax, dword[ebx + 4*3]
        div  ecx
        mov  dword[ebx + 4*3], eax
        mov  eax, dword[ebx + 4*2]
        div  ecx
        mov  dword[ebx + 4*2], eax
        mov  eax, dword[ebx + 4*1]
        div  ecx
        mov  dword[ebx + 4*1], eax
        mov  eax, dword[ebx + 4*0]
        div  ecx
        mov  dword[ebx + 4*0], eax
        ret

round:
    ; if 2*edx > ecx, add one to ebx[0:1:2:3]
        xor  eax, eax
        add  edx, edx
         jc  .big
        cmp  ecx, edx
.big:
        adc  dword[ebx + 4*0], eax
        adc  dword[ebx + 4*1], eax
        adc  dword[ebx + 4*2], eax
        adc  dword[ebx + 4*3], eax
        ret

print_num:
    ; print ebx[0:1:2:3] to edi
    ; also clobbers ebx[0:1:2:3]
       push  ebp
        mov  ebp, esp
        mov  ecx, 10
.loop1:
       call  divide
       push  edx
        mov  eax, dword[ebx + 4*0]
         or  eax, dword[ebx + 4*1]
         or  eax, dword[ebx + 4*2]
         or  eax, dword[ebx + 4*3]
        jnz  .loop1
.loop2:
        pop  eax
        add  eax, '0'
      stosb
        cmp  esp, ebp
        jne  .loop2
        pop  ebp
        ret



print_calculation:
    ; in    : eax number to calculate
    ; in/out: edi address of buffer to write calculation
       push  esi
        sub  esp, 4*4
        mov  ebx, esp
        mov  esi, eax   ; esi = number to calculate

        mov  eax, 'f('
      stosw
        mov  dword[ebx + 4*0], esi
        mov  dword[ebx + 4*1], 0
        mov  dword[ebx + 4*2], 0
        mov  dword[ebx + 4*3], 0
       call  print_num
        mov  eax, ') = '
      stosd

        mov  eax, esi
        xor  edx, edx
        mov  ecx, 5
        div  ecx
        mov  esi, eax   ; esi = floor(number to calculate/5)

        mov  dword[ebx + 4*0], 4
        mov  dword[ebx + 4*1], 0
        mov  dword[ebx + 4*2], 0
        mov  dword[ebx + 4*3], 0
        mov  eax, esi
        mov  ecx, 54
       call  madd
        mov  eax, esi
        mov  ecx, 212
       call  madd
        mov  eax, esi
        mov  ecx, 225
       call  madd
        mov  ecx, 240
       call  divide
       call  round
       call  print_num
        add  esp, 4*4
        pop  esi
        ret



Start:
        lea  edi, [Message]

        mov  eax, 0
       call  print_calculation
        mov  al, 10
      stosb

        mov  eax, 11
       call  print_calculation
        mov  al, 10
      stosb

        mov  eax, 99
       call  print_calculation
        mov  al, 10
      stosb

        mov  eax, 200
       call  print_calculation
        mov  al, 10
      stosb

        mov  eax, 900
       call  print_calculation
        mov  al, 10
      stosb

        mov  eax, 999
       call  print_calculation
        mov  al, 10
      stosb

        mov  eax, 12345
       call  print_calculation
        mov  al, 10
      stosb

        mov  eax, -1
       call  print_calculation
        xor  eax, eax
      stosb

     invoke  MessageBoxA, 0, Message, "result", MB_OK
     invoke  ExitProcess, 0





section '.data' data readable writeable
Message:
    rb 1000


section '.idata' import data readable writeable

  library kernel32,'KERNEL32.DLL',\
          user32,'USER32.DLL'

include 'api\kernel32.inc'
include 'api\user32.inc'    
Post 24 Mar 2018, 16:36
View user's profile Send private message Reply with quote
donn



Joined: 05 Mar 2010
Posts: 105
That was quick, numbers look correct?


Description: Result
Filesize: 20.41 KB
Viewed: 1356 Time(s)

result1.JPG


Post 24 Mar 2018, 18:08
View user's profile Send private message Reply with quote
moveax41h



Joined: 18 Feb 2018
Posts: 46
So I re-arranged my code because I realized my looping branches looked like crap and frankly to you experienced folks they probably still do...

It still doesn't work properly though, something is wrong, if I enter 200 I only get 40 combinations so it's a premature exit... Gonna have to do some more debugging but I'm still tryin:

Code:
format PE console

entry main

include 'macro/import32.inc'

section '.rdata' data readable
Msg db 'Enter an amount in PENNIES and we will give the change combinations for quarters, dimes, nickels, and pennies', 0
frmspec db '%d', 0
Msg2 db 13, 10, 'Total combinations: %d', 0
pnull db 'pause>nul', 0
section '.data' data readable writeable
combos dd 0
postquarter_total dd 0
postdime_total dd 0
q dd 0
d dd 0
n dd 0
p dd 0
userinput dd 0
quarters_divided dd 0

section '.code' readable executable

main:
        ;setup main stack
        push ebp
        mov ebp, esp

        push Msg
        call [printf]
        add esp, 4

        push userinput
        push frmspec
        call [scanf]
        add esp, 8

        ; Doesn't need to be in loop because this will be constant
        mov eax, [userinput] ; Move the penny count into eax
        mov edx, 0 ; prepare to do a divide
        mov ecx, 25 ; we are dividing by 25 for quarters
        div ecx ; do the divide
        mov [quarters_divided], eax

.quarter_loop: ; q as counter
        mov eax, [quarters_divided]
        cmp eax, [q] ; compare the quotient with the counter
        jle endme ; if the quotient is not less than or equal to the counter, jmp out

        mov eax, [q] ; Quarter counter into eax
        mov ecx, 25 ; Prepare for mult by 25
        mul ecx ; mult by 25 product in eax
        mov ebx, [userinput] ; put pennies into ebx
        sub ebx, eax ; subtract the quotient of q * 25 from pennies
        mov [postquarter_total], ebx ; assign to postquarter_total
        
        

        .dime_loop:
                mov eax, [postquarter_total] ; Problem loop... need to base cmp off d here
                mov edx, 0
                mov ecx, 10
                div ecx
                cmp eax, [d]
                jle .after_dime


                mov eax, [d] ; Move the dime cou
                mov ecx, 10
                mul ecx
                mov ebx, [postquarter_total]
                sub ebx, eax
                mov [postdime_total], ebx
        
                .nickel_loop:
                        mov eax, [postdime_total]
                        mov edx, 0
                        mov ecx, 5
                        div ecx
                        cmp eax, [n] ; This is never satisfied because n never increments and is always 0
                        jle .after_nickel ; This is our exit condition to go up to the prev loop

                        mov eax, [n]
                        mov ecx, 5
                        mul ecx
                        mov ebx, [postdime_total]
                        sub ebx, eax
                        mov [p], ebx
                        inc [combos]
                        inc [n]                ; Bug fix hopefully...
                        jmp .nickel_loop

                .after_nickel:
                inc [d]
                jmp .dime_loop
    .after_dime:
    inc [q]
    jmp .quarter_loop

endme:
        push [combos]
        push Msg2
        call [printf]
        add esp, 8

        mov esp, ebp
        pop ebp
        push pnull
        call [system]
        add esp, 4

        push 0
        call [exit]

section '.idata' import data readable

library msvcrt, 'msvcrt.dll'
import msvcrt, \
printf, 'printf', \
scanf, 'scanf', \
system, 'system', \
exit, 'exit'
    
[/code]

_________________
-moveax41h
Post 25 Mar 2018, 21:19
View user's profile Send private message Reply with quote
moveax41h



Joined: 18 Feb 2018
Posts: 46
This is fast as hell. Works like a charm. Did you write it without even testing it?

tthsqe wrote:
For all of you with windoze boxes, could you give this a try? It should popup a message box with the correct answers.

Code:
format PE GUI 5.0
entry Start

include 'win32ax.inc'

section '.text' code readable executable


madd:
    ; ebx[0:1:2:3] = ebx[0:1:2:3] * eax + ecx
       push  esi
        mov  esi, eax
        mov  eax, dword[ebx + 4*2]
        mul  esi
       push  edx
       push  eax
        mov  eax, dword[ebx + 4*0]
        mul  esi
       push  edx
       push  eax
        mov  eax, dword[ebx + 4*1]
        mul  esi
       imul  esi, dword[ebx + 4*3]
        add  dword[esp + 4*0], ecx
        adc  dword[esp + 4*1], eax
        adc  dword[esp + 4*2], edx
        adc  dword[esp + 4*3], esi
        pop  dword[ebx + 4*0]
        pop  dword[ebx + 4*1]
        pop  dword[ebx + 4*2]
        pop  dword[ebx + 4*3]
        pop  esi
        ret

divide:
    ; ebx[0:1:2:3] = ebx[0:1:2:3] / ecx
    ; output remainder in edx
        xor  edx, edx
        mov  eax, dword[ebx + 4*3]
        div  ecx
        mov  dword[ebx + 4*3], eax
        mov  eax, dword[ebx + 4*2]
        div  ecx
        mov  dword[ebx + 4*2], eax
        mov  eax, dword[ebx + 4*1]
        div  ecx
        mov  dword[ebx + 4*1], eax
        mov  eax, dword[ebx + 4*0]
        div  ecx
        mov  dword[ebx + 4*0], eax
        ret

round:
    ; if 2*edx > ecx, add one to ebx[0:1:2:3]
        xor  eax, eax
        add  edx, edx
         jc  .big
        cmp  ecx, edx
.big:
        adc  dword[ebx + 4*0], eax
        adc  dword[ebx + 4*1], eax
        adc  dword[ebx + 4*2], eax
        adc  dword[ebx + 4*3], eax
        ret

print_num:
    ; print ebx[0:1:2:3] to edi
    ; also clobbers ebx[0:1:2:3]
       push  ebp
        mov  ebp, esp
        mov  ecx, 10
.loop1:
       call  divide
       push  edx
        mov  eax, dword[ebx + 4*0]
         or  eax, dword[ebx + 4*1]
         or  eax, dword[ebx + 4*2]
         or  eax, dword[ebx + 4*3]
        jnz  .loop1
.loop2:
        pop  eax
        add  eax, '0'
      stosb
        cmp  esp, ebp
        jne  .loop2
        pop  ebp
        ret



print_calculation:
    ; in    : eax number to calculate
    ; in/out: edi address of buffer to write calculation
       push  esi
        sub  esp, 4*4
        mov  ebx, esp
        mov  esi, eax   ; esi = number to calculate

        mov  eax, 'f('
      stosw
        mov  dword[ebx + 4*0], esi
        mov  dword[ebx + 4*1], 0
        mov  dword[ebx + 4*2], 0
        mov  dword[ebx + 4*3], 0
       call  print_num
        mov  eax, ') = '
      stosd

        mov  eax, esi
        xor  edx, edx
        mov  ecx, 5
        div  ecx
        mov  esi, eax   ; esi = floor(number to calculate/5)

        mov  dword[ebx + 4*0], 4
        mov  dword[ebx + 4*1], 0
        mov  dword[ebx + 4*2], 0
        mov  dword[ebx + 4*3], 0
        mov  eax, esi
        mov  ecx, 54
       call  madd
        mov  eax, esi
        mov  ecx, 212
       call  madd
        mov  eax, esi
        mov  ecx, 225
       call  madd
        mov  ecx, 240
       call  divide
       call  round
       call  print_num
        add  esp, 4*4
        pop  esi
        ret



Start:
        lea  edi, [Message]

        mov  eax, 0
       call  print_calculation
        mov  al, 10
      stosb

        mov  eax, 11
       call  print_calculation
        mov  al, 10
      stosb

        mov  eax, 99
       call  print_calculation
        mov  al, 10
      stosb

        mov  eax, 200
       call  print_calculation
        mov  al, 10
      stosb

        mov  eax, 900
       call  print_calculation
        mov  al, 10
      stosb

        mov  eax, 999
       call  print_calculation
        mov  al, 10
      stosb

        mov  eax, 12345
       call  print_calculation
        mov  al, 10
      stosb

        mov  eax, -1
       call  print_calculation
        xor  eax, eax
      stosb

     invoke  MessageBoxA, 0, Message, "result", MB_OK
     invoke  ExitProcess, 0





section '.data' data readable writeable
Message:
    rb 1000


section '.idata' import data readable writeable

  library kernel32,'KERNEL32.DLL',\
          user32,'USER32.DLL'

include 'api\kernel32.inc'
include 'api\user32.inc'    

_________________
-moveax41h
Post 25 Mar 2018, 22:36
View user's profile Send private message Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 716
moveax41h, the logic of your asm code leaves me in doubt of your c code. Why do you need to divide? Try translating the following into asm. Of course the variable total doesn't need to be passed around in your asm version because you can reserve a register for it.
Code:
#include <stdio.h>

void countP(int n, int& total) {total++;}
void countN(int n, int& total) {do countP(n, total); while ((n-=5) >= 0);}
void countD(int n, int& total) {do countN(n, total); while ((n-=10) >= 0);}
void countQ(int n, int& total) {do countD(n, total); while ((n-=25) >= 0);}

int f(int n) {
    int total = 0;
    if (n >= 0) countQ(n, total);
    return total;
}

int main()
{
    int number, result;
loop:
    printf("enter integer to calculate\n");
    result = scanf("%d", &number);
    if (result == EOF || result == 0) {
        printf("exiting\n");
        return 0;
    }
    printf("f(%d) = %d\n",number,f(number));
    goto loop;
}    
Post 26 Mar 2018, 11:50
View user's profile Send private message Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 1203
Technically, that's C++ code, because int& is a reference, and C doesn't have references. (i.e. you'll need a C++ compiler to compile it, not that it's a problem, just mentioning it if OP can't compile it)

I'm sure this can be simplified to a math problem with Combinations/Permutations stuff (so just need to calculate some factorials), but I'm not that knowledgeable there.
Post 26 Mar 2018, 12:16
View user's profile Send private message Reply with quote
moveax41h



Joined: 18 Feb 2018
Posts: 46
Here's the C program:


Code:
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
    if (argc != 2)
    {
        printf("Usage: change [total cents]\n");
        return 1;
    }

    int total = atoi(argv[1]);

    printf("Total combinations of quarters, dimes, nickels, and pennies to make %d\n", total);

    int combos = 0;
    int postquarter_total;
    int postdime_total;
    for (int q = 0; q <= total / 25; q++) // quarters
    {
        postquarter_total = total - q * 25;
        for (int d = 0; d <= postquarter_total / 10; d++) // dimes
        {
            postdime_total = postquarter_total - d * 10;
            for (int n = 0; n <= postdime_total / 5; n++) // nickels
            {
                int p = postdime_total - n * 5; // pennies
                combos++;
            }
        }
    }

    printf("%d combinations\n", combos);

    return 0;
}
    
Post 26 Mar 2018, 20:47
View user's profile Send private message Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 716
Ok c code is fine, you are simply suffering from a GCE concerning labels. I will give you control flow for a for loop, and hopefully you can then apply it three times.
Code:
for (a=b; a<=c; a+=d) {body}
    
Code:
        mov a, b
.check_cond:
        cmp a, c
        jg  .exit             ; exit loop if a>c
        body
        add a, d
        jmp .check_cond   ; unconditional jump back
.exit:
    

Some people would be up in arms about that extra uncondition jump from a performance point of view. This is also possible:
Code:
        mov a, b
.check_cond:
        cmp a, c
        jg  .exit           ; exit loop if a>c
.cond_true:
        body
        add a, d
        cmp a, c
        jle .cond_true  ; dont exit if a<=c
.exit:
    


x86 is a register-starved arch. I hope you fix your conversion and also convert the c code I recommended, and compare them for speed.
Post 26 Mar 2018, 22:19
View user's profile Send private message Reply with quote
DimonSoft



Joined: 03 Mar 2010
Posts: 306
Location: Belarus
tthsqe wrote:
Some people would be up in arms about that extra uncondition jump from a performance point of view.

<OffTop>They shouldn’t really be since static prediction always predicts unconditional jump as taken and, if memory serves me, it has a very small penalty. Unlike conditional branches which are predicted with more complex rules and the predictions for which may fail.</OffTop>
Post 27 Mar 2018, 15:09
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 15912
Location: SDSS J140821.67+025733.2
Predictions, and their rules, and even whether or not the CPU supports branch prediction, is not an absolute. Each CPU does things differently. Try not to fall into the trap of reading one CPU manual and thinking all other CPUs do the same thing or have the same behaviours. Some people use old CPUs, some people use new CPUs, some people use a different brand of CPU, some people use embedded single core CPUs, some people use server-class multi-core CPUs., some systems have fast SRAM, some systems have slow ROM, some CPUs have no cache, etc.
Post 27 Mar 2018, 22:24
View user's profile Send private message Visit poster's website Reply with quote
moveax41h



Joined: 18 Feb 2018
Posts: 46
Thanks for the help everyone. I finally squashed all the bugs and got this thing working.

#1 mistake was that I forgot to set each loop counter to 0 before entering the loop. Thus after each loop ran through all of its cycles in 1 iteration of the outer loop, it would never actually run again, severely misrepresenting the count.

The #2 mistake was that I should have been using jl instead of jle because the loops need to execute when the numbers being compared are equal to one another as well... It ONLY needs to jump out of the loop when the number is less.

These 2 fixes made the program work finally...

Code:
format PE console 

entry main 

include 'macro/import32.inc' 

section '.rdata' data readable 
Msg db 'Enter an amount in PENNIES and we will give the change combinations for quarters, dimes, nickels, and pennies', 0 
frmspec db '%d', 0 
Msg2 db 13, 10, 'Total combinations: %d', 0 
pnull db 'pause>nul', 0 
section '.data' data readable writeable 
combos dd 0 
postquarter_total dd 0 
postdime_total dd 0 
q dd 0 
d dd 0 
n dd 0 
p dd 0 
userinput dd 0 
quarters_divided dd 0 

section '.code' readable executable 

main: 
        ;setup main stack 
        push ebp 
        mov ebp, esp 

        push Msg 
        call [printf] 
        add esp, 4 

        push userinput 
        push frmspec 
        call [scanf] 
        add esp, 8 

        ; Doesn't need to be in loop because this will be constant 
        mov eax, [userinput] ; Move the penny count into eax 
        mov edx, 0 ; prepare to do a divide 
        mov ecx, 25 ; we are dividing by 25 for quarters 
        div ecx ; do the divide 
        mov [quarters_divided], eax 
        mov [q], 0

.quarter_loop: ; q as counter 
        mov eax, [quarters_divided]  
        cmp eax, [q] ; compare the quotient with the counter 
        jl endme ; if the quotient is not less than or equal to the counter, jmp out 

        mov eax, [q] ; Quarter counter into eax 
        mov ecx, 25 ; Prepare for mult by 25 
        mul ecx ; mult by 25 product in eax 
        mov ebx, [userinput] ; put pennies into ebx 
        sub ebx, eax ; subtract the quotient of q * 25 from pennies 
        mov [postquarter_total], ebx ; assign to postquarter_total 
        mov [d], 0 
         

        .dime_loop: 
                mov eax, [postquarter_total] ; Problem loop... need to base cmp off d here 
                mov edx, 0 
                mov ecx, 10 
                div ecx 
                cmp eax, [d] 
                jl .after_dime 


                mov eax, [d] ; Move the dime cou 
                mov ecx, 10 
                mul ecx 
                mov ebx, [postquarter_total] 
                sub ebx, eax 
                mov [postdime_total], ebx 
                        mov [n], 0
                .nickel_loop: 
                        mov eax, [postdime_total] 
                        mov edx, 0 
                        mov ecx, 5 
                        div ecx 
                        cmp eax, [n] ; This is never satisfied because n never increments and is always 0 
                        jl .after_nickel ; This is our exit condition to go up to the prev loop 

                        mov eax, [n] 
                        mov ecx, 5 
                        mul ecx 
                        mov ebx, [postdime_total] 
                        sub ebx, eax 
                        mov [p], ebx 
                        inc [combos] 
                        inc [n]                ; Bug fix hopefully... 
                        jmp .nickel_loop 

                .after_nickel: 
                inc [d] 
                jmp .dime_loop 
    .after_dime: 
    inc [q] 
    jmp .quarter_loop 

endme: 
        push [combos] 
        push Msg2 
        call [printf] 
        add esp, 8 

        mov esp, ebp 
        pop ebp 
        push pnull 
        call [system] 
        add esp, 4 

        push 0 
        call [exit] 

section '.idata' import data readable 

library msvcrt, 'msvcrt.dll' 
import msvcrt, \ 
printf, 'printf', \ 
scanf, 'scanf', \ 
system, 'system', \ 
exit, 'exit'     

_________________
-moveax41h
Post 28 Mar 2018, 07:50
View user's profile Send private message Reply with quote
DimonSoft



Joined: 03 Mar 2010
Posts: 306
Location: Belarus
revolution wrote:
Predictions, and their rules, and even whether or not the CPU supports branch prediction, is not an absolute. Each CPU does things differently. Try not to fall into the trap of reading one CPU manual and thinking all other CPUs do the same thing or have the same behaviours. Some people use old CPUs, some people use new CPUs, some people use a different brand of CPU, some people use embedded single core CPUs, some people use server-class multi-core CPUs., some systems have fast SRAM, some systems have slow ROM, some CPUs have no cache, etc.

Well, I’ve read quite a lot of optimization manuals by now, from very old x86 processors (the oldest was around 80286, I guess) to quite modern ones (like Core i7). The rules are the same for this particular case.

Why would one ever decide to predict an unconditional branch as not taken?

I guess, this is the case where your (generally useful) habit to remind about the relativity of optimization techniques becomes useless. Besides, there’s nothing wrong with taking two pieces of code that have a small difference and talking about possible pros and cons of each one. Engineering is all about making decisions and choosing implementations. We need some playground to share our knowledge on optimizations and the playground is made of such code snippets.

Nearly every pipelined CPU out there has a branch predictor. 99.999% of x86 CPUs (except really old ones, I guess) these days do have it. People with old CPUs already cope with the fact that most software either fails to run or is not optimized for their particular CPU model. The number of cores doesn’t render branch prediction talks wrong since each core has the same rules and the code snippet runs on the same core (with respect to task switching, of course).

What I was trying to say is that there’s no need to rewrite a loop just because someone might complain about unconditional jumps. If JMP suits the needs of this particular loop better, any attempts to avoid it with Jcc will almost definitely render its performance worse. A good advice comes with a rationale. There seems to be no rationale behind the fear of unconditional jumps ending loops.
Post 28 Mar 2018, 11:05
View user's profile Send private message Visit poster's website Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 716
moveax41h: I'm glad you got it working, now can you make it faster? I will test your code when I can on a windows box, but at first sight it looks like it will be on the slow side.

I more aptly should have said "aesthetic" in place of "performance". Use of an unconditional jump is a hint that there might be a better way to layout your basic blocks. Besides, everyone else is doing it nowadays. Here is gcc output at O3 (you have to go all of the way to O0 to get a jmp) for you javascriptless users:

Code:
int f(int x);

int g(int a, int b, int c, int d) {
    for (a=b; a<=c; a+=d)
        a= f(a);
    return a;
}    

Code:
g(int, int, int, int):
  cmp esi, edx
  mov edi, esi
  jg .L6
  push rbp
  push rbx
  mov ebp, ecx
  mov ebx, edx
  sub rsp, 8
.L3:
  call f(int)
  lea edi, [rax+rbp]
  cmp ebx, edi
  jge .L3
  add rsp, 8
  mov eax, edi
  pop rbx
  pop rbp
  ret
.L6:
  mov eax, esi
  ret    
Post 28 Mar 2018, 11:33
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 15912
Location: SDSS J140821.67+025733.2
DimonSoft wrote:
Why would one ever decide to predict an unconditional branch as not taken?
Thanks for the response. I was intending to comment on your statement about penalties and predictions of condition branches. I should have quoted the part I was referring to.
Post 28 Mar 2018, 12:19
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2018, Tomasz Grysztar.

Powered by rwasa.