flat assembler
Message board for the users of flat assembler.

flat assembler > Windows > Turn procedure into Macro

Author
Thread Post new topic Reply to topic
MacroZ



Joined: 12 Oct 2018
Posts: 30
Can you help me convert this simple 32-bit memory copy routine into a macro. The macro should be able to detect the size of the copy and adjust the code based on the size. If the size is 1, it should only use mov al,[Src] and mov [Dst],al, if the size is 4, it should use the same, but in eax instead of al. Anything else and it should use the code below.

I tried using if CpySize eq 1 and it works, but a problem starts when I pass sizeof.structure for example, then it won't detect the actual size anymore.

I'm not sure if the match macroinstruction should be used instead, but then how should it be used when I need several types of matches for each size.

I want special code to be generated when CpySize is 1,2,3,4,5,6,7,8,9,10,11 and 12, anything else, and it should use the code below.

Code:
proc MemCopy uses esi edi,CpySize,Src,Dst
  cld
  mov ecx,CpySize
  mov esi,Src
  mov edi,Dst
  shr ecx,2
  rep movsd
  mov ecx,CpySize
  and ecx,3
  rep movsb
  ret
endp    

If CpySize is 1 it should generate this code:
Code:
mov al,Src
mov Dst,al    

If CpySize is 2 it should generate this code:
Code:
mov ax,Src
mov Dst,ax    

If CpySize is 3 it should generate this code:
Code:
mov ax,Src
mov cl,Src+2
mov Dst,ax
mov Dst+2,cl    

If CpySize is 4 it should generate this code:
Code:
mov eax,Src
mov Dst,eax    

If CpySize is 5 it should generate this code:
Code:
mov eax,Src
mov cl,Src+4
mov Dst,eax
mov Dst+4,cl    

If CpySize is 6 it should generate this code:
Code:
mov eax,Src
mov cx,Src+4
mov Dst,eax
mov Dst+4,cx    

If CpySize is 7 it should generate this code:
Code:
mov eax,Src
mov cx,Src+4
mov dl,Src+6
mov Dst,eax
mov Dst+4,cx
mov Dst+6,dl    

If CpySize is 8 it should generate this code:
Code:
mov eax,Src
mov ecx,Src+4
mov Dst,eax
mov Dst+4,ecx    

If CpySize is 9 it should generate this code:
Code:
mov eax,Src
mov ecx,Src+4
mov dl,Src+8
mov Dst,eax
mov Dst+4,ecx
mov Dst+8,dl    

If CpySize is 10 it should generate this code:
Code:
mov eax,Src
mov ecx,Src+4
mov dx,Src+8
mov Dst,eax
mov Dst+4,ecx
mov Dst+8,dx    

If CpySize is 11 it should generate this code:
Code:
mov eax,Src
mov ecx,Src+4
mov dx,Src+8
mov Dst,eax
mov Dst+4,ecx
mov Dst+8,dx
mov al,Src+10
mov Dst+10,al    

and if CpySize is 12 it should generate this code:
Code:
mov eax,Src
mov ecx,Src+4
mov edx,Src+8
mov Dst,eax
mov Dst+4,ecx
mov Dst+8,edx    

Anything else, and it should use the code at the top. The macro should be able to detect when CpySize is passed as sizeof.structure. If CpySize is zero, nothing must be generated.
Post 12 Oct 2018, 18:01
View user's profile Send private message Reply with quote
MacroZ



Joined: 12 Oct 2018
Posts: 30
I managed to put together something. I don't know if this is "sustainable", it takes a lot of if - then statements to get this perfect. Is there an easier way to do it?

Code:
;##############################################################################################################

; Copy a block of memory from one address to another

; Entry
;       CpySize = Number of bytes to copy Imm32/Label or ebx
;       Src = Source address Imm32/Label or esi
;       Dst = Destination address Imm32/Label or edi
; Used Regs
;       ebx esi edi If CpySize > 12, caller must save these first
; Return
;       None

macro _m_MemCopy CpySize*,Src*,Dst* 
  if ~ CpySize eqtype eax
    if CpySize > 0
          if CpySize = 1
            mov al,byte Src
            mov byte Dst,al
      else if CpySize = 2
            mov ax,word Src
            mov word Dst,ax
      else if CpySize = 3
            mov ax,word Src
            mov cl,byte Src+2
            mov word Dst,ax
            mov byte Dst+2,cl
      else if CpySize = 4
            mov eax,dword Src
            mov dword Dst,eax
      else if CpySize = 5
            mov eax,dword Src
            mov cl,byte Src+4
            mov dword Dst,eax
            mov byte Dst+4,cl
      else if CpySize = 6
            mov eax,dword Src
            mov cx,word Src+4
            mov dword Dst,eax
            mov word Dst+4,cx
      else if CpySize = 7
            mov eax,dword Src
            mov cx,word Src+4
            mov dl,byte Src+6
            mov dword Dst,eax
            mov word Dst+4,cx
            mov byte Dst+6,dl
      else if CpySize = 8
            mov eax,dword Src
            mov ecx,dword Src+4
            mov dword Dst,eax
            mov dword Dst+4,ecx
      else if CpySize = 9
            mov eax,dword Src
            mov ecx,dword Src+4
            mov dl,byte Src+8
            mov dword Dst,eax
            mov dword Dst+4,ecx
            mov byte Dst+8,dl
      else if CpySize = 10
            mov eax,dword Src
            mov ecx,dword Src+4
            mov dx,word Src+8
            mov dword Dst,eax
            mov dword Dst+4,ecx
            mov word Dst+8,dx
      else if CpySize = 11
            mov eax,dword Src
            mov ecx,dword Src+4
            mov dx,word Src+8
            mov dword Dst,eax
            mov dword Dst+4,ecx
            mov word Dst+8,dx
            mov al,byte Src+10
            mov byte Dst+10,al
      else if CpySize = 12
            mov eax,dword Src
            mov ecx,dword Src+4
            mov edx,dword Src+8
            mov dword Dst,eax
            mov dword Dst+4,ecx
            mov dword Dst+8,edx
      else
            cld
            if ~ Src eqtype eax
              mov esi,Src
            end if
            if ~ Dst eqtype eax
              mov edi,Dst
            end if
            if CpySize mod 4 = 0
              mov ecx,CpySize shr 2
              rep movsd
            else
              mov ecx,CpySize shr 2
              rep movsd
              mov ecx,CpySize and 3
              rep movsb
            end if
      end if
    end if
  else
    cld
        mov ecx,CpySize
        if ~ Src eqtype eax
          mov esi,Src
        end if
        if ~ Dst eqtype eax
          mov edi,Dst
    end if
        shr ecx,2
    rep movsd
    mov ecx,CpySize
    and ecx,3
    rep movsb
  end if


;##############################################################################################################    
Post 12 Oct 2018, 20:53
View user's profile Send private message Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 1424
Looks good to me. You could of course "share" some of the code since some cases are similar, but if you find it easier this way just keep it.

Your MemCopy function is not very efficient for the last rep movsb, since those instructions have some overhead and are useful when copying large blocks.

If you target newer CPUs with fast "rep movs" (I assume you do since you use it in the first place), then just use rep movsb for the entire thing. The CPU is smart enough to do the large copy in the most optimal way. (note that it should be done for larger blocks only). The beauties of proper CISC.

If you target older CPUs also, then well "rep movs" is not a fast way to copy memory, unfortunately.
Post 13 Oct 2018, 12:36
View user's profile Send private message Reply with quote
MacroZ



Joined: 12 Oct 2018
Posts: 30
I haven't tested rep movsb alone, I will test it. But on 64-bit and a fairly new computer rep movs should probably be avoided all together. In my experience, rep stos instructions are very fast (superior) to regular instructions but rep mov instructions are very slow and should be avoided. I tried creating a 64-bit memcopy routine some years back using rep mov, and it was not good compared to regular instructions. Here is the 64-bit memcopy using regular instructions (Both macro and procedure)

Code:
;##############################################################################################################

; Copy a block of memory from one address to another

; Entry
;       CpySize = Number of bytes to copy Imm64/Label or rcx
;       Src = Source address Imm64/Label or rdx
;       Dst = Destination address Imm64/Label or r8 
;       bRbx = Set to TRUE to allow the use of rbx register or FALSE if not
;       bRsi = Set to TRUE to allow the use of rsi register or FALSE if not
;       bRdi = Set to TRUE to allow the use of rdi register or FALSE if not
;       bR12 = Set to TRUE to allow the use of r12 register or FALSE if not
;       bR13 = Set to TRUE to allow the use of r13 register or FALSE if not
;       bR14 = Set to TRUE to allow the use of r14 register or FALSE if not
;       bR15 = Set to TRUE to allow the use of r15 register or FALSE if not
; Used Regs
;       rbx rsi rdi and r12-r15 If caller set them to be used in the arguments
; Return
;       None

macro _m_MemCopy CpySize*,Src*,Dst*,bRbx*,bRsi*,bRdi*,bR12*,bR13*,bR14*,bR15* 
  local loop32,check8,check4,check1,loop1,bye
  
  maxsize = 39+bRbx*8+bRsi*8+bRdi*8+bR12*8+bR13*8+bR14*8+bR15*8
  if Src eqtype rax
    maxsize = maxsize - 8
  end if
  if Dst eqtype rax
    maxsize = maxsize - 8
  end if
  
  if ~ CpySize eqtype rax & CpySize > 0 & CpySize <= maxsize
        qcount = 0
        current_offset = 0
        if CpySize/8 > qcount
          qcount = qcount + 1
          mov rax,qword Src
          current_offset = current_offset + 8
        end if
        if bRbx = 1
          if CpySize/8 > qcount
            qcount = qcount + 1
            mov rbx,qword Src+current_offset
            current_offset = current_offset + 8
          end if
        end if
        if bRsi = 1
          if CpySize/8 > qcount
            qcount = qcount + 1
            mov rsi,qword Src+current_offset
            current_offset = current_offset + 8
          end if
        end if
        if bRdi = 1
          if CpySize/8 > qcount
            qcount = qcount + 1
            mov rdi,qword Src+current_offset
            current_offset = current_offset + 8
          end if
        end if
        if bR12 = 1
          if CpySize/8 > qcount
            qcount = qcount + 1
            mov r12,qword Src+current_offset
            current_offset = current_offset + 8
          end if
        end if
        if bR13 = 1
          if CpySize/8 > qcount
            qcount = qcount + 1
            mov r13,qword Src+current_offset
            current_offset = current_offset + 8
          end if
        end if
        if bR14 = 1
          if CpySize/8 > qcount
            qcount = qcount + 1
            mov r14,qword Src+current_offset
            current_offset = current_offset + 8
          end if
        end if
        if bR15 = 1
          if CpySize/8 > qcount
            qcount = qcount + 1
            mov r15,qword Src+current_offset
            current_offset = current_offset + 8
          end if
        end if
        if CpySize/8 > qcount
          qcount = qcount + 1
      mov rcx,qword Src+current_offset
          current_offset = current_offset + 8
        end if
        if ~ Src eqtype rax
          if CpySize/8 > qcount
                qcount = qcount + 1
                mov rdx,qword Src+current_offset
                current_offset = current_offset + 8
          end if
        end if
        if ~ Dst eqtype rax
          if CpySize/8 > qcount
                qcount = qcount + 1
                mov r8,qword Src+current_offset
                current_offset = current_offset + 8
          end if
        end if
    if CpySize and 4 = 4
      mov r9d,dword Src+current_offset
          current_offset = current_offset + 4
        else if CpySize/8 > qcount
          qcount = qcount + 1
          mov r9,qword Src+current_offset
          current_offset = current_offset + 8
        end if
        if CpySize and 2 = 2
      mov r10w,word Src+current_offset
          current_offset = current_offset + 2
        else if CpySize/8 > qcount
          qcount = qcount + 1
          mov r10,qword Src+current_offset
          current_offset = current_offset + 8
        end if
        if CpySize and 1 = 1
          mov r11b,byte Src+current_offset
          current_offset = current_offset + 1
        else if CpySize/8 > qcount
          qcount = qcount + 1
          mov r11,qword Src+current_offset
          current_offset = current_offset + 8
        end if
            
        current_offset = 0
        qcount = 0
                
        if CpySize/8 > qcount
          qcount = qcount + 1
      mov qword Dst,rax
          current_offset = current_offset + 8
        end if
        if bRbx = 1
          if CpySize/8 > qcount
            qcount = qcount + 1
            mov qword Dst+current_offset,rbx
            current_offset = current_offset + 8
          end if
        end if
        if bRsi = 1
          if CpySize/8 > qcount
            qcount = qcount + 1
            mov qword Dst+current_offset,rsi
            current_offset = current_offset + 8
          end if
        end if
        if bRdi = 1
          if CpySize/8 > qcount
            qcount = qcount + 1
            mov qword Dst+current_offset,rdi
            current_offset = current_offset + 8
          end if
        end if
        if bR12 = 1
          if CpySize/8 > qcount
            qcount = qcount + 1
            mov qword Dst+current_offset,r12
            current_offset = current_offset + 8
          end if
        end if
        if bR13 = 1
          if CpySize/8 > qcount
            qcount = qcount + 1
            mov qword Dst+current_offset,r13
            current_offset = current_offset + 8
          end if
        end if
        if bR14 = 1
          if CpySize/8 > qcount
            qcount = qcount + 1
            mov qword Dst+current_offset,r14
            current_offset = current_offset + 8
          end if
        end if
        if bR15 = 1
          if CpySize/8 > qcount
            qcount = qcount + 1
            mov qword Dst+current_offset,r15
            current_offset = current_offset + 8
          end if
        end if
        if CpySize/8 > qcount
          qcount = qcount + 1
          mov qword Dst+current_offset,rcx
          current_offset = current_offset + 8
        end if
        if ~ Src eqtype rax
          if CpySize/8 > qcount
            qcount = qcount + 1
                mov qword Dst+current_offset,rdx
                current_offset = current_offset + 8
          end if
        end if
        if ~ Dst eqtype rax
          if CpySize/8 > qcount
            qcount = qcount + 1
                mov qword Dst+current_offset,r8
                current_offset = current_offset + 8
      end if
        end if
    if CpySize and 4 = 4
      mov dword Dst+current_offset,r9d
          current_offset = current_offset + 4
        else if CpySize/8 > qcount
          qcount = qcount + 1
          mov qword Dst+current_offset,r9
          current_offset = current_offset + 8
        end if
        if CpySize and 2 = 2
      mov word Dst+current_offset,r10w
          current_offset = current_offset + 2
        else if CpySize/8 > qcount
          qcount = qcount + 1
          mov qword Dst+current_offset,r10
          current_offset = current_offset + 8
        end if
        if CpySize and 1 = 1
      mov byte Dst+current_offset,r11b
          current_offset = current_offset + 1
        else if CpySize/8 > qcount
          qcount = qcount + 1
          mov qword Dst+current_offset,r11
          current_offset = current_offset + 8
        end if
  else if ~ CpySize eqtype rax & CpySize > 0 | CpySize eqtype rax
        if ~ CpySize eqtype rax
          mov rcx,CpySize
          mov r9,CpySize
          if ~ Src eqtype rax
            mov rdx,Src
          end if
          if ~ Dst eqtype rax
            mov r8,Dst
          end if
        else
          mov r9,rcx
          if ~ Src eqtype rax
            mov rdx,Src
          end if
          if ~ Dst eqtype rax
            mov r8,Dst
          end if
        end if
        shr rcx,4
    mov r11d,16
    jz check8
  align 8
  loop32
    mov rax,rdx
    mov r10,rdx+8
    lea rdx,rdx+r11
    mov r8,rax
    mov r8+8,r10
    add r8,r11
    sub rcx,1
    jz check8
    mov rax,rdx
    mov r10,rdx+8
    lea rdx,rdx+r11
    mov r8,rax
    mov r8+8,r10
    add r8,r11
    sub rcx,1
    jnz loop32
  check8       
    test r9d,8
    jz check4
    mov rax,rdx
    lea rdx,rdx+8
    mov r8,rax
    add r8,8
  check4
    test r9d,4
    jz check1
    mov eax,rdx
    lea rdx,rdx+4
    mov r8,eax
    add r8,4
  check1
    and r9,3
    jz bye
  align 4
  loop1
    mov al,rdx
    lea rdx,rdx+1
    mov r8,al
    add r8,1
    sub r9,1
    jnz loop1
  bye
  end if

;##############################################################################################################    
Code:
;##############################################################################################################

; Copy a block of memory from one address to another

; Entry
;   rcx = Number of bytes to copy
;   rdx = Source address
;   r8 = Destination address
; Return
;   None
proc MemCopy,CpySize,pSrc,pDst
  mov r9,rcx
  shr rcx,4
  mov r11d,16
  jz .check8
align 8
.loop32
  mov rax,rdx
  mov r10,rdx+8
  lea rdx,rdx+r11
  mov r8,rax
  mov r8+8,r10
  add r8,r11
  sub rcx,1
  jz .check8
  mov rax,rdx
  mov r10,rdx+8
  lea rdx,rdx+r11
  mov r8,rax
  mov r8+8,r10
  add r8,r11
  sub rcx,1
  jnz .loop32
.check8        
  test r9d,8
  jz .check4
  mov rax,rdx
  lea rdx,rdx+8
  mov r8,rax
  add r8,8
.check4
  test r9d,4
  jz .check1
  mov eax,rdx
  lea rdx,rdx+4
  mov r8,eax
  add r8,4
.check1
  and r9,3
  jz .ret
align 4
.loop1
  mov al,rdx
  lea rdx,rdx+1
  mov r8,al
  add r8,1
  sub r9,1
  jnz .loop1
.ret
  ret
endp

;##############################################################################################################    
Post 13 Oct 2018, 15:50
View user's profile Send private message Reply with quote
MacroZ



Joined: 12 Oct 2018
Posts: 30
I would like some input on the latest macro, is it good? Is there anything in the macro variant that should be designed differently? Is there any places where match would be better to use and perhaps use sub-macros inside the main macro?

If anyone is alive on the forum (It doesn't seem like people are active here)
Post 13 Oct 2018, 22:31
View user's profile Send private message Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 1424
MacroZ wrote:
I haven't tested rep movsb alone, I will test it. But on 64-bit and a fairly new computer rep movs should probably be avoided all together.
Both rep stosb and rep movsb are fast on newer CPUs. I don't know any CPU where only stos is fast compared to movs (well, on older CPUs they're both slow obviously). I know that on Haswell (my CPU) they have both been enhanced to use 256-bit operations (internally).

I found this with a lot more info if you want Smile https://stackoverflow.com/questions/43343231/enhanced-rep-movsb-for-memcpy

Your macro seems pretty large, but if it works then it's fine. match is used when you need to do symbol comparisons.

Note that FASM has two stages. The preprocessor deals with text (symbols) and match is part of it. The second stage is assembly stage. if statements are part of assembly stage, so they can mostly refer to numbers only (with few exceptions, such as registers and the like). All the symbols/values defined in assembly stage (with = operator) can only contain such numbers or whatever. You can't contain arbitrary text/symbols, you need "equ" and "define" preprocessor for that.

match is useful for macros with "custom syntax", instead of just passing parameters normally. e.g. you can extract a parameter that looks like "rdi:true" into "rdi" and "true" and do other sorts of text processing (both of those are text).

But if you parameters like "true, true, true, false, true", you don't need it. Of course assuming true expands during preprocessing to some number (1?). Remember: variables during assembly stage don't contain arbitrary text, all of it has been replaced by preprocessor.

You can actually not define "true" and "false" and use match to check for "=true" (literal text) on a parameter, if you want, instead of using if. Then, it will be replaced at preprocessing time.
Post 14 Oct 2018, 14:18
View user's profile Send private message Reply with quote
MacroZ



Joined: 12 Oct 2018
Posts: 30
Care to show me how you would do the macro prototype? Very Happy
Post 14 Oct 2018, 17:32
View user's profile Send private message Reply with quote
donn



Joined: 05 Mar 2010
Posts: 132
AMD also has some rep movs alternatives in Software Optimization Guide for AMD64 Processors pulled on 25112 Rev. 3.06 September 2005 Section 5.13. It's a bit old, so it's possible their method was superseded. They have examples, which I found interesting and implemented part of one and tested it myself. The alignment wasn't yet implemented, but copying seemed to work:

Code:

        mov linearCopy.copyAddress, rcx
        mov linearCopy.copyDestAddress, rdx
        mov linearCopy.copySize, r8


        .copySet

        mov rsi, linearCopy.copyAddress
        mov rdi, linearCopy.copyDestAddress


        cld


        mov rax, linearCopy.copySize
        mov linearCopy.copySizeRemainder, rax
        shr rax, 101b                           ; Divide by 32
        mov linearCopy.copySize, rax
        mov rax, linearCopy.copySize
        mov rdx, 0
        mov rcx, 100000b
        imul rcx
        mov r10, linearCopy.copySizeRemainder
        sub r10, rax
        mov linearCopy.copySizeRemainder, r10

        mov rax, linearCopy.copySize
        cmp rax, 0
        je linearCopy.smallCopyOnly
        
        ;and rsp, -32;align 16                  ; Not working yet
        .copyLarge                             ; Copy in chunks of 4 qwords. AMD Optimization recommendation. Compare with rep movsq.
        mov r8, rsi
        mov r9, rsi+1000b
        add rsi, 100000b
        movnti rdi, r8
        movnti rdi+1000b, r9
        add rdi, 100000b
        mov r8, rsi-10000b
        mov r9, rsi-1000b
        dec rax
        movnti rdi-10000b, r8
        movnti rdi-1000b, r9
        jnz linearCopy.copyLarge

        .smallCopyOnly

        mov rcx, linearCopy.copySizeRemainder 

        rep movsb                               


        mov rax, linearCopy.copyDestAddress
    
Post 14 Oct 2018, 18:51
View user's profile Send private message Reply with quote
donn



Joined: 05 Mar 2010
Posts: 132
AMD also has some rep movs alternatives in Software Optimization Guide for AMD64 Processors pulled on 25112 Rev. 3.06 September 2005 Section 5.13. It's a bit old, so it's possible their method was superseded. They have examples, which I found interesting and implemented part of one and tested it myself. The alignment wasn't yet implemented, but copying seemed to work:

Code:

        mov linearCopy.copyAddress, rcx
        mov linearCopy.copyDestAddress, rdx
        mov linearCopy.copySize, r8


        .copySet

        mov rsi, linearCopy.copyAddress
        mov rdi, linearCopy.copyDestAddress


        cld


        mov rax, linearCopy.copySize
        mov linearCopy.copySizeRemainder, rax
        shr rax, 101b                           ; Divide by 32
        mov linearCopy.copySize, rax
        mov rax, linearCopy.copySize
        mov rdx, 0
        mov rcx, 100000b
        imul rcx
        mov r10, linearCopy.copySizeRemainder
        sub r10, rax
        mov linearCopy.copySizeRemainder, r10

        mov rax, linearCopy.copySize
        cmp rax, 0
        je linearCopy.smallCopyOnly
        
        ;and rsp, -32;align 16                  ; Not working yet
        .copyLarge                             ; Copy in chunks of 4 qwords. AMD Optimization recommendation. Compare with rep movsq.
        mov r8, rsi
        mov r9, rsi+1000b
        add rsi, 100000b
        movnti rdi, r8
        movnti rdi+1000b, r9
        add rdi, 100000b
        mov r8, rsi-10000b
        mov r9, rsi-1000b
        dec rax
        movnti rdi-10000b, r8
        movnti rdi-1000b, r9
        jnz linearCopy.copyLarge

        .smallCopyOnly

        mov rcx, linearCopy.copySizeRemainder 

        rep movsb                               


        mov rax, linearCopy.copyDestAddress
    
Post 14 Oct 2018, 18:53
View user's profile Send private message Reply with quote
MacroZ



Joined: 12 Oct 2018
Posts: 30
Even if it is old people can still make use of it, it doesn't mean it stops there, although I was thinking more about the macro implementation itself, but thanks anyway, nice example to keep in mind. Very Happy
Post 14 Oct 2018, 20:24
View user's profile Send private message Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 1424
MacroZ wrote:
Care to show me how you would do the macro prototype? Very Happy
Well I was just giving you some possibilities since you asked about match. In general, you use match when you want to do some text processing like that (i.e. if you want to match the literal "true" text, instead of having it replaced by a number constant).

In this case, I'd rather just specify the registers as a single parameter, each separated by space. It's the cleanest way to call this macro IMO.

Something like this:
Code:
@err fix macro +

macro _m_MemCopy CpySize*,Src*,Dst*,regs* 
  local loop32,check8,check4,check1,loop1,bye
  irp reg, bRbx,bRsi,bRdi,bR12,bR13,bR14,bR15 \
    local reg
    reg = 0
  \
  define reg
  irps r, regs \
    match =rbx, r \\ define reg bRbx \\
    match =rsi, r \\ define reg bRsi \\
    match =rdi, r \\ define reg bRdi \\
    match =r12, r \\ define reg bR12 \\
    match =r13, r \\ define reg bR13 \\
    match =r14, r \\ define reg bR14 \\
    match =r15, r \\ define reg bR15 \\
    match , reg \\
      @err "Bad register"
    \\
    reg = 1
    restore reg
  \
  restore reg

  ; more stuff
    
Use it like:
Code:
_m_MemCopy 1, 2, 3, rbx r13 r15    
(just showing the register parameters of course)

Just FYI, after preprocessing, this will look like:
Code:
; the locals here would be replaced by some local auto-generated names due to our use of local without the \ so it's part of macro, not irp
bRbx = 0
bRsi = 0
bRdi = 0
bR12 = 0
bR13 = 0
bR14 = 0
bR15 = 0

bRbx = 1
bR13 = 1
bR15 = 1    
You can also use bitwise mask of flags (each register = 1 bit) if you want more efficient assembly process (not that important, just time to assemble it and memory usage).

It's only slightly important because the assembly stage is multi-pass, so this will get "evaluated" multiple times for each pass if needed.
Post 14 Oct 2018, 21:49
View user's profile Send private message Reply with quote
MacroZ



Joined: 12 Oct 2018
Posts: 30
Something like that Very Happy

I will try it out as soon as I get back to coding. I hate it when I can't get macro's as clean as that.
Post 14 Oct 2018, 22:49
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2019, Tomasz Grysztar.

Powered by rwasa.