flat assembler
Message board for the users of flat assembler.
![]() |
Author |
|
donn 12 Oct 2018, 15:07
x64 calling conventions on Windows are important, I included some links in here:
https://board.flatassembler.net/topic.php?p=201275#201275 Alignment is also important, furs had a good method, I'll try to find a link for that post... |
|||
![]() |
|
Tomasz Grysztar 12 Oct 2018, 16:33
Please note that x32 is usually only used to refer to a very specific ABI and thus it might be a bit confusing when used like here.
I would suggest to name this conversion "x86 to x86-64", though "x86 to x64" might be more appropriate since the Windows API is involved. |
|||
![]() |
|
vafylec 13 Oct 2018, 10:53
- I've changed the title to 'tips to convert x86 to x64'.
- Out of interest, would the terms '32-bit x86' and '64-bit x86' be acceptable? - I'm used to x16/x32/x64/x86 (16/32/64/16 or 32-bit) but don't mind using different terminology. - Thanks donn and Tomasz Grysztar (and thanks for FASM). |
|||
![]() |
|
vafylec 16 Oct 2018, 14:18
I've attempted to convert the asm from 32-bit to 64-bit, line by line.
I'd appreciate it greatly if anybody could check over the code. I've been wanting to convert this for around 2 years. changes: PE -> PE64 'win32a.inc' -> 'win64a.inc' e__ -> r__ [register names][apart from in some cmp instructions] popfd -> popfq pushfd -> pushfq :DWORD -> :QWORD incorrect comment in original file (before/after): xchg eax,ebx ;EBX=bytes 1..5 of Needle xchg eax,ebx ;EBX=bytes 1..4 of Needle (0-based) Code: ;attempt to convert 32-bit code (in link) to 64-bit code: ;Machine code binary buffer searching regardless of NULL - Scripts and Functions - AutoHotkey Community ;https://autohotkey.com/board/topic/23627-machine-code-binary-buffer-searching-regardless-of-null/#entry152824 format PE64 GUI 4.0 entry start include 'win64a.inc' section '.data' data readable writeable hayStack db '1111111122222111111' Needle db '22222' section '.code' code readable executable start: push 0 5 19 Needle hayStack call InBuf push -1 5 19 Needle hayStack call InBufRev invoke ExitProcess,0 proc InBuf stdcall uses rbx rcx rdx rsi rdi, hayStack,Needle,hayStackSize,NeedleSize,StartOffset local lNeedleRemDwords:QWORD ;(NeedleSize-4)>>2 local lNeedleRemTail:QWORD ;Needle remainder byte count (NeedleSize-4) mod 4 -> (0..3) local lNeedleRemPtr4:QWORD ;&Needle[4] pushfq mov rbx,[NeedleSize] cmp rbx,0 jle .NotFound mov rcx,[hayStackSize] mov rax,[StartOffset] sub rcx,rax sub rcx,rbx inc rcx ;repetitions=hayStackSize-StartOffset-NeedleSize+1 jle .NotFound mov rdi,[hayStack] add rdi,rax ;rdi=&(hayStack[StartOffset]) ;load Needle FirstByte mov rsi,[Needle] xor rax,rax cld lodsb ; AL=Needle[0], keep RAX now! ;decide on needle length dec rbx jz .NeedleLenIs1 dec rbx jz .NeedleLenIs2 dec rbx jz .NeedleLenIs3 dec rbx jz .NeedleLenIs4 dec rbx jnz .NeedleLenIsLong ;.NeedleLenIs5: xchg rax,rbx lodsd ;AL=Needle[0] xchg rax,rbx ;RBX=bytes 1..4 of Needle (0-based) .ScanNeedleLenIs5: repne scasb jne .NotFound cmp [rdi],ebx jne .ScanNeedleLenIs5 jmp .Found .NeedleLenIs4: dec rsi lodsd ;RAX=first 4 bytes of Needle .ScanNeedleLenIs4: repne scasb jne .NotFound cmp [rdi-1],eax jne .ScanNeedleLenIs4 jmp .Found .NeedleLenIs1: repne scasb jne .NotFound jmp .Found .NeedleLenIs2: mov ah,[rsi] .ScanNeedleLenIs2: repne scasb jne .NotFound cmp [rdi],ah jne .ScanNeedleLenIs2 jmp .Found .NeedleLenIs3: xchg rbx,rax lodsw xchg rbx,rax .ScanNeedleLenIs3: repne scasb jne .NotFound cmp [rdi],bx jne .ScanNeedleLenIs3 jmp .Found .NeedleLenIsLong: ; get (needleSize-1)//4, (needleSize-1) mod 4 dec rsi ;RSI=&(Needle[0]) inc rbx ;RBX=NeedleSize-4 lodsd ;RAX=first 4 bytes of Needle mov [lNeedleRemPtr4],rsi mov rdx,rbx shr rbx,2 mov [lNeedleRemDwords],rbx and rdx,3 mov [lNeedleRemTail],rdx xchg rbx,rdi ;RBX=save RDI buf ptr for scasb xchg rdx,rcx ;RDX=save RCX counter for scasb .ScanNeedleLenIsLong: xchg rdi,rbx ;load saved buf ptr xchg rcx,rdx ;load saved counter .ScanNeedleLenIsLongJustScan: repne scasb jne .NotFound ;check all 4 bytes cmp [rdi-1],eax jne .ScanNeedleLenIsLongJustScan ;check up to Needle's tail mov rbx,rdi mov rdx,rcx add rdi,3 mov rsi,[lNeedleRemPtr4] mov rcx,[lNeedleRemDwords] test rcx,rcx jz .ScanNeedleLenIsLongTail repe cmpsd jne .ScanNeedleLenIsLong .ScanNeedleLenIsLongTail: mov rcx,[lNeedleRemTail] test rcx,rcx jz .ScanNeedleLenIsLongFound repe cmpsb jne .ScanNeedleLenIsLong .ScanNeedleLenIsLongFound: mov rdi,rbx ;FOUND! .Found: dec rdi mov rax,rdi sub rax,[hayStack] .popOut: popfq ret .NotFound: xor rax,rax not rax jmp .popOut endp ;@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ proc InBufRev stdcall uses rbx rcx rdx rsi rdi, hayStack,Needle,hayStackSize,NeedleSize,StartOffsetOfLastByte local lNeedleRemDwords:QWORD ;(NeedleSize-4)>>2 local lNeedleRemTail:QWORD ;Needle remainder byte count (NeedleSize-4) mod 4 -> (0..3) local lNeedleRemPtr4:QWORD ;&Needle[4] pushfq mov rbx,[NeedleSize] cmp rbx,0 jle .NotFound mov rax,[hayStackSize] dec rax mov rcx,[StartOffsetOfLastByte] cmp rcx,-1 cmovE rcx,rax cmp rax,rcx cmovL rcx,rax sub rcx,rbx mov rdi,rcx inc rcx ;repetitions=min(hayStackSize-1,StartOffsetOfLastByte)-NeedleSize+2 jle .NotFound add rdi,[hayStack] ;rdi=&(hayStack[min(hayStackSize-1,StartOffsetOfLastByte)-NeedleSize+1]) ;load Needle FirstByte mov rsi,[Needle] and rax,0 cld lodsb ; AL=Needle[0], keep RAX now! ;decide on needle length dec rbx jz .NeedleLenIs1 dec rbx jz .NeedleLenIs2 dec rbx jz .NeedleLenIs3 dec rbx jz .NeedleLenIs4 dec rbx jnz .NeedleLenIsLong ;.NeedleLenIs5: xchg rax,rbx lodsd ;AL=Needle[0] xchg rax,rbx ;RBX=bytes 1..4 of Needle (0-based) std .ScanNeedleLenIs5: repne scasb jne .NotFound cmp [rdi+2],ebx jne .ScanNeedleLenIs5 jmp .Found .NeedleLenIs1: std repne scasb jne .NotFound jmp .Found .NeedleLenIs2: std mov ah,[rsi] ;AH=Needle[1] .ScanNeedleLenIs2: repne scasb jne .NotFound cmp [rdi+2],ah jne .ScanNeedleLenIs2 jmp .Found .NeedleLenIs3: xchg rbx,rax lodsw xchg rbx,rax std .ScanNeedleLenIs3: repne scasb jne .NotFound cmp [rdi+2],bx jne .ScanNeedleLenIs3 jmp .Found .NeedleLenIs4: dec rsi lodsd ;RAX=first 4 bytes of Needle std .ScanNeedleLenIs4: repne scasb jne .NotFound cmp [rdi+1],eax jne .ScanNeedleLenIs4 jmp .Found .NeedleLenIsLong: ; get (needleSize-1)//4, (needleSize-1) mod 4 dec rsi ;RSI=&(Needle[0]) inc rbx ;RBX=NeedleSize-4 lodsd ;RAX=first 4 bytes of Needle mov [lNeedleRemPtr4],rsi mov rdx,rbx shr rbx,2 mov [lNeedleRemDwords],rbx and rdx,3 mov [lNeedleRemTail],rdx xchg rbx,rdi ;RBX=save RDI buf ptr for scasb xchg rdx,rcx ;RDX=save RCX counter for scasb .ScanNeedleLenIsLong: std xchg rdi,rbx ;load saved buf ptr xchg rcx,rdx ;load saved counter .ScanNeedleLenIsLongJustScan: repne scasb jne .NotFound ;check all 4 bytes cmp [rdi+1],eax jne .ScanNeedleLenIsLongJustScan ;check up to Needle's tail cld mov rbx,rdi mov rdx,rcx add rdi,5 mov rsi,[lNeedleRemPtr4] mov rcx,[lNeedleRemDwords] test rcx,rcx jz .ScanNeedleLenIsLongTail repe cmpsd jne .ScanNeedleLenIsLong .ScanNeedleLenIsLongTail: mov rcx,[lNeedleRemTail] test rcx,rcx jz .ScanNeedleLenIsLongFound repe cmpsb jne .ScanNeedleLenIsLong .ScanNeedleLenIsLongFound: mov rdi,rbx ;FOUND! .Found: inc rdi mov rax,rdi sub rax,[hayStack] .popOut: popfq ret .NotFound: xor rax,rax not rax jmp .popOut endp data import library kernel32,'KERNEL32.DLL' import kernel32,ExitProcess,'ExitProcess' end data Is there a way to test this code safely? E.g. inside some kind of GUI 'emulator' which displays the variable/register/flag contents. Btw I've collected some resources here: resources for learning assembly language - AutoHotkey Community https://autohotkey.com/boards/viewtopic.php?f=17&t=57651 |
|||
![]() |
|
revolution 16 Oct 2018, 14:27
If you use pushfq/popfq then you don't need the uses clause since they essentially do the same thing; they both save and restore the registers.
If you want to watch it execute then you can use a debugger and step through each instruction one-by-one. |
|||
![]() |
|
fasmnewbie 16 Oct 2018, 17:25
Not suggesting any solution here because I don't know what your codes are doing. Just a few structural suggestions;
1. Lose those "stdcall" from those two PROCs. This caused the obvious syntax errors. 2. Your "start" should look like so; Code: start: sub rsp,8 ;align stack. But it doesn't matter. fastcall InBuf,hayStack,Needle,19,5,0 fastcall InBufRev,hayStack,Needle,19,5,-1 invoke ExitProcess,0 Try these suggestions first and see how it goes. |
|||
![]() |
|
fasmnewbie 16 Oct 2018, 17:28
revolution wrote: If you use pushfq/popfq then you don't need the uses clause since they essentially do the same thing; they both save and restore the registers. |
|||
![]() |
|
vafylec 16 Oct 2018, 18:26
- What the script does:
- When you compile the asm to an exe, you can extract two sections of machine code. - That machine code can be used in AutoHotkey, and I suppose other programming languages, as machine code functions. - InBuf(haystackAddr, needleAddr, haystackSize, needleSize, StartOffset:=0) - InBufRev(haystackAddr, needleAddr, haystackSize, needleSize, StartOffsetOfLastNeedleByte:=-1) - The functions search for binary data (including/excluding null bytes), and don't stop at null bytes. - The functions have separate bits of code for 1/2/3/4/5/n-byte needles. - Re. debuggers: - I've only ever run ASM by compiling it and extracting the machine code, and running it in AutoHotkey. Code that I felt was safe enough to risk running. I don't know if there are added risks to running machine code directly, versus code from programming languages like AutoHotkey/C++/Java/Python, checked by an IDE. Does the exe just crash and you get an error message, or can it be worse? - This mentions OllyDbg, is that the most common FASM debugger? And Fresh IDE is mentioned but not recommended. flat assembler - Does it have a debugger? https://board.flatassembler.net/topic.php?t=20184 - I was partly interested in finding something simple that I could recommend to people interested in learning ASM. - Thanks revolution and fasmnewbie re. potential errors. - In my reading, I didn't come across stack misalignment. The 5 pointers and RFLAGS are multiples of 8 bytes, so I might have thought they'd be safe. - I'm used to padding so that items start at a multiple of their size (1/2/4/ ![]() - If the ASM can be made to work, maintaining both 'uses' and push/pop, even if one can be omitted, that would probably be best to begin with. And afterwards I might hope to simplify the code. |
|||
![]() |
|
fasmnewbie 16 Oct 2018, 20:11
Quote: If the ASM can be made to work, maintaining both 'uses' and push/pop, even if one can be omitted, that would probably be best to begin with. And afterwards I might hope to simplify the code For FASTCALL, I think the stack is guaranteed to be aligned if you let FASM handles all your stack movement (including allocating the shadow space). That means it doesn't matter how many USES and LOCALS items you declared, the stack is guaranteed to be aligned. Things will be a bit out of control if you manually using PUSH/POP without balancing them back to 16-bytes boundary. Using PUSHFQ is such an example. AFAIK, kernel32.dll services don't really observe the stack alignment, so your code might be safe at some point (But still it observes the shadow space). But not so when dealing with WinAPI such as user32 and similar stuff. Things will get ugly pretty quickly with segfaults if u don't observe the alignment. Not a big fan of high-level features of FASM, though. I favor working with plain instructions so I get to control every padding and alignment. Like this example below... === OUT-OF-TOPIC === I found that there's a small 'anomaly' when dealing with STRUCT inside LOCALS. For example, I can assign values to a STRUCT members in 32-bit PROC and but can't do so in 64-bit PROC. Example; Code: format PE console include 'win32axp.inc' entry main section '.text' code readable executable proc main call IDCPU cinvoke getchar cinvoke exit,0 endp proc IDCPU uses eax ebx ecx edx locals struct cpu fpu dd ? sse dd ? avx dd 7 ;<<-- acceptable in 32-bit, not in 64-bit brand rb 56 ends myCPU cpu formt db '%s',0 intel db 0ah,"FPU=%u SSE=%u AVX=%u",0ah,0 endl mov eax,80000002h cpuid mov dword[myCPU.brand+ 0],eax mov dword[myCPU.brand+ 4],ebx mov dword[myCPU.brand+ 8],ecx mov dword[myCPU.brand+12],edx mov eax,80000003h cpuid mov dword[myCPU.brand+16],eax mov dword[myCPU.brand+20],ebx mov dword[myCPU.brand+24],ecx mov dword[myCPU.brand+28],edx mov eax,80000004h cpuid mov dword[myCPU.brand+32],eax mov dword[myCPU.brand+36],ebx mov dword[myCPU.brand+40],ecx mov dword[myCPU.brand+44],edx cinvoke printf,addr formt,addr myCPU.brand mov eax,1 cpuid bt edx,0 adc [myCPU.fpu],0 bt edx,25 adc [myCPU.sse],0 bt ecx,28 adc [myCPU.avx],0 cinvoke printf,addr intel,[myCPU.fpu],[myCPU.sse],[myCPU.avx] ret endp section '.idata' import data readable library msvcrt,'msvcrt.dll' import msvcrt,\ printf,'printf',\ getchar,'getchar',\ exit,'exit' This is a 32-bit code. If for example I assign a value to STRUCT member "avx", then FASM will happily accept it. But if I changed this code to 64-bit, it will complain error. I don't know which one is supposed / not supposed to work. What's wrong? |
|||
![]() |
|
revolution 17 Oct 2018, 00:43
fasmnewbie wrote:
|
|||
![]() |
|
vafylec 22 Oct 2018, 02:04
- I've been testing Fresh IDE, and it's great.
- I installed Fresh IDE, and I searched, but couldn't find a way to specify where to look for .inc files. Although it does find any files placed directly into 'C:\Program Files (x86)\Fresh'. - Btw which other debuggers/debugging techniques do people use? - Unfortunately, I couldn't get the 32-bit asm file to compile with Fresh IDE (although it did compile with FASM). (I know that the 32-bit version works, and I'm trying to produce a working 64-bit version.) - I tried these 2 lines (and variants): Code: proc InBuf stdcall uses ebx ecx edx esi edi, hayStack,Needle,hayStackSize,NeedleSize,StartOffset proc InBuf uses ebx ecx edx esi edi, hayStack,Needle,hayStackSize,NeedleSize,StartOffset - But both gave the following error on Fresh IDE: Code: Error: extra characters on line << if~defined interfaces.uses ebx ecx edx esi edi >> Noname1.asm [21] - Thanks. |
|||
![]() |
|
bitRAKE 04 Nov 2018, 16:24
https://github.com/x64dbg/x64dbg is a good debugger to watch the code.
Since this code is used by an external program, that interface is going to limit how the code is written/debugged. If AutoHotKey is expecting Win64 ABI then you'll need to code to that - which means aligning the stack and creating the shadow space, etc. |
|||
![]() |
|
vafylec 11 Apr 2019, 17:53
- I've been working on code to do file/console I/O, to help with testing.
- This code works in 32-bit, but I had trouble trying to convert it to 64-bit, if anyone can convert it. Code: format PE GUI 4.0 include 'win32a.inc' section '.text' code readable executable ;GENERIC_WRITE := 0x40000000 ;OPEN_ALWAYS := 4 invoke CreateFileA,_path,40000000h,0,0,4,0,0 mov ebx,eax invoke WriteFile,ebx,_text,6,esp,0 invoke CloseHandle,ebx invoke ExitProcess,0 ;_path db 'C:\Users\me\Desktop\z test write 32-bit.txt',0 ;also works _path db 'z test write 32-bit.txt',0 _text db 'abcdef',0 data import library kernel32,'KERNEL32.DLL' import kernel32,\ CreateFileA,'CreateFileA',\ WriteFile,'WriteFile',\ CloseHandle,'CloseHandle',\ ExitProcess,'ExitProcess' end data - Btw it would be greatly beneficial if someone could write a 64-bit version of cmd.zip ('Command line parameters'), here. Thanks. flat assembler https://flatassembler.net/examples.php |
|||
![]() |
|
Ali.Z 11 Apr 2019, 19:57
read x86 calling conventions
read fasm doc Code: format PE64 GUI 4.0 include 'win64a.inc' section '.text' code readable executable frame invoke CreateFile,_path,GENERIC_WRITE,0,0,OPEN_ALWAYS,0,0 mov rbx,rax invoke WriteFile,rbx,_text,6,rsp,0 invoke CloseHandle,rbx endf invoke ExitProcess,0 _path db 'z test write 64-bit.txt',0 _text db 'abcdef',0 data import library kernel32,'KERNEL32.DLL' import kernel32,\ CreateFile,'CreateFileA',\ WriteFile,'WriteFile',\ CloseHandle,'CloseHandle',\ ExitProcess,'ExitProcess' end data _________________ Asm For Wise Humans |
|||
![]() |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.