flat assembler
Message board for the users of flat assembler.
Index
> Main > LineOfCodes Challenge |
Author |
|
fasmnewbie 01 Oct 2017, 21:26
I am not a very good user of FASM high level features. But today I pushed myself using them to convert my dumpreg routine for 64-bit Windows. Here's the code;
Code: ;-------------------------------------------- ; Compile : fasm this.asm ; link : gcc -m64 this.obj -o this.exe ;-------------------------------------------- format MS64 coff include 'd:\fasmw\include\win64axp.inc' public main extrn printf section '.text' executable main: mov rbp,-34h ;verifier call dumpreg call dumpreg ret ;38 lines from head to toe proc dumpreg uses rax rcx rdx r8 r9 r10 r11 r12 locals _rax db 'RAX|%p RBX|%p RCX|%p',0ah,0,0,0 _rdx db 'RDX|%p RDI|%p RSI|%p',0ah,0,0,0 _r8 db 'R8 |%p R9 |%p R10|%p',0ah,0,0,0 _r11 db 'R11|%p R12|%p R13|%p',0ah,0,0,0 _r14 db 'R14|%p R15|%p RBP|%p',0ah,0,0,0 _rsp db 'RSP|%p RIP|%p',0ah,0,0 regs rq 8 endl mov [regs ],rax mov [regs+ 8],rcx mov [regs+16],rdx mov [regs+24],r8 mov [regs+32],r9 mov [regs+40],r10 mov [regs+48],r11 mov [regs+56],r12 mov r12,rsp sub rsp,512 and rsp,-16 fxsave [rsp] sub rsp,32 fastcall printf,addr _rax,[regs],rbx,[regs+8] fastcall printf,addr _rdx,[regs+16],rdi,rsi fastcall printf,addr _r8,[regs+24],[regs+32],[regs+40] fastcall printf,addr _r11,[regs+48],[regs+56],r13 fastcall printf,addr _r14,r14,r15,qword[rbp] mov r8,[rbp+8] sub r8,5 lea rdx,[rbp+16] lea rcx,[_rsp] call printf add rsp,32 fxrstor [rsp] mov rsp,r12 ret endp Me think: This is the shortest DUMPREG code in the world, showcasing some of the FASMW high-level features. So here it is... The challenge now is to cut the code lines some more; 1. without breaking anything. 2. Still visually good (readable etc) 3. FXSAVE/FXRSTOR are to stay. See if you can do something about it. The code is ready to compile out of the box. You need to configure the PATH to your win64axp.inc yourself. EDIT: I forgot to include the output... just in case we need to catch anything off balance Code: RAX|00007FF8B9DC3CF8 RBX|0000000000000001 RCX|0000000000000001 RDX|0000000000171360 RDI|0000000000171330 RSI|0000000000000011 R8 |00000000001742B0 R9 |0000000000171360 R10|0000000000000000 R11|0000000000000246 R12|0000000000000001 R13|0000000000000008 R14|0000000000000000 R15|0000000000000000 RBP|FFFFFFFFFFFFFFCC RSP|000000000060FE58 RIP|00000000004015B7 RAX|00007FF8B9DC3CF8 RBX|0000000000000001 RCX|0000000000000001 RDX|0000000000171360 RDI|0000000000171330 RSI|0000000000000011 R8 |00000000001742B0 R9 |0000000000171360 R10|0000000000000000 R11|0000000000000246 R12|0000000000000001 R13|0000000000000008 R14|0000000000000000 R15|0000000000000000 RBP|FFFFFFFFFFFFFFCC RSP|000000000060FE58 RIP|00000000004015BC |
|||
01 Oct 2017, 21:26 |
|
fasmnewbie 02 Oct 2017, 19:01
Lines are down some more: New record = 31 L.O.C
I changed the display grid to 4 x 4. Two disadvantages though; 1. RIP is now orphaned 2. On Console and Powershell console, the display is nice. But on normal cmd prompt, the display is a bit off. Advantages: 1. All volatile registers are in the same place (RCX, RDX, R8, R9) for quick lookup 2. RSP display is at the outer edge position, for quick confirmation of the stack Code: ;compile and run in FASMW format PE64 console include 'win64axp.inc' call dumpreg call dumpreg call [getchar] mov rcx,0 call [exit] ;31 lines inclusive of PROC/ENDP proc dumpreg uses rax rcx rdx r8 r9 r10 r11 r12 locals _rax db 'RAX|%p RBX|%p RCX|%p RDX|%p',0ah,0,0,0,0 _rdi db 'RDI|%p RSI|%p R8 |%p R9 |%p',0ah,0,0,0,0 _r10 db 'R10|%p R11|%p R12|%p R13|%p',0ah,0,0,0,0 _r14 db 'R14|%p R15|%p RBP|%p RSP|%p',0ah,0,0,0,0 _rip db 'RIP|%p',0ah,0 regs rq 8 endl mov [regs ],rax mov [regs+ 8],rcx mov [regs+16],rdx mov [regs+24],r8 mov [regs+32],r9 mov [regs+40],r10 mov [regs+48],r11 mov [regs+56],r12 mov r12,rsp sub r12,512 and r12,-16 fxsave [r12] fastcall [printf],addr _rax,[regs],rbx,[regs+8],[regs+16] fastcall [printf],addr _rdi,rdi,rsi,[regs+24],[regs+32] fastcall [printf],addr _r10,[regs+40],[regs+48],[regs+56],r13 fastcall [printf],addr _r14,r14,r15,qword[rbp],addr rbp+16 mov rdx,[rbp+8] sub rdx,5 fastcall [printf],addr _rip fxrstor [r12] ret endp data import library msvcrt,'msvcrt.dll' import msvcrt,\ printf,'printf',\ getchar,'getchar',\ exit,'exit' end data Output Code: RAX|0000000000401000 RBX|0000000000000000 RCX|0000000000284000 RDX|0000000000401000 RDI|0000000000000000 RSI|0000000000000000 R8 |0000000000284000 R9 |0000000000401000 R10|0000000000000000 R11|0000000000000000 R12|0000000000000000 R13|0000000000000000 R14|0000000000000000 R15|0000000000000000 RBP|0000000000000000 RSP|000000000008FF58 RIP|0000000000401000 RAX|0000000000401000 RBX|0000000000000000 RCX|0000000000284000 RDX|0000000000401000 RDI|0000000000000000 RSI|0000000000000000 R8 |0000000000284000 R9 |0000000000401000 R10|0000000000000000 R11|0000000000000000 R12|0000000000000000 R13|0000000000000000 R14|0000000000000000 R15|0000000000000000 RBP|0000000000000000 RSP|000000000008FF58 RIP|0000000000401005 |
|||
02 Oct 2017, 19:01 |
|
Furs 02 Oct 2017, 20:27
Is there a reason you use multiple printf calls instead of just 1 with the entire thing? You could also just push the registers on the stack instead and simply address them from 'rsp' (even the old rsp value could be pushed this way, but push it first so it's not changed). If you push them on the stack, almost all of them (except the first 4) will also be parameters directly, so you can just call the function manually without any macro.
But I don't know if it fits your requirements since these seem obvious Also, do you use data below the stack pointer (fxsave) before a call to printf? That will screw it up. |
|||
02 Oct 2017, 20:27 |
|
fasmnewbie 02 Oct 2017, 22:11
I think "fastcall" corrupts my FXSAVE area in the latest program (31 LOC). It works well in the first two programs. Probably because I use more than 4 arguments for printf. I am not sure far a fastcall would further pushed the stack to accomodate the 5th arguments and beyond. I guess that explains my refusal to use a single printf for the entire display.
Can't move FXSAVE area into the LOCALS either because I want this code to work both in unaligned and aligned stack ecosystem. So I guess the score is back to 35 LOC then. If you can cut it down any further, do share your code. Just use the 2nd dumpreg output to track any unwanted changes to the registers (except of course for the RIP). |
|||
02 Oct 2017, 22:11 |
|
fasmnewbie 02 Oct 2017, 22:42
I revised the last code, but I unfortunately I have to add one more LOC. So I guess the new record is reverted to 32 LOC. If you know anyway to improve it, take this code and share your modifications here.
Code: ;compile and run in FASMW format PE64 console include 'win64axp.inc' entry main section '.text' code readable executable main: call dumpreg call dumpreg call [getchar] mov rcx,0 call [exit] ;32 lines including PROC/ENDP proc dumpreg uses rax rcx rdx r8 r9 r10 r11 r12 locals _rax db 'RAX|%p RBX|%p RCX|%p RDX|%p',0ah,0,0,0,0 _rdi db 'RDI|%p RSI|%p R8 |%p R9 |%p',0ah,0,0,0,0 _r10 db 'R10|%p R11|%p R12|%p R13|%p',0ah,0,0,0,0 _r14 db 'R14|%p R15|%p RBP|%p RSP|%p',0ah,0,0,0,0 _rip db 'RIP|%p',0ah,0 regs rq 8 endl mov [regs ],rax mov [regs+ 8],rcx mov [regs+16],rdx mov [regs+24],r8 mov [regs+32],r9 mov [regs+40],r10 mov [regs+48],r11 mov [regs+56],r12 mov r12,rsp sub rsp,512 and rsp,-16 fxsave [rsp] fastcall [printf],addr _rax,[regs],rbx,[regs+8],[regs+16] fastcall [printf],addr _rdi,rdi,rsi,[regs+24],[regs+32] fastcall [printf],addr _r10,[regs+40],[regs+48],[regs+56],r13 fastcall [printf],addr _r14,r14,r15,qword[rbp],addr rbp+16 mov rdx,[rbp+8] sub rdx,5 fastcall [printf],addr _rip fxrstor [rsp] mov rsp,r12 ret endp section '.idata' import data readable library msvcrt,'msvcrt.dll' import msvcrt,\ printf,'printf',\ getchar,'getchar',\ exit,'exit' |
|||
02 Oct 2017, 22:42 |
|
fasmnewbie 02 Oct 2017, 23:04
If you need verifications at every step of the above code, you can use BASE6.DLL from my BASELIB, and call any routines that you think may assist you with it. An example, using DUMPXMM from base6.dll to verify whether the FXSAVE area is corrupt or not.
Code: ;----------------------------------- ;compile and run in FASMW ;use BASE6.DLL for verification work ;----------------------------------- format PE64 console include 'win64axp.inc' entry main section '.text' code readable executable main: ;sub rsp,40 ;if aligned stack mov rax,109.234 movq xmm0,rax mov rax,9 call [dumpxmm] call dumpregs call dumpregs ;make sure nothing is corrupt ;except RIP mov rax,9 ;to verify FXSAVE/FXRSTOR call [dumpxmm] call [getchar] mov rcx,0 call [exit] ;32 lines including PROC/ENDP proc dumpregs uses rax rcx rdx r8 r9 r10 r11 r12 locals _rax db 'RAX|%p RBX|%p RCX|%p RDX|%p',0ah,0,0,0,0 _rdi db 'RDI|%p RSI|%p R8 |%p R9 |%p',0ah,0,0,0,0 _r10 db 'R10|%p R11|%p R12|%p R13|%p',0ah,0,0,0,0 _r14 db 'R14|%p R15|%p RBP|%p RSP|%p',0ah,0,0,0,0 _rip db 'RIP|%p',0ah,0 regs rq 8 endl mov [regs ],rax mov [regs+ 8],rcx mov [regs+16],rdx mov [regs+24],r8 mov [regs+32],r9 mov [regs+40],r10 mov [regs+48],r11 mov [regs+56],r12 mov r12,rsp sub rsp,512 and rsp,-16 fxsave [rsp] fastcall [printf],addr _rax,[regs],rbx,[regs+8],[regs+16] fastcall [printf],addr _rdi,rdi,rsi,[regs+24],[regs+32] fastcall [printf],addr _r10,[regs+40],[regs+48],[regs+56],r13 fastcall [printf],addr _r14,r14,r15,qword[rbp],addr rbp+16 mov rdx,[rbp+8] sub rdx,5 fastcall [printf],addr _rip fxrstor [rsp] mov rsp,r12 ret endp section '.idata' import data readable library msvcrt,'msvcrt.dll',\ base6,'base6.dll' import base6,\ dumpxmm,'dumpxmm' import msvcrt,\ printf,'printf',\ getchar,'getchar',\ exit,'exit' The output after inserting DUMPXMM from BASELIB. Code: XMM0 : 0.0|109.234 XMM1 : 0.0|0.0 XMM2 : 0.0|0.0 XMM3 : 0.0|0.0 XMM4 : 0.0|0.0 XMM5 : 0.0|0.0 XMM6 : 0.0|0.0 XMM7 : 0.0|0.0 XMM8 : 0.0|0.0 XMM9 : 0.0|0.0 XMM10: 0.0|0.0 XMM11: 0.0|0.0 XMM12: 0.0|0.0 XMM13: 0.0|0.0 XMM14: 0.0|0.0 XMM15: 0.0|0.0 RAX|0000000000000009 RBX|0000000000000000 RCX|0000000000368000 RDX|0000000000401000 RDI|0000000000000000 RSI|0000000000000000 R8 |0000000000368000 R9 |0000000000401000 R10|0000000000000000 R11|0000000000000000 R12|0000000000000000 R13|0000000000000000 R14|0000000000000000 R15|0000000000000000 RBP|0000000000000000 RSP|000000000008FF58 RIP|000000000040101C RAX|0000000000000009 RBX|0000000000000000 RCX|0000000000368000 RDX|0000000000401000 RDI|0000000000000000 RSI|0000000000000000 R8 |0000000000368000 R9 |0000000000401000 R10|0000000000000000 R11|0000000000000000 R12|0000000000000000 R13|0000000000000000 R14|0000000000000000 R15|0000000000000000 RBP|0000000000000000 RSP|000000000008FF58 RIP|0000000000401021 XMM0 : 0.0|109.234 XMM1 : 0.0|0.0 XMM2 : 0.0|0.0 XMM3 : 0.0|0.0 XMM4 : 0.0|0.0 XMM5 : 0.0|0.0 XMM6 : 0.0|0.0 XMM7 : 0.0|0.0 XMM8 : 0.0|0.0 XMM9 : 0.0|0.0 XMM10: 0.0|0.0 XMM11: 0.0|0.0 XMM12: 0.0|0.0 XMM13: 0.0|0.0 XMM14: 0.0|0.0 XMM15: 0.0|0.0 See that XMM0 is not corrupted or well-preserved by FXSAVE/ FXRSTOR even after many uses of "printf" that destroys XMM0 to XMM5. |
|||
02 Oct 2017, 23:04 |
|
Furs 03 Oct 2017, 11:04
Well, before you used r12 below the stack without adjusting rsp, so it messed them up. Now you adjust rsp so it's fine, that's what I meant.
Anyway I probably wouldn't be using macros if I were to code in asm, I prefer more a hands-on approach to asm (otherwise I'd be using HLL )... I'll see if I can come up with something later EDIT: Ok here's a quickie, I only briefly tested it to see if it has values that make sense, I did not test it if it's actually "real" in terms of values (I'd have to debug it), it could be wrong but you get the idea I hope... No macros, but it does use the FASM quirk to push more than one stuff on the same line. Even though I dislike it, I guess I had to use it here just to make it shorter in terms of lines... 17 lines if you exclude the format string (now it's in data section instead of on the stack) Code: format PE64 console include 'win64axp.inc' call dumpreg call dumpreg call [getchar] xor ecx, ecx call [exit] ; 17 lines including label (+6 lines if you include the printf format which is in data section) dumpreg: push rax rcx rdx r8 r9 r10 r11 ; save volatile regs enter 512,0 and rsp, -16 fxsave [rsp] push qword [rbp+8*8] rbp qword [rbp] r15 r14 r13 r12 r11 r10 r9 r8 rsi rdi rdx rax rax rax rax sub qword [rsp+17*8], 5 ; adjust original RIP add qword [rsp+16*8], 9*8 ; adjust original RSP mov r9, rcx lea rcx, [dumpreg_fmt] mov rdx, rax mov r8, rbx call qword [printf] fxrstor [rsp+18*8] leave pop r11 r10 r9 r8 rdx rcx rax ret data import library msvcrt,'msvcrt.dll' import msvcrt,\ printf,'printf',\ getchar,'getchar',\ exit,'exit' dumpreg_fmt db \ 'RAX|%p RBX|%p RCX|%p RDX|%p',10,\ 'RDI|%p RSI|%p R8 |%p R9 |%p',10,\ 'R10|%p R11|%p R12|%p R13|%p',10,\ 'R14|%p R15|%p RBP|%p RSP|%p',10,\ 'RIP|%p',10,0 end data If you want you can obviously compact it further... but anyway. |
|||
03 Oct 2017, 11:04 |
|
fasmnewbie 03 Oct 2017, 14:13
That's a good code, although I am actually asking for a high-level approach / features to dumpreg where everything is done in-house to facilitate modularity. I guess I'll have to make a new category in the competition? Shortest LOC for low-level category?
If you put your data outside, then you can further reduce the lines by transferring those FXSAVE outside the dumpreg body. Cut you 3 or so more LOC. section '.data'... align 16 fxdata rb 512 Try it and lets see how it works. |
|||
03 Oct 2017, 14:13 |
|
fasmnewbie 03 Oct 2017, 14:35
Hi peeps, this is getting more interesting. Now we have 2 categories;
1. Shortest LOC for high-level features. Current record: 32 LOC. Me 2. Shortest LOC for low-level assembly. Current record: 17 LOC (about to be 14 LOC). Furs Why don't u people join in? Pick one category and let the coding gymnastics begins. Will be fun learning experience for beginners too to see how the stack programming works and be abused like hell! The more the merrier |
|||
03 Oct 2017, 14:35 |
|
Furs 03 Oct 2017, 14:50
fasmnewbie wrote: If you put your data outside, then you can further reduce the lines by transferring those FXSAVE outside the dumpreg body. Cut you 3 or so more LOC. (the reason is such data may end up being written by two threads at once, while the stack is unique for each -- the "format" string doesn't change though, so no issues with two threads accessing it at once) |
|||
03 Oct 2017, 14:50 |
|
fasmnewbie 03 Oct 2017, 14:58
Furs wrote:
Based on my observation, you can get rid of 1. enter 2. and rsp,-16 3. leave So you now have 14 LOC. Try it. You're a record holder now. Someone else would break it in the morning. Who knows. Lots of evil talents here on this board. Try to protect it. |
|||
03 Oct 2017, 14:58 |
|
revolution 04 Oct 2017, 00:00
Don't forget about vprintf. No need to push the registers twice.
|
|||
04 Oct 2017, 00:00 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.