Message board for the users of flat assembler.
> Windows > win64 64 bit source samples, executables
Goto page Previous 1, 2, 3, 4, 5, 6, 7 Next
Feryno 04 Aug 2005, 05:04
Last night I finished drivers for win64 with FASM.
What differs from win32 is calling API qword from section .rdata and exchange rva ImportAddress and rva ImportLookup in INIT section.
But maybe as well on win32 this must be called from .rdata because IDA report IMPORT section seems to be destroyed on r0pc.sys for win32 posted somewhere on this forum that have call api to dword from INIT section.
Thanx to Tomasz Grysztar for help and motivation because his work is very big, great and hard and it's a big motivation for me to finish drivers in FASM for win64.
Thanx for biew.exe which help me to find how to make IMPORT sections correct.
See this history of finding on attached file. First working driver beeper64.asm is compiled with microsoft compiler. Next two FASM produced beeper64_2.asm beeper64_3.asm have bad import section. From the beeper64_4.asm sections are correct.
format PE64 native 5.02 at 10000h
section '.text' code readable executable notpageable
; rcx=pDriverObject rdx=pDriverPath
xor eax,eax ; success exit code
section '.rdata' readable notpageable
imp_HalMakeBeep dq rva szHalmakebeep
section 'INIT' data import readable notpageable
dd rva ImportAddress; dd rva ImportLookup
dd rva szHal_dll
dd rva ImportLookup; dd rva ImportAddress
times 5 dd 0
dq rva szHalmakebeep
szHalmakebeep dw 0
szHal_dll db 'HAL.dll',0
Last edited by Feryno on 10 Aug 2005, 09:11; edited 2 times in total
|04 Aug 2005, 05:04||
Tomasz Grysztar 09 Aug 2005, 21:58
The package of Win64 driver examples is now in the official examples section.
|09 Aug 2005, 21:58||
Feryno 10 Aug 2005, 08:59
Please download drivers from FASM official examples section. Official version include more usefull stuff and included utilities have every absolute addressing (my vice from old win32 style of coding) replaced with correct relative addressing.
Driver examples include:
- Driver without call API, only write to a ports - make a beep
- Driver with call API, make a beep
- Driver for read and write to a ports and execute ring0 protected instructions in ring3 user mode programs - analogy of r0pc.sys posted somewhere on this forum. This driver support stop, because it has implemented routines for this (you needn't restart win for repeated use as by both beepers)
- stop_drv.exe (note, both beeper drivers can't be stopped, beepers don't have stop routine)
- write_device.exe - Sample how to communicate with a05.sys driver from user mode program
*.bat files for howto use drivers and utilities
|10 Aug 2005, 08:59||
Feryno 26 Aug 2005, 05:02
I finished skeleton for debugger (dbg01.exe in attach). It hasn't interface nor window nor interaction is possible, it display nothing... only simple process that debug another program. It put one breakpoint to the start offset of debugged exe and after process it, it leave exe to run and to terminate. Nothing great, skeleton only... a few of comments, not realy good method for set startup breakpoint (I have a correct idea - described in the source, but isn't finished yet - must be calculated, not every exe has it 401000h...)
Jeremy Gordon has a great idea for port GoBug to win64.
Microsoft WinDbg 64-bit is very good and enough for every asm programmer.
Is somebody here who interest in win64 debugger or want to cooperate with development?
Last edited by Feryno on 30 Aug 2005, 04:39; edited 1 time in total
|26 Aug 2005, 05:02||
Feryno 29 Aug 2005, 05:07
Prog.exe in above debug.zip has a bug with MOVDQA instruction to unaligned memory.
The interesting thing is:
When you boot windows and you load prog.exe by small debugger included in debug.zip, debugger fail (it hasn't implemented exception handling yet). If you load it by dbg02.exe included in new dbg_2005_08_28.zip, you will be informed about exception code caused by prog.exe at xxxxmemory, where MOVDQA execute with nonaligned memory. Prog.exe can't continue normaly. If you will retry debug, you will be informed with exception everytime.
2. If you simply run prog.exe from debug.zip (or prog_cause_exception.exe from dbg_2005_08_28.zip) outside debugger, windows run it normaly, every is silent, you aren't informed about exception and it looks like windows patch this instruction with MOVDQU (move to unaligned memory) because MessageBox following after MOVDQA show correct content of xmm register which is transferred to messagebox text correctly by MOVDQA/MOVDQU (xmm hold 'ASCII' value, no binary or floating).
After you successfuly run exe in win outside debugger, if you launch this exe inside debugger (=as a debuggee), you will be never informed in the debugger about this exception in this exe until you reboot windows. Program running as the debuggee continue correctly after MOVAPS and messagebox show correct content of xmm register as well.
If you want to replay this, you needn't reboot win, simply copy or rename exe with exception to another filename.
Well, development of debugger in asm continue, here you have new version, first interaction is possible by lazy method, single stepping. You have only 2 possibilities - single step or leave prog to run. Use dbg02_interactive.exe for this. Debugger show registers after every step. It hasn't disasm, so you can't see nothing from instructions, only RIP (instruction pointer) show you where are you in debuggee (or in kernel if you step to an API). This debugger inform you about exceptions, but don't handle them (isn't implemented yet). Very very very few things are implemented here, it's only an experimental debugger, something like a game for small children...
Last edited by Feryno on 01 Sep 2005, 11:47; edited 1 time in total
|29 Aug 2005, 05:07||
Feryno 01 Sep 2005, 11:42
Example from windbg64:
ModLoad: 000007ff`7d830000 000007ff`7d85c000 C:\WINDOWS\System32\CSCDLL.dll
(e3c.e40): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
*** ERROR: Symbol file could not be found. Defaulted to export symbols for C:\WINDOWS\system32\COMDLG32.DLL -
000007ff`7d36358f 660f7f442420 movdqa oword ptr [rsp+0x20],xmm0 ss:00000000`0006e578=fffffadfd92fcef00000000000050003
You can see in API trying movdqa to [RSP+20] when rsp=6E578h.
You cannot see this outside debugger. If you simple run program, you aren't warned about exception, program silently run correct.
This may be somewhere in msdn documentation, but I find this on myself mistakes... (I'am lazy to study)
SEH in win64:
You can avoid API stack misalignment after read next:
Misaligment don't cause program error because win64 patch movdqa with modqu by exception error handler, you aren't warned about exception error.
Problem is only by debugging, while debugger set own SEH which don't patch movaps with movups.
Jeremy Gordon has reported to me win64 sensitivity to align 16, he must somewhere use
to align stack before API. I don't find this sensitivity by me by about 5 month of work with win64 assembler. But I know now, that my trouble with GetCursorPos and CreateProcess was caused by misalignment.
Win64 API stack align 16 sensitivity stay here from win64 developer versions up to today. It looks like microsoft add patch routine to exception handling for replace movaps with movups which don't cause exception if stack misalignment.
You can find exceptions occured by API only if you run API in debugger first time. If you first run executable with API outside debugger, you don't find exceptions by debugger until win reboot. Once API patched (the only way is to run exe with API outside of debugger), exception in debugger don't occur until reboot.
As the conclusion, stack misalignment is transparent for every end user. But programmer using debugger may encounter it.
We can avoid this by sub rsp,8 at start and subtract nonparity power of 8 at every subroutine prolog.
Executable startup rsp is like 6FF78h
At start, stack is off 16-byte alignment by 8 bytes. We must correct the stack alignment, at time of CALL must be 16-byte aligned.
Subroutines look similar because return qword value is on align 16 boundary, so stack is off align 16 by 8 bytes.
Note that only some API use xmm with movdqa
and only some API destroy [RSP+0],[RSP+8],[RSP+16],[RSP+24]
but we must allways thing of this.
Correct examples of use API
|01 Sep 2005, 11:42||
Feryno 23 Sep 2005, 06:36
For debugging, I suggest to you to use microsoft win64 debugger http://www.microsoft.com/whdc/devtools/debugging/install64bit.mspx
Here is a simple debugger in FASM, that has implemented only very few functions (single step, step over, run) (close isn't finished yet), it hasn't disassembler, you see only raw binary opcodes, not instructions. You can see registers, but you can't modify them yet...
Some about asm debugger structure:
1. self debugger - parent process, has window message loop
2. thread - child of 1. - has debug message loop, for don't waste CPU time it wait in suspended state until parent (1.) set e.g. single step and resume thread
3. debuggee - child of 2.
edit 2012-10-01 deleted attachment, reached quota limit
Last edited by Feryno on 01 Oct 2012, 09:47; edited 2 times in total
|23 Sep 2005, 06:36||
r22 25 Sep 2005, 03:23
quick question about win 64
What registers need to be preserved in the factcall convention?
IE 32bit EBX,EDI,ESI,EBP
Is it the same for 64bit?
I'm going to be making an encryption dll (dont have 64bit system yet)
and I want to make sure it conforms to windows calling.
|25 Sep 2005, 03:23||
Feryno 26 Sep 2005, 06:51
|26 Sep 2005, 06:51||
Feryno 14 Oct 2005, 05:06
Next experimental development version of debugger:
It's design is (and will be) similar to legendary Borland Turbo Debugger
File - Open debuggee, Close debuggee
Run - Run, Trace into, Step over, Execute to
View - nothig finished until now
Change - General Purpose Registers, Flags
Breakpoints - At (=software breakpoint), Hardware breakpoint
Not supported until now:
File - Attach process, Open debuggee command line parameters
Run - Go to position, Program reset
View - everything needs to begin
Change - XMM registers, Flags hexa, Memory (change/save/load)
Breakpoints - View, Toggle, Delete all
The worst thing at the end: Disassembler part was started only now, finished max 1%, you can see opcode disassembled as "db XX" the most.
Will anybody cooperate on it? Will anobody need it? Microsoft win 64 debugger is good enough, but is too big (10 MB) and you must enter everything by command. FASM debugger needn't and won't need to type commands at shell line.
removed attachment because of quota
Last edited by Feryno on 02 Aug 2013, 08:58; edited 2 times in total
|14 Oct 2005, 05:06||
decard 14 Oct 2005, 07:46
That's very useful project. I will surely need it, only I have to finally buy WinXP 64...
|14 Oct 2005, 07:46||
Reverend 15 Oct 2005, 12:14
Yeah, project sound great. unfortunately I am still on 32bit machine and it doesn't seem to change in the near future . But keep up the good work!
|15 Oct 2005, 12:14||
Feryno 25 Oct 2005, 13:09
It seems like SetThreadContext win64 API forgets to set bit 1. of RFlags = 1
( http://www.sandpile.org/aa64/rflags.htm )
But AMD64 CPU sets this bit automaticaly after execute first instruction, e.g. single step. It's only irrelevant and silent smallness in win64 API.
Typical value of Rflag register looks like 00000202h (Interrupt flag = 1, bit 1. = 1)
After SetThreadContext and GetThread context, Rflag is 00000200h (only Interrupt Flag = 1). After execute single step, Rflag = 00000202h again - until SetThreadContext forget to set bit 1. = 1
Now again, it's irrelevant detail only, doesn't matter in real assembler coding... I wasted a little of the time until find what happens.
You can test this with attached files. Run dbg.exe and load prog_for_test_disassembler_04.exe as a debuggee and do a few of single steps. You will see Flags with bit 1. = 0 because dbg10.exe use SetThreadContext (when remove Trap Flag in handle_single_step routine) and GetThreadContext before display Flags value. But when you single step PUSHF instruction, you will see e.g. 00000302h pushed into the stack (Interrupt bit = 1, TrapFlag bit = 1 , bit 1. = 1)
removed attachment because of quota
Last edited by Feryno on 02 Aug 2013, 08:58; edited 1 time in total
|25 Oct 2005, 13:09||
vid 25 Oct 2005, 14:12
what's bit 1 for? reserved to stay = 1?
|25 Oct 2005, 14:12||
MazeGen 25 Oct 2005, 14:30
Yes, it is hardcoded, probably because of some legacy issues.
|25 Oct 2005, 14:30||
vid 25 Oct 2005, 15:48
so it resets back to 1 after each instruction?
|25 Oct 2005, 15:48||
MazeGen 25 Oct 2005, 17:21
Sorry, I didn't read the original post carefully.
What Feryno said is weird, because bit 1 in rFlags seems to be not programmable (just like in old 8086):
AMD64 Programmer’s Manual Volume 1: Application Programming, chapter 3.1.4 Flags Register wrote:
|25 Oct 2005, 17:21||
Feryno 26 Oct 2005, 05:19
I think, that win64 isn't able to clear bit 1. of rflags in CPU. It's only small mistake in OS task switch structures.
You can find it by this typical way:
4. execute single step in debuggee
5. debuggee is suspended
6. you call GetThreadContext (with CONTEXT_ALL flag of course) and you get rflags of ThreadContext structure = e.g. 00000302h
7. you call SetThreadContext (again with CONTEXT_ALL)
8. you call GetThreadContext (again with CONTEXT_ALL) for reread ThreadContext and... ooooops, you get rflags 00000300h
9. you resume debuggee (API call ContinueDebugEvent, WaitForDebugEvent), so one or more instructions execute (single step or breakpoint) and now - if you call GetThreadContext, bit 1. of rflag is set (=1).
You can try now 6., 7., 8 again. Everytime in 6. you will get bit 1. = 1 and in 8. you will get bit 1. = 0
Because debuggee in 6., 7., 8. is suspended, I suppose, that ThreadContext isn't reloaded to CPU. Bit 1. maybe stay cleared (=0) only in internal win64 task switch structures... After debuggee is resumed, CPU reloads registers as well rflags from task switch structures of win64 and maybe isn't able to clear bit 1. of rflags.
Maybe this same occurs in win32 as well in win64...
|26 Oct 2005, 05:19||
Feryno 22 Nov 2005, 06:22
Disassembler part of debugger is done in stage of most frequent instructions. Nor FPU nor SSE instructions implemented yet.
Please report disassembler mistakes to my mail or in this forum.
Some API used in dbg is very slow, I must identify them. I suppose some ListView API - about 2 single steps per 1 second if you hold down F7 - very slow... (dbg09 was much faster without ListViews using paint)
Some functions not finished (remove breakpoint, attach process...).
edit day after:
3 disassembler mistakes fixed, updated attachments:
removed attachment because of quota
Last edited by Feryno on 02 Aug 2013, 08:59; edited 1 time in total
|22 Nov 2005, 06:22||
alorent 05 Dec 2005, 23:07
I'm new into this forum and just wanted to say that Feryno is doing a great job in the Win64 debugger
I was doing a bit of testing on it. It looks the "INT x" instructions are not decoded OK. "CD03 ---> INT 00"
One question, why don't you divide the whole project in several ASM files? So, I'm sure that it will be easier for us to join this project...something like:
I hope this debugger will shadow OllyDbg for Win32
Keep the good work!!!
|05 Dec 2005, 23:07||
|Goto page Previous 1, 2, 3, 4, 5, 6, 7 Next
< Last Thread | Next Thread >
Copyright © 1999-2023, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.