flat assembler
Message board for the users of flat assembler.
![]() |
Author |
|
B.Dubious 07 Apr 2020, 02:06
I'm trying to use the function StretchDIBits.
https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-stretchdibits But I'm getting an 'undefined symbol' on DIB_RGB_COLORS. I should be linking against the gdi32.dll so why am I getting the undefined symbol? Code: library kernel,'kernel32.dll',\ user,'user32.dll',\ gdi, 'gdi32.dll' I would also like to know if I'm doing the following things properly, when using local/endl I have the following issue. Code: locals Width dd 0 endl mov eax, [rdx+16] mov [Width], eax ;mov [Width], [rdx+16] ;cant do this and I cant seem to do an invoke with something like [rdx+16] as a parameter either, having to first mov [rdx+16] into a register or variable? When should I used stdcall vs invoke? Sorry for the noob questions, and thanks for any help you can offer ![]() |
|||
![]() |
|
B.Dubious 07 Apr 2020, 02:59
Oh, I didn't realise that constants weren't in the dll. Why is it that I'm able to use some constants like MEM_RELEASE but I'm not able to use DIB_RGB_COLORS ?
That makes sense, thanks. I should have read up on addressing modes more closely, I'm trying to hit the ground running but I'm moving with all the grace of a duck. I already used some c code and the debugger to check what the constant was but for knowledge sake I needed to ask. And given that I'm still just getting used to the basics I apologise for this next question, but I'm confused about when it's wise to use branch-less code using conditional moves etc, versus when to use branches and let the branch predictor do its magic. I was under the impression that branch predictors these days were right like 995/1000 and that the cost of missing on the speculative execution was generally minimal. Is that mistaken? Is branch-less code really that much more efficient? Are there any costs with intermixing avx and sse stuff? Is there any documentation on windows constants, or should I just keep using the debugger to figure it out? You're fast becoming my favourite person, thanks for your help and patience ![]() |
|||
![]() |
|
revolution 07 Apr 2020, 03:33
If you used MEM_RELEASE before without a problem then that means one of your included files has defined it for you.
Don't worry about the branch predictor until you have your code working and tested. When need to optimise it to run faster on your chosen system then test it. To answer your question about its efficiency, it depends upon your access patterns. Some access patterns are well suited to such prediction, other patterns have terrible performance. You have to test your code in both ways to see if it helps you or harms you. Windows constants are officially documented in the Windows SDK includes, but many websites will also have some values. For AVX and SSE it depends upon your CPU and system. Some CPUs will reduce the clock when using AVX2, some don't. Your system might overheat with the increased workload. Test your system to see how it is affected. |
|||
![]() |
|
B.Dubious 07 Apr 2020, 04:30
I see, I've had a look and I notice that win64a.inc includes a whole lot of things, it just happens that the gdi file doesn't have DIB_RGB_COLORS.
I'll read over them and see what is included so that I won't make the same error in future. That's generally how I do things, I like to get things going, but lately I've been questioning my mental model in regard to this topic. I really need to read up on micro-architecture, but most things I see are a few years old, and some things I've seen recently are conflicting in their advice. But I do recall that in a video Fabian Giesen said pretty much the same thing. I'll add learning about the branch prediction stuff to my todo for later. Reduce the clock? Why do they do that? Do you know of any particularly good resources for learning more about uarchitecture and the likes? |
|||
![]() |
|
revolution 07 Apr 2020, 04:41
Clock reduction is to keep the power requirements within the specification. The AVX circuits can consume a lot of power and this can overload the power delivery. Or something like that anyway. Basically the CPU is expecting to do a lot more computational work so that equates to a lot more power drawn.
|
|||
![]() |
|
B.Dubious 07 Apr 2020, 05:11
Is the same clock reduction true for sse and mmx? Can the clock reduction reduce performance in some instances? I've never heard this mentioned before, it's quite interesting.
|
|||
![]() |
|
revolution 07 Apr 2020, 05:21
Each CPU is different. And the behaviours change as new models come and go.
FWIW I've not seen any CPUs that alter the clock for instructions older than AVX2. But just because I haven't seen it doesn't mean it doesn't happen, or won't happen in the future. As always check your CPU specs to see if it affects you. |
|||
![]() |
|
B.Dubious 07 Apr 2020, 05:40
I wonder if there are any flags in the CPUID that could be checked for such things. I'll read into it further. Thanks again for all the great info
![]() |
|||
![]() |
|
revolution 07 Apr 2020, 05:45
B.Dubious wrote: I wonder if there are any flags in the CPUID that could be checked for such things. |
|||
![]() |
|
B.Dubious 07 Apr 2020, 19:27
I'll definitely look into it. At the moment I'm being driven rather crazy, I've been trying to debug this for hours and I've encountered so many different mistakes. I've learned a lot, but although I felt like I was coming to grips with this style of programming, I seem to have hit a brick wall!
I'm trying to call StretchDIBits, but it keeps returning 0. At the current state I've pretty much rewritten the whole thing from scratch again just to make sure that I wasn't missing mistakes I made that I wouldn't now that I'm a little more used to looking at disassembly in the debugger for ages! Now I'm GetLastError returns 6, which is invalid handle. The only handle is the dc, but even if I call GetDC with the HWND that I stored from the window creation, it still fails with a 6. I don't even know what to do about it right now, it's very frustrating. As far as I can see all the arguments to the call are fair and valid. |
|||
![]() |
|
revolution 07 Apr 2020, 20:19
Stop your debugger just before the call and examine all the values.
Looking at the disassembly is fine, but stepping through and watching it in action is a very powerful technique to spot problems. |
|||
![]() |
|
B.Dubious 07 Apr 2020, 20:29
I did that, I kept introducing problems and then fixing them again trying to work out what was wrong, went on for hours, was quite painful! As far as I can see everything is as it should be... I'm probably too tired, I'll try again tomorrow. I did briefly get my heart racing when I ran to the op after the call and saw rax change, but it was crashing inside the function, haha
I've just been using int3 a lot and moving things into registers to watch them. It's quite tricky to follow the disassembly, used to long variable names! |
|||
![]() |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2023, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.
Website powered by rwasa.