flat assembler
Message board for the users of flat assembler.
Index
> Main > Saving SSE state Goto page 1, 2 Next |
Author |
|
system error 24 Feb 2015, 23:46
I tested the code on Windows 8 (64-bit).
|
|||
24 Feb 2015, 23:46 |
|
HaHaAnonymous 24 Feb 2015, 23:54
[ Post removed by author. ]
Last edited by HaHaAnonymous on 28 Feb 2015, 17:51; edited 1 time in total |
|||
24 Feb 2015, 23:54 |
|
system error 25 Feb 2015, 00:14
HaHaAnonymous wrote:
LOL. I forgot about that. Thanks. one more question if u don't mind. why would the putchar (from msvcrt.dll) mess up with the SSE registers? This is insanely annoying because everytime I use it, it clears xmm0 to xmm5 for no reason. |
|||
25 Feb 2015, 00:14 |
|
HaHaAnonymous 25 Feb 2015, 00:22
[ Post removed by author. ]
Last edited by HaHaAnonymous on 28 Feb 2015, 17:51; edited 1 time in total |
|||
25 Feb 2015, 00:22 |
|
system error 25 Feb 2015, 00:36
HaHaAnonymous wrote:
That is annoying. I don't understand why would a function as simple as a putchar really need to mess up with big fat extended registers. FAIL design! LOL |
|||
25 Feb 2015, 00:36 |
|
Tyler 25 Feb 2015, 01:15
Regardless of whether it does or doesn't, you shouldn't depend on it not messing with it unless it is guaranteed by the calling convention.
|
|||
25 Feb 2015, 01:15 |
|
system error 25 Feb 2015, 01:47
Tyler wrote: Regardless of whether it does or doesn't, you shouldn't depend on it not messing with it unless it is guaranteed by the calling convention. That's a problem, for example, if you are creating a general-purpose library where a simple routine like dispChar (which is central to information retrieval in text-based and string-based system) would have to deal with saving and restoring the SSE state every time. For a 100 routines that depend on one dispChar, one will have to deal with 100 times of saving / restoring the XMMs. This bloatness is contagious and really is a BAD design. I am glad Linux don't share this disease. |
|||
25 Feb 2015, 01:47 |
|
revolution 25 Feb 2015, 02:24
Then don't store your data in the XMM registers. There is no sense in saving and restoring it 100 times when instead you can put it in memory once and read it when required.
Your argument could be extended to RAX, or any other register. There comes a point where a trade-off has to be made. If you saved everything across all system calls then things get saved far more often. But the saving is hidden behind the inscrutable OS call routine so people don't realise about all the extra work that is being done. Whether your code saves it, or the OS code saves it, makes no difference to the performance. But if you can have the OS save less things and the user code only save things when needed then you get a performance boost. |
|||
25 Feb 2015, 02:24 |
|
system error 25 Feb 2015, 04:23
revolution wrote: Then don't store your data in the XMM registers. There is no sense in saving and restoring it 100 times when instead you can put it in memory once and read it when required. idk revo. I can live with the OS use up most gp registers. But involving fat registers such as XMM and YMM just to print a character is totally incomprehensible. You lose your XMM content right after the next unrelated line like this; Code: movdqa xmm0, dqword [byebye] cinvoke putchar ;----> bye bye XMM0 putchar is not even an FP routine! That should not happen because XMMs are specialized registers where user codes should be given more priority. IMO, XMM registers are for the users / applications. Not for the OS. If the OS take up most of the GP registers, then XMMs should be left alone for user codes to use. |
|||
25 Feb 2015, 04:23 |
|
system error 25 Feb 2015, 04:43
Ok I get it. Calling a putchar from msvcrt is like calling the entire COUNTER-STRIKE game to load! Because they both use SSE registers and they're both MATH-intensive. LOOOLL!
|
|||
25 Feb 2015, 04:43 |
|
revolution 25 Feb 2015, 06:35
XMM is used for more than just floating point. They can also be used for integer arithmetic/boolean operations and for general data movement. I assume putChar places a character on the screen for viewing? If so then naturally that involves copying the character's bitmap data to the display memory so why not use XMM and do it efficiently?
|
|||
25 Feb 2015, 06:35 |
|
system error 25 Feb 2015, 08:34
revolution wrote: XMM is used for more than just floating point. They can also be used for integer arithmetic/boolean operations and for general data movement. I assume putChar places a character on the screen for viewing? If so then naturally that involves copying the character's bitmap data to the display memory so why not use XMM and do it efficiently? We are talking about CLI-based character rendering here revo (which I believe have their own Font ROM in video). Not a graphical based MS-Word. BIOS don't need sse register to display a 'C' nor did DOOM engine. They all work efficiently without SSE registers. I am not even calling for a blinking text. Just a plain char to the DOS and I have to go thorough all the entire SSE documentation for that?? LOL |
|||
25 Feb 2015, 08:34 |
|
revolution 25 Feb 2015, 08:50
The ROM fonts would only be active in full screen text mode in older versions of Windows OS (note that newer versions don't support this mode). But Windows can also display the console in graphics mode. Perhaps it makes a graphical copy also in case the user turns off full screen text? Anyhow, I don't know the details of what Windows does exactly but it does still follow the calling convention as stated above so there is no error or bug. And we can't go around having some functions follow one set of rules and others follow different sets of rules, it would all get to confusing and disorderly. If you really need to know exactly what Windows uses the lower XMM registers for then you can use a debugger to discover what it is doing.
|
|||
25 Feb 2015, 08:50 |
|
Feryno 26 Feb 2015, 13:53
system error wrote: why would the putchar (from msvcrt.dll) mess up with the SSE registers? This is insanely annoying because everytime I use it, it clears xmm0 to xmm5 for no reason. On x64 ms windows kernel, ring3 switches into ring0 using syscall instruction which transfers execution to KiSystemCall64 no matter 64 bit app or 32 bit app running in compatibility submode of long mode. On return back from ring0 to ring3, the procedure name is KiSystemServiceExit which at the end executes something like this: Code: pxor xmm0,xmm0 pxor xmm1,xmm1 pxor xmm2,xmm2 pxor xmm3,xmm3 pxor xmm4,xmm4 pxor xmm5,xmm5 mov rcx,[rbp+CONTEXT.RIP] ; get RIP pointing after the syscall instruction mov r11,[rbp+CONTEXT.RFLAGS] ; RFLAGS mov rbp,r9 mov rsp,r8 swapgs sysretq Under ms windows x64, nonvolatile xmm registers are xmm6...xmm15. To access xmm8...15 your executable must not be 32 bit, you have to update it to x64. |
|||
26 Feb 2015, 13:53 |
|
revolution 26 Feb 2015, 15:09
So I guess that the pxor instructions are a security precaution to ensure no out-of-process information leakage in case those registers were used for something sensitive.
|
|||
26 Feb 2015, 15:09 |
|
system error 26 Feb 2015, 15:56
Feryno wrote:
Yeah. Actually the code runs perfectly on a 32-bit CPU. The only problem is when I run it on a 64-bit OS. That means now I have to do manual save and restore whenever a string and a math routine cross path. |
|||
26 Feb 2015, 15:56 |
|
system error 26 Feb 2015, 16:05
revolution wrote: The ROM fonts would only be active in full screen text mode in older versions of Windows OS (note that newer versions don't support this mode). But Windows can also display the console in graphics mode. Perhaps it makes a graphical copy also in case the user turns off full screen text? Anyhow, I don't know the details of what Windows does exactly but it does still follow the calling convention as stated above so there is no error or bug. And we can't go around having some functions follow one set of rules and others follow different sets of rules, it would all get to confusing and disorderly. If you really need to know exactly what Windows uses the lower XMM registers for then you can use a debugger to discover what it is doing. Regardless of calling convention, I think SSE registers should be left alone, at least in the string or char routines. They can find excuses in DirectX routines, but a putchar?? LOL. Its an overkill. |
|||
26 Feb 2015, 16:05 |
|
revolution 26 Feb 2015, 16:11
You need to tell MS about your suggestion. Perhaps they will like it and make a new calling convention and apply it to all string and char routines and then tell everyone to change there C codes for the new convention.
|
|||
26 Feb 2015, 16:11 |
|
system error 26 Feb 2015, 16:17
revolution wrote: So I guess that the pxor instructions are a security precaution to ensure no out-of-process information leakage in case those registers were used for something sensitive. |
|||
26 Feb 2015, 16:17 |
|
Goto page 1, 2 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.