flat assembler
Message board for the users of flat assembler.

Index > OS Construction > task switch & SIMD register

Author
Thread Post new topic Reply to topic
N-LG



Joined: 14 Feb 2019
Posts: 40
Location: france
N-LG 02 Mar 2019, 11:47
I am currently studying how to back up the SIMD registers at each task swap. I only found the FXSAVE and FXRSTOR intructions, is there another solution?

currently I use the backup of the general registry on the stack, I looked at using the TSS but nothing seems to be expected for the SIMD registry
Post 02 Mar 2019, 11:47
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20486
Location: In your JS exploiting you and your system
revolution 02 Mar 2019, 12:00
It is common for OSes to not save/restore the FPU/SSE/AVX registers upon a task switch. Usually they employ lazy save/restore when the new task starts to use the FPU. This is triggered by disabling the FPU and waiting for each task to start using it. That way if you only have a single task using the FPU then there is no time wasted to keep saving and restoring the FPU registers.
Post 02 Mar 2019, 12:00
View user's profile Send private message Visit poster's website Reply with quote
N-LG



Joined: 14 Feb 2019
Posts: 40
Location: france
N-LG 02 Mar 2019, 12:35
in order not to save these registry unnecessarily, I thought to assign to each task a level of use of the tasks and to make the backups according to that

the level of use will also allow not to perform tasks that would not be compatible with the instruction set of the processor. (all this is not done yet)
Post 02 Mar 2019, 12:35
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20486
Location: In your JS exploiting you and your system
revolution 02 Mar 2019, 12:43
If you know all of your tasks and what they use then a list will work. But it can lead to inefficiency is some circumstances. For example, maybe a task doesn't use the FPU during the upcoming time-slice, and then you would save and restore the FPU state even though it wasn't needed.
Post 02 Mar 2019, 12:43
View user's profile Send private message Visit poster's website Reply with quote
N-LG



Joined: 14 Feb 2019
Posts: 40
Location: france
N-LG 02 Mar 2019, 13:02
I did not study the FPU (I did not need it yet) and I did not know it could be disabled.
when the FPU is disabled and the proc is trying to execute a FPU instruction, does it generate an exception? it is the #NM (Device Not Available) exception? does it work the same way for the SIMD unit?
Post 02 Mar 2019, 13:02
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20486
Location: In your JS exploiting you and your system
revolution 02 Mar 2019, 13:45
I use the term FPU as a proxy for all the float related stuff FPU/SSE/AVX. They are disabled by setting a bit in the control register (CR0 I think?). Then you get an exception if a task attempts to execute something that needs the FPU/SEE/AVX unit.

If you have the concept of which task "owns" the FPU then you can use that to determine where to save and restore its state.
Post 02 Mar 2019, 13:45
View user's profile Send private message Visit poster's website Reply with quote
N-LG



Joined: 14 Feb 2019
Posts: 40
Location: france
N-LG 03 Mar 2019, 17:21
I think that I will use a signal emitted by the task to signal that it starts or finish using the FPU / SIMD unit, so that the system knows when to save the registers
however I do not know if there is any other way than FXSAVE / FXRSTOR to backup / restore the registry. I searched but I do not see other ways
Post 03 Mar 2019, 17:21
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20486
Location: In your JS exploiting you and your system
revolution 03 Mar 2019, 18:12
There is also XSAVE, XRSTOR and XSAVEOPT
Post 03 Mar 2019, 18:12
View user's profile Send private message Visit poster's website Reply with quote
N-LG



Joined: 14 Feb 2019
Posts: 40
Location: france
N-LG 03 Mar 2019, 21:27
ok, this is the extension of FXSAVE / FXRSTOR for AVX, I note for later, already I must get to use properly SSE first before trying AVX
thank you for all these indications
Post 03 Mar 2019, 21:27
View user's profile Send private message Reply with quote
dunkaist



Joined: 31 Jul 2015
Posts: 24
dunkaist 03 Mar 2019, 21:28
You have to use XSAVE family instructions to save AVX context. You can't use lazy context switching via CR0.TS with XSAVE*, by design.
Post 03 Mar 2019, 21:28
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8363
Location: Kraków, Poland
Tomasz Grysztar 03 Mar 2019, 21:53
dunkaist wrote:
You can't use lazy context switching via CR0.TS with XSAVE*, by design.
Wasn't that a misinterpretation?
Post 03 Mar 2019, 21:53
View user's profile Send private message Visit poster's website Reply with quote
dunkaist



Joined: 31 Jul 2015
Posts: 24
dunkaist 06 Mar 2019, 22:41
Tomasz Grysztar wrote:
dunkaist wrote:
You can't use lazy context switching via CR0.TS with XSAVE*, by design.
Wasn't that a misinterpretation?

I was actually thinking about 'CR0.TS vs AVX' but wrote 'CR0.TS vs XSAVE'. And you, Tomasz, somehow felt that and replied with a link to 'CR0.TS vs AVX' patch .
It turns out even AVX512 context can be lazily switched. Thank you for pointing me to this!

Just for completenes, 'CR0.TS vs XSAVE' is possible though not recommended.
Intel wrote:
Vol 3, 13.4 DESIGNING OS FACILITIES FOR SAVING X87 FPU, SSE AND EXTENDED STATES ON TASK OR CONTEXT SWITCHES

...

The use of lazy restore mechanism in context switches is not recommended when XSAVE feature set is used to save/restore states for the following reasons.
  • With XSAVE feature set, Intel processors have optimizations in place to avoid saving the state components
    that are in their initial configurations or when they have not been modified since they were restored last.
    These optimizations eliminate the need for lazy restore. See section 13.5.4 in Intel® 64 and IA-32 Archi-
    tectures Software Developer’s Manual, Volume 1.

  • Intel processors have power optimizations when state components are in their initial configurations. Use of
    lazy restore retains the non-initial configuration of the last thread and is not power efficient.

  • Not all extended states support lazy restore mechanisms. As such, when one or more such states are
    enabled it becomes very inefficient to use lazy restore as it results in two separate state restore, one in
    context switch for the states that does not support lazy restore and one in the #NM handler for states that support lazy restore.
Post 06 Mar 2019, 22:41
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.