flat assembler
Message board for the users of flat assembler.

Index > Compiler Internals > New instructions

Goto page 1, 2  Next
Author
Thread Post new topic Reply to topic
Jin X



Joined: 06 Mar 2004
Posts: 70
Location: Russia
Jin X
Hello!

There're new instructions described in Intel SDM (October 2019):

CLRSSBSY, ENDBR32, ENDBR64, INCSSPD/INCSSPQ
RDSSPD/RDSSPQ, RSTORSSP, SAVEPREVSSP, SETSSBSY
VPCOMPRESSB/VCOMPRESSW, VPDPBUSD, VPDPBUSDS, VPDPWSSD, VPDPWSSDS, VPEXPANDB/VPEXPANDW, VPOPCNT, VPSHLD, VPSHLDV, VPSHRD, VPSHRDV, VPSHUFBITQMB, VPDPWSSDS, WRSSD/WRSSQ, WRUSSD/WRUSSQ

All of them are absent in fasm 1 now Wink


Last edited by Jin X on 27 Nov 2019, 13:11; edited 2 times in total
Post 27 Nov 2019, 12:57
View user's profile Send private message Reply with quote
Jin X



Joined: 06 Mar 2004
Posts: 70
Location: Russia
Jin X
So, VPSHLD is present but as XOP instruction.
Intel's VPSHLD is a group of AVX-512(VBMI2/VL) instructions defined as VPSHLDW, VPSHLDD and VPSHLDQ Smile
Some other instructions listed above is also group of instructions. See Intel SDM for more info.
Post 27 Nov 2019, 13:08
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 7797
Location: Kraków, Poland
Tomasz Grysztar
You can get them for fasmg right now. But this should also pave the way for a future implementation in fasm 1.
Post 27 Nov 2019, 16:31
View user's profile Send private message Visit poster's website Reply with quote
Jin X



Joined: 06 Mar 2004
Posts: 70
Location: Russia
Jin X
I don't need it right now. I just wanted to let you know (just in case) Smile

By the way, MOVSXD allows to use r16 and r32 as first operand but fasm 1 doesn't. This operation is meaningless but described in Intel SDM (and conflicting with ARPL in non-64-bit mode) Smile
Post 27 Nov 2019, 18:04
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 7797
Location: Kraków, Poland
Tomasz Grysztar
Jin X wrote:
By the way, MOVSXD allows to use r16 and r32 as first operand but fasm 1 doesn't. This operation is meaningless but described in Intel SDM (and conflicting with ARPL in non-64-bit mode) Smile
This has been discussed many years ago...
Post 27 Nov 2019, 18:44
View user's profile Send private message Visit poster's website Reply with quote
Jin X



Joined: 06 Mar 2004
Posts: 70
Location: Russia
Jin X
Some other comments:

1. PTWRITE is also absent in fasm 1.

2. It seems there's no way to specify operand size of XBEGIN instruction (rel16 or rel32). I don't know what reserved word should be used (maybe REL16 and REL32 as SHORT and NEAR for JMP: XBEGIN REL32 AbortProc).

3. SAL is encoded as SHL.
Post 27 Nov 2019, 19:01
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 7797
Location: Kraków, Poland
Tomasz Grysztar
Jin X wrote:
1. PTWRITE is also absent in fasm 1.
It must have fallen through the cracks. Perhaps because it is not a part of any instruction set extension.

Jin X wrote:
2. It seems there's no way to specify operand size of XBEGIN instruction (rel16 or rel32). I don't know what reserved word should be used (maybe REL16 and REL32 as SHORT and NEAR for JMP: XBEGIN REL32 AbortProc).
Back when I was implementing it, I was not sure about this, so I decided to choose fasm's usual route and I just made it automatically optimized for size.

Jin X wrote:
3. SAL is encoded as SHL.
Officially there is no separate code for SAL, it is defined as just a synonym of SHL. See my post about some undocumented opcodes.
Post 27 Nov 2019, 19:23
View user's profile Send private message Visit poster's website Reply with quote
Jin X



Joined: 06 Mar 2004
Posts: 70
Location: Russia
Jin X
Tomasz Grysztar wrote:
This has been discussed many years ago...
You wrote:

1. The manuals list MOVSX/MOVZX with forms of first operand being larger than the second one.
MOVZX has no forms with the same operand size of 1st and 2nd operands but MOVSXD has.

2. The fact that it isn't really important for programmer which encoding the assembler chooses, allows even to make the "imprint" of the assembler on the code it generates.
You limit the use of assembler to your personal vision of what is important to all programmers and what is not. However, this use can be much wider. For example, it can be sizecoding (for demoscene, where code can be part of data), obfuscation, or self-modifying code, where encoding or instruction length can be important. And many other cases that do not occur to us.
Post 27 Nov 2019, 19:27
View user's profile Send private message Reply with quote
Jin X



Joined: 06 Mar 2004
Posts: 70
Location: Russia
Jin X
Tomasz Grysztar wrote:
It must have fallen through the cracks. Perhaps because it is not a part of any instruction set extension.
Maybe but it has special CPUID bit (EAX=14h,ECX=0 : EBX[bit4]).

Tomasz Grysztar wrote:
Back when I was implementing it, I was not sure about this, so I decided to choose fasm's usual route and I just made it automatically optimized for size.
I see but it can be important e.g. for self-modifying code Smile
Post 27 Nov 2019, 19:34
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 7797
Location: Kraków, Poland
Tomasz Grysztar
Jin X wrote:
MOVZX has no forms with the same operand size of 1st and 2nd operands but MOVSXD has.
The manuals have changed since that discussion was taking place. But, as Intel manuals are not completely consistent anyway, I stay with my design choices.

Jin X wrote:
You limit the use of assembler to your personal vision of what is important to all programmers and what is not. However, this use can be much wider. For example, it can be sizecoding (for demoscene, where code can be part of data), obfuscation, or self-modifying code, where encoding or instruction length can be important. And many other cases that do not occur to us.
Yeah, it is hard to satisfy everyone. This realization is one of the reasons why I made fasmg instead of fasm 2. With instructions implemented simply as optional macro packages, it is possible to customize them in any way you might need. By using different macros you can get an encoder allowing to customize everything, or emulate syntax of another assembler, or whatever other could come to your mind.

Jin X wrote:
I see but it can be important e.g. for self-modifying code Smile
The early versions of fasm had many features that allowed to enforce a specific length of immediate or displacement field, specifically for the purposes of self-modifying code. But I have been gradually withdrawing these features. As it turned out, use of such features is really exceptionally rare (as self-modifying code itself is not used frequently, especially nowadays) and in the meantime fasm gained many new features that allow to get the same effect without cluttering the basic syntax with additional options. For example, you can generate an instruction with a value of immediate field that enforces the size you need and then replace it with the value you want through STORE directive.
Post 27 Nov 2019, 19:36
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 7797
Location: Kraków, Poland
Tomasz Grysztar
OK, most of these new instruction sets were trivial to implement in fasm 1, the only ones left are AVX512_4VNNIW and CET_SS, because they need some specialized handlers. I could release a new version now, but there is no hurry, so I'm going to wait till I have the remaining instructions completed, too.

Having the (easier to make) fasmg macro implementations first is a great help in fasm 1 development as well. And it allows for additional cross-testing.
Post 27 Nov 2019, 20:53
View user's profile Send private message Visit poster's website Reply with quote
alphis01



Joined: 05 Sep 2020
Posts: 7
alphis01
Any idea when fasm 1 will support the new intel CET instructions? I'd like to begin testing by the end of the year if possible. I need my binaries to conform to CET when enabled and running on supported hardware. If anyone wants to see some of the hardware intel is releasing Tiger Lake soon. It will support it and Win10 already has support built in.

I believe the Linux kernel is also adding support as well.

Thanks!
Post 05 Sep 2020, 20:04
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 7797
Location: Kraków, Poland
Tomasz Grysztar
The CET_SS and CET_IBT instruction sets were added in release 1.73.19, as noted in whatsnew.txt.
Post 05 Sep 2020, 20:34
View user's profile Send private message Visit poster's website Reply with quote
alphis01



Joined: 05 Sep 2020
Posts: 7
alphis01
Ah I see! Thanks!
Post 06 Sep 2020, 05:06
View user's profile Send private message Reply with quote
alphis01



Joined: 05 Sep 2020
Posts: 7
alphis01
How do I enable use of IBT? I'm expecting to see function entries/jmp targets all prefixed with the new endbranch opcodes.

Do I need to do anything special to enable its use? Surely fasm's proc/jmp labels will automatically add them?

Thanks.
Post 06 Sep 2020, 13:15
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 7797
Location: Kraków, Poland
Tomasz Grysztar
alphis01 wrote:
Do I need to do anything special to enable its use? Surely fasm's proc/jmp labels will automatically add them?
No, the standard macros do not do it, they remain compatible with old systems. But you can make a customized prologue/epilogue for "proc" macro. In general, if you need the new instructions in your code, you should just write them explicitly - this is an assembly language after all.
Post 07 Sep 2020, 04:56
View user's profile Send private message Visit poster's website Reply with quote
alphis01



Joined: 05 Sep 2020
Posts: 7
alphis01
Quote:
In general, if you need the new instructions in your code, you should just write them explicitly - this is an assembly language after all.


While normally I'd agree, the intel CET functionality is something more akin to a "call" pushing the return address on the stack and less like doing something manually like pushing params on the stack/moving into regs (which you can still use invoke for).

The entire point of ENDBRANCH is that its automatically inserted into call targets and jump targets. It would be a little silly to have to do this manually.

Keep in mind that this instruction is also expected to be added to all jump targets which means jump tables as well. Perhaps even exception handlers
Post 07 Sep 2020, 18:58
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17671
Location: In your JS exploiting you and your system
revolution
alphis01: You can write macros to insert the instructions for you if you want. But keep in mind that only newer CPUs can execute the new instructions so incompatibility may be an issue for others running your code.
Post 07 Sep 2020, 22:31
View user's profile Send private message Visit poster's website Reply with quote
alphis01



Joined: 05 Sep 2020
Posts: 7
alphis01
Actually thats not true. I think there might be some misunderstanding of intel CET and its IBT functionality. The new opcodes are actually just NOPs on CPUs that don't have the new ISA. So its safe to have in current code.

The main issue that I see is that since its designed to work with calls (including virtual calls)/jumptables/etc these new endbr instructions need to automatically be inserted if "cet mode" is enabled. I'm not entirely sure macros can actually cover all the necessary ground going the more manual route.

At least I can manually shove these bytes in for testing but prepare for AVs on HW that supports CET for cases you may forget to have them inserted. Or just not write CET code.
Post 09 Sep 2020, 04:13
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 7797
Location: Kraków, Poland
Tomasz Grysztar
alphis01 wrote:
Actually thats not true. I think there might be some misunderstanding of intel CET and its IBT functionality. The new opcodes are actually just NOPs on CPUs that don't have the new ISA. So its safe to have in current code.
While it is likely irrelevant to your case, it is worth noting that this is only safe on CPU generations that support the long NOP instructions. For example my P5 machine does not. As the Intel manuals state, the multi-byte form of NOP is available on processors with model encoding 6 or F (the pattern of opcodes 0F18-0F1F made into reserved NOPs follows the patent that was filled in late 1995).

alphis01 wrote:
The main issue that I see is that since its designed to work with calls (including virtual calls)/jumptables/etc these new endbr instructions need to automatically be inserted if "cet mode" is enabled. I'm not entirely sure macros can actually cover all the necessary ground going the more manual route.
This is something that actually requires adding more semantic information to the code, for example using a special syntax to define label that may be the target of an indirect jump. Because of a free-form character of assembly language (which is one of its strengths), it normally does not convey such structural information.

This is analogous to "proc" macro, which requires additional syntax wrapped around the assembly instructions to provide semantic information about the scope of procedures, which would be impossible to derive from a pure assembly. By using such structuring layers, a programmer also imposes constraints on the code - adding HLL concepts like "function boundaries" to assembly language that in its basic form does not have to conform to anything like that.

In a similar manner, to make the code conform to the rules of CET, you might also need to add new layers of structure to your code. For instance, normally assembler would not be able to reliably tell which points of your code could be targets of indirects jumps - you need to provide such information in one form or another. It could be done by adding more macros structuring the code, or at least making your code follow specific conventions.

For example, in pure assembly you would prepare a target for an indirect jump this way:
Code:
option:
        endbr64
        ; actual code after    
But if you opted for a more structured code, you could introduce "ibt" macro to define an "indirect branch target" as opposed to a simple label:
Code:
ibt option
        ; actual code after    
While in base implementation the macro would only be adding a branch-terminating instruction, this approach also imbues the code with additional semantics, just like "proc" macro. If in the future you might need to do something more with such jump targets, you can extend the macro with additional functionality. On the other hand, if you kept using just plain labels, it would not be possible, because labels by itself do not carry any more semantics than just marking a point in memory.

Another option might be viable if your source code follows some specific conventions that would allow to derive information like "this label is going to be used as a target of indirect branch" from existing constructions. But this would be a specialized solution, dependent on following the conventions strictly. And it would probably be better to use fasmg in that case, as it has macro facilities that are more capable for such purposes.
Post 09 Sep 2020, 09:24
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page 1, 2  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.

Website powered by rwasa.