flat assembler
Message board for the users of flat assembler.

Index > Programming Language Design > Advanced x86 encoder prototype (macros)

Author
Thread Post new topic Reply to topic
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8344
Location: Kraków, Poland
Tomasz Grysztar 02 May 2017, 22:00
(Split off from the thread [fasmg.x86] the long road ahead.)

I think it is worth noting why have I not started working on native implementation x86 instruction set myself. I was a bit overwhelmed by my own ideas. I knew that to include everything I wanted I would have to put so much work into it that I'm not even sure if it could pay off. And fasmg by itself is enough to keep me busy for a long time, while I find writing macros for it very satisfying.

What I had in mind for "fasm 2" instruction encoder (I only shared it with revolution back then, but we did not discuss this much) was to include many switchable settings. In case of fasm 1 there is just a few settings for the instruction encoder - USE16, USE32 and USE64, and recently added USEAVX256 and USEAVX512. For a new encoder I wanted to have many more, including choice of supported CPU lines / instruction sets, but also including options of encoding instructions differently, like enforcing long immediates, or choosing to use a different "assembler fingerprint".

In this vision the options could be set globally (like USE32 does) or just for a single instruction by adding a "decorator" to the line. For example, there could be settings called "rmdst" and "rmsrc" to select whether the instruction with "reg,reg" operands should be encoded using the "r/m,reg" opcode, or the "reg,r/m" one. It would be possible to select this option globally with line like:
Code:
use rmdst    
but also it would be possible to select it just for a single instruction:
Code:
xor eax,ebx {rmsrc}    


Similarly an instruction set could be selected for the whole source, but then a single instruction from an other instruction set could be assembled using the local setting:
Code:
use i286
bsr ax,bx {i386}    
Some other examples of what I had in mind:
Code:
add eax,0 {imm32} ; like "add eax,dword 0" in fasm 1
loadall {i286} ; like loadall286 in fasm 1    

These are just to give an general idea of what my plans for fasm 2 were, the actual implementation could end up different. At the time when I first envisioned these ideas braces were not used in any of the Intel assembly syntax. This changed with AVX-512 and perhaps if I was to implement these ideas nowadays, I would reconsider the syntactic choice. But I do not really see myself starting such project anytime soon. Or perhaps I would start working on a set of macros for fasmg that would have it all.


Last edited by Tomasz Grysztar on 02 May 2018, 08:48; edited 1 time in total
Post 02 May 2017, 22:00
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8344
Location: Kraków, Poland
Tomasz Grysztar 20 Jul 2017, 20:06
I have started working on a prototype macros of an "advanced x86 encoder". I took my existing x64 macros for fasmg as a base and I'm adapting them to the "use" engine I created. The processor and mode selection may look like:
Code:
use i386
use32 ; or: use 32    
But multiple options may be specified in a single line, so the above can be shortened to:
Code:
use i386, 32    
So far I have converted only a few of the instructions macros, so there are no options for additional instruction sets and even the basic ones are only partially done. But some of the options can be already tested, like the "rmdst" and "rmsrc" I mentioned earlier. For the decorators altering options in a single line I chose the {* *}
syntax (it can be easily altered in the "parse_operands" macro):
Code:
use i386, 32 

xor eax,ebx {*rmdst*}   ; 31 D8
xor eax,ebx {*rmsrc*}   ; 33 C3

add ecx,1 {*imm8*}      ; 83 C1 01
add ecx,1 {*imm32*}     ; 81 C1 01 00 00 00

use AMD64, 64

mov rax,1000 {*imm32*}  ; 48 C7 C0 E8 03 00 00
mov rax,1000 {*imm64*}  ; 48 B8 E8 03 00 00 00 00 00 00    
The same settings that can be set with USE can also be specified in decorators, and vice versa, the difference is that decorator only alters settings for a single line while USE switches it semi-permanently (until it is changed again by another USE command):
Code:
use i186

mov eax,ebx {*i386*}
mov eax,ebx ; error    


Description: Advanced x86 encoder in form of fasmg macros - early prototype
Download
Filename: x86-2.inc
Filesize: 98.24 KB
Downloaded: 1351 Time(s)



Last edited by Tomasz Grysztar on 31 Jul 2017, 08:39; edited 1 time in total
Post 20 Jul 2017, 20:06
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8344
Location: Kraków, Poland
Tomasz Grysztar 18 Dec 2017, 14:47
I have another idea - by moving the decorator to the beginning of line instead I could still use simple braces without a risk of clashing with Intel syntax, because the decorator would always precede the instruction mnemonic and would not have any overlap with operands.

This would also be make the instruction handlers potentially simpler, as it would be easier to parse decorator before parsing instruction, and the instruction would be parsed with a complete knowledge of the applied settings. This would make the instruction handler implementation more similar to how it is done in fasm 1.

I'm attaching the modified version of X86-2.INC for fasmg that simulates this variant. An example code for this version looks like:
Code:
use i386, 32

{rmdst} xor eax,ebx     ; 31 D8
{rmsrc} xor eax,ebx     ; 33 C3

{imm8}  add ecx,1       ; 83 C1 01
{imm32} add ecx,1       ; 81 C1 01 00 00 00

use AMD64, 64 

{imm32} mov rax,1000    ; 48 C7 C0 E8 03 00 00
{imm64} mov rax,1000    ; 48 B8 E8 03 00 00 00 00 00 00    


Description: Advanced x86 encoder in form of fasmg macros - another prototype (updated: 2018-05-10)
Download
Filename: x86-2.inc
Filesize: 98.09 KB
Downloaded: 1261 Time(s)

Post 18 Dec 2017, 14:47
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8344
Location: Kraków, Poland
Tomasz Grysztar 29 Jan 2023, 15:38
Years later, I'm finally revisiting this idea, after fasmg has evolved enough for this to become more viable. CALM makes everything faster, and the new "??" interceptor allows to handle the annotation syntax without affecting the base performance of non-annotated instructions.

I'm attaching a new prototype. Syntax is fully compatible with the previous one, the example code from the above post assembles the same (and it can be combined with IEV.ALM to verify the output of each instruction). Defining and testing groups of settings is now much simpler, though.

Keep in mind that since this is only a proof of concept, many instruction sets and variations of settings are not completely implemented. It is incrementally evolving towards the old vision of fasm 2, but it is not there yet. If you would like to actually start using it - please let me know.


Description: Advanced x86 encoder reimplemented with help of CALM
Download
Filename: x86-2.inc
Filesize: 95.82 KB
Downloaded: 220 Time(s)

Post 29 Jan 2023, 15:38
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8344
Location: Kraków, Poland
Tomasz Grysztar 31 Jan 2023, 15:51
Being quite happy with how it turned out, I'm publishing the latest version on GitHub: https://github.com/tgrysztar/fasmg/tree/master/packages/x86-2
While this is still in the "research" phase, it is already functional - you should be able to use it as a replacement for the basic encoder, although limited in scope of instruction sets supported.
Post 31 Jan 2023, 15:51
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.