Tutorial: adding custom instructions

Index > Compiler Internals > Tutorial: adding custom instructions

Author

Thread

Tomasz Grysztar

Joined: 16 Jun 2003
Posts: 8434
Location: Kraków, Poland

Tomasz Grysztar 27 Aug 2003, 17:05

This is the new version of this tutorial (as the older ones are obsolete, since fasm's internals have changed a bit since they were written), and the first which I'm posting on this messageboard. Hope you'll find it useful, as I failed with writing the full internals documentation (that doesn't mean I won't finish it, but it'll probably take a long time). Maybe I should post also some tutorial on porting fasm to other environments?

Example 1 - simple approach

Consider we want to have the "bignop" instruction, without any arguments, which will generate 7 bytes of value 90h. First step: add this name to the table, so fasm will recognize this instruction. This name is 6 bytes long, so you should find instructions_6 table (it's in the 'x86.inc' file, which contains the standard set of instructions) and put there two new lines (remember to keep the alphabetical order!):

Code:

db 'bignop',0
dw bignop_instruction-assembler

This will define our new instruction, with the handler procedure being bignop_instruction, and a zero as the additional parameter. Now, when fasm meets this instruction in preprocessed and parsed source, it will jump to your handler (the bignop_instruction label), with the additional parameter in AL register. So the only thing left to do is to write this instruction handler. You may add it to "x86.inc", but the best solution is to create new "custom.inc" file, and put "include 'custom.inc'" line somewhere in the "x86.inc" file. If your editor can't handle text files larger than 64KB, just write the following command at the DOS prompt:

echo include 'custom.inc' >> x86.inc

Now create the "custom.inc" file, and put the bignop_instruction handler there:

Code:

bignop_instruction:
        mov     al,90h
        mov     ecx,7
        rep     stos byte [edi]
        jmp     instruction_assembled

This handler will just generate 7 bytes of code, without reading any arguments, and then pass the control back to assembler. Every instruction handler should be ended with this jump.

Now recompile the fasm and try the new instruction!

Example 2 - argument processing

Now we will add the "varnop" instruction, which will expect an argument being a number specifying the length of the instruction. The instruction handler is:

Code:

varnop_instruction:
        lods    byte [esi]
        cmp     al,'('
        jne     invalid_argument
        cmp     byte [esi],'.'
        je      invalid_value
        call    get_dword_value
        mov     ecx,eax
        mov     al,90h
        rep     stos byte [edi]
        jmp     instruction_assembled

This handler expects the number at ESI, so it loads a byte and checks if it is a number expression (marked by a "(" byte). If it isn't, we are jumping to the error handler (look at "errors.inc" to see what errors can be handled, you can also add your own - it's simple). Also, if the second byte is ".", it means this is floating point number, and we don't want it. Then we can call the "get_dword_value" procedure with esi pointing to the first byte after "(" character, and the whole expression is processed, the result number is stored in EAX register. Now handler generates this count of NOPs and exits.

There are many procedures that will process arguments for you, here is the list of the most useful of them:

1. When argument is a number, call one of the following procedures with ESI pointing to the first byte after "(":

get_byte_value - returns number in AL
get_word_value - returns number in AX
get_dword_value - returns number in EAX
get_pword_value - returns number in DX:EAX
get_qword_value - returns number in EDX:EAX
get_value - converts number of any type and returns it in EDX:EAX

2. When argument is a register, the first byte at ESI is 10h, load the second byte to AL and call:

convert_register - accepts only general purpose registers, sets the AH to the size of register (1, 2 or 4) and AL to the register code number.
convert_fpu_register - accepts only FPU register, sets the AH to the value of 10 (this is the size of single FPU register) and AL to the register code number.
convert_mmx_register - accepts only MMX registers, AH is set to the register size (8 or 16) and AL to the register code number.

These procedures set also the [argument_size] variable to the same value as AH register. If the [argument_size] is already set to something but 0, and sizes don't match, the error handler is called.
You can also process the second byte manually, you can see the possible second byte values looking at the "symbols" table in "x86.inc" file.

3. To process size overrides, after loading the first byte of an argument into AL call get_size_operator procedure. If there is a size override, it sets the [argument_size] to proper value, and loads the first byte of next symbol into AL, otherwise it does nothing.

4. When argument is the memory (the first byte is "["), call the get_address procedure with ESI pointing to the first byte after "[". It will return an address value in EDX, base register code in BH, index register code in BL, index scale in CL, address size override in CH and the segment register code in [segment_register] variable. You can just pass the unchanged BX, CX and EDX registers to the store_instruction procedure, with [base_code] set to instruction code and [postbyte_register] set to the register code or instruction extension - the whole opcode will be generated then and stored at EDI. If [base_code] is set to 0Fh, the [extended_code] should contain the value of second opcode byte.

To make 16-bit version of instruction (regardless the "use16" or "use32" setting), call the operand_16bit_prefix procedure before generation an opcode. To make 32-bit version, call the operand_32bit_prefix.

Please look at the various instruction handlers in "x86.inc" for the more complex examples.

Example 3 - common handler

We can make the common handler for the both of above instructions, using the additional parameter field:

in "tables.inc":

Code:

db 'bignop',7
dw bignop_instruction-assembler

and

Code:

db 'varnop',0
dw bignop_instruction-assembler

in "custom.inc":

Code:

bignop_instruction:
        xor     ecx,ecx
        or      cl,al
        jnz     .store
        lods    byte [esi]
        cmp     al,'('
        jne     invalid_argument
        cmp     byte [esi],'.'
        je      invalid_value
        call    get_dword_value
        mov     ecx,eax
      .store:
        mov     al,90h
        rep     stos byte [edi]
        jmp     instruction_assembled

If the additional parameter is 0, it reads the count argument, otherwise it uses the AL as a count.

Have a nice customizing!

27 Aug 2003, 17:05

scientica
Retired moderator

Joined: 16 Jun 2003
Posts: 689
Location: Linköping, Sweden

scientica 27 Aug 2003, 20:10

Privalov wrote:

This is the new version of this tutorial (as the older ones are obsolete, since fasm's internals have changed a bit since they were written), and the first which I'm posting on this messageboard. Hope you'll find it useful, as I failed with writing the full internals documentation (that doesn't mean I won't finish it, but it'll probably take a long time). Maybe I should post also some tutorial on porting fasm to other environments?

I got an idea, post the internals as tutorials like this - and then when you've covered 'enougth' simply compile it, that way you don't need to have the burden of having an entire "fasm under the hood" manual to write.
It would also be good since then you have the documentation with an easy overview, just stick these tutorials and then it's easy to find the internals documentation that one needs at the moment, for OS developers they simply select the porting tutorial and read on, port it, spead it and help more users see the beauty of fasm Smile

_________________
... a professor saying: "use this proprietary software to learn computer science" is the same as English professor handing you a copy of Shakespeare and saying: "use this book to learn Shakespeare without opening the book itself.
- Bradley Kuhn

27 Aug 2003, 20:10

pelaillo
Missing in inaction

Joined: 19 Jun 2003
Posts: 878
Location: Colombia

pelaillo 28 Aug 2003, 11:28

... and we (the growing users community) will fill them with well explained working examples in order to have the bare bones of your book on FASM.

28 Aug 2003, 11:28

Mac2004

Joined: 15 Dec 2003
Posts: 314

Mac2004 07 May 2004, 15:15

Hi pelaillo!!

You're absolutely right about 'the community' to help Thomaz with the Fasm book.

regards
Mac2004

07 May 2004, 15:15

shutdownall

Joined: 02 Apr 2010
Posts: 517
Location: Munich

shutdownall 22 Nov 2011, 19:40

I just wanted to push it up as it seemed to be lost on page 11 of compiler internals.
Maybe Tomasz has updates in the meanwhile or maybe have more tutorials here.
Would be nice to have a link list for all official documentation and/or tutorials. Cool

22 Nov 2011, 19:40

shutdownall

Joined: 02 Apr 2010
Posts: 517
Location: Munich

shutdownall 22 Nov 2011, 19:48

Here I found one more thread:
But it's from 2004. Shocked

http://board.flatassembler.net/topic.php?t=60

22 Nov 2011, 19:48

< Last Thread | Next Thread >

Forum Rules:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum