flat assembler
Message board for the users of flat assembler.

Index > Macroinstructions > [fasmg] polynomials - how to distinguish between A+A and A*2

Author
Thread Post new topic Reply to topic
zhak



Joined: 12 Apr 2005
Posts: 501
Location: Belarus
zhak 09 Mar 2017, 21:48
Tomasz, I applied some ideas you implemented in x86 instruction set macros for parsing memory operands. For example, using polynomials
Code:
repeat elementsof addr
  meta = addr metadata %
  if meta relativeto x86.reg32
    expr = expr + addr element % * addr scale %
  end if
end repeat
    


However, there's a problem. Let's take two 32-bit instructions:
Code:
inc byte [eax + eax]
inc byte [eax * 2]
    

Well, they do the same, but instruction encoding differ in modrm/sib part.
Code:
[eax + eax] => 0x04 0x00
[eax * 2]   => 0x04 0x45 0x00000000
    

I'm afraid there's no way to distinguish between A + A and A * 2 in expression, am I right? The way I see it can be achieved is by `match`ing, but I would like to avoid this for obvious performance reasons. The easiest way would, of course, be using always A + A form, but I'd like to keep macros as 'achieve what you're trying to'.
Post 09 Mar 2017, 21:48
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8266
Location: Kraków, Poland
Tomasz Grysztar 09 Mar 2017, 22:18
Yes, there is no way to distinguish them, and you have to use explicit MATCH if you need to differentiate them.

On a side note: the (much simpler) implementation of register algebra in fasm 1 also did conflate these variants into the same result.

Depending on your constraints you could also try some other tricks. Let me think a moment...
Post 09 Mar 2017, 22:18
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8266
Location: Kraków, Poland
Tomasz Grysztar 09 Mar 2017, 22:30
One peculiar trick that comes to my mind is to define registers as symbolic values to break the algebraic symmetry:
Code:
eax equ ea1+ea2

element ea1 : 'ea1'
element ea2 : 'ea2'

macro showpoly value
        local tmp
        tmp = value
        repeat 1, a : tmp scale 0
                display `a
        end repeat
        repeat elementsof tmp
                display '+',string tmp metadata %,'*'
                repeat 1, a: tmp scale %
                        display `a
                end repeat
        end repeat
        display 13,10
end macro

showpoly eax+eax        ; ea1+ea2+ea1+ea2
showpoly eax*2          ; ea1+ea2*2
showpoly 2*eax          ; ea1*2+ea2    
Whether this is a viable solution depends on your macro framework.
Post 09 Mar 2017, 22:30
View user's profile Send private message Visit poster's website Reply with quote
zhak



Joined: 12 Apr 2005
Posts: 501
Location: Belarus
zhak 09 Mar 2017, 23:07
Thanks for the idea! Interestings, but pretty complex. What's more important, I was afraid of matching for nothing. Only one additional match (and one if were enough): If you know that expr scale 2 = 0 & expr scale 1 = 2 for expression Scale * Index + Base + Displacement (which I compute earlier), only match a*b, addr is enough to solve this case
Post 09 Mar 2017, 23:07
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 19871
Location: In your JS exploiting you and your system
revolution 10 Mar 2017, 00:38
Matching is fraught with problems.
Code:
mov eax,[eax+eax+4*2]
mov eax,[4+4+eax*2]    
There are so many ways to form valid expressions.
Post 10 Mar 2017, 00:38
View user's profile Send private message Visit poster's website Reply with quote
zhak



Joined: 12 Apr 2005
Posts: 501
Location: Belarus
zhak 10 Mar 2017, 08:13
Indeed, you're right. The first case I missed
Post 10 Mar 2017, 08:13
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8266
Location: Kraków, Poland
Tomasz Grysztar 10 Mar 2017, 09:24
I'd think that you'd need to recognize just a few "special" expressions, since only then one could really expect to get a strictly defined result. That is: you could expect the address to be written exactly as "base+index*scale+displacement" or "index*scale+displacement" and only then enforce the strictly corresponding encoding (this is a bit similar to how the AT&T syntax in GAS handles it), while for any other free-form expression a good fail-safe would be to evaluate it algebraically and then optimize the instruction output just like fasm 1 does it.

If you like your match a*b, addr solution, then you could additionally check if "a" is a register. If we agree that only "eax*2" should be treated specially and variants "(eax+0)*2" should be evaluated and optimized, you'd need to detect if something is exactly a token corresponding to register. I have at least one trick that can do it:
Code:
__eax equ **

match a*b, addr
    match **, __#a    
Post 10 Mar 2017, 09:24
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2023, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.