flat assembler
Message board for the users of flat assembler.

Index > Compiler Internals > registers algebra engine

Goto page 1, 2  Next
Author
Thread Post new topic Reply to topic
ouadji



Joined: 24 Dec 2008
Posts: 1081
Location: Belgium
ouadji
Code:
cmp ax,[toto + (ebp*2)]
    
result with fasm 1.69.14 :
Code:
66 3B 84 2D xx xx xx xx ; xx xx xx xx = toto    
but this above == cmp ax,[toto + ebp + ebp] !!
how to have really this:
Code:
cmp ax,[toto + (ebp*2)]   ;opcode : 66 3B 04 6D xx xx xx xx    
is this possible with fasm ?



When i ask "[(ebp*2) + Displacement]",
"fasm" doesn't build the right opcode, but builds "[ebp + ebp + displacement]

Where is the problem ?
With [(ebp*2) + X], the segment used is DS
and with [ebp + ebp + X] the segment is SS ! (INTEL vol 1 - 3.7.5 (also figure 3.11)
If I want to use the DS segment, I am forced to use the prefix "DS:",
whereas normally it is not necessary.
An assembler must scrupulously follow the instructions requested,
this is not the case here.

_________________
I am not young enough to know everything (Oscar Wilde)- Image


Last edited by ouadji on 25 Jun 2010, 19:46; edited 6 times in total
Post 16 Jun 2010, 16:17
View user's profile Send private message Send e-mail Reply with quote
vid
Verbosity in development


Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid
obvious, use DB if you care about particular encoding.
Post 16 Jun 2010, 17:33
View user's profile Send private message Visit poster's website AIM Address MSN Messenger ICQ Number Reply with quote
edemko



Joined: 18 Jul 2009
Posts: 549
edemko
ouadji hi
scale factor(2 in your case) may be used if index factor has been defined
refer to this document, pg90:
Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 1: Basic Architecture.
http://www.intel.com/Assets/PDF/manual/253665.pdf
Post 16 Jun 2010, 17:38
View user's profile Send private message Reply with quote
ouadji



Joined: 24 Dec 2008
Posts: 1081
Location: Belgium
ouadji
edemko, yes, of course ...
but the "index" is fully defined ... cmp ax,[toto + (ebp*2)].
This is not a matter of encoding, the encoding is correct ...
it's a matter of assembling by FASM

_________________
I am not young enough to know everything (Oscar Wilde)- Image
Post 16 Jun 2010, 19:35
View user's profile Send private message Send e-mail Reply with quote
edemko



Joined: 18 Jul 2009
Posts: 549
edemko
TG likes surprises
Code:
;lea eax,[eax+eax*8] ;OllyDbg's assemble window autotransforms it too
lea     eax,[eax*9]
;lea eax,[eax*4+8]; OllyDbg's assemble window fails
lea     eax,[4*(eax+2)]
    
Post 16 Jun 2010, 22:21
View user's profile Send private message Reply with quote
ouadji



Joined: 24 Dec 2008
Posts: 1081
Location: Belgium
ouadji
Code:
A) cmp ax,[toto + (ebp*2)] ;DS segment

and

B) cmp ax,[toto + ebp+ ebp] ;SS segment    


"A" et "B" are not equal !

"A" : the DS segment is the default segment
"B" : the SS segment is the default segment

To have " a" and " b" equal, one needs this:

Code:
cmp ax,[ds:toto + ebp+ ebp]    
but there is a byte more !

Sorry, but it's a bug !

I encode this: cmp ax,["DS:" toto+(ebp*2)]
and I have this: cmp ax,["SS:" toto+ebp+ebp]


it's not the same résult.

OllyDbg's assemble window fails ? ...
lea eax,[eax*9] ... Shocked Confused try Syser, the best one !
Wink

_________________
I am not young enough to know everything (Oscar Wilde)- Image
Post 16 Jun 2010, 22:42
View user's profile Send private message Send e-mail Reply with quote
edemko



Joined: 18 Jul 2009
Posts: 549
edemko
if toto = 1 then
Code:
CPU Disasm
Address    Hex dump             Command                             Comments
<ModuleEnt     66:3B442D 01     cmp     ax,[word ss:ebp+ebp+1]
00401005       66:3B442D 01     cmp     ax,[word ss:ebp+ebp+1]
0040100A    .  C3               retn
    
Post 16 Jun 2010, 23:01
View user's profile Send private message Reply with quote
edemko



Joined: 18 Jul 2009
Posts: 549
edemko
btw "otot"(reversed "toto") means "that" in Slavic dialect
also "toto" sounds French
also i'll stop arguing and read some
Post 16 Jun 2010, 23:03
View user's profile Send private message Reply with quote
ouadji



Joined: 24 Dec 2008
Posts: 1081
Location: Belgium
ouadji

toto, glop, gloup, ... my favorite names for variables ... Razz
and you ?
Wink

_________________
I am not young enough to know everything (Oscar Wilde)- Image
Post 16 Jun 2010, 23:09
View user's profile Send private message Send e-mail Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
Where is the bug? Why you say that A) uses DS segment when the processor is supposed to switch to SS when a reference to (R|E){BP, SP} is detected?
Post 16 Jun 2010, 23:18
View user's profile Send private message Reply with quote
ouadji



Joined: 24 Dec 2008
Posts: 1081
Location: Belgium
ouadji


Quote:
Why you say that A) uses DS segment when the processor is supposed to switch
to SS when a reference to (R|E){BP, SP} is detected?
yes, if "ebp" is used as the "base", not "index"

INTEL vol 1 - 3.7.5 (also figure 3.11)

When the ESP or EBP register is used as the base,
the SS segment is the default segment.
In all other cases, the DS segment is the default segment.

[no base + (ebp*2) + toto] = DS
[ebp + ebp + toto] = SS
(second ebp == ebp*1)

inside "(ebp*2)" we have "index(ebp) * scale(2)" ... "ebp" is not the base.

_________________
I am not young enough to know everything (Oscar Wilde)- Image


Last edited by ouadji on 16 Jun 2010, 23:46; edited 7 times in total
Post 16 Jun 2010, 23:22
View user's profile Send private message Send e-mail Reply with quote
edemko



Joined: 18 Jul 2009
Posts: 549
edemko
due ouadji's request, the topic is going to be sticky, small letters made so

for the last time dot-prepended names saying those are local: .a, .b
when i stayed with delphi, type specifiers were used: dwThis, dwThat
i like fasm supports native languages so var's meaning is descriptive
anyway such national naming is avoided; used for Russian friends in demos sometimes
Tomasz uses var_parts_etc, i like it, it keeps time
VarParts_Etc <- oh i'm to lazy to press SHIFT every time
if you have seen Borland's sources, most asm inlined there was in CAPITALS: TEST [ME]
i can spend 6 hours rolling a proc to put all the vars into registers
also i'm a big offtopic man still such things needed sometimes
there are many funny constants like 0xDEADCODE: http://board.flatassembler.net/download.php?id=3636
there is DOS386 whose utterances make me smile


Last edited by edemko on 16 Jun 2010, 23:51; edited 1 time in total
Post 16 Jun 2010, 23:28
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
ouadji, I see the problem, and it is best seen with this simple test:
Code:
toto = 1
int3
cmp eax,[toto + ebp*2] ; SS segment, but could be DS if SIB with MOD=00 where used
cmp eax,[toto + ebp + ebp] ;SS segment
cmp eax,[toto + ebp*4] ;DS segment    
Code:
00401000 > CC               INT3
00401001   3B442D 01        CMP EAX,DWORD PTR SS:[EBP+EBP+1]
00401005   3B442D 01        CMP EAX,DWORD PTR SS:[EBP+EBP+1]
00401009   3B04AD 01000000  CMP EAX,DWORD PTR DS:[EBP*4+1]    


But note that the encoding favoring DS needs a disp32 instead of disp8 even when "toto" is small.

I think that changing this would bring some troubles but still, I suggest you to edit your first post adding more explanation about this bug so I can stick the thread.
Post 16 Jun 2010, 23:45
View user's profile Send private message Reply with quote
ouadji



Joined: 24 Dec 2008
Posts: 1081
Location: Belgium
ouadji

Quote:
cmp eax,[toto + ebp*2] ; SS segment
it's false !!!

it's "FASM" which transforms [toto + (ebp*2)] in [toto + ebp + ebp]
These two cases exist and are different !
[toto + (ebp*2)] == DS
[toto + ebp + ebp] === SS

But Fasm changes "(ebp*2)" with "(ebp +ebp)" !!!

A) cmp ax,[(ebp*2) + toto] == 66 3B 04 6D toto
B) cmp ax,[(ebp + ebp + toto] == 66 3B 84 2D toto

if i ask "A", "Fasm" gives me "B" !!!


_________________
I am not young enough to know everything (Oscar Wilde)- Image
Post 16 Jun 2010, 23:57
View user's profile Send private message Send e-mail Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
Quote:

it's false !!!

it's "FASM" which transforms [toto + (ebp*2)] in [toto + ebp + ebp]
fasm is supposed to apply some algebra to perform some optimizations and even make some expressions (like EBP*3) compilable. In this case however, it is changing the default segment as well, probably something overlooked by Tomasz (or not? maybe flat assembler has something to do with this? Wink).

Please note that I already agreed this is a problem, and something must be done but I need you to update your first post as it is providing no information about this issue, it is only complaining about the inability to force [ebp*2] without talking about the consequences of not allowing this.

IMHO, if this is fixed, I think the best will be to change [ds:ebp+ebp+disp32] to check if some ModR/M+SIB exists that avoids using the override prefix ([EBP*2+disp32] here), and in cases where disp can fit in disp8 use [ds:ebp*2+dword disp8] to still avoid the prefix at the expenses of a larger encoding. Strictly following the expression could probably lead to spec violations and even slower instructions (at least my CPU takes two cycles with LEA when an index is scaled instead of just one).
Post 17 Jun 2010, 01:34
View user's profile Send private message Reply with quote
ouadji



Joined: 24 Dec 2008
Posts: 1081
Location: Belgium
ouadji
Quote:
Please note that I already agreed this is a problem
thank you (it seems obvious)
Quote:
I need you to update your first post as it is providing no information about this issue
ok, it's done
Quote:
probably something overlooked by Tomasz, or not ?
if "not", it's a obvious limitation of fasm

_________________
I am not young enough to know everything (Oscar Wilde)- Image
Post 17 Jun 2010, 06:29
View user's profile Send private message Send e-mail Reply with quote
Tomasz Grysztar
Assembly Artist


Joined: 16 Jun 2003
Posts: 7724
Location: Kraków, Poland
Tomasz Grysztar
LocoDelAssembly wrote:
IMHO, if this is fixed, I think the best will be to change [ds:ebp+ebp+disp32] to check if some ModR/M+SIB exists that avoids using the override prefix ([EBP*2+disp32] here).
I agree. This is fix for just an optimization issue (that is: to optimize [ds:ebp+ebp] into form without prefix, shorter than the one generated now).

LocoDelAssembly wrote:
In this case however, it is changing the default segment as well, probably something overlooked by Tomasz (or not? maybe flat assembler has something to do with this? Wink).
Yes, it was overlooked. fasm due to the register algebra it calculates is unable to distinguish between reg+reg and reg*2. I overlooked that it causes this problem with segment register inconsistency. The bad consequence is that [ebp+ebp] and [ebp*2] are - in fasm - ambiguous formulations. If you operate in flat mode (where DS=SS) you may not worry about it and just use ambiguous forms, being sure that no segment prefix will be put there. But where segment is important you have to use segment prefix, and right now fasm sometimes doesn't optimize it well (the aforementioned [ds:ebp+ebp] case).
Post 17 Jun 2010, 08:43
View user's profile Send private message Visit poster's website Reply with quote
edemko



Joined: 18 Jul 2009
Posts: 549
edemko
Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 2A: Instruction Set Reference, A-M
http://www.intel.com/Assets/PDF/manual/253666.pdf
section 2.1(pg.33): Chapter 2 Instruction Format
Idea
Post 17 Jun 2010, 11:14
View user's profile Send private message Reply with quote
ouadji



Joined: 24 Dec 2008
Posts: 1081
Location: Belgium
ouadji

Is it possible to solve this limitation in the next version ?
(thank you Tomasz)


it would be great to get opcode really requested.
If one wishes to have "(reg*2)" ... get "(reg*2)" and nothing else.
same thing about "(reg+reg)"

_________________
I am not young enough to know everything (Oscar Wilde)- Image
Post 17 Jun 2010, 11:58
View user's profile Send private message Send e-mail Reply with quote
ouadji



Joined: 24 Dec 2008
Posts: 1081
Location: Belgium
ouadji
Code:
inc dword [ebp+ebp+ebp+ebp+ebp]
    
it's not rigorous and doesn't respect the assembler spirit.
Fasm allows this, above, but does not give the correct code for this:
Code:
inc dword [ebp*2]
    
the registers algebra engine isn't good.
FASM can't afford this kind of interpretations and approximations.

sorry, but :
Code:
ebp*2 != ebp+ebp

and

[ebp+ebp+ebp+ebp+ebp] : is not a addressing mode allowed.
    
I think this engine (the registers algebra) should be rebuilt and redesigned!
Code:
EXPRESSI.INC
--------------
sib_allowed:
     or      bh,bh
       jnz     check_index_scale

       cmp     cl,2                    | ebp*2 => ebp+ebp
       je      special_index_scale     | "a small part" of the problem

       cmp     cl,3
        je      special_index_scale
    
"2" isn't a special index. "3","5","9" yes, but not "2".

_________________
I am not young enough to know everything (Oscar Wilde)- Image


Last edited by ouadji on 25 Jun 2010, 18:22; edited 1 time in total
Post 25 Jun 2010, 12:03
View user's profile Send private message Send e-mail Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page 1, 2  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar.

Powered by rwasa.