flat assembler
Message board for the users of flat assembler.

Index > Compiler Internals > "db (-1) shr 1" results into "value out of ra

Goto page 1, 2  Next
Author
Thread Post new topic Reply to topic
edemko



Joined: 18 Jul 2009
Posts: 549
edemko 04 Sep 2010, 19:20
[EDIT BY LOCO]I had to add this text to be able to edit this post to make it sticky. BTW, how did you made the forum allow you to post a blank post???[/EDIT]
Post 04 Sep 2010, 19:20
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 04 Sep 2010, 19:52
Code:
rb (-1) shr X    


X = 0 <= Error: Invalid value [OK]
X = [1, 31] <= Error: Value out of range [MAYBE OK]
X = 32 <= Error: Invalid value [SHOULD BE SAME MESSAGE AS ABOVE]
X = [33,64] <= Either out of memory or success but the out of memory is OK

Although it would be great to have a consistent error message, it is not really a big problem, the maximum attainable RB would be somewhere near 2 GB (or 3GB in 3G/1G systems, and maybe 4 GB in 64-bit OSes??), since fasm is a 32 bit program and does not employ AWE to extend to 64 GB memory.
Post 04 Sep 2010, 19:52
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 04 Sep 2010, 19:58
http://msdn.microsoft.com/en-us/library/aa366778%28VS.85%29.aspx

So yes, 64-bit systems would allow 4GB space when large address aware flag is set. X = 32 should be working as X = [33,64] then.
Post 04 Sep 2010, 19:58
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 04 Sep 2010, 20:04
Code:
db (-1) shr 1    

Right, that's something I forgot to finish a long time ago. I'm going to fix it now. "xor" operation was not finished as well (only "not" works correctly in current version).

LocoDelAssembly: yours is something completely different.
Post 04 Sep 2010, 20:04
View user's profile Send private message Visit poster's website Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 04 Sep 2010, 20:39
Mmmhh, I see but I thought I copy&pasted edemko's code Confused (maybe it was changed while I was replying??)

Anyway, what's the problem that needs fixing here? You mean that shr should be arithmetic shift right?
Post 04 Sep 2010, 20:39
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 04 Sep 2010, 21:16
LocoDelAssembly wrote:
Anyway, what's the problem that needs fixing here? You mean that shr should be arithmetic shift right?

When the value has a size explicitly stated (as in "a = byte -1" or "db -1"), fasm does calculations in such a way, that the result is as if the operations were performed on the data of such size (unless there was some "overflow" during the calculations, but that's another story).

For example, to yield correct results for such pieces of code:
Code:
db not 80h
dw not 8000h
dd not 80000000h    
fasm has to calculate the value accordingly to the size specified by context, otherwise you'd get "out of range" error.

However I forgot to finish this feature, namely for "xor" and "shr" operations. The cases where context-dependent calculations are needed with them are very rare, however - that's why it went unnoticed for a long time.
Post 04 Sep 2010, 21:16
View user's profile Send private message Visit poster's website Reply with quote
edemko



Joined: 18 Jul 2009
Posts: 549
edemko 04 Sep 2010, 21:22
[quote=Loco]
how did you made the forum allow you to post a blank post?
[/quote]
Empty line using <Enter>.
[quote=Loco]
Mmmhh, I see but I thought I copy&pasted edemko's code (maybe it was changed while I was replying??)
[/quote]
No, you pasted authentic code Smile
[quote=Tomasz]
fasm has to calculate the value accordingly to the size specified by context
[/quote]
I thought fasm operated on qwords.
Post 04 Sep 2010, 21:22
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 04 Sep 2010, 21:44
Oh, I see why I couldn't do that here, only the JavaScript code is preventing that, the server side code does not complain about blank posts at all.

Small clarification: I meant I pasted your code in my FASMW, in the forum I obviously introduced the X variable Razz
Post 04 Sep 2010, 21:44
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 04 Sep 2010, 21:46
edemko wrote:
I thought fasm operated on qwords.
It does, however it makes some special handling when there is a context-specified size for the value.

I have these problems (including "rb" error messages) fixed in 1.69.19, but it may take some time with the upload, as I have some connection problems right now.


PS. That fasm's expression calculator operates on 64-bit values is in fact a serious flaw - it should operate on 65-bit ones (you may notice an echo of this fact in my fasm 2 presentation from fasmcon 2009). Because of this flaw fasm sometimes allows expressions it should not really allow (like "db 0FFFFFFFFFFFFFF80h"). However it is too late to correct it for 1.x architecture, so I have to live with it.
Post 04 Sep 2010, 21:46
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 06 Sep 2010, 18:44
As those fasm's expression calculator peculiarities were never actually documented in detail, I decided to write it down here, so that this information is kept somewhere.

TASM (which was fasm's initial inspiration in many areas) did its integer calculations on 33-bit values.
Since those days 32 bits were the usual maximum size for integers, it wasn't a problem that it didn't allow to calculate on 64-bit number, for example. However staying with simple 32 bits was also problematic, because in assembly language one doesn't specify whether a value is signed or unsigned. Therefore one could write "DD 0FFFFFFFFh" or "DD -80000000h" and in both cases expect it to be correct code. In other words, for expression like the one for DD directive, both signed and unsigned value ranges have to be allowed.
Therefore TASM was doing its calculations on 33 bits, so that by looking at this additional bits it could know, whether the value was positive or negative, and then check the allowed range accordingly

When I was writing fasm, I decided to go a bit further, and I implemented calculations on 64-bit values. That wasn't really because I predicted that 64-bit processors would come, it was years before x86-64 was even first mentioned, and I didn't even think that fasm will be something I would still use at the time of 64-bit processors coming. So fasm's 64 bits were just an improvement over TASM's 33 bits and served the same purpose. If I had envisioned that one day 64 bit will become a size for every-day use (did it?), I would perhaps implement 65-bit calculations instead, analogously to 33-bit ones.

Thus this decision had two consequences: one good, and one bad. The good one was that upon coming of x86-64 architecture fasm was ready to deal with 64-bit integer expressions. The bad one was that this was not perfect, because without 65-th bit fasm is not able to distinguish very large positive 64-bit number from a negative one.
I'm not saying it is definitely too late to implement 65-bit calculations into fasm. It is possible to do it with some effort; however I'm not sure whether it is really so important (though it would certainly be very satisfying to me to have these things cleaned up). Also there would be a solution of disallowing unsigned positive 64-bit values that don't fit into signed range, but that would the most probably stimulate a wave of bug reports if people started getting error message upon trying to declare quad word with such value.

Now back to the topic of range checking, as there is more to it.
Allowing both signed and unsigned range applies of course not only to double word values, but to the other quantities as well. So if you define a byte value, you can specify either a unsigned one in range 0 to 0FFh, or a signed one in range -80h to 80h. So effectively byte values are allowed in ranges from -80h to 0FFh.

So when you define a value of byte size, fasm first calculates it as 64-bit, and then checks whether it fits into the range -80h to 0FFh. It generally gives the expected results, however it also creates some additional problems. The example of such problem is:
Code:
db not 80h    
It is natural to expect, that this will be assembled properly - the logical NOT is not a kind of operation like addition or multiplication where one would expect overflow to happen. When you do a bitwise operation on value that fits into range, you don't expect it to overflow. But when calculated as 64-bit value, "not 80h" becomes 0FFFFFFFFFFFFFF7Fh, and it does not fit into -80h to 0FFh range.
So how does fasm manage to assemble this? Well, it is aware of the size of desired value from the context (like DB directive or constant definition with BYTE size operator). And in such case it performs the logical operations like "not" appropriately for that size, so that the result does not overflow.
And thus with fasm these four instructions are all correct:
Code:
db not 80h
dw not 8000h 
dd not 80000000h
dp not 800000000000h    


There is a similar case with XOR operations when it has one of arguments negative and one positive and large, like:
Code:
db (-1) xor 80h    
This also gets assembled only because fasm detects a special case when it has to take the context into consideration.

Note, however, that if any of the values calculated on is out of range, then fasm does not do this special treating, and calculates it with full 64-bit precision. This means that the only difference between having this feature and not is that some operations are allowed that otherwise would give an error. And it never happens that some expression that would be allowed anyway gives a different result. This is important, as it minimizes the chance that something will be calculated in other way than expected.

Also SHR operator is now treated in this special way, to allow things like
Code:
db (-1) shr 1    

However SHL is not affected by the context-derived value size - for the reason that it is sometimes used in the arithmetical sense, and thus is expected to overflow just like the multiplication. Also, if you use SHL to create some bit fields, it is better to detect an overflow and signal it with error, than to silently discard the overflowing bits.
Post 06 Sep 2010, 18:44
View user's profile Send private message Visit poster's website Reply with quote
edemko



Joined: 18 Jul 2009
Posts: 549
edemko 07 Sep 2010, 04:12
-"neg" = "-" have priority thus those, like the rest, are cpu instructions protos, really as fasm will execute NEG etc instruction soon. That's why db -$ff should be allowed.
-Why, borders there are?..
-Using 65 arith even, and counting an expression, an intermediate may be a disallowed signed integer(will fasm report every wrong intermediate), and then you add some big value which will not clean bit65 and value stays negative but withing range restrictions(will fasm count transcendent and, having found correct result store it). See, when you controll user input, intermed. must be controlled too. Such controll makes both values pre-operating which is not much convenient. Simulate cpu. Getting a bug report you'll always point at Intel or AMD ;)

Next:
Code:
BIG = $ffffffff'ffffffff
db (-1) shr 1 - BIG ;$80
db (-1) shr 1 - 1   ;$7e
    
Post 07 Sep 2010, 04:12
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 07 Sep 2010, 05:14
edemko wrote:
-Using 65 arith even, and counting an expression, an intermediate may be a disallowed signed integer(will fasm report every wrong intermediate), and then you add some big value which will not clean bit65 and value stays negative but withing range restrictions(will fasm count transcendent and, having found correct result store it).
If you mean the case when we add something to large 65-bit negative value that wouldn't fit int 64 bits and obtain a negative value that fits into 64 bits, then this would be correct and desirable effect (the same was true with 33-bit operations, BTW). The intermediate values may not fit into the range desired for final value. For example you are allowed to do:
Code:
db -256+128    
or
Code:
db (144*343) and 3Fh    
or even
Code:
db (144*343) shr 8    
(in this last example SHR is not treated as limited to value size because the input value is out of range already - this was discussed in my above post).
If the similar possibilities applied to 64-bit numbers, it would not be a problem. The larger intermediates you can calculate on, the better (that's why I said on the fasmcon last year that fasm 2.0 would calculate on 129-bit numbers).

edemko wrote:
Next:
Code:
BIG = $ffffffff'ffffffff
db (-1) shr 1 - BIG ;$80
db (-1) shr 1 - 1   ;$7e
    
And this is the example of the mentioned problem with 64-bit calculations that would be fixed by making them 65-bit. Since "(-1) shr 1" in a byte context is "7Fh", then the result of first substraction would be -0FFFFFFFFFFFFFF80h, way outside the byte range (but stored correctly in 65 bits as 1.00000000000080) and it would signal error. The second one would give small result and be still be correct, as it is now. And "db (-1) shr 1 - (-1)" as well.
Post 07 Sep 2010, 05:14
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4020
Location: vpcmpistri
bitRAKE 07 Sep 2010, 06:28
Why not make FASM 2.0 arbitrary precision? Combined with the context size selection currently used, of course. (Which is a totally awesome way to handle the matter, btw.)
Post 07 Sep 2010, 06:28
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 07 Sep 2010, 09:08
bitRAKE wrote:
Why not make FASM 2.0 arbitrary precision?
That obviously would be something nice. It still may be decided, I'm not starting fasm 2.0 in any nearest future. You know, they are going to release Duke Nukem Forever next year, so then I may get a chance to beat them in terms of development time. Wink
Post 07 Sep 2010, 09:08
View user's profile Send private message Visit poster's website Reply with quote
edemko



Joined: 18 Jul 2009
Posts: 549
edemko 07 Sep 2010, 17:38
Tomasz wrote:

Also there would be a solution of disallowing unsigned positive 64-bit values that don't fit into signed range

Borland Delphi 7:
Code:
const
  ok1:ShortInt = -$80;
  ok2:ShortInt = -($7f+$01);
  ok3:ShortInt = $100-$100;
//  nok1:ShortInt = $80;
//  nok2:ShortInt = $7f+$01;
//  nok3:ShortInt = $81;
//  nok4:ShortInt = -$81;
    

See their brains: value treated positive regardless sign bit.
No operand size satisfies $80, $8000, etc. values.
Using such approach you are supposed finding new data types: signed and unsigned.
Tomasz wrote:

So if you define a byte value, you can specify either a unsigned one in range 0 to 0FFh, or a signed one in range -80h to 80h. So effectively byte values are allowed in ranges from -80h to 0FFh.

See yourself: like Borland, -129 can not be difined, -$81 too.
Hence range checking already works(i found it only now) but, to make it easy, within -128..+255.
Now look at -$81, you do not mind it's sign bit too.
Tomasz wrote:

I'm not saying it is definitely too late to implement 65-bit calculations into fasm. It is possible to do it with some effort; however I'm not sure whether it is really so important (though it would certainly be very satisfying to me to have these things cleaned up).

Ok, i mean all is fine and no 65 needed.
(-super_small) + (-super_small) = super_puper_small, 65 won't help you.
Tomasz wrote:

But when calculated as 64-bit value, "not 80h" becomes 0FFFFFFFFFFFFFF7Fh, and it does not fit into -80h to 0FFh range.

As told over, you do not mention sign bit(let it be a convention).
"not $80" gets translated into qword in correct form with high parts zero due the convention.
As no minus is specified with "not": -$80 < $ffff'ffff'ffff'ff7f > 255.
It means you do all right.
There is nothing strange in that high_part_removal as that one is an extension.
Byte value is wrong when its extension isn't -1.
Seems to be clear... for "not" case at least.


Em... to hard.
Post 07 Sep 2010, 17:38
View user's profile Send private message Reply with quote
edemko



Joined: 18 Jul 2009
Posts: 549
edemko 07 Sep 2010, 17:47
shl, shr, or, xor are other cases so i cannot argue/agree with Tomasz as see no bit 65 role here, and a bit tired :(
Post 07 Sep 2010, 17:47
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 07 Sep 2010, 17:48
edemko: either you did not understand what I was writing about or I don't understand what you are writing about. Or perhaps both.
Post 07 Sep 2010, 17:48
View user's profile Send private message Visit poster's website Reply with quote
DOS386



Joined: 08 Dec 2006
Posts: 1900
DOS386 07 Sep 2010, 23:34
Tomasz Grysztar wrote:
As those fasm's expression calculator peculiarities were never actually documented in detail, I decided to write it down here, so that this information is kept somewhere.


Quote:







(much)

Very interesting ... maybe there should be "Compiler Internals FAQ" or even better a set of downloadable docs including this one. There are several such large docs posts by Tomasz here in but they are very hard to find when they sink Sad
Post 07 Sep 2010, 23:34
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 08 Sep 2010, 09:57
And I'm adding just a small summary (which I hope may help to understand what is the point here):

We are talking about calculations performed on values of fixed bit size, with signalizing error on overflow. We can then make those operations signed or unsigned, or we alternatively allow both the signed and unsigned results at the price such, that some large positive results won't be distinguishable from the negative ones - I will call this this option to be the "composite".

And the summary is: TASM operated on 33-bit signed values (so operated in range -2^32 to 2^32-1), fasm operates on 64-bit composite values (so in range -2^63 to 2^64-1, but with some of the positive results being indistinguishable from negative ones, and therefore with a very poor error checking), and it would be better if it operated on 65-bit signed (in range -2^64 to 2^64-1).

And why does fasm use 64-bit composite and not 64-bit signed (which would have no ambiguity problems)? It is because when fasm defines a value of specified size (like BYTE or WORD), it allows it to be in the composite range for that size, and thus analogously it should allow values in composite range for the QWORD size, and this means it has to calculate them in at least 64-bit composite (or at least 65-bit signed) precision.

And the special handling of some operations depending on the context-derived size of value is a separate issue, not related to choosing the overall precision of operations.


Last edited by Tomasz Grysztar on 08 Sep 2010, 11:43; edited 3 times in total
Post 08 Sep 2010, 09:57
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 08 Sep 2010, 10:05
DOS386 wrote:
Very interesting ... maybe there should be "Compiler Internals FAQ" or even better a set of downloadable docs including this one.
That might be a good idea - there are already many links in the "advanced fasm topics" in the Important/interesting threads thread that could be used to compose such set of references. However many of them are not up to date, since fasm internals evolved a bit. I would have to take each one of them, update where needed and then make a part of such renewed reference.
Post 08 Sep 2010, 10:05
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page 1, 2  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.