flat assembler
Message board for the users of flat assembler.

Index > Main > Conditional moving without branching

Goto page Previous  1, 2
Author
Thread Post new topic Reply to topic
edfed



Joined: 20 Feb 2006
Posts: 4242
Location: 2018
edfed
goplat, congratulations for this proof of human optimisation. i doubt any compiler can do that...
Post 03 Apr 2008, 22:28
View user's profile Send private message Visit poster's website Reply with quote
AlexP



Joined: 14 Nov 2007
Posts: 561
Location: Out the window. Yes, that one.
AlexP
Quote:
goplat, congratulations for this proof of human optimisation. i doubt any compiler can do that...
Smile
Post 04 Apr 2008, 13:09
View user's profile Send private message Visit poster's website Reply with quote
vid
Verbosity in development


Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid
I guess some kind of lexical analyzer generator *could* come up with this solution. Anyway, this is only proof that humans are better in problems with lesser complexity - I doubt you could compete Flex with handwriten lexer, for something more complex.


Last edited by vid on 04 Apr 2008, 13:45; edited 1 time in total
Post 04 Apr 2008, 13:39
View user's profile Send private message Visit poster's website AIM Address MSN Messenger ICQ Number Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17665
Location: In your JS exploiting you and your system
revolution
So, game on, which C compiler can find it? Any?
Post 04 Apr 2008, 13:40
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 3044
Location: vpcmipstrm
bitRAKE
vid wrote:
I guess some kind of lexical analyzer generator *could* come up with this solution. Anyway, this is only proof that humans are better in problems with lesser complexity - I doubt you could compete Flex with handwriten lexer, for something more complex.
Luckily, there isn't just one human. We build great things all the time - the complexity of which is debatable.

_________________
¯\(°_o)/¯ unlicense.org
Post 04 Apr 2008, 13:56
View user's profile Send private message Visit poster's website Reply with quote
mattst88



Joined: 12 May 2006
Posts: 260
Location: South Carolina
mattst88
Goplat wrote:
bitRAKE wrote:
Most likely the smallest too.
Actually, it could be one byte smaller by changing "add eax,96" to "add al,96".


Other than being smaller, are there any other implications of doing this? Implications for 64-bit code, for example?

_________________
My x86 Instruction Reference -- includes SSE, SSE2, SSE3, SSSE3, SSE4 instructions.
Assembly Programmer's Journal
Post 04 Apr 2008, 13:58
View user's profile Send private message Visit poster's website Reply with quote
vid
Verbosity in development


Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid
revolution: was that reaction to me? i think that most compilers would produce that from "x = (x>>1)+96;". Or what do you expect to be input for C compiler?

As for complexity, I meant something like FASMs parser, which translates ASCII tokens to different internal representation. This particular example is a very simple case of same problem. In more complex case, i doubt anyones handwritten parser would beat parser generated flex.
Post 04 Apr 2008, 14:13
View user's profile Send private message Visit poster's website AIM Address MSN Messenger ICQ Number Reply with quote
edfed



Joined: 20 Feb 2006
Posts: 4242
Location: 2018
edfed
sure a compiler can generate the same code if we enter this formula.

but can it make it if we enter:
selectcase x
case 128, x=160
case 256, x=224
case 192, x=192

there i doubt it will output this so simple and short :

shr eax,1
add eax,96


Last edited by edfed on 04 Apr 2008, 19:08; edited 1 time in total
Post 04 Apr 2008, 14:51
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17665
Location: In your JS exploiting you and your system
revolution
edfed has the right idea for what I was thinking of, rather than feed the compiler the formula, give it the raw data and see what it does.

Of course, I don't expect this to be any sort of real test, just for fun only to see if the smart compiler folks have manage to put their smarts into the compiler.
Post 04 Apr 2008, 15:10
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 3044
Location: vpcmipstrm
bitRAKE
vid wrote:
As for complexity, I meant something like FASMs parser, which translates ASCII tokens to different internal representation. This particular example is a very simple case of same problem. In more complex case, i doubt anyones handwritten parser would beat parser generated flex.
I disagree this isn't an example of a parsing problem. Certainly not in the flex sense of parsing.

Parsing can be done in any language with a decent model of the problem, and people code them by hand all the time. Why must we tackle a massive grammar in one piece? Smaller descriptions are easier to code and understand by the user (a human by the way).

_________________
¯\(°_o)/¯ unlicense.org
Post 04 Apr 2008, 16:14
View user's profile Send private message Visit poster's website Reply with quote
mattst88



Joined: 12 May 2006
Posts: 260
Location: South Carolina
mattst88
mattst88 wrote:
Goplat wrote:
bitRAKE wrote:
Most likely the smallest too.
Actually, it could be one byte smaller by changing "add eax,96" to "add al,96".


Other than being smaller, are there any other implications of doing this? Implications for 64-bit code, for example?

_________________
My x86 Instruction Reference -- includes SSE, SSE2, SSE3, SSSE3, SSE4 instructions.
Assembly Programmer's Journal
Post 04 Apr 2008, 23:55
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 3044
Location: vpcmipstrm
bitRAKE
mattst88, do you mean - does it work in long mode? Of course, it does. The shift needs to operate on 16-bits or more (given the input range specified), but the add only needs to operate on the lower byte - upper bits should be zero. Somehow I feel you are after something else with that question though. There'd be a stall between the instructions, but they don't execute in a vacuum - most likely other instructions would mitigate the delay.

_________________
¯\(°_o)/¯ unlicense.org
Post 05 Apr 2008, 00:14
View user's profile Send private message Visit poster's website Reply with quote
DOS386



Joined: 08 Dec 2006
Posts: 1901
DOS386
I wrote:

Code:
; Input in AL, AX or EAX
sub al,128
shr al,1
add al,160
; Result in AL, AX or EAX
    


edfed wrote:

Quote:

(192-128)/2+160!=192
(256-128)/2+160!=224


Interesting. Please post the wrong results also !

> your easy puzzle resolution is completelly false.
> you presume of your power.

Evidence please.

> al cannot contain 9 bits. 256 ==> 9 bits.

Missed the point.

AlexP wrote:

> good luck with storing 256 into al

Missed the point.

256 can be stored into AL as 0 - if 0 itself is not a valid value ... wasn't this obvious enough ?

Goplat wrote:

> This is probably the fastest way to do it:
> shr eax,1
> add eax,96

Great Smile Reduced 3 instruction to 2 ... who dares to reduce them to just one ?

And now, seeing Goplat's code, after reducing EAX to AX as I had suggested, would work on 8086, and, even worse, my code would work even on 8080 , compare this to original "solutions":

Code:
    xor  ebx, ebx
    xor  ecx, ecx
    mov  eax, 128
    cmp eax,192
    sbb ebx,ebx
    cmp eax,256
    sbb ecx,ecx
    and ebx,32
    and ecx,12
    add eax,ebx
    lea eax,[eax+ecx-12]
    


Code:
mov eax, 224 ; default
mov ebx, 160 ; other two
mov ecx, 190
cmp edi, 192
cmove eax, ecx ; &
cmovb eax, ebx ; & Very SAFE to use ???
    


and the latest CMOV / SSE / 64-bit technology looks very very good Very Happy

bitRAKE wrote:

> does it work in long mode? Of course, it does.

YES. Just hang 56 idle ZERO bits on the top of your 8-bit integers and pull them with you all the time Very Happy
Post 05 Apr 2008, 03:11
View user's profile Send private message Reply with quote
AlexP



Joined: 14 Nov 2007
Posts: 561
Location: Out the window. Yes, that one.
AlexP
Quote:
256 can be stored into AL as 0 - if 0 itself is not a valid value
I don't think that 256 == 0 (mod 256) was what I was inferring, I think YOU missed the point Smile
Post 05 Apr 2008, 03:26
View user's profile Send private message Visit poster's website Reply with quote
DOS386



Joined: 08 Dec 2006
Posts: 1901
DOS386
> I don't think that 256 == 0 (mod 256) was what I was inferring, I think YOU missed the point

OK ... please reveal what you find in AL after

Code:
mov eax,256
; ud2 ; Display register values 
    

_________________
Bug Nr.: 12345

Title: Hello World program compiles to 100 KB !!!

Status: Closed: NOT a Bug
Post 05 Apr 2008, 03:49
View user's profile Send private message Reply with quote
edfed



Joined: 20 Feb 2006
Posts: 4242
Location: 2018
edfed
Quote:

(192-128)/2+160!=192
(256-128)/2+160!=224

then, my note will be a Z. sorry DOS386.
what was troubling me was the 9th bit, not available for a 8bit register.

your algo is the second fastest one. Very Happy
Quote:
who dares to reduce them to just one ?

I dare, only 1 instruction Very Happy, but a lot of wasted memory. Sad

Code:
...
mov eax,[table+eax]
...
table:
rb 128  ;there can be defined 128 bytes for something else
dd 160
rb 60   ; there can be defined 60 bytes for something else
dd 192
rb 60   ; there can be defined 60 other bytes for other something else
dd 224

    

Laughing
the same for 8080?
Code:
...
mov bl,[table+bx]
...
table:
rb 128  ;there can be defined 128 bytes for something else
db 160
rb 63   ; there can be defined 63 bytes for something else
db 192
rb 63   ; there can be defined 63 other bytes for other something else
db 224

    


note that if the compiler implement well the selectcase, it will be able to generate this code and use the extra bytes for some other usages...
Post 05 Apr 2008, 04:00
View user's profile Send private message Visit poster's website Reply with quote
Borsuc



Joined: 29 Dec 2005
Posts: 2466
Location: Bucharest, Romania
Borsuc
DOS386 is right, and actually you guys did not get the point.

Just because you use 'al' doesn't mean 'eax' can't contain other values -- did you lost the sense of asm optimizations?

For example:

Code:
mov eax, 256
add al, 1

; eax will now be 257 Wink    
Post 05 Apr 2008, 12:10
View user's profile Send private message Reply with quote
edfed



Joined: 20 Feb 2006
Posts: 4242
Location: 2018
edfed
the value in the bit 9 is not important if we substract 128, the result will give 128, then, 0 - 128 = 128
cause : binary knows an inifite, + and - infinites are the same value, the one between 7Fh and 80h, 7Fh +1/2?
sorry for the mistake, i still banish my intellect with the Z notation.
i can decrease it with Z-- if you want.
Post 05 Apr 2008, 12:18
View user's profile Send private message Visit poster's website Reply with quote
Borsuc



Joined: 29 Dec 2005
Posts: 2466
Location: Bucharest, Romania
Borsuc
edfed wrote:
sorry for the mistake, i still banish my intellect with the Z notation.
i can decrease it with Z-- if you want.
lol

don't be so harsh, we all make mistakes Wink
Post 05 Apr 2008, 12:24
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page Previous  1, 2

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.

Website powered by rwasa.