flat assembler
Message board for the users of flat assembler.

Index > Main > Beauty in x86 assembly?

Goto page Previous  1, 2
Author
Thread Post new topic Reply to topic
bitRAKE



Joined: 21 Jul 2003
Posts: 4073
Location: vpcmpistri
bitRAKE 29 Apr 2018, 06:58
revolution wrote:
It's mind bogglingly complex.
...and we aren't getting any younger, lol. Very Happy
That's what good tools are for - to leverage a limited mind and body.

Here we have the GCD - it's not fast, but it works:
Code:
macro gcd? reg0,reg1
  local _0,_1
        jmp _1
_0:     neg reg1
        xchg reg1, reg0
_1:     sub reg1, reg0
        jg _1
        jne _0
end macro    
...create your own inline instruction. Wink

_________________
¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup
Post 29 Apr 2018, 06:58
View user's profile Send private message Visit poster's website Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 2565
Furs 29 Apr 2018, 12:24
revolution wrote:
It isn't just the CPU though. Even if you knew exactly every transistor, it is still the code you are running that affects things. On some CPUs the OOO buffer is more than 100 instructions long. So you also have to know every one of those 100+ instructions ahead of your snippet, and which port they will go into, and what instructions are currently in each port, and how many ports you have, and whether or not the memory read/write buffers are full, and the current state of the BTB and caches, whether or not another SMP instruction stream is interleaved with your stream, etc. etc. etc. It's mind bogglingly complex.
Sometimes it's good to have unused ports or units in a thread, because they become free to use for another thread with Hyperthreading.

It's not often you see hyperthreading double the performance, but it does happen (and it did for me) when I did a brute force test of a very long latency-bottlenecked algorithm (but you could parallelize individual inputs of course). Using 8 threads with hyperthreading finished in almost half the time I estimated with 4 threads (8 threads with HT), which was a 2 hour gain. And that's with me still using my PC for lightweight stuff (I lowered the priority on that test program to minimum).
Post 29 Apr 2018, 12:24
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20451
Location: In your JS exploiting you and your system
revolution 29 Apr 2018, 13:15
Furs wrote:
Sometimes it's good to have unused ports or units in a thread, because they become free to use for another thread with Hyperthreading.
Yes. If you know all about that particular CPU then you can do such things. But in six months time when a newer system is being used all that knowledge becomes useless.
Post 29 Apr 2018, 13:15
View user's profile Send private message Visit poster's website Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 02 May 2018, 02:25
Remember the 62-byte Sudoku solver (DOS .COM) we discussed years ago?

There's also Assembly Gems (archived).

Bob Swart wrote about Borland Pascal Efficiency, and one interesting snippet is this:

Code:
function IsAscii(C: Char): Boolean;
 InLine(
   $5B/          {      POP     BX      }
   $31/$C0/      {      XOR     AX,AX   }
   $D0/$E3/      {      SHL     BL,1    }
   $1C/$FF);     {      SBB     AL,$FF  }
    
Post 02 May 2018, 02:25
View user's profile Send private message Visit poster's website Reply with quote
Picnic



Joined: 05 May 2007
Posts: 1403
Location: Piraeus, Greece
Picnic 03 May 2018, 14:20
This snippet prints a two-digit number in AL (11 bytes).
Code:
putn:
aam
push A
A:xchg al,ah
add al,48
int 0x29 
ret 
    
Post 03 May 2018, 14:20
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20451
Location: In your JS exploiting you and your system
revolution 03 May 2018, 14:29
Picnic wrote:
This snippet prints a two-digit number in AL (11 bytes).
I presume this is intended for real mode DOS int 0x29?
Post 03 May 2018, 14:29
View user's profile Send private message Visit poster's website Reply with quote
Picnic



Joined: 05 May 2007
Posts: 1403
Location: Piraeus, Greece
Picnic 03 May 2018, 14:40
Yes, it is the 29h Fast console output (more about here Undocumented DOS Programming).

If i recall well i saw it inside one of the 256 byte demo, many years ago.
Post 03 May 2018, 14:40
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page Previous  1, 2

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.