flat assembler
Message board for the users of flat assembler.
![]() |
Author |
|
MazeGen 03 Oct 2007, 14:12
It is because of the CISC nature of early x86 instruction set. In short, there are two groups of instructions: general and specialized, like MOV versus LODS, JNZ versus LOOP, MOVSX versus CWDE.
BTW, those two blocks of code don't perform the same. LODS increments/decrements ESI additionaly. |
|||
![]() |
|
xspeed 03 Oct 2007, 17:29
lodsb, especially if you run it with repXX (rep function) it will tend to run 5-10 times faster then mov.
|
|||
![]() |
|
MazeGen 03 Oct 2007, 17:31
REP LODSB? Interesting instruction. Very useful!
![]() |
|||
![]() |
|
Feryno 04 Oct 2007, 08:42
A little out of topic, but I see an usage of REP LODSB e.g. in self-modifying protected code:
1. code prepares hardware breakpoint at the begin of memory to be accessed by rep lodsb 2. debug exception handler decrytps byte causing exception and increments debug register 0-3 so debug exception occures until rep lodsb ends code skeleton: ; code section is readable + WRITEABLE - set exception handler - set debug registers in the thread context so DR0 or DR1 or DR2 or DR3 points to encrypted_start and DR7 is set to trigger on memory read/write lea esi,[encrypted_start] cld mov ecx,encrypted_size repz lodsb encrypted_start: ; some encrypted code here ; end of encrypted code encrypted_size = $ - encrypted_start exception01_handler: - get DR0 or DR1 or DR2 or DR3 set before (from ThreadContext) - decrypt byte at that address - increment DR0 or DR1 or DR2 or DR3 - write incremented debug register back to the thread context ; end of exception handled Don't suppose my brain/thinking to be crazy... I'm just now thinking about such a protection of code. Thank you for the tip. I used similar rep scasd in my recent demo. But rep lodsd looks even crazier !!! Thing looking useless may make big pleasure for someone else... |
|||
![]() |
|
16bitPM 29 Mar 2012, 09:58
It's useful to load code in the cache if timing is crucial. EVEN on cached 286 systems
![]() |
|||
![]() |
|
LostCoder 29 Mar 2012, 11:38
Because of size probably. At the time of the good old 16-bit media were small, and processors did not have the advanced instruction caches, etc.
therefore, the program was so fast as they are short, and so to the size of the code have paid much attention. Check yourself: Code: ; code with lodsb mov esi,_string ; 6 bytes cld ; 1 byte lodsb ; 1 bytes ; 8 bytes total ; same things for oldschool 16-bit mov si,_string ; 3 bytes cld ; 1 byte lodsb ; 1 bytes ; 5 bytes total ; emulation mov esi,_string ; 6 bytes ; emulate cld xor edx,edx ; 3 bytes ; use edx as "direction flag", use 0 or 1 ; emulate lodsb shl edx,1 ; 3 bytes ; convert direction flag to -1,1 dec edx ; 2 bytes mov al,[esi] ; 3 bytes add esi,edx ; 3 bytes ; 20 bytes total ; same thing for oldshool 16-bit mov si,_string ; 3 bytes ; emulate cld xor dx,dx ; 2 bytes ; use dx as "direction flag" ; emulate lodsb shl dx,1 ; 2 bytes ; convert direction flag to usable delta dec dx ; 1 byte mov al,[si] ; 2 bytes add si,dx ; 2 bytes ; 11 bytes total |
|||
![]() |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.