flat assembler
Message board for the users of flat assembler.

Index > Main > Rotating Bits

Goto page Previous  1, 2
Author
Thread Post new topic Reply to topic
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20621
Location: In your JS exploiting you and your system
revolution 27 Apr 2010, 06:12
Code:
   xor       edx,edx
   mov       dl,[col]    
==
Code:
   movzx     edx,[col]    
Post 27 Apr 2010, 06:12
View user's profile Send private message Visit poster's website Reply with quote
Fanael



Joined: 03 Jul 2009
Posts: 168
Fanael 27 Apr 2010, 10:59
revolution wrote:
Code:
   xor       edx,edx
   mov       dl,[col]    
==
Code:
   movzx     edx,[col]    
It won't compile, you didn't specify the size of the operand. On the other hand,
Code:
movzx edx, byte [col]    
compiles correctly and even works.
Post 27 Apr 2010, 10:59
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20621
Location: In your JS exploiting you and your system
revolution 27 Apr 2010, 11:09
Well col should be defined as byte-sized in the source. Catches a lot of programming errors when the sizes are specified beforehand. If you use overrides everywhere then what happens if you later decide to make 'col' a word-sized value?
Post 27 Apr 2010, 11:09
View user's profile Send private message Visit poster's website Reply with quote
Fanael



Joined: 03 Jul 2009
Posts: 168
Fanael 27 Apr 2010, 11:22
Ah, I was wrong - if col is defined as a byte or a halfword then your code compiles just fine. FASM never ceases to astonish me.
Post 27 Apr 2010, 11:22
View user's profile Send private message Reply with quote
edfed



Joined: 20 Feb 2006
Posts: 4354
Location: Now
edfed 27 Apr 2010, 11:32
X*80 = X*64+X*16 = (X*4+X)*16
then, for fastnes...
Code:
push eax edx
movzx eax,byte[row]
movzx edx,byte[col]
lea eax,[eax*5]
shl eax,4
add eax,edx
mov [pos],eax
pop edx eax
ret
    


and for more speed, use parameter passing through registers.
it will save the clocks used to save registers

then:
Code:
movzx eax,[row]
movzx edx,[col]
call setpos
mov [pos],eax
...
setpos:
lea eax,[eax*5]
shl eax,4
add eax,edx
ret
    


Last edited by edfed on 27 Apr 2010, 11:40; edited 2 times in total
Post 27 Apr 2010, 11:32
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20621
Location: In your JS exploiting you and your system
revolution 27 Apr 2010, 11:39
edfed: Sure, as long as the multiplier 80 never changes, ever, then no problem. Wink
Post 27 Apr 2010, 11:39
View user's profile Send private message Visit poster's website Reply with quote
edfed



Joined: 20 Feb 2006
Posts: 4354
Location: Now
edfed 27 Apr 2010, 11:41
if multiplier change, then use imul via a memory operand
Code:
mul eax,[Xres]
    
Post 27 Apr 2010, 11:41
View user's profile Send private message Visit poster's website Reply with quote
Fanael



Joined: 03 Jul 2009
Posts: 168
Fanael 27 Apr 2010, 11:53
No ARM version yet?
Code:
ldrb r0, [row]
ldrb r1, [col]
add r0, r0, r0, lsl #2
add r0, r1, r0, lsl #4
strh r0, [pos]
bx lr

row db ?
col db ?
pos dh ?    
Wink
Post 27 Apr 2010, 11:53
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20621
Location: In your JS exploiting you and your system
revolution 27 Apr 2010, 11:56
Fanael wrote:
No ARM version yet?
Code:
ldrb r0, [row]
ldrb r1, [col]
add r0, r0, r0, lsl #2
add r0, r1, r0, lsl #4
strh r0, [pos]
bx lr

row db ?
col db ?
pos dh ?    
Wink
Yeah!
Quote:
flat assembler for ARM version 1.69.13 (216715 kilobytes memory)
2 passes, 24 bytes.
Code:
00000000: E5DF0010 V1     ldrb  r0,[r15,0x10]   ;=[0x18]
00000004: E5DF100D V1     ldrb  r1,[r15,0xD]    ;=[0x19]
00000008: E0800100 V1     add   r0,r0,r0,lsl 2
0000000C: E0810200 V1     add   r0,r1,r0,lsl 4
00000010: E1CF00B2 V4     strh  r0,[r15,0x2]    ;=[0x1A]
00000014: E12FFF1E V4T    bx    r14    
Post 27 Apr 2010, 11:56
View user's profile Send private message Visit poster's website Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 27 Apr 2010, 13:22
Code:
update_pos:
   pushad
   mov       eax, 80
   mul       [row]                 ; EAX[0:15] = 80*[row]; EAX[16:31] = 0
   movzx     edx, [col]
   add       eax, edx              ; row*80+col
   mov       [pos], eax            ; pos=row*80+col
   popad
   ret

update_pos_dword:
   pushad
   imul      eax, [row_dd], 80
   add       eax, [col_dd]         ; row*80+col
   mov       [pos], eax            ; pos=row*80+col
   popad
   ret    
Post 27 Apr 2010, 13:22
View user's profile Send private message Reply with quote
edfed



Joined: 20 Feb 2006
Posts: 4354
Location: Now
edfed 27 Apr 2010, 15:51
why pushad, there are only EAX and EDX to save in the first case, and olnly EAX in the second.
pushad is only one instruction, but it use a lot of cycles for nothing in theses cases.
Post 27 Apr 2010, 15:51
View user's profile Send private message Visit poster's website Reply with quote
Fanael



Joined: 03 Jul 2009
Posts: 168
Fanael 27 Apr 2010, 16:05
There's no need to preserve anything at all, EAX and EDX can be clobbered according to most calling conventions (if you're not using one of these, then ignore this post).
Post 27 Apr 2010, 16:05
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 27 Apr 2010, 17:08
Quote:

why pushad, there are only EAX and EDX to save in the first case, and olnly EAX in the second.
pushad is only one instruction, but it use a lot of cycles for nothing in theses cases.

Yep, I just copy&pasted the code without paying any attention to the prologue and epilogue, the second should be "push eax"/"pop eax". I wanted to optimize for size*, since I guessed this function is probably not used much and it is best to save the code cache for more important stuff, "push eax edx" takes more space than pushad.

*Some tricks could be applied to further reduce size if the three variables are known to be near from each other. Perhaps BCD tricks can do something here too?
Post 27 Apr 2010, 17:08
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page Previous  1, 2

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.