flat assembler
Message board for the users of flat assembler.

Index > Heap > M$'s magic touch to performance

Author
Thread Post new topic Reply to topic
DustWolf



Joined: 26 Jan 2006
Posts: 373
Location: Ljubljana, Slovenia
DustWolf
I was just digging around my system DLLs, when I found this profound example of inefficiency:
Code:
OR ECX,FFFFFFFF
NOT ECX
CMP ECX,0FFFF    

This code is on many procedures of the NTDLL.DLL file, which includes the kernel functions for often-executed code like string creation, memory copying, etc.

Any guesses what's it for? There's no jumps pointing to anywhere inbetween those opcodes and I find it hard to think that anything in C calling itself an Optimizer could be forced to admit that to be it's legitimite output. Razz

Somebody at M$ is doing bad code on purpose... but then you already knew that. Razz

</rant>
Post 27 Aug 2006, 23:02
View user's profile Send private message AIM Address Yahoo Messenger MSN Messenger Reply with quote
vid
Verbosity in development


Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid
i believe that is product of using some macros for testing >1000h (differ handle from pointer), and used for -1 constant somehow... anyway, it's nice proof that compilers can beat human in producing all that code, because human can not focus on 15things at a time, and can't remember exact timing of every instruction...
Post 27 Aug 2006, 23:12
View user's profile Send private message Visit poster's website AIM Address MSN Messenger ICQ Number Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
vid, but ECX is always zero at "cmp ecx, 0ffff". Maybe that code sets the FLAGS for something?

DustWolf, can you post some code that use that sequence?
Post 27 Aug 2006, 23:16
View user's profile Send private message Reply with quote
DustWolf



Joined: 26 Jan 2006
Posts: 373
Location: Ljubljana, Slovenia
DustWolf
locodelassembly wrote:
vid, but ECX is always zero at "cmp ecx, 0ffff". Maybe that code sets the FLAGS for something?

DustWolf, can you post some code that use that sequence?


Wouldn't I be breaching a copyright if I did that?

I would suggest loading your NTDLL.DLL in Olly and looking at the procedure code for string innitialization. It's pretty obvious and near the top. Razz
Post 27 Aug 2006, 23:23
View user's profile Send private message AIM Address Yahoo Messenger MSN Messenger Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
I searched with W32Dasm NTDLL.DLL but I can't find any occurrence of that sequence. Can you send me a private message at least? I don't work for Microsoft so you can trust me Very Happy

Regards
Post 27 Aug 2006, 23:42
View user's profile Send private message Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3170
Location: Denmark
f0dder
Which windows version?
Is this alignment stuff or does it happen inside a procedure?
Post 28 Aug 2006, 12:30
View user's profile Send private message Visit poster's website Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
DustWolf, in your PM doesn't appear that sequence exactly, you forgot to add the "REPNE SCAS BYTE PTR ES:[EDI]" which changes everything.

The exact sequence is:

Code:
.
.
.
:7C9112AF 83C9FF                  or ecx, FFFFFFFF
:7C9112B2 33C0                    xor eax, eax
:7C9112B4 F2                      repnz
:7C9112B5 AE                      scasb
:7C9112B6 F7D1                    not ecx
:7C9112B8 81F9FFFF0000            cmp ecx, 0000FFFF
:7C9112BE 7605                    jbe 7C9112C5
:7C9112C0 B9FFFF0000              mov ecx, 0000FFFF
.
.
.
    


That is from RtlInitAnsiString. That code scans for NULL char and if it found in less than $FFFF bytes then ECX is set to $FFFF. The OR is just doing "mov ecx, $FFFFFFFF" in fewer bytes and the NOT is needed to determinate how many iterations "rep scasb" did. The NOT can't be removed because not only CMP uses ECX, but some instructions later accesses memory using ECX as displacement.

Regards
Post 28 Aug 2006, 12:43
View user's profile Send private message Reply with quote
Maverick



Joined: 07 Aug 2006
Posts: 251
Location: Citizen of the Universe
Maverick
Is it before a loop? Very Happy

Remember padding..

_________________
Greets,
Fabio
Post 28 Aug 2006, 12:44
View user's profile Send private message Visit poster's website Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
Quote:

The NOT can't be removed because not only CMP uses ECX, but some instructions later accesses memory using ECX as displacement.


Sorry, I'm wrong, ECX isn't used as displacement, ECX is the operand stored in memory Razz
Post 28 Aug 2006, 12:55
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
Quote:
That code scans for NULL char and if it found in less than $FFFF bytes then ECX is set to $FFFF


That's wrong too :S What happen to me today??

If the NULL char is found in more than or exactly in $FFFF iterations then ECX is set to $FFFF.
Post 28 Aug 2006, 14:33
View user's profile Send private message Reply with quote
Madis731



Joined: 25 Sep 2003
Posts: 2141
Location: Estonia
Madis731
Very HappyVery HappyVery Happy
Code:
    lea ecx,[esi-4]
  nextDWORD:
    add ecx,4
    mov eax,[ecx]
    mov edx,eax
    not eax
    sub edx,01010101h
    and eax,80808080h
    and eax,edx
    je  nextDWORD
    sub ecx,1
    shr eax,8
    adc ecx,1
    shr eax,8
    adc ecx,1
    shr eax,8
    adc ecx,1
    shr eax,8
    adc ecx,1
    sub ecx,esi
    cmp ecx,0FFFFh
    jbe @f
    mov ecx,0FFFFh
  @@:
    

I love the optimizations you can do on the last three lines.
Code:
    movzx ecx,word[$-4] ; Very Happy
    
Post 29 Aug 2006, 16:46
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
Code:
    sub ecx,1 
    shr eax,8 
    adc ecx,1 
    shr eax,8 
    adc ecx,1 
    shr eax,8 
    adc ecx,1 
    shr eax,8 
    adc ecx,1 
    sub ecx,esi    


Is that reliable? What if the string is something like below?
Code:
 string db "ABCD", 0, "DEF"    


Quote:
I love the optimizations you can do on the last three lines.
Code:
movzx ecx,word[$-4]; Very Happy    


Sorry I don't get it Razz Can you write the resultant code of that optimization?

Regards

[edit] Well I verified it, ECX is 8 with that string, and 4 if it is ["ABC", 0, "DEF"] or ["AB", 0, "DEF"] or ["A", 0, "DEF"] [/edit]

[edit2]
Code:
format PE GUI 4.0

    mov esi, string

    lea ecx,[esi-4]
  nextDWORD: 
    add ecx,4 
    mov eax,[ecx] 
    mov edx,eax 
    not eax 
    sub edx,01010101h 
    and eax,80808080h 
    and eax,edx 
    je  nextDWORD 
    sub ecx,1 
    shr eax,8 
    adc ecx,1 
    shr eax,8 
    adc ecx,1 
    shr eax,8 
    adc ecx,1 
    shr eax,8 
    adc ecx,1 
    sub ecx,esi 
    cmp ecx,0FFFFh 
    jbe @f 
    mov ecx,0FFFFh 
  @@: 
    int3 ; ECX = 7 but it should be 0 (or 1 if you want to emulate rep scas)

string:
  times 256 db 0    
[/edit2]
Post 29 Aug 2006, 17:33
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You can attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar. Also on YouTube, Twitter.

Website powered by rwasa.