flat assembler
Message board for the users of flat assembler.

Index > Heap > Curiosity with VS2008

Goto page Previous  1, 2
Author
Thread Post new topic Reply to topic
kohlrak



Joined: 21 Jul 2006
Posts: 1421
Location: Uncle Sam's Pad
kohlrak
ouadji wrote:

serfasm wrote:
Code:
...... (A)
00D20AA1    8BD9            MOV EBX,ECX
00D20AA3    8BCB            MOV ECX,EBX ; (B)
    

that is alredy curious

I can not stand this kind of thing.
Probably remnants of older versions, uncleaned, not optimized.
how horrible ! It's the kind of code that prevents me from sleeping.

edit : only solution ...
a jump from other portion of code on the second opcode.
1) from "A" mov ecx,ecx ; don't change "ecx"
2) Jump to "B" (from anywhere) ... mov ecx,ebx


Sadly i've seen this sort of thing out of GCC as well (hm...). No wonder java does so well... Anyway, i don't think the compiler's actually that smart to reuse code like that. Heck, half the time i've even seen compilers ignore the good ol' inline hint.
Post 13 Mar 2010, 00:16
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17278
Location: In your JS exploiting you and your system
revolution
godomega wrote:
It's sad how JIT screws up its complete potential. I mean it doesn't dare to touch any sse instructions. They haven't any good reason to do so, they create code DYNAMICLY which means they can check compatibility of instruction sets and even can make it completely processor specific if they wanted to.
You make it sound so easy to vectorise code! Even humans have trouble vectorising code so how can we expect to program computers to do the job if we can't even do it ourselves?
Post 13 Mar 2010, 01:26
View user's profile Send private message Visit poster's website Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
Well, I've tried with this code:
Code:
Option Strict On

Imports System.Runtime.InteropServices

Class CustomStream
   Private pos As Integer
   Private storage() As Byte

   Public Function ReadByte() As Integer

      Return If(pos < storage.Length, storage(pos), -1)
   End Function

   Public Sub New(ByVal size As Integer)

      storage = New Byte(size - 1) {} ' In VB.NET the constructor specifies upper bound index, not size
   End Sub
End Class

Class RecordReader
   Public Shared Function ReadDouble(ByVal stream As CustomStream) As Double
      Dim result As Int64

      result = stream.ReadByte()
      result = result Or CLng(stream.ReadByte()) << 8
      result = result Or CLng(stream.ReadByte()) << 16
      result = result Or CLng(stream.ReadByte()) << 24
      result = result Or CLng(stream.ReadByte()) << 32
      result = result Or CLng(stream.ReadByte()) << 40
      result = result Or CLng(stream.ReadByte()) << 48
      result = result Or CLng(stream.ReadByte()) << 56

      Return BitConverter.Int64BitsToDouble(result)
   End Function

End Class

Module Module1
   <DllImport("KERNEL32.DLL", EntryPoint:="DebugBreak", _
        SetLastError:=False, CharSet:=CharSet.Unicode, _
        ExactSpelling:=True, _
        CallingConvention:=CallingConvention.StdCall)> _
        Public Sub DebugBreak()
   End Sub

   Const STREAM_SIZE As Integer = 1024 * 1024 * 16

   Sub Main()
      Dim stream As New CustomStream(STREAM_SIZE)
      Dim doubles(STREAM_SIZE \ 8 - 1) As Double
      Dim output(1) As Double

      For i As Integer = 0 To STREAM_SIZE \ 8 - 1
         ' Also tried "doubles.Length - 1" but the compiler still checks bounds per iteration...
         ' (and "i" was compared against memory instead of an immediate)

         doubles(i) = RecordReader.ReadDouble(stream)
      Next

      ' Consume values to prevent posible unwanted optimizations above
      For i As Integer = 0 To STREAM_SIZE \ 8 - 1 Step 2

         output(0) += doubles(i) : output(1) += doubles(i + 1) ' Will it be SSE's ADDPD?
      Next

      DebugBreak()
      Console.WriteLine(output(0) + output(1)) ' Should be 0.0
   End Sub

End Module    


ReadDouble continued to be the same so I'll copy Module1.Main only:
Code:
00D10090   55               PUSH EBP
00D10091   8BEC             MOV EBP,ESP
00D10093   57               PUSH EDI
00D10094   56               PUSH ESI
00D10095   53               PUSH EBX
00D10096   83EC 0C          SUB ESP,0C
00D10099   B9 18319800      MOV ECX,983118
00D1009E   E8 791FC6FF      CALL 0097201C
00D100A3   8945 F0          MOV DWORD PTR SS:[EBP-10],EAX
00D100A6   8B4D F0          MOV ECX,DWORD PTR SS:[EBP-10]
00D100A9   BA 00000001      MOV EDX,1000000
00D100AE   FF15 08319800    CALL DWORD PTR DS:[983108]
00D100B4   BA 00002000      MOV EDX,200000
00D100B9   B9 4E591179      MOV ECX,7911594E
00D100BE   E8 CD21C6FF      CALL 00972290
00D100C3   8BF8             MOV EDI,EAX
00D100C5   BA 02000000      MOV EDX,2
00D100CA   B9 4E591179      MOV ECX,7911594E
00D100CF   E8 BC21C6FF      CALL 00972290
00D100D4   8BF0             MOV ESI,EAX                              ; First for-loop setup
00D100D6   33DB             XOR EBX,EBX
00D100D8   8B4D F0          MOV ECX,DWORD PTR SS:[EBP-10]            ; First for-loop
00D100DB   FF15 78319800    CALL DWORD PTR DS:[983178]               ; It goes directly to RecordReader.ReadDouble except for the first time.
00D100E1   3B5F 04          CMP EBX,DWORD PTR DS:[EDI+4]             ; Out of bounds check
00D100E4   0F83 85000000    JNB 00D1016F
00D100EA   DD5CDF 08        FSTP QWORD PTR DS:[EDI+EBX*8+8]
00D100EE   83C3 01          ADD EBX,1
00D100F1   70 77            JO SHORT 00D1016A                        ; Overflow check for the "i" variable. No comments...
00D100F3   81FB FFFF1F00    CMP EBX,1FFFFF
00D100F9  ^7E DD            JLE SHORT 00D100D8                       ; End of first for-loop
00D100FB   33C9             XOR ECX,ECX                              ; Second for-loop setup. No SSE Sad
00D100FD   8B56 04          MOV EDX,DWORD PTR DS:[ESI+4]
00D10100   83FA 00          CMP EDX,0                                ; Second for-loop; Out of bounds check for output(0)
00D10103   76 6A            JBE SHORT 00D1016F
00D10105   DD46 08          FLD QWORD PTR DS:[ESI+8]
00D10108   3B4F 04          CMP ECX,DWORD PTR DS:[EDI+4]             ; Out of bounds check for doubles(i)
00D1010B   73 62            JNB SHORT 00D1016F
00D1010D   DC44CF 08        FADD QWORD PTR DS:[EDI+ECX*8+8]
00D10111   DD5E 08          FSTP QWORD PTR DS:[ESI+8]
00D10114   83FA 01          CMP EDX,1                                ; Out of bounds check for output(1)
00D10117   76 56            JBE SHORT 00D1016F
00D10119   DD46 10          FLD QWORD PTR DS:[ESI+10]
00D1011C   8BC1             MOV EAX,ECX
00D1011E   83C0 01          ADD EAX,1
00D10121   70 47            JO SHORT 00D1016A                        ; Overflow check of "i + 1"
00D10123   3B47 04          CMP EAX,DWORD PTR DS:[EDI+4]             ; Out of bounds check for doubles(i + 1)
00D10126   73 47            JNB SHORT 00D1016F
00D10128   DC44C7 08        FADD QWORD PTR DS:[EDI+EAX*8+8]
00D1012C   DD5E 10          FSTP QWORD PTR DS:[ESI+10]
00D1012F   83C1 02          ADD ECX,2
00D10132   70 36            JO SHORT 00D1016A                        ; Overflow check of "i" variable...
00D10134   81F9 FFFF1F00    CMP ECX,1FFFFF
00D1013A  ^7E C4            JLE SHORT 00D10100
00D1013C   E8 43BFC7FF      CALL 0098C084                            ; DebugBreak() call
00D10141   DD46 08          FLD QWORD PTR DS:[ESI+8]
00D10144   DC46 10          FADD QWORD PTR DS:[ESI+10]
00D10147   DD5D E8          FSTP QWORD PTR SS:[EBP-18]
00D1014A   E8 A1D15D78      CALL mscorlib.792ED2F0
00D1014F   DD45 E8          FLD QWORD PTR SS:[EBP-18]
00D10152   83EC 08          SUB ESP,8
00D10155   DD1C24           FSTP QWORD PTR SS:[ESP]
00D10158   8BC8             MOV ECX,EAX
00D1015A   8B01             MOV EAX,DWORD PTR DS:[ECX]
00D1015C   FF90 D0000000    CALL DWORD PTR DS:[EAX+D0]
00D10162   8D65 F4          LEA ESP,DWORD PTR SS:[EBP-C]
00D10165   5B               POP EBX
00D10166   5E               POP ESI
00D10167   5F               POP EDI
00D10168   5D               POP EBP
00D10169   C3               RETN

    


Configuring the project with "Remove integer overflow checks" stops the checks of the "i" variables but array bounds are still checked.

What it is surprising, is that the JIT acknowledges that the output array size doesn't change and constants "0" and "1" don't neither but still the checks are performed for each iteration... (the CMPs EDX, {0,1}). Note that this checks are not present in the MSIL, only the instruction to access arrays is used.
Code:
.method public static void  Main() cil managed
{
  .entrypoint
  .custom instance void [mscorlib]System.STAThreadAttribute::.ctor() = ( 01 00 00 00 ) 
  // Code size       122 (0x7a)
  .maxstack  6
  .locals init (float64[] V_0,
           float64[] V_1,
           class TestsPad.CustomStream V_2,
           int32 V_3,
           int32 V_4,
           int32 V_5)
  IL_0000:  ldc.i4     0x1000000
  IL_0005:  newobj     instance void TestsPad.CustomStream::.ctor(int32)
  IL_000a:  stloc.2
  IL_000b:  ldc.i4     0x200000
  IL_0010:  newarr     [mscorlib]System.Double
  IL_0015:  stloc.0
  IL_0016:  ldc.i4.2
  IL_0017:  newarr     [mscorlib]System.Double
  IL_001c:  stloc.1
// Second for-loop part ({ld|st}elem.r8 are the Double() array access)
  IL_0037:  ldloc.1
  IL_0038:  ldc.i4.0
  IL_0039:  stloc.s    V_5
  IL_003b:  ldloc.s    V_5
  IL_003d:  ldloc.1
  IL_003e:  ldloc.s    V_5
  IL_0040:  ldelem.r8
  IL_0041:  ldloc.0
  IL_0042:  ldloc.s    V_4
  IL_0044:  ldelem.r8
  IL_0045:  add
  IL_0046:  stelem.r8
  IL_0047:  ldloc.1
  IL_0048:  ldc.i4.1
  IL_0049:  stloc.s    V_5
  IL_004b:  ldloc.s    V_5
  IL_004d:  ldloc.1
  IL_004e:  ldloc.s    V_5
  IL_0050:  ldelem.r8
  IL_0051:  ldloc.0
  IL_0052:  ldloc.s    V_4
  IL_0054:  ldc.i4.1
  IL_0055:  add
  IL_0056:  ldelem.r8
  IL_0057:  add
  IL_0058:  stelem.r8
  IL_0059:  ldloc.s    V_4
  IL_005b:  ldc.i4.2
  IL_005c:  add
  IL_005d:  stloc.s    V_4
  IL_005f:  ldloc.s    V_4
  IL_0061:  ldc.i4     0x1fffff
  IL_0066:  ble.s      IL_0037    


Even using the relaxation didn't make a difference. IL DASM shows this:
Code:
     CustomAttribute #8 (0c000009)
       -------------------------------------------------------
             CustomAttribute Type: 0a000027
          CustomAttributeName: System.Runtime.CompilerServices.CompilationRelaxationsAttribute :: instance void .ctor(int32)
              Length: 8
               Value : 01 00 08 00 00 00 00 00                          >                <
               ctor args: (8)    
So I believe I marked the "assembly" correctly using "Dim instance As CompilationRelaxationsAttribute".

PD: And yes, even in CustomStream.ReadByte() the bounds check is performed despite my own code does it already.

PD2: Remember, OllyDbg is not even running separately while the program is running, it only appears when I press "Debug" button in the program crash dialog when the process crash due to the DebugBreak() call.

PS3: Forgot to mention, the MSIL code corresponds to the program with overflow checks disabled. But the only difference is that the arithmetic instructions don't contain ".ovf".
Post 13 Mar 2010, 02:47
View user's profile Send private message Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3170
Location: Denmark
f0dder
kohlrak wrote:
Heck, half the time i've even seen compilers ignore the good ol' inline hint.
Because it is just that - a hint. Some compilers have a "forceinline" if you really-really-really believe a function should be inlined.

LocoDelAssembly: sounds pretty sad Sad

IMHO JIT'ing and managed code has a lot of potential, but it requires JITers that's probably a lot better than what we have today... and there'd be a fair amount of profiling overhead the first few times code is run.

_________________
Image - carpe noctem
Post 13 Mar 2010, 13:34
View user's profile Send private message Visit poster's website Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
In this article they claim they remove some checks. In the case of CustomStream.ReadByte() the redundant check could be considered OK actually due to threading issues (the class has no means to change the array, though), but the checks of doubles() and output() are clearly unneeded (and it is already stupid that the compiler checks for overflow of the "i" variables despite that it could have determined that at compile time).

Well, maybe VB.NET apps are marked as "intentionally make unoptimal code" to push C# a little more. If that is the case I kinda agree with them, VB.NET should be eradicated, but unfortunately I need it for my job...

[edit]The article I linked to might be referring to .Net Framework 4.0, I'm using 3.5 here.[/edit]


Last edited by LocoDelAssembly on 13 Mar 2010, 17:58; edited 1 time in total
Post 13 Mar 2010, 16:19
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17278
Location: In your JS exploiting you and your system
revolution
LocoDelAssembly wrote:
Well, maybe VB.NET apps are marked as "intentionally make unoptimal code" to push C# a little more. If that is the case I kinda agree with them, VB.NET should be eradicated, but unfortunately I need it for my job...
Hehe, it's time you looked for a new job then.
Post 13 Mar 2010, 16:24
View user's profile Send private message Visit poster's website Reply with quote
kohlrak



Joined: 21 Jul 2006
Posts: 1421
Location: Uncle Sam's Pad
kohlrak
revolution wrote:
LocoDelAssembly wrote:
Well, maybe VB.NET apps are marked as "intentionally make unoptimal code" to push C# a little more. If that is the case I kinda agree with them, VB.NET should be eradicated, but unfortunately I need it for my job...
Hehe, it's time you looked for a new job then.


Eh, so long as he ain't affiliated with the bloatware, he should be fine.

Quote:
(and it is already stupid that the compiler checks for overflow of the "i" variables despite that it could have determined that at compile time).


I had to make a program for my VB class (when i was taking it) and I wasn't allowed to use things that weren't already covered, so i tried to re-make the bitwise instructions. It pained me to align them to 31bits instead of 32.

Quote:
Well, maybe VB.NET apps are marked as "intentionally make unoptimal code" to push C# a little more. If that is the case I kinda agree with them, VB.NET should be eradicated, but unfortunately I need it for my job...


Reminds me of the 2 byte boolean values... My teacher actually agreed for me once that maybe, just maybe that was rediculous. I had to first explain to him that chars are 1 byte first, but after i did, it didn't take long for him to join me on that one. I still swears up and down, though, that there must be something optimal about it, though. Unfortunately, the problem has been around for a long time, which is why so few people even use it.

Quote:

Because it is just that - a [bold]hint[/bold]. Some compilers have a "forceinline" if you really-really-really believe a function should be inlined.


I'll keep that in mind. Just last night someone informed me that C++ has something called "assert" as well, which apparently is another thing that is "too difficult' and "too pointless" for people learning C++ to know. And here i thought this one tutorial i read was pretty good since it wasn't afraid to cover bitwise, pointers, and goto. Guess i was wrong, as it didn't teach me anything like that (it could be made with little effort, but it's less code to write per project).

Quote:
IMHO JIT'ing and managed code has a lot of potential, but it requires JITers that's probably a lot better than what we have today... and there'd be a fair amount of profiling overhead the first few times code is run.


Compilers in general have the potential to be as almost great as we are, but they still haven't perfected them yet for the x86. I'm told they do a much better job on other archs, though, like arm.
Post 13 Mar 2010, 18:46
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page Previous  1, 2

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You can attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar.

Powered by rwasa.