flat assembler
Message board for the users of flat assembler.

Index > Windows > Java faster than ASM!?

Goto page Previous  1, 2, 3, 4, 5  Next
Author
Thread Post new topic Reply to topic
drhowarddrfine



Joined: 10 Jul 2007
Posts: 533
drhowarddrfine 05 Feb 2008, 15:42
I got into an argument with a guy over this and C#'s "optimization" during JIT. I just don't get it. Though Java may optimize the code during JIT, the optimizing process itself takes time which the hand-coded program would not need. I can see where code optimized using jit could be faster, because a human may not consider every possible angle on that, but if the human did, how is it possible for a jit compiled version to be faster? That just doesn't make sense to me.
Post 05 Feb 2008, 15:42
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20299
Location: In your JS exploiting you and your system
revolution 05 Feb 2008, 15:55
If a human is either lazy or does not have the time to think about optimisation then JIT can (in theory) be faster. But for any time critical code it would be folly to just simply leave it up to JIT to optimise it and think it will give beautiful and perfect code, because it won't and it can't. Just the start up overhead means it could never catch a hand-optimised piece of code.

An interesting question is this: Since JIT programs are written by humans can they ever outperform a human at optimisation?
Post 05 Feb 2008, 15:55
View user's profile Send private message Visit poster's website Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 05 Feb 2008, 16:02
revolution wrote:
Just the start up overhead means it could never catch a hand-optimised piece of code.

Unless you opt for profiled optimization, you only need the startup overhead once, though. And you can optimize for the specific processor the application is installed to... instead of having 32/64bit versions, P4, AM64, core2, ... - but JITs aren't there yet.

revolution wrote:
An interesting question is this: Since JIT programs are written by humans can they ever outperform a human at optimisation?

"It depends" - machines have the advantage that they don't get tired/bored, even when faced with extremely similar & repetitive tasks. So in theory, you could code a "perfect optimizer", which does NP-complete analysis of the problem, something a human would never complete.
Post 05 Feb 2008, 16:02
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20299
Location: In your JS exploiting you and your system
revolution 05 Feb 2008, 16:12
f0dder wrote:
... NP-complete analysis of the problem ...
I have no idea what that means. I know what the NP-complete class of problems means but "NP-complete analysis"?
Post 05 Feb 2008, 16:12
View user's profile Send private message Visit poster's website Reply with quote
drhowarddrfine



Joined: 10 Jul 2007
Posts: 533
drhowarddrfine 05 Feb 2008, 18:50
So this, then, erases any doubt that crept into my mind from my argument with that guy. I've never understood how anyone could claim such a thing since ALL code winds up as asm. It boils down to good code runs faster than not as good code but no higher level language can run faster than asm.
Post 05 Feb 2008, 18:50
View user's profile Send private message Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 05 Feb 2008, 19:36
Just some general comments:

"div" is always slow, use shifts and simple arithmatic instead (if possible). Anyways, I think the smaller the dividend (e.g. AL instead of AX), the faster it'll be.

Doesn't "nop" have special hardware circuitry to be fast or is it still slow like the atomic "xchg" (since it's basically same opcode as "xchg ax,ax")?

I think Pentium M is just a P3 w/ SSE2 (not a P4), so keep that in mind.

Pentium 4 is faster with "add ?,1" instead of "inc". Even GCC generates such (if you tell it to via -march=).
Post 05 Feb 2008, 19:36
View user's profile Send private message Visit poster's website Reply with quote
itsnobody



Joined: 01 Feb 2008
Posts: 93
Location: Silver Spring, MD
itsnobody 06 Feb 2008, 00:30
drhowarddrfine wrote:
So this, then, erases any doubt that crept into my mind from my argument with that guy. I've never understood how anyone could claim such a thing since ALL code winds up as asm. It boils down to good code runs faster than not as good code but no higher level language can run faster than asm.


Not really Java is still faster, using JIT and some better memory management when you add the div instructions

Or perhaps the Java runtimes are tweaked specifically for hardware
Post 06 Feb 2008, 00:30
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 06 Feb 2008, 00:46
Can you show the loop where you are using division in both languages?
Post 06 Feb 2008, 00:46
View user's profile Send private message Reply with quote
drhowarddrfine



Joined: 10 Jul 2007
Posts: 533
drhowarddrfine 06 Feb 2008, 03:13
itsnobody wrote:

Not really Java is still faster, using JIT and some better memory management when you add the div instructions

Or perhaps the Java runtimes are tweaked specifically for hardware
Impossible I say! How can Java do something I can't in assembly? Are you saying Java can tweak for specific hardware but I cannot in assembly? Please show me any code where Java does something assembly cannot. You can't do it. Impossible!
Post 06 Feb 2008, 03:13
View user's profile Send private message Reply with quote
itsnobody



Joined: 01 Feb 2008
Posts: 93
Location: Silver Spring, MD
itsnobody 06 Feb 2008, 04:00
drhowarddrfine wrote:
itsnobody wrote:

Not really Java is still faster, using JIT and some better memory management when you add the div instructions

Or perhaps the Java runtimes are tweaked specifically for hardware
Impossible I say! How can Java do something I can't in assembly? Are you saying Java can tweak for specific hardware but I cannot in assembly? Please show me any code where Java does something assembly cannot. You can't do it. Impossible!


Well I don't know how Java is so fast, ASM is definitely faster in theory

As for the instruction I added in it was:
Code:
mov eax,ecx
mov ebx,7
div ebx
    


Which in Java was:
Code:
nop = count/7;
    


I don't know why the ASM version goes so much slower, the Java version is around 2400 milliseconds while the ASM version is around 2700-2800 milliseconds
Post 06 Feb 2008, 04:00
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 06 Feb 2008, 04:05
Please, FULL code, it is important what is around. Also, your division code seems to not clear EDX, so you are actually doing a messy division.
Post 06 Feb 2008, 04:05
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20299
Location: In your JS exploiting you and your system
revolution 06 Feb 2008, 04:08
Your div code should be:
Code:
mov eax,ecx
xor edx,edx
mov ebx,7
div ebx    
Plus you can move the "mov ebx,7" out of the loop.
Post 06 Feb 2008, 04:08
View user's profile Send private message Visit poster's website Reply with quote
itsnobody



Joined: 01 Feb 2008
Posts: 93
Location: Silver Spring, MD
itsnobody 06 Feb 2008, 04:20
LocoDelAssembly wrote:
Please, FULL code, it is important what is around. Also, your division code seems to not clear EDX, so you are actually doing a messy division.


Yes that's right, after adding xor edx, edx it reduced down to 2449 milliseconds...though still slightly slower than Java by 20 milliseconds or so...

Ok, here's the full code:
Code:
include 'win32ax.inc'

.data
    start_time dd 0
    _output rb 20
.code

start:
            invoke MessageBox,NULL,"Click Ok to start","Speed Test",MB_OK
            invoke GetTickCount
            mov [start_time],eax
            mov ecx, 100000000
            place:
                mov eax,ecx
                mov ebx,7
                xor edx,edx
                div ebx
                dec ecx
                jnz place
            invoke GetTickCount
            sub eax, [start_time]
            invoke wsprintf,_output,"%d milliseconds",eax
            invoke MessageBox,NULL,_output,"Speed Test",MB_OK
            invoke ExitProcess,0
 .end start
    
Post 06 Feb 2008, 04:20
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 06 Feb 2008, 04:26
Code:
                mov ebx,7
            align 16
            place:
                mov eax,ecx
                xor edx,edx
                div ebx 
                dec ecx 
                jnz place 
    


Seems to give another little speed up. And still, you could change the DIV by a MUL using the reciprocal trick since 7 is an odd number. It is also possible for even divisors but normally doesn't work for any dividend.
Post 06 Feb 2008, 04:26
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20299
Location: In your JS exploiting you and your system
revolution 06 Feb 2008, 04:28
Don't forget to align the loop to 16.
Code:
...
mov ecx,1000000000
mov ebx,7
align 16
place:
...    
Post 06 Feb 2008, 04:28
View user's profile Send private message Visit poster's website Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 06 Feb 2008, 04:34
http://board.flatassembler.net/topic.php?p=39254#39254 <- Link about division, take special attention to the link posted by Tomasz Grysztar
Post 06 Feb 2008, 04:34
View user's profile Send private message Reply with quote
itsnobody



Joined: 01 Feb 2008
Posts: 93
Location: Silver Spring, MD
itsnobody 06 Feb 2008, 04:38
LocoDelAssembly wrote:
Code:
                mov ebx,7
            align 16
            place:
                mov eax,ecx
                xor edx,edx
                div ebx 
                dec ecx 
                jnz place 
    


Seems to give another little speed up. And still, you could change the DIV by a MUL using the reciprocal trick since 7 is an odd number. It is also possible for even divisors but normally doesn't work for any dividend.


It doesn't seem to speed things up much

Still even with all this optimization, I get it equal to Java, no time ever came up to be faster than Java...all this shows is that some how Java is as fast as ASM
Post 06 Feb 2008, 04:38
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20299
Location: In your JS exploiting you and your system
revolution 06 Feb 2008, 04:47
itsnobody wrote:
Still even with all this optimization, I get it equal to Java, no time ever came up to be faster than Java...all this shows is that some how Java is as fast as ASM
Then I guess Java is the perfect language if you need to do nothing really really fast.
Post 06 Feb 2008, 04:47
View user's profile Send private message Visit poster's website Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4624
Location: Argentina
LocoDelAssembly 06 Feb 2008, 04:51
For certain code both will equally faster, that can happens, what cannot however is that assembly is slower that Java, if such thing happens that means just that Java found better CPU instruction sequence than you which means then that you have lower skills than a HLL compiler.
Post 06 Feb 2008, 04:51
View user's profile Send private message Reply with quote
itsnobody



Joined: 01 Feb 2008
Posts: 93
Location: Silver Spring, MD
itsnobody 06 Feb 2008, 04:51
revolution wrote:
itsnobody wrote:
Still even with all this optimization, I get it equal to Java, no time ever came up to be faster than Java...all this shows is that some how Java is as fast as ASM
Then I guess Java is the perfect language if you need to do nothing really really fast.


Nah Java is still slower for loading up and graphics, but JIT must be amazing
Post 06 Feb 2008, 04:51
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page Previous  1, 2, 3, 4, 5  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.