flat assembler
Message board for the users of flat assembler.

Index > Main > online compiler assembly tricks for you

Author
Thread Post new topic Reply to topic
sylware



Joined: 23 Oct 2020
Posts: 461
Location: Marseille/France
sylware 17 Oct 2022, 11:42
https://godbolt.org/noscript

You don't have to deal with the HORRIBLE AND INSANE SDK of those compilers.
Post 17 Oct 2022, 11:42
View user's profile Send private message Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 2564
Furs 17 Oct 2022, 13:18
They added noscript page, cool.
Post 17 Oct 2022, 13:18
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20450
Location: In your JS exploiting you and your system
revolution 17 Oct 2022, 14:39
It works on my JS-free browser. That is good. Smile
Post 17 Oct 2022, 14:39
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20450
Location: In your JS exploiting you and your system
revolution 17 Oct 2022, 14:45
Haha, I tried to default "return num * num;" code for ARM gcc trunk, and got this monstrosity
Code:
square(int):
        push    {r7}
        sub     sp, sp, #12
        add     r7, sp, #0
        str     r0, [r7, #4]
        ldr     r3, [r7, #4]
        mul     r3, r3, r3
        mov     r0, r3
        adds    r7, r7, #12
        mov     sp, r7
        ldr     r7, [sp], #4
        bx      lr    
Why not simply
Code:
square(int):
        mul     r0, r0, r0
        bx      lr    
Question

Any normal person wouldn't even make it a function. It is just a single instruction.
Post 17 Oct 2022, 14:45
View user's profile Send private message Visit poster's website Reply with quote
sylware



Joined: 23 Oct 2020
Posts: 461
Location: Marseille/France
sylware 17 Oct 2022, 15:01
revolution wrote:
Any normal person wouldn't even make it a function. It is just a single instruction.


That makes me think about memcpy and memset which are now, on x86_64, accelerated using "rep movsX" and "rep stosX", then which should not be into the libc but generated inline by compilers.
Post 17 Oct 2022, 15:01
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20450
Location: In your JS exploiting you and your system
revolution 17 Oct 2022, 15:08
Are there still HLL coders out there claiming the compilers can produce the best possible code ever, and how humans can't possibly do any better?

Just show them these results. Laughing
Post 17 Oct 2022, 15:08
View user's profile Send private message Visit poster's website Reply with quote
DimonSoft



Joined: 03 Mar 2010
Posts: 1228
Location: Belarus
DimonSoft 17 Oct 2022, 21:13
They’ll just say it’s a corner case. Religion and HLL-oriented propaganda are still things.
Post 17 Oct 2022, 21:13
View user's profile Send private message Visit poster's website Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 2564
Furs 18 Oct 2022, 13:24
revolution wrote:
Haha, I tried to default "return num * num;" code for ARM gcc trunk, and got this monstrosity
Code:
square(int):
        push    {r7}
        sub     sp, sp, #12
        add     r7, sp, #0
        str     r0, [r7, #4]
        ldr     r3, [r7, #4]
        mul     r3, r3, r3
        mov     r0, r3
        adds    r7, r7, #12
        mov     sp, r7
        ldr     r7, [sp], #4
        bx      lr    
Why not simply
Code:
square(int):
        mul     r0, r0, r0
        bx      lr    
Question

Any normal person wouldn't even make it a function. It is just a single instruction.
I don't know about ARM, but did you enable optimizations? Since this would be trivial for a compiler.

i.e. pass -Ofast (or more typically -O2, or -O3) on the command line.
Post 18 Oct 2022, 13:24
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20450
Location: In your JS exploiting you and your system
revolution 18 Oct 2022, 13:31
Furs wrote:
... did you enable optimizations? Since this would be trivial for a compiler.

i.e. pass -Ofast (or more typically -O2, or -O3) on the command line.
I didn't. But I tried all three of your suggestions just now, and all three return exactly the two line code mul+bx.
Post 18 Oct 2022, 13:31
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20450
Location: In your JS exploiting you and your system
revolution 19 Oct 2022, 06:12
Using -O3, this is suboptimal

Input:
Code:
int square(int num) {
    return num ^ (num << num);
}    
Output
Code:
square(int):
        lsl     r3, r0, r0
        eors    r0, r0, r3
        bx      lr    
Can be:
Code:
        eor     r0, r0, r0, lsl r0
        bx      lr    
Post 19 Oct 2022, 06:12
View user's profile Send private message Visit poster's website Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 2564
Furs 19 Oct 2022, 13:12
Sounds like a typical issue when it doesn't recognize more complicated patterns. At the instruction level passes (RTL stages in GCC) compiler optimizers are usually just pattern matching.

There's a lot of optimization passes before that (GIMPLE stages in GCC) which deal with overall code optimizations, not specific to a target instruction set, like removing redundant code and simplifying stuff, of course.
Post 19 Oct 2022, 13:12
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20450
Location: In your JS exploiting you and your system
revolution 19 Oct 2022, 13:31
It knows the instruction with a fixed shift:
Code:
    return num ^ (num << 1);    
Code:
        eor     r0, r0, r0, lsl #1
        bx      lr    
Note the change from eors to eor. So it probably comes from a different part of the code.
Post 19 Oct 2022, 13:31
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20450
Location: In your JS exploiting you and your system
revolution 25 Oct 2022, 11:57
This is also suboptimal.
Code:
int square(int num) {
    return (num << num) ^ 0xffffffff ;
}    
Code:
square(int):
        lsls    r0, r0, r0
        mvns    r0, r0
        bx      lr    
Can be
Code:
square(int):
        mvn     r0, r0, lsl r0
        bx      lr    
Also, excessive and unnecessary updates to the flags can use more power, draining the battery faster.

This is why your "smart"phone needs recharging every 10 minutes. Bad code generation. Razz
Post 25 Oct 2022, 11:57
View user's profile Send private message Visit poster's website Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 2564
Furs 25 Oct 2022, 13:03
revolution wrote:
This is why your "smart"phone needs recharging every 10 minutes. Bad code generation. Razz
Facts.
Post 25 Oct 2022, 13:03
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20450
Location: In your JS exploiting you and your system
revolution 26 Oct 2022, 05:38
It appears that arm gcc by default starts in thumb mode. So some of the above code might be because those instructions are not available.

So now using options -marm -O3 we can see this;
Code:
int square(int num) {
    return num * num > 0 ? num * num + 1 : num * num - 1 ;
}    
Code:
square(int):
        mul     r0, r0, r0
        cmp     r0, #0
        addne   r0, r0, #1
        mvneq   r0, #0
        bx      lr    
Can be
Code:
        muls    r0, r0, r0
        addne   r0, r0, #1
        mvneq   r0, #0
        bx      lr    
Post 26 Oct 2022, 05:38
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.