flat assembler
Message board for the users of flat assembler.

flat assembler > Main > Online C/C++ -> assembler

Author
Thread Post new topic Reply to topic
redsock



Joined: 09 Oct 2009
Posts: 283
Location: Australia
Saw this on HN this morning during my news reading:

http://godbolt.org ... very useful IMO for seeing different compilers' ideas about optimisations, etc (and excellent followon/plaything for all of the various threads here on the board about HLL/compiler's being better than hand-coded, etc).

_________________
2 Ton Digital - https://2ton.com.au/
Post 15 Dec 2016, 20:57
View user's profile Send private message Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 1172
At first I thought it only supports non-optimized output but then I saw the compiler options part...

BTW, to do 32-bit code in GCC just add -m32 if anyone is interested like me.
Post 16 Dec 2016, 13:34
View user's profile Send private message Reply with quote
Roman



Joined: 21 Apr 2012
Posts: 353
Second many asm commands !

My example:
Code:
mov eax,Number mov ebx,eax imul ebx



Thats all !
But thanks you for this.
Some times useful look output code from c++ example.
Post 19 Dec 2016, 12:40
View user's profile Send private message Reply with quote
l4m2



Joined: 15 Jan 2015
Posts: 612
Code:
int sgn(int n) { return n<0?-1:n>0; }

O3:
Code:
cmp DWORD PTR [esp+0x4],0x0 jl 8048420 <sgn(int)+0x10> setg al movzx eax,al ret xchg ax,ax mov eax,0xffffffff ret

My solution:
Code:
cmp edi, 0 setg al setl ah sub al, ah movsx eax,al
Code:
mov eax, esi shr eax, 31 neg esi adc eax, eax


Last edited by l4m2 on 25 Jan 2017, 16:26; edited 4 times in total
Post 25 Jan 2017, 13:07
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 15859
Location: 162173 Ryugu
Your solution incorrectly use or instead of adding. So it is not the same, the FLAGS can be different.

Edit: Oh wait, the ZF flag is undefined after MUL so your code is totally wrong!

Edit2: And even if ZF was defined, you might find that because of the latency from MUL the final result might not be any "faster". But you would have to check it in your app to see if it makes a measurable difference.
Post 25 Jan 2017, 13:12
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar
Assembly Artist


Joined: 16 Jun 2003
Posts: 6862
Location: Kraków, Poland
Well, the modern compilers are certainly already capable of compensating some of the shortcomings of the HLL syntax by creating optimizations similar in effect to what a programmer thinking in assembly may write. My classical example of such "assembly programmer thinking" was my snippet from the '90s that computed the sum of proper divisors of a number. It looked, as far as I remember, like this:
Code:
sum_divisors: ; in: edi = number, out: eax = sum of proper divisors mov ecx,1 xor esi,esi add_divisor: add esi,ecx next_divisor: inc ecx mov eax,edi xor edx,edx div ecx cmp eax,ecx jb done je square_root test edx,edx jnz next_divisor add esi,eax jmp add_divisor done: mov eax,esi ret square_root: test edx,edx jnz done add eax,esi ret
There is a couple of assembly-specific constructions there, like triple branching (this is where there are two conditional jumps in a row) and the efficient use of both dividend and remainder obtained through a single division.
Out of curiosity I checked if I'd be able to write a C code that would get optimized to something similar, and what I got was often really close, though wildly varying between compilers and their versions. I think icc 16 made the fastest one (though not the nicest-looking):
Code:
sum_divisors(unsigned int): mov esi, 1 #2.20 mov ecx, esi #2.30 ..B1.2: # Preds ..B1.3 ..B1.1 inc esi #4.7 mov eax, edi #5.20 xor edx, edx #5.20 div esi #5.20 cmp esi, eax #7.16 je ..B1.6 # Prob 20% #7.16 lea r8d, DWORD PTR [rcx+rax] #2.30 add r8d, esi #4.7 test edx, edx #13.9 cmove ecx, r8d #13.9 cmp esi, eax #15.18 jb ..B1.2 # Prob 82% #15.18 ..B1.5: # Preds ..B1.3 ..B1.6 mov eax, ecx #16.12 ret #16.12 ..B1.6: # Preds ..B1.2 # Infreq test edx, edx #8.15 jne ..B1.5 # Prob 50% #8.15 add ecx, esi #8.29 mov eax, ecx #8.29 ret #8.29
It appears to be even slightly faster than mine, perhaps thanks to the use of CMOV. Note that my snippet was only naïvely optimized, not taking into account any instruction sequencing rules of modern processors, just simply demonstrating the methods of thinking in assembly and structuring the code accordingly. I still believe that for larger programs the assembly-programming mindset allows to create efficient and beautiful code that would be very hard to obtain from the compilers (and that this may pay off even when no processor-specific optimizations are applied). But I acknowledge that at least on the level of functions like above one the compilers are already capable of generating code that looks almost as nice.
Post 28 Jan 2017, 23:38
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 15859
Location: 162173 Ryugu
As an aside: In ARM32 code this type of code is really easy to do nicely and concisely. All that jumping about can be eliminated with conditional predicates.

But for ARM64 code it is not so nice any more. It seems that the compilers were incapable of making good use of the ARM32 conditional predicates so ARM decided to remove most of them from the 64-bit instruction set (since they were "never" used). Sad It seems that hand-crafted assembly code is a dying art, and worse, the CPU instruction sets are being designed to match the HLL compilers capabilities. Crying or Very sad
Post 29 Jan 2017, 00:12
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar
Assembly Artist


Joined: 16 Jun 2003
Posts: 6862
Location: Kraków, Poland
revolution wrote:
As an aside: In ARM32 code this type of code is really easy to do nicely and concisely. All that jumping about can be eliminated with conditional predicates.
And this is exactly what was always the most appealing to me in this architecture. Then it's a pity that it was a wasted potential.
Post 29 Jan 2017, 16:03
View user's profile Send private message Visit poster's website Reply with quote
TheRaven



Joined: 22 Apr 2008
Posts: 87
Location: U.S.A.
Compilers get better all the time; paraphrasing Ozzy Osbourne -moving forward in reverse opitimizes the "newer compiler designs" philos where they're getting reversed working back toward assembler.

Kids today get spit out of college with half the story and these simpleton view points (more opinionated than anything else) are the result ~ HLL compilers are better than assemblers totally ignorant to the fact that compilers still assemble.

Assembler is more about the CPU as a development library (API) and opcode is not really all that optional.

GCC and Gas are garbage. Read someone's post that GCC puts out really optimized code and bout fell out of my seat (that was funny sh!t right there).

My picks:
CLang is taking over with good reason.
FAsm is insanely powerful at such a tiny size and cross platform as hell.


But, yeah, that Compiler Explorer looks fun as hell. Nice post!
Post 07 Jun 2017, 21:58
View user's profile Send private message Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 1172
The reason compilers don't optimize better than humans is because of the attitude of the developers. They don't find many "minor" improvements worth it, even if someone submits a patch, prefer "simpler" code or other bullshit like that and won't accept the patch because "makes code more complex for minor stuff" lmfao. And you wonder why hand-written asm is superior? Because developers don't prioritize optimizations.

But when you have a massive pile of minor improvements they tend to add up, even if one by itself won't be a "bottleneck" (it seems all bad programmers only think of easy solutions to bottlenecks). After what, 2 or 3 decades? They still can't do basic optimizations in some cases, it's laughable.


Also, GCC is bad and has developers with crappy/retarded attitudes but Clang is worse. I mean, someone patched an error message to be explicit like in GCC (GCC was superior there, back then), but it showed something like "const char* foo" which makes perfect sense.

However this one moron deliberately changed it to his so-called "correct" position of the asterisk "const char *foo" (nevermind that it's not even correct and he's full of shit). I mean Clang is even written in C++, where pointers are distinct types so it makes sense for the * to be near the type (just as you use it in casts). Who the fuck does he think he is? This is just an example, you know, even if insignificant. Clang is ran by even worse retards than GCC.

Also the name is shit.
Post 09 Jun 2017, 11:35
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 15859
Location: 162173 Ryugu
Admittedly it is a hard task to have the dev express the intent of the code through an HLL and then have the compiler try to determine the intent from the HLL code to create good assembly code.

But another problem I see people complain about is that HLL compiler on its maximum optimisation level will occasionally prune code it incorrectly thinks isn't used or can't ever be reached. That creates quite a headache for the dev to try and figure what is going on and how to fix it. Crazy times.
Post 09 Jun 2017, 12:05
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar
Assembly Artist


Joined: 16 Jun 2003
Posts: 6862
Location: Kraków, Poland
revolution wrote:
But another problem I see people complain about is that HLL compiler on its maximum optimisation level will occasionally prune code it incorrectly thinks isn't used or can't ever be reached. That creates quite a headache for the dev to try and figure what is going on and how to fix it. Crazy times.
Crazy. Wouldn't that be considered an actual bug?
Post 09 Jun 2017, 12:11
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 15859
Location: 162173 Ryugu
Yes, it is a bug. It is not new either.

Works on O1 and O2, fails on O3.
Post 09 Jun 2017, 12:12
View user's profile Send private message Visit poster's website Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 1172
revolution wrote:
Admittedly it is a hard task to have the dev express the intent of the code through an HLL and then have the compiler try to determine the intent from the HLL code to create good assembly code.
I'm talking about the dev of the compiler, not the one who wants optimized code from his HLL source code. When I said they want "simpler code", I didn't mean end-users who use the compiler, I meant the compiler itself!

People who say "compilers most likely do a better job than you at optimizing" while at the same time never compile with all optimization settings on (and even if they do, the compiler is dumb on purpose to keep its code (who cares?) simpler) or are advocating "debug experience" instead... simply disgust me.

They need to STFU in optimization-related topics since they obviously treat it as second rate and stop spreading bullshit about compilers optimizing better when it's not true and that's partly because of compiler developers who suck.
Post 09 Jun 2017, 16:43
View user's profile Send private message Reply with quote
l_inc



Joined: 23 Oct 2009
Posts: 881
revolution
Quote:
Yes, it is a bug. It is not new either.

Works on O1 and O2, fails on O3.

Compiler bugs aren't impossible of course and also happen to be triggered by more aggressive optimizations, but my experience with respect to C tells me that such situations are much more likely to occur because of a sloppily coded C program relying on undefined or unspecified behaviour. Most C coders (not to mention C++) just don't know the language well enough.

_________________
Faith is a superposition of knowledge and fallacy
Post 11 Jun 2017, 10:35
View user's profile Send private message Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 1172
Yeah, a famous one is "strict aliasing" which refers to type-based aliasing detection. GCC does have a workaround for this without having to disable that optimization, though. You can use the may_alias attribute to a type which will force it to treat it as "any type" and thus alias with anything (if it can't prove the access range obviously). Classic example of reading a float's bits as int:
Code:
uint32_t foo(float f) { typedef uint32_t __attribute__((__may_alias__)) bar; return *(bar*)(&f); }
This wouldn't work otherwise since it assumes uint32_t and float can never alias (different types). (In this case it doesn't matter since the function is too short though, but if you use it and it gets inlined guaranteed, it will matter)
Post 11 Jun 2017, 11:34
View user's profile Send private message Reply with quote
l_inc



Joined: 23 Oct 2009
Posts: 881
Furs
The more or less standard-compliant way of aliasing these is to have a union. But it's still implementation-defined, of course, because the standard does not define the binary format used for floating-point numbers.

_________________
Faith is a superposition of knowledge and fallacy
Post 11 Jun 2017, 11:41
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >

Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 2004-2018, Tomasz Grysztar.

Powered by rwasa.