flat assembler
Message board for the users of flat assembler.
Index
> Main > Online C/C++ -> assembler |
Author |
|
redsock 15 Dec 2016, 20:57
Saw this on HN this morning during my news reading:
http://godbolt.org ... very useful IMO for seeing different compilers' ideas about optimisations, etc (and excellent followon/plaything for all of the various threads here on the board about HLL/compiler's being better than hand-coded, etc). |
|||
15 Dec 2016, 20:57 |
|
Roman 19 Dec 2016, 12:40
Second many asm commands !
My example: Code: mov eax,Number mov ebx,eax imul ebx Thats all ! But thanks you for this. Some times useful look output code from c++ example. |
|||
19 Dec 2016, 12:40 |
|
l4m2 25 Jan 2017, 13:07
Code: int sgn(int n) { return n<0?-1:n>0; } O3: Code: cmp DWORD PTR [esp+0x4],0x0 jl 8048420 <sgn(int)+0x10> setg al movzx eax,al ret xchg ax,ax mov eax,0xffffffff ret My solution: Code: cmp edi, 0 setg al setl ah sub al, ah movsx eax,al Code: mov eax, esi shr eax, 31 neg esi adc eax, eax Last edited by l4m2 on 25 Jan 2017, 16:26; edited 4 times in total |
|||
25 Jan 2017, 13:07 |
|
revolution 25 Jan 2017, 13:12
Your solution incorrectly use or instead of adding. So it is not the same, the FLAGS can be different.
Edit: Oh wait, the ZF flag is undefined after MUL so your code is totally wrong! Edit2: And even if ZF was defined, you might find that because of the latency from MUL the final result might not be any "faster". But you would have to check it in your app to see if it makes a measurable difference. |
|||
25 Jan 2017, 13:12 |
|
Tomasz Grysztar 28 Jan 2017, 23:38
Well, the modern compilers are certainly already capable of compensating some of the shortcomings of the HLL syntax by creating optimizations similar in effect to what a programmer thinking in assembly may write. My classical example of such "assembly programmer thinking" was my snippet from the '90s that computed the sum of proper divisors of a number. It looked, as far as I remember, like this:
Code: sum_divisors: ; in: edi = number, out: eax = sum of proper divisors mov ecx,1 xor esi,esi add_divisor: add esi,ecx next_divisor: inc ecx mov eax,edi xor edx,edx div ecx cmp eax,ecx jb done je square_root test edx,edx jnz next_divisor add esi,eax jmp add_divisor done: mov eax,esi ret square_root: test edx,edx jnz done add eax,esi ret Out of curiosity I checked if I'd be able to write a C code that would get optimized to something similar, and what I got was often really close, though wildly varying between compilers and their versions. I think icc 16 made the fastest one (though not the nicest-looking): Code: sum_divisors(unsigned int): mov esi, 1 #2.20 mov ecx, esi #2.30 ..B1.2: # Preds ..B1.3 ..B1.1 inc esi #4.7 mov eax, edi #5.20 xor edx, edx #5.20 div esi #5.20 cmp esi, eax #7.16 je ..B1.6 # Prob 20% #7.16 lea r8d, DWORD PTR [rcx+rax] #2.30 add r8d, esi #4.7 test edx, edx #13.9 cmove ecx, r8d #13.9 cmp esi, eax #15.18 jb ..B1.2 # Prob 82% #15.18 ..B1.5: # Preds ..B1.3 ..B1.6 mov eax, ecx #16.12 ret #16.12 ..B1.6: # Preds ..B1.2 # Infreq test edx, edx #8.15 jne ..B1.5 # Prob 50% #8.15 add ecx, esi #8.29 mov eax, ecx #8.29 ret #8.29 |
|||
28 Jan 2017, 23:38 |
|
revolution 29 Jan 2017, 00:12
As an aside: In ARM32 code this type of code is really easy to do nicely and concisely. All that jumping about can be eliminated with conditional predicates.
But for ARM64 code it is not so nice any more. It seems that the compilers were incapable of making good use of the ARM32 conditional predicates so ARM decided to remove most of them from the 64-bit instruction set (since they were "never" used). It seems that hand-crafted assembly code is a dying art, and worse, the CPU instruction sets are being designed to match the HLL compilers capabilities. |
|||
29 Jan 2017, 00:12 |
|
Tomasz Grysztar 29 Jan 2017, 16:03
revolution wrote: As an aside: In ARM32 code this type of code is really easy to do nicely and concisely. All that jumping about can be eliminated with conditional predicates. |
|||
29 Jan 2017, 16:03 |
|
TheRaven 07 Jun 2017, 21:58
Compilers get better all the time; paraphrasing Ozzy Osbourne -moving forward in reverse opitimizes the "newer compiler designs" philos where they're getting reversed working back toward assembler.
Kids today get spit out of college with half the story and these simpleton view points (more opinionated than anything else) are the result ~ HLL compilers are better than assemblers totally ignorant to the fact that compilers still assemble. Assembler is more about the CPU as a development library (API) and opcode is not really all that optional. GCC and Gas are garbage. Read someone's post that GCC puts out really optimized code and bout fell out of my seat (that was funny sh!t right there). My picks: CLang is taking over with good reason. FAsm is insanely powerful at such a tiny size and cross platform as hell. But, yeah, that Compiler Explorer looks fun as hell. Nice post! |
|||
07 Jun 2017, 21:58 |
|
Furs 09 Jun 2017, 11:35
The reason compilers don't optimize better than humans is because of the attitude of the developers. They don't find many "minor" improvements worth it, even if someone submits a patch, prefer "simpler" code or other bullshit like that and won't accept the patch because "makes code more complex for minor stuff" lmfao. And you wonder why hand-written asm is superior? Because developers don't prioritize optimizations.
But when you have a massive pile of minor improvements they tend to add up, even if one by itself won't be a "bottleneck" (it seems all bad programmers only think of easy solutions to bottlenecks). After what, 2 or 3 decades? They still can't do basic optimizations in some cases, it's laughable. Also, GCC is bad and has developers with crappy/retarded attitudes but Clang is worse. I mean, someone patched an error message to be explicit like in GCC (GCC was superior there, back then), but it showed something like "const char* foo" which makes perfect sense. However this one moron deliberately changed it to his so-called "correct" position of the asterisk "const char *foo" (nevermind that it's not even correct and he's full of shit). I mean Clang is even written in C++, where pointers are distinct types so it makes sense for the * to be near the type (just as you use it in casts). Who the fuck does he think he is? This is just an example, you know, even if insignificant. Clang is ran by even worse retards than GCC. Also the name is shit. |
|||
09 Jun 2017, 11:35 |
|
revolution 09 Jun 2017, 12:05
Admittedly it is a hard task to have the dev express the intent of the code through an HLL and then have the compiler try to determine the intent from the HLL code to create good assembly code.
But another problem I see people complain about is that HLL compiler on its maximum optimisation level will occasionally prune code it incorrectly thinks isn't used or can't ever be reached. That creates quite a headache for the dev to try and figure what is going on and how to fix it. Crazy times. |
|||
09 Jun 2017, 12:05 |
|
Tomasz Grysztar 09 Jun 2017, 12:11
revolution wrote: But another problem I see people complain about is that HLL compiler on its maximum optimisation level will occasionally prune code it incorrectly thinks isn't used or can't ever be reached. That creates quite a headache for the dev to try and figure what is going on and how to fix it. Crazy times. |
|||
09 Jun 2017, 12:11 |
|
revolution 09 Jun 2017, 12:12
Yes, it is a bug. It is not new either.
Works on O1 and O2, fails on O3. |
|||
09 Jun 2017, 12:12 |
|
Furs 09 Jun 2017, 16:43
revolution wrote: Admittedly it is a hard task to have the dev express the intent of the code through an HLL and then have the compiler try to determine the intent from the HLL code to create good assembly code. People who say "compilers most likely do a better job than you at optimizing" while at the same time never compile with all optimization settings on (and even if they do, the compiler is dumb on purpose to keep its code (who cares?) simpler) or are advocating "debug experience" instead... simply disgust me. They need to STFU in optimization-related topics since they obviously treat it as second rate and stop spreading bullshit about compilers optimizing better when it's not true and that's partly because of compiler developers who suck. |
|||
09 Jun 2017, 16:43 |
|
l_inc 11 Jun 2017, 10:35
revolution
Quote: Yes, it is a bug. It is not new either. Compiler bugs aren't impossible of course and also happen to be triggered by more aggressive optimizations, but my experience with respect to C tells me that such situations are much more likely to occur because of a sloppily coded C program relying on undefined or unspecified behaviour. Most C coders (not to mention C++) just don't know the language well enough. _________________ Faith is a superposition of knowledge and fallacy |
|||
11 Jun 2017, 10:35 |
|
Furs 11 Jun 2017, 11:34
Yeah, a famous one is "strict aliasing" which refers to type-based aliasing detection. GCC does have a workaround for this without having to disable that optimization, though. You can use the may_alias attribute to a type which will force it to treat it as "any type" and thus alias with anything (if it can't prove the access range obviously). Classic example of reading a float's bits as int:
Code: uint32_t foo(float f) { typedef uint32_t __attribute__((__may_alias__)) bar; return *(bar*)(&f); } |
|||
11 Jun 2017, 11:34 |
|
l_inc 11 Jun 2017, 11:41
Furs
The more or less standard-compliant way of aliasing these is to have a union. But it's still implementation-defined, of course, because the standard does not define the binary format used for floating-point numbers. _________________ Faith is a superposition of knowledge and fallacy |
|||
11 Jun 2017, 11:41 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.