flat assembler
Message board for the users of flat assembler.
Index
> Main > Optimizing - is it true? Goto page Previous 1, 2, 3 Next |
Author |
|
f0dder 02 Mar 2005, 12:43
I thught the branch hints basically execute as no-ops on older processors?
|
|||
02 Mar 2005, 12:43 |
|
S.T.A.S. 02 Mar 2005, 22:30
I belive 2E & 3E bytes take no additional time to execute, because they are prefixes (so, considered as a part of opcode), but not nop-equivalents.
|
|||
02 Mar 2005, 22:30 |
|
f0dder 02 Mar 2005, 22:35
STAS, I meant "nop" in the effect that they shouldn't have any effect on CPUs that don't support them as hints. Since they're just CS: and DS: prefixes, they shouldn't take any additional time to execute. It does add a bit to code size, but that is negligible - you're probably only going to use the hints on time-critical code anyway.
|
|||
02 Mar 2005, 22:35 |
|
S.T.A.S. 02 Mar 2005, 23:14
Oh, we're both talking about the same things
|
|||
02 Mar 2005, 23:14 |
|
Ralph 04 Mar 2005, 19:07
Hey while we're on optimizing, I always wondered what would be faster, mov or an alu instruction. For example, which one is faster?
Code: sub eax,eax sub ebx,ebx Code: sub eax,eax mov ebx,eax I'm guessing the first one because the mov depends on the sub before it. If that's the case, what happens if you space it out enough to eliminate the dependancy? |
|||
04 Mar 2005, 19:07 |
|
Madis731 07 Mar 2005, 17:54
the mov is theoretically faster because it only copys 32bits, but add/sub must also take care of upto 32 carrys.
|
|||
07 Mar 2005, 17:54 |
|
Ralph 07 Mar 2005, 23:55
That's a good point. I wasn't thinking about flags. Thanks.
|
|||
07 Mar 2005, 23:55 |
|
S.T.A.S. 11 Mar 2005, 09:43
There's another point:
IA-32 Intel® Architecture Optimization Reference Manual wrote: Use xor and pxor instructions to clear registers and break dependencies for integer operations AMD Athlon™ Processor x86 Code Optimization Guide wrote: To clear an integer register to all 0s, use “XOR reg, reg”. The AMD Athlon processor is able to avoid the false read dependency on the XOR instruction. |
|||
11 Mar 2005, 09:43 |
|
tom tobias 11 Mar 2005, 11:14
, And here's another point. DON'T use Boolean operators to acomplish NON-BOOLEAN operations, IF you want others to understand your CODE, as opposed, to having others able to read your PROGRAM.
The distinction between MOV and XOR is enormous, from a functional point of view. The 30 picoseconds saved by using XOR instead of MOV is IRRELEVANT. PRIORITY number one, is, was, and always will be, READABILITY. If your goal is to replace the current contents of a register with zero's, the proper way to do that, is to MOV zero's into the register, NOT add, subtract, or implement even sillier mathematical manipulations, like exclusive or. Since many view computer science is a branch of MATHEMATICS, this debate will not be easily won by me, and I suppose the vast majority of those perusing this forum have no idea what the perceived issue is! So long as programmers work BY THEMSELVES, it makes no difference what one uses. BUT, on any collaborative endeavor, the BOTTLENECK is ALWAYS readability. |
|||
11 Mar 2005, 11:14 |
|
IronFelix 11 Mar 2005, 12:29
I think it is obviously for GOOD assembler programmer that XOR EAX,EAX is MOV EAX,0. As for me, it is more clear to read XOR EAX,EAX than MOV EAX,0, because first instruction is faster and easy to understand.
Readability is important of course but you can't write fastest program without optimization, which makes your code much less readable. Use comments - it really helps. And of course optimization must be performed AFTER the readable code is ready and works. |
|||
11 Mar 2005, 12:29 |
|
S.T.A.S. 11 Mar 2005, 12:41
tom tobias, yes you're right that old good story "How to program Pascal language using C compiler"...
However, is there any relation to the art of assembly ? And, by the way, I'm confused by reading about XOR as a boolean operator, it's just straightforward arithmetics isn't it? |
|||
11 Mar 2005, 12:41 |
|
Tomasz Grysztar 11 Mar 2005, 14:12
eXclusive OR is boolean operator, the same as AND and OR are.
|
|||
11 Mar 2005, 14:12 |
|
madmatt 11 Mar 2005, 14:42
Just my two cent's worth :
for AND, OR, XOR the source operand's bits determines what gets changed in the destination's operand bits, NEG and NOT just take a source operand: The AND operator uses the source bits to determine which destination bits get cleared, 1 = keep this bit the same, 0 = clear this bit The OR operator uses the source bits to determine which destination bits get set, 1 = on , 0 = unchanged The XOR operator uses the source bits to determine which destination bits gets "flipped", 1 = reverse this bit (0=1 and 1=0), 0 = unchanged The NEG operator simply reverses the sign of the source operand The NOT operator works like the NEG instruction, except it subtracts one from the source operand |
|||
11 Mar 2005, 14:42 |
|
S.T.A.S. 11 Mar 2005, 15:16
XOR is boolean operator when we speak about boolean arguments (i.e. bits), not integers.
All CPU commands like 'add', 'and', 'or', 'sub', 'xor' are executed by Arithmetic Logic Unit. Bitwise opearations are computed with the rules of boolean algebra. But in fact (that is in hardware) arithmetical addition & substractions are just combination of some 'and' and 'or', so why we don't consider them as boolean operators too? Also, in HLL, there's a difference between bitwise and logical (boolean) operators: 1 & 2 = 0 (this is 'bitwise and', it operates with integer types of data) but: 1 && 2 = 1 (there's no direct x86 opcode, it's compiled into some Jcc /SETcc stuff) |
|||
11 Mar 2005, 15:16 |
|
Madis731 11 Mar 2005, 16:50
I think he meant that XOR uses at least 3 gates to accomplish the manipulation while mov uses only 1. Though when keeping in mind the 32-bit processors and 5-byte mov instruction, it is always two reads from the memory so too much gate logic for me, eh!?
My theory is when you "align 4" and "xor eax,eax \ xor ebx,ebx", then its two bytes and issues in 1clock because both take 1µop (or one of UV-pipes) and can fit in one read WHILE "mov eax,0 \ mov ebx,0" are the only instructions not optimized for size like "BYTE mov eax,00h" so there we have it - 10bytes 3-reads (the best) and 4-reads (worst case) ... makes you think doesn't it. XOR really is the most obvious optimization and it should be used if not for a very good reason IMUL eax,eax,3 ;can be used but only if you can hide 3+ clocks in the next few instructions. LEA eax,[eax*3] ;is better even if it held you for a one-clock penalty for AGI-stall. |
|||
11 Mar 2005, 16:50 |
|
Tomasz Grysztar 11 Mar 2005, 17:12
S.T.A.S. wrote:
The AND, OR, XOR and NOT are called "logical instructions" by Intel, and to quote the Intel's manuals: they "perform the standard Boolean operations for which they are named". Bitwise, of course, since Intel processors do not use any other way of encoding Boolean values that single bits. But these instruction really form a different group than a binary arithmetic or decimal arithmetic, which operate on different data encodings (x86 binary arithmetics operate on two's complement encoding of integers, what could be different in other architectures, etc.). |
|||
11 Mar 2005, 17:12 |
|
S.T.A.S. 12 Mar 2005, 12:53
I've searched for word 'Boolean' these (latest ?) Intel's docs:
IA-32 Intel® Architecture Optimization Reference Manual (24896611.pdf) IA-32 Intel® Architecture Software Developer’s Manual Instruction Set Reference (25366614.pdf, 25366714.pdf) and found nothing. Just System Programming Guide (25366814.pdf Vol. 3 7-31) contains following string: boolean MONITOR_MWAIT_works = TRUE; I've read a few old books where XOR was called 'modulo-two congruence addition' with nice functional diagram how to reduce three-input adder (which executes ADD command) into modulo-two sum gate (which executes XOR). Anyway IMHO this is just a question of philosophy. The name 'ALU' itself symbolizes a very blurred edge between arithmetisc and logics (they are the same for hardware, but humans stuck with old school) Here's another example: ADD EAX,EAX does the same thing as SHL EAX,1 and SAL EAX,1 do, but these all are different instructions. BTW SHL and SAL are called 'multiply by 2' (that is arithmetic), but in other 's CPU manuals these are 'shifts' or even 'logical shift'. PS. Sorry for the offtopic here, but this is rather interesting question like 'why ancient people didn't know of negative numbers'. At the first glance it's silly, but it helped me to understand substraction |
|||
12 Mar 2005, 12:53 |
|
Tomasz Grysztar 12 Mar 2005, 13:06
You forgot to seach the Intel's Software Developer's Manual Vol.1: Basic Architecture (25366514.pdf), and this where my quotation comes from (section 7.2.4, page 7-12).
|
|||
12 Mar 2005, 13:06 |
|
S.T.A.S. 12 Mar 2005, 17:47
Oh yes, my bad
But I don't think this volume can be considered as a good argument. Have a look at next section's name: 7.2.5. Shift and Rotate Instructions. It's rater strange, why not multiply & divide by 2 like in Vol.2 ? And where are *left* and *right* sides of a byte?! BTW, why not top and bottom? The reason for all that funny stuff is obvious: this is the simplification. It's easy to say that XOR is boolean instruction, which deals with separate bits of a byte and that's all. But byte is a single whole, not just a collection of bits. Each bit in byte has its own position (or weight). So, to get value of a byte we have to sum all bits multiplied by appropriate power of 2. Huh, everyone knows that, doesn't he? NO! This is the great secret. Oh, no, not for you But for many people I'd like to give a classic example of such "arithmetics". How to compute: is a number a power of 2? Of course, it just simple: divide by 2 in cycle and check the result! Ugh! Let's use magic arithmetic instructions instead: Code: ; eax = number lea edx, [eax-1] and edx, eax ; if edx = 0, then eax is power of 2 That's why I say that XOR is arithmetics too. |
|||
12 Mar 2005, 17:47 |
|
Goto page Previous 1, 2, 3 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.