flat assembler
Message board for the users of flat assembler.
Index
> Main > REP RETN |
Author |
|
revolution 18 Feb 2010, 04:14
It is an optimisation for AMD CPUs. The branch predictor gets it wrong if the retn instruction is jumped to directly by a branch instruction, or if the retn directly follows a branch instruction.
You can safely ignore it. |
|||
18 Feb 2010, 04:14 |
|
asmmsa 18 Feb 2010, 09:53
F3 is not REP, its a prefix. its rep only for stos/movs/lods/ins/outs. for other instructions it has diffrent meaning.
for ret - its undefined, and shouldnt be used. they might modify ret instruction so with f3 it will do some other stuff than only returning. branch prediction is only with conditional jumps. f3 + ret = reserved, that means dont use it because in 10 years your code will crash. branches are 2e and 3e btw, not f3. |
|||
18 Feb 2010, 09:53 |
|
revolution 18 Feb 2010, 09:59
asmmsa: AMD explicity say to use 'rep ret' to solve a problem with the branch prediction. It only affects performance slightly so putting it in, or leaving it out, makes almost no different for most programs.
I seriously doubt that in 10 years the code will crash. Once the manufacturers have stated to use a particular encoding that usually means it will work both on all past CPUs and for all future CPUs without issue, with both Intel and AMD (and others like VIA also). Don't worry about it, it is minor and causes no problems. |
|||
18 Feb 2010, 09:59 |
|
asmmsa 18 Feb 2010, 10:03
if rep use f3, then i didnt know that.
perhaps i have old version of manuals. and how adding it to ret solve anything? ret is uncontitional, it always do 1 way! |
|||
18 Feb 2010, 10:03 |
|
revolution 18 Feb 2010, 10:12
It would seem that when AMD designed the branch predictors they make a mistake when the branch led to a 'ret'. So to solve the problem, they say to put a 'rep' in front so that the branch predictor will work correctly and do predictions.
Code: jcc .exit ;In some AMD CPUs, this cannot be predicted correctly ;... .exit: ret Code: jcc .exit ;Now can be predicted correctly ;... .exit: rep ret |
|||
18 Feb 2010, 10:12 |
|
Fanael 18 Feb 2010, 11:02
Does ret imm16 suffer from branch misprediction too?
|
|||
18 Feb 2010, 11:02 |
|
f0dder 18 Feb 2010, 12:15
Why recommend "rep ret", though? Wouldn't a nop has been just about as good, without the conceptual nastiness of "rep ret"?
|
|||
18 Feb 2010, 12:15 |
|
chaoscode 18 Feb 2010, 12:54
well. AMD can excute 3 (simple) instructions on the same time,
i think that a prefix is choosen because it doesn't use Execution resources and a nop does. but maybe i'm not right and just tell wired Crap. edit: Why don't ask the amd Support? |
|||
18 Feb 2010, 12:54 |
|
f0dder 18 Feb 2010, 12:57
chaoscode wrote: well. AMD can excute 3 (simple) instructions on the same time, _________________ - carpe noctem |
|||
18 Feb 2010, 12:57 |
|
revolution 18 Feb 2010, 13:06
Well branch prediction is very important to achieve top performance in critical loops. A modern CPU with malfunctioning branch prediction would run really badly. In normal GUI or I/O type code it will make no difference of course.
|
|||
18 Feb 2010, 13:06 |
|
f0dder 18 Feb 2010, 13:10
Of course, revolution... but how often do you have "Jcc <location-of-ret>" inside a critical loop? - dunno how serious the flaw is, though. If it's just the ret that misses branch prediction I wouldn't say it's too bad, if the whole branch predictors "gets confused for a while" it would be more serious.
|
|||
18 Feb 2010, 13:10 |
|
revolution 18 Feb 2010, 15:01
I guess AMD felt a little embarrassed and needed a way to fix it without people complaining about creating another overhead of extra instructions.
BTW: it also affects this: Code: .loop: ;... jcc .loop ;<---- prediction problem ret |
|||
18 Feb 2010, 15:01 |
|
asmmsa 18 Feb 2010, 15:34
AMD noticed it after they released cpu on market?
LOL, i want to replace 1 of those noobs, my skills are lower for now, but at least i wouldnt do such idiocy. How could they miss that? Its not some small company run by amateur, its 2nd biggest one, they cant make such mistakes! by the way, fasm doesnt support branch prediction? i have to add prefixes by db before each conditional jump? |
|||
18 Feb 2010, 15:34 |
|
revolution 18 Feb 2010, 15:38
asmmsa: Modern CPUs are extremely complex. The occasional mistake is to be expected. But AMD have done worse.
http://en.wikipedia.org/wiki/AMD_K10#TLB_Bug |
|||
18 Feb 2010, 15:38 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.