flat assembler
Message board for the users of flat assembler.
Index
> Compiler Internals > Support MOVBE Goto page 1, 2 Next |
Author |
|
revolution 01 Jul 2009, 08:01
A pity there is no support for MOVBE r16/32/64, r16/32/64
|
|||
01 Jul 2009, 08:01 |
|
MazeGen 01 Jul 2009, 08:33
Yeah, it's weird that they don't implemented it. However, it is still useful because BSWAP can work with register only.
|
|||
01 Jul 2009, 08:33 |
|
Tomasz Grysztar 01 Jul 2009, 08:50
Yes, looks like a nice instruction. How is it classified - is it SSE4 related?
|
|||
01 Jul 2009, 08:50 |
|
MazeGen 01 Jul 2009, 09:12
It's classified among Miscellaneous Instructions (together with LEA, NOP, ...) in Basic Architecture manual. Indicated by CPUID.01H:ECX.MOVBE[bit 22].
|
|||
01 Jul 2009, 09:12 |
|
Tomasz Grysztar 01 Jul 2009, 09:35
Hmm, perhaps that's why I missed it. Its introduction is not that much different from the one of POPCNT instruction (which is also indicated by its own bit, CPUID.01H:ECX.POPCNT[bit 23]), but while POPCNT is listed as SSE4 instruction, MOVBE is not. Quite a mess.
|
|||
01 Jul 2009, 09:35 |
|
MazeGen 02 Jul 2009, 09:34
Code: movbe ax, dx Quote: flat assembler version 1.69.01 (1199842 kilobytes memory) (both register operands are not allowed) |
|||
02 Jul 2009, 09:34 |
|
Tomasz Grysztar 02 Jul 2009, 09:58
Can someone test it? Maybe its undocumented, but works?
|
|||
02 Jul 2009, 09:58 |
|
MazeGen 02 Jul 2009, 10:04
Seems like the first steppings of Core 2 don't support this instruction so it will be hard to find someone able to test it
|
|||
02 Jul 2009, 10:04 |
|
pal 02 Jul 2009, 18:53
Quote: Maybe its undocumented, but works? Maybe I have you misunderstood but it is in the Instruction Set Reference A-M. (Vol. 2A 3-657), so it is documented. |
|||
02 Jul 2009, 18:53 |
|
LocoDelAssembly 02 Jul 2009, 19:08
Quote:
Yes you did He meant that perhaps "movbe reg, reg" is supported despite the operand combination is undocumented. I've tried to test on my brother's computer but seems to be a too old Core2 because "movbe [var], edx" crashed the application. |
|||
02 Jul 2009, 19:08 |
|
r22 02 Jul 2009, 19:09
With PSHUFB already added to SSE[?], MOVBE seems pointless, unless it is optimized (unlike the string instructions) to be faster than MOV/BSWAP.
Seems redundant. |
|||
02 Jul 2009, 19:09 |
|
LocoDelAssembly 02 Jul 2009, 19:25
r22, but PSHUFB needs either the FPU state or SSE state restored while MOVBE can work with GRPs.
I don't know the real intent for this instruction but perhaps it is to make networking drivers faster by storing in network order (big-endian) and reading with a direct conversion to native format (litte-endian) in a single step? |
|||
02 Jul 2009, 19:25 |
|
Tomasz Grysztar 02 Jul 2009, 19:52
It seems like a very elegant instruction to generally operate on big-endian fields in data structures (including the network addresses, but also perhaps ASN.1/BER structures, smart card APDUs, etc.).
|
|||
02 Jul 2009, 19:52 |
|
pal 02 Jul 2009, 20:42
Quote: He meant that perhaps "movbe reg, reg" is supported despite the operand combination is undocumented. I thought I must have to be honest That would have been a too easy one. |
|||
02 Jul 2009, 20:42 |
|
Madis731 05 Jul 2009, 12:15
LOAD+BSWAP will always be faster because there are many ways to throw it to the uop-schedulers. MOVBE is hindered with constant uops and therefore will penalize performance. Moreover this instruction seems not to replace BSWAP, but add [mem] capability to it.
If it is this silent and you have to test a separate bit for it (i.e. SSE4 doesn't guarantee this bit being set) then this is the perfect formula for failure. Summary: -macroop instuction (at least 2 uops) -already has an elegant replacement as MOV reg,mem+BSWAP reg -is not guaranteed on newer CPUs -is not made popular with advertising I don't get this instruction ... it seems to be some instruction from the times of LOOP and AAA and such... |
|||
05 Jul 2009, 12:15 |
|
Tomasz Grysztar 05 Jul 2009, 15:12
Madis731 wrote: -already has an elegant replacement as MOV reg,mem+BSWAP reg But in the opposite direction you would have to do BSWAP reg + MOV mem,reg + BSWAP reg |
|||
05 Jul 2009, 15:12 |
|
bitRAKE 05 Jul 2009, 19:00
...or MOV rtemp,reg + BSWAP rtemp + MOV mem,rtemp
(me thinks intel compiler fails to produce optimal code with BSWAP?) |
|||
05 Jul 2009, 19:00 |
|
Madis731 06 Jul 2009, 08:13
Tomasz and bitRAKE - you might be on to something - I tend to agree.
|
|||
06 Jul 2009, 08:13 |
|
revolution 07 Jul 2009, 16:41
I think MOVBE would be most useful in text sorting. Network address computations are hardly a case for optimising your code, but sorting data can take a considerable time in some cases and a simple optimisation with MOVBE will help a lot.
|
|||
07 Jul 2009, 16:41 |
|
Goto page 1, 2 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.