Message board for the users of flat assembler.
> Non-x86 architectures > ..
pool 06 Jan 2013, 05:14
Last edited by pool on 17 Mar 2013, 12:22; edited 1 time in total
|06 Jan 2013, 05:14||
hopcode 06 Jan 2013, 07:37
opcodes express functionalities of an ISA.
on x86 few opcodes encode functions for few registers and instructions.
encoding len of an instruction is variable.
on x86-64,apart those inherithed from the x86,
you have yet more opcodes, example the REX byte or those for SIMD
encoding len here is variable too.
someone can tell you now more/better about ARM than me.
what i know about ARM at the moment is:
- instructions have practically fixed len,
apart 2 or 3 level of code density (T1,T2)
- opcodes may evaluate to instructions + conditions (branching etc)
- it is normal here using multiple source registers
for this reasons, RISC opcoding is basicall more complex than x86-64,
as pointed out indirectly in other thread.
i prefer lot of registers and opcodes of dense information.
x86 is very costrictive for me.
after 2 years of 64bit i cannot go back to 32bit and push/pop
all the time. this is a waste of time, cpu-cycles and human time i mean because you get ANYWAYS a "stacking-way-of-thinking"
by assembling programming, but when i am there in the subroutine i would prefer avoid to walk mentally all the stack backward to fetch the value because of lack of registers on the ISA. in this case x64 is better than x32
at the moment i write only opcodes directly on ARM
hardware. look at this post+videoclip to see how it works
that's but only the beginning
|06 Jan 2013, 07:37||
pool 06 Jan 2013, 08:06
Last edited by pool on 17 Mar 2013, 12:20; edited 1 time in total
|06 Jan 2013, 08:06||
hopcode 06 Jan 2013, 08:26
about mem-to-mem, this link, https://groups.google.com/d/topic/comp.lang.asm.x86/_pnGfjh5O-o/discussion
but those are pure speculations,imo.
you may think to reprogram the Intel CPU using microcodes, and allow
|06 Jan 2013, 08:26||
revolution 06 Jan 2013, 17:23
x86 has movsb (and others)
Why x86/64 and ARM doesn't allow memory to memory operation?
|06 Jan 2013, 17:23||
HaHaAnonymous 06 Jan 2013, 18:18
[ Post removed by author. ]
Last edited by HaHaAnonymous on 28 Feb 2015, 22:05; edited 1 time in total
|06 Jan 2013, 18:18||
AsmGuru62 07 Jan 2013, 11:42
Why throw stones?
These instructions would be very useful and save registers too.
I think the issue here is that micro-architecture in x86 allows a single
memory addressing per one instruction and that is somewhere in the core
of the CPU machinery and it is unfeasible (or even impossible) to change it at this time.
It is better to build a new CPU with the ability to address two locations per one instruction.
Also, such design will increase the instruction length and therefore
decrease the speed of prefetching instructions. Basically, what is needed is
a second "MOD-R-M"-like sequence and that may take ~10 bytes on 64-bit.
Or, simply limit the offset to data to 1 or 2 bytes. Most of the structures in
code will not be over these offsets.
Or use only registers, like in your code sample.
"rcx+rax*2" --> this can be en-coded in few bits.
|07 Jan 2013, 11:42||
Madis731 18 Jan 2013, 14:19
The main problem with fast CPU-s is that memory (RAM) is slow and you make your efforts to get everything done with registers. You only need to read once to load stuff (data) and after calculations are done on registers, you put the stuff (data) in memory.
[mem],[mem] is just a consequence of bad programming habits - why would you need to "just move memory around" If you get your datastructures right, you don't need to defragment RAM space and you don't need much [mem],[mem] operations.
For the cases you absolutely need, mov** will help you out.
The reason for x86 to disallow it is because 16 bytes (or was it 15?) is the maximum instruction length you can encode and when you allow [mem],[mem] you must also allow something like: MOV [ESP+2*EBP+401000h],[ESP+2*EBP+400000h]. Taking into account the size of MOV EAX,[ESP+2*EBP+400000h] in 16-bit mode, which is 9 bytes, it can easily overflow 16 bytes because you need extra 4 bytes for another immediate and at least a byte to encode the 2 registers + another byte to encode the multiplier. Oh and 2 size prefixes for the other two registers. That is 18!
Another problem is of course that there are other instructions that would want to take advantage of the new addressing possibilities shl [eax],[ebx]; add [eax],[ebx] etc. That will definitely pose a problem at some point.
@AsmGuru: 2 bytes is 64KB and when you think about a datastructure RECT(x1,x2,y1,y2,c) you rarely use mov [x2],[x1] or put some coordinate in colour's spot. What you usually do is copy one datastructure or some of its members to another datastructure and these can easily be >64KB apart.
I cannot think of a simpler way than ecx=5, esi=rect1, edi=rect2 and a rep movsd to copy the whole RECT structure.
If a lazy programmer gets hands on mov [mem],[mem] it would look like:
mov [rect2.x1],[rect1.x1] mov [rect2.y1],[rect1.y1] mov [rect2.x2],[rect1.x2] mov [rect2.y2],[rect1.y2] mov [rect2.c],[rect1.c]
and would take at least 55 bytes using only immediate offsets or at least 40 bytes using reg+imm (in 16-bit mode).
movsd approach would take 21 bytes and one line less.
mov ecx,5 mov esi,rect1 mov edi,rect2 rep movsd
|18 Jan 2013, 14:19||
< Last Thread | Next Thread >
Copyright © 1999-2023, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.