flat assembler
Message board for the users of flat assembler.

Index > Non-x86 architectures > ..

Author
Thread Post new topic Reply to topic
pool



Joined: 08 Jan 2007
Posts: 97
pool 06 Jan 2013, 05:14
..


Last edited by pool on 17 Mar 2013, 12:22; edited 1 time in total
Post 06 Jan 2013, 05:14
View user's profile Send private message Reply with quote
hopcode



Joined: 04 Mar 2008
Posts: 563
Location: Germany
hopcode 06 Jan 2013, 07:37
hallo pool,
opcodes express functionalities of an ISA.
on x86 few opcodes encode functions for few registers and instructions.
encoding len of an instruction is variable.

on x86-64,apart those inherithed from the x86,
you have yet more opcodes, example the REX byte or those for SIMD
encoding len here is variable too.

someone can tell you now more/better about ARM than me.

what i know about ARM at the moment is:
- instructions have practically fixed len,
apart 2 or 3 level of code density (T1,T2)

- opcodes may evaluate to instructions + conditions (branching etc)
- it is normal here using multiple source registers

for this reasons, RISC opcoding is basicall more complex than x86-64,
as pointed out indirectly in other thread.

i prefer lot of registers and opcodes of dense information.

x86 is very costrictive for me.
after 2 years of 64bit i cannot go back to 32bit and push/pop
all the time. this is a waste of time, cpu-cycles and human time i mean Smile because you get ANYWAYS a "stacking-way-of-thinking"
by assembling programming, but when i am there in the subroutine i would prefer avoid to walk mentally all the stack backward to fetch the value because of lack of registers on the ISA. in this case x64 is better than x32

at the moment i write only opcodes directly on ARM
hardware. look at this post+videoclip to see how it works
http://board.flatassembler.net/topic.php?p=151837#151837

that's but only the beginning
Cheers,
Very Happy

_________________
⠓⠕⠏⠉⠕⠙⠑
Post 06 Jan 2013, 07:37
View user's profile Send private message Visit poster's website Reply with quote
pool



Joined: 08 Jan 2007
Posts: 97
pool 06 Jan 2013, 08:06
..


Last edited by pool on 17 Mar 2013, 12:20; edited 1 time in total
Post 06 Jan 2013, 08:06
View user's profile Send private message Reply with quote
hopcode



Joined: 04 Mar 2008
Posts: 563
Location: Germany
hopcode 06 Jan 2013, 08:26
about mem-to-mem, this link, https://groups.google.com/d/topic/comp.lang.asm.x86/_pnGfjh5O-o/discussion
but those are pure speculations,imo.
you may think to reprogram the Intel CPU using microcodes, and allow
mem-to-mem operations.

_________________
⠓⠕⠏⠉⠕⠙⠑
Post 06 Jan 2013, 08:26
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20410
Location: In your JS exploiting you and your system
revolution 06 Jan 2013, 17:23
pool wrote:
Why x86/64 and ARM doesn't allow memory to memory operation?
x86 has movsb (and others)
Post 06 Jan 2013, 17:23
View user's profile Send private message Visit poster's website Reply with quote
HaHaAnonymous



Joined: 02 Dec 2012
Posts: 1178
Location: Unknown
HaHaAnonymous 06 Jan 2013, 18:18
[ Post removed by author. ]


Last edited by HaHaAnonymous on 28 Feb 2015, 22:05; edited 1 time in total
Post 06 Jan 2013, 18:18
View user's profile Send private message Reply with quote
AsmGuru62



Joined: 28 Jan 2004
Posts: 1657
Location: Toronto, Canada
AsmGuru62 07 Jan 2013, 11:42
Why throw stones?
These instructions would be very useful and save registers too.

I think the issue here is that micro-architecture in x86 allows a single
memory addressing per one instruction and that is somewhere in the core
of the CPU machinery and it is unfeasible (or even impossible) to change it at this time.

It is better to build a new CPU with the ability to address two locations per one instruction.

Also, such design will increase the instruction length and therefore
decrease the speed of prefetching instructions. Basically, what is needed is
a second "MOD-R-M"-like sequence and that may take ~10 bytes on 64-bit.

Or, simply limit the offset to data to 1 or 2 bytes. Most of the structures in
code will not be over these offsets.

Or use only registers, like in your code sample.
"rcx+rax*2" --> this can be en-coded in few bits.
Post 07 Jan 2013, 11:42
View user's profile Send private message Send e-mail Reply with quote
Madis731



Joined: 25 Sep 2003
Posts: 2139
Location: Estonia
Madis731 18 Jan 2013, 14:19
The main problem with fast CPU-s is that memory (RAM) is slow and you make your efforts to get everything done with registers. You only need to read once to load stuff (data) and after calculations are done on registers, you put the stuff (data) in memory.

[mem],[mem] is just a consequence of bad programming habits - why would you need to "just move memory around" Smile If you get your datastructures right, you don't need to defragment RAM space and you don't need much [mem],[mem] operations.

For the cases you absolutely need, mov** will help you out.

The reason for x86 to disallow it is because 16 bytes (or was it 15?) is the maximum instruction length you can encode and when you allow [mem],[mem] you must also allow something like: MOV [ESP+2*EBP+401000h],[ESP+2*EBP+400000h]. Taking into account the size of MOV EAX,[ESP+2*EBP+400000h] in 16-bit mode, which is 9 bytes, it can easily overflow 16 bytes because you need extra 4 bytes for another immediate and at least a byte to encode the 2 registers + another byte to encode the multiplier. Oh and 2 size prefixes for the other two registers. That is 18!

Another problem is of course that there are other instructions that would want to take advantage of the new addressing possibilities shl [eax],[ebx]; add [eax],[ebx] etc. That will definitely pose a problem at some point.

@AsmGuru: 2 bytes is 64KB and when you think about a datastructure RECT(x1,x2,y1,y2,c) you rarely use mov [x2],[x1] or put some coordinate in colour's spot. What you usually do is copy one datastructure or some of its members to another datastructure and these can easily be >64KB apart.
I cannot think of a simpler way than ecx=5, esi=rect1, edi=rect2 and a rep movsd to copy the whole RECT structure.
If a lazy programmer gets hands on mov [mem],[mem] it would look like:
Code:
mov [rect2.x1],[rect1.x1]
mov [rect2.y1],[rect1.y1]
mov [rect2.x2],[rect1.x2]
mov [rect2.y2],[rect1.y2]
mov [rect2.c],[rect1.c]
    

and would take at least 55 bytes using only immediate offsets or at least 40 bytes using reg+imm (in 16-bit mode).

movsd approach would take 21 bytes and one line less.
Code:
mov ecx,5
mov esi,rect1
mov edi,rect2
rep movsd
    
Post 18 Jan 2013, 14:19
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.