flat assembler
Message board for the users of flat assembler.

Index > Non-x86 architectures > FASMARM v1.44 - Cross assembler for ARM CPUs

Goto page Previous  1, 2, 3, 4 ... 31, 32, 33  Next
Author
Thread Post new topic Reply to topic
Kuemmel



Joined: 30 Jan 2006
Posts: 200
Location: Stuttgart, Germany
Kuemmel 08 Feb 2006, 13:12
@vid: Yep, as far as I can see the basic core instructions (Thumb and ARM) are unchanged and always present in any ARM based cpu since more than 5 years. Just the add-ons like Java cell, SIMD, etc. are always either there or not...in the end it would be the same like to test an x86 CPU if there's 3dnow or SSE1/2/3 available, before you run the code.

The problem is the operation system. Probably there are many of them, which I don't even know...as far as I know there's ARM based Windows CE, but of course a lot device specific ones like Nintendo DS OS, or whatever phone OS...but as there's software running on these devices...there must be some kind of documentation or tools...

@revolution...by the way...on which device do you run your ARM code !?
Post 08 Feb 2006, 13:12
View user's profile Send private message Visit poster's website Reply with quote
Madis731



Joined: 25 Sep 2003
Posts: 2139
Location: Estonia
Madis731 08 Feb 2006, 17:57
Kuemmel wrote:

Code:
cmp   r0, #10
movgt r0, #0
movle r0, #1
    

Code:
add r0, r0, [r1, LSL #5]
    

Code:
stmia r0!,{r1-r8}
    


To defend x86 a bit then:
1) It has conditional moves and exchanges
2) LEA instruction can be powerful in situations like:
ARM:
add r0, r0, [r1,LSL #5]
x86:
lea eax,[ebx*8+eax-14]
3) The SIMD is also possible in this way:
REP MOVSB (or compare, scan,...word, doubleword etc.)

Its not THAT bad the IA32 Razz

_________________
My updated idol Very Happy http://www.agner.org/optimize/
Post 08 Feb 2006, 17:57
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20302
Location: In your JS exploiting you and your system
revolution 08 Feb 2006, 19:56
Quote:
revolution...by the way...on which device do you run your ARM code !?
My company makes propriatory systems using a number of different CPU's. The latest ARM product has a four CPU daughter board for high speed data encryption using Intel PXA27x.

As for the discussion about which is better, I think it is not a fair comparison. The development budgets for the two CPU's is considerably different and the design goals were very different.

Here are a few of the main differences I have noticed in no particular order (I'm not trying to say one or the other is better, just different):

ARM: 15 user registers + PC relative addressing
X86: 7 user registers + stack pointer

ARM: Register based addressing with limited immediate offsets or (almost) unlimited register shifting
X86: Memory addressing with unlimited immediate offsets and/or limited register shifts

ARM: 99% of user instructions are conditional
X86: Only a limited subset of MOV's are conditional

ARM: Strictly LOAD/OP/STORE architechure
X86: All arithmetic OP's can use memory directly

ARM: Large constants (>8 bits) are tricky to optimise
X86: Any constant can be placed in instruction stream

ARM: Low power consumption
X86: High power consumption

ARM: No hardware standard platform
X86: PC based platform is pseudo standard

ARM: Lack of software/tools/information for beginners
X86: Plenty of software/tools/information for beginners

ARM: Ability to address memory with negative register offsets
X86: Only additive register addressing supported

ARM: Basic RISC style instruction set
X86: CISC instruction set (DIV/DAA/ENTER/LEAVE etc.)

ARM: ALL I/O is memory based
X86: I/O using separate control bus

ARM: FLAG updates are selectable for each instruction
X86: FLAG updates are fixed into the instruction set

One of the things I have noticed while porting code from X86 to ARM is that while having 15 general registers (not including PC) might seem to be good for not needing to access the stack but in reality the lack of direct memory based arithmetic operations and the difficulty with optimising constants means many registers end up holding memory address offsets or constants sometimes leaving less available registers to perform the desired function. Another is that the ability to make instuctions conditional often greatly improves ability to process variable data. A third thing is that the predictable clocking and instruction throughput makes optimising for ARM more deterministic and involves less guesswork. Overall though I feel that in terms of clocks per instruction the ARM keeps pace almost one-to-one with a P4 for general algorithimic processing.

For the product I mentioned above it is probably the case that a single P4/3GHz could have done the same job as the four ARM CPU's, but the power consumption per board would be about 27 times more (80W compared to 3W). In terms of processing power per Watt (MIPS/mW) the P4 doesn't stand a chance.
Post 08 Feb 2006, 19:56
View user's profile Send private message Visit poster's website Reply with quote
Madis731



Joined: 25 Sep 2003
Posts: 2139
Location: Estonia
Madis731 08 Feb 2006, 21:47
Yeah, these two are not exactly comparable, because one is meant for mainstream - other for things like handhelds and other itegrations.
Why ARM consumes less power is because it has strict rules. Constant length instructions make it easier for processor to decode. Pentiums nowdays need to predict jumps, flags, fill/flush pipelines and it all takes transistors and power. All for our convenience Smile but it all comes with a price.

All-in-all its nice to see ARM-in-FASM but I'm still looking for a good platform to code on. I've been on Intel from the first assembly days :S, well some Hitachys also but it was a long time ago...
Post 08 Feb 2006, 21:47
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20302
Location: In your JS exploiting you and your system
revolution 09 Feb 2006, 04:53
Quote:
Why ARM consumes less power is because it has strict rules. Constant length instructions make it easier for processor to decode. Pentiums nowdays need to predict jumps, flags, fill/flush pipelines and it all takes transistors and power.
The xScale has two instruction sets (ARM and THUMB), branch prediction logic, store buffers, split 32k each data and code caches, TLB's, boot ROM, SRAM, integrated memory controller numerous serial ports and GPIO's and lots of other stuff all together in one package. When running at 500 MHz is still needs less than 1 Watt per processor.

I said this:
Quote:
... in terms of clocks per instruction the ARM keeps pace almost one-to-one with a P4 for general algorithimic processing.
That does not make clear what I was thinking at the time. What I meant was more like this:

Consider three metrics: 1) Instruction count required to encode Function-X. 2) Instruction bytes required to encode Function-X. and 3) Clock count required to execute Function-X. Where Function-X is some non-trivial function that needs to be performed. In general on all three metrics I find ARM code is very similar to P4 code.
Quote:
I'm still looking for a good platform to code on.
Aren't we all, but unfortunately it all comes down to popularity and marketing. The old 8086 was popular (because IBM was backing it) and Motorola (in the Apple's) was not. So Intel made lots of sales and poured money into marketing to keep it popular. That's why we have todays P4's.
Post 09 Feb 2006, 04:53
View user's profile Send private message Visit poster's website Reply with quote
Kuemmel



Joined: 30 Jan 2006
Posts: 200
Location: Stuttgart, Germany
Kuemmel 09 Feb 2006, 08:50
Quote:
For the product I mentioned above it is probably the case that a single P4/3GHz could have done the same job as the four ARM CPU's, but the power consumption per board would be about 27 times more (80W compared to 3W). In terms of processing power per Watt (MIPS/mW) the P4 doesn't stand a chance.


Yep and then I always wondered were the cpu power consumption goes to in x86...only heat !? I remember some article that said that the x86 CISC instruction set is decoded into small RISC-like instructions on the cpu and only then executed...

I just thought if there would have been the same money involved pushing ARM architecture or DEC Alpha to the same GHz levels/Cache levels and how fast and low power consumption devices could have been used today...

@revolution: I found the plenty registers often quite usefull. When I was doing some time critical inner loops of some graphics effects or like 64bit fractals. I saved like you said a lot of memory access...what is may be not so relevant on x86, of course. Anyway...for which OS do you do your code ? Or you use C-compiler with inline assembler for more easy developement ?

@madis: thanks for the code hints on x86...seems there are some enhancements Wink
Post 09 Feb 2006, 08:50
View user's profile Send private message Visit poster's website Reply with quote
MazeGen



Joined: 06 Oct 2003
Posts: 977
Location: Czechoslovakia
MazeGen 09 Feb 2006, 11:07
Kuemmel wrote:
I remember some article that said that the x86 CISC instruction set is decoded into small RISC-like instructions on the cpu and only then executed...

Yes, these instructions are replaced with their microcode into the trace cache.
Post 09 Feb 2006, 11:07
View user's profile Send private message Visit poster's website Reply with quote
Kuemmel



Joined: 30 Jan 2006
Posts: 200
Location: Stuttgart, Germany
Kuemmel 09 Feb 2006, 11:46
Madis731 wrote:
Yeah, these two are not exactly comparable, because one is meant for mainstream - other for things like handhelds and other itegrations.

It's may be interesingly to know that the first ARM processors were actually designed for desktop machines, may be some people remember the Acorn Archimedes A3000 running an ARM2, it had same 'look' like Amiga 500. I think this design and the ARM3 was very competitive at this time. IIRC the company ARM was founded by Acorn computer company, ressulting in the first cpu, called ARM1 in 1985.
Post 09 Feb 2006, 11:46
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20302
Location: In your JS exploiting you and your system
revolution 09 Feb 2006, 12:31
Quote:
I found the plenty registers often quite usefull. When I was doing some time critical inner loops of some graphics effects or like 64bit fractals. I saved like you said a lot of memory access...what is may be not so relevant on x86, of course.
It is easy to find examples in both respects being in favour of X86 or ARM. I was talking more generally in an overall sense.
Quote:
Anyway...for which OS do you do your code ? Or you use C-compiler with inline assembler for more easy developement ?
Revolution OS Smile Actually it has no official name, I was just being flippant. It is entirely an internal OS used in my company. Written with Notepad and assembled with FASMARM. I can't release the code here though, company secrets etc. The development platform is PC based of course, my laptop has been very busy lately doing this project.
Post 09 Feb 2006, 12:31
View user's profile Send private message Visit poster's website Reply with quote
pierre



Joined: 07 Nov 2004
Posts: 10
pierre 10 Feb 2006, 12:50
Maybe you should have a look at catt, it translates x86 to arm.

http://www.sax.de/~adlibit/
Post 10 Feb 2006, 12:50
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20302
Location: In your JS exploiting you and your system
revolution 10 Feb 2006, 14:24
Quote:
Maybe you should have a look at catt, it translates x86 to arm.
Thanks for the link. Personally I don't have LINUX to run it intalled but it would be interesting to run some code through it see how well it does it's job. I expect many optimisations would be required after it has done its job.
Post 10 Feb 2006, 14:24
View user's profile Send private message Visit poster's website Reply with quote
Borsuc



Joined: 29 Dec 2005
Posts: 2465
Location: Bucharest, Romania
Borsuc 10 Feb 2006, 14:33
Or ask the author to add 'smart' optimization features in the program. Smile
Post 10 Feb 2006, 14:33
View user's profile Send private message Reply with quote
Giant



Joined: 10 Feb 2006
Posts: 14
Giant 10 Feb 2006, 22:15
Hello! I am extremely excited about my old friend fasm working on my gumstix board. Gumstix features an intel ARM5 chip and linux, for those who missed it.

I installed fasmarm on the cross-compiling machine and assembled the (only) ARMDWARF example that came with it. Object dump looks like a valid executable at a glance. As expected, the executable barfs, although in a very strange way:

#./ARMDWARF
Killed
#

Taking out the DWARF from the config lines makes it an invalid executable from the dump view, and looks like this in action:

#./ARMDWARF
./ARMDWARF: 1: Syntax Error: "(" unexpected
#

Any ideas on how to set up a very simple executable in this environment? It should be pretty easy by the smell of it.

Does anyone have working code? ANy success stories on other ARM platforms?

Thank you in advance
Post 10 Feb 2006, 22:15
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20302
Location: In your JS exploiting you and your system
revolution 11 Feb 2006, 04:32
Quote:
the executable barfs
I have currently only set FASMARM to support two formats, ELF DWARF and BINARY. Also, the example I gave uses the ARM semi-hosting system commands, for other systems (like LINUX) the SWI calls and parameters will most likely be different. If you are familar with the calling procedures in your system you might be able to make another example program. Don't forget to post your results here. I will do what I can to help you out but bear in mind that I don't have a similar system at hand to test with. Any extra debugging information that your system might produce would be usuful also. If you have a working "hello world" executable example for your system post it here and I might be able to reverse engineer and make a compatible format, time permitting.

The *.AXF files work for what they were intended to do. I have never used it to execute directly because with my system any final runtime code is assembled with BINARY format and loaded into ROM once testing and debugging is finished.
Post 11 Feb 2006, 04:32
View user's profile Send private message Visit poster's website Reply with quote
Giant



Joined: 10 Feb 2006
Posts: 14
Giant 11 Feb 2006, 16:22
...supported formats are ELF DWARF and BINARY...

Just to clarify - are these 2 the only supported formats? Is it possible to strip the DWARF stuff out and be left with a clean ELF executable? if yes, can you give an example of a well-formed directive line(s) to do so...

I am still trying to bring up an fasm "hello world" on a gumstix linux... will keep everyone posted if I have progress to report.

Thanks[/quote]
Post 11 Feb 2006, 16:22
View user's profile Send private message Reply with quote
Giant



Joined: 10 Feb 2006
Posts: 14
Giant 11 Feb 2006, 20:57
I am somewhat stumped in my attempt to bring up a simple executable on arm-linux for gumstix. This is a 2.6 kernel, a pretty straightforward linux port. I've reduced the test program to:

format ELF dwarf executable at 0
entry start
section 'one' executable readable writeable align 20h
start:
mov r0,0 ;exit code 0
swi 0x900001 ;syscall exit

Trying to run it results in the "Killed" message.

I guess there is a reason for 4K of setup/breakdown code in the libraries after all. I was hoping for an easy fix. It's just too close to give up, but I don't have any good ideas as I am neither linux nor a fasm expert...
Post 11 Feb 2006, 20:57
View user's profile Send private message Reply with quote
pelaillo
Missing in inaction


Joined: 19 Jun 2003
Posts: 878
Location: Colombia
pelaillo 12 Feb 2006, 17:29
Giant, maybe a writeable .bss section is required?
Look here: http://board.flatassembler.net/topic.php?t=3689

btw, gumstix seems exactly what I was looking for. Could you give me the reference of your board? I need a small, ARM based - low consumption platform to play FASM with.
Post 12 Feb 2006, 17:29
View user's profile Send private message Yahoo Messenger Reply with quote
Giant



Joined: 10 Feb 2006
Posts: 14
Giant 12 Feb 2006, 18:20
I will try .bss next... Thank you

For gumstix see http://www.gumstix.com/products.html I have no affiliation and am not pushing it. You will need to either get a waysmall package or some combination as the gumstix itself is not very useful. If you go piecemeal, you will need an additional etherstix if you want to network (or a wireless card), and a serial board like waysmall stuart to reflash... It's still a pretty good deal for a 200 or 400MHZ arm-linux system that is literally the size of a gum package.

I would love to hear from anyone who can compile fasm code to gumstix or any other ARM linux platform.

Thanks[/url]
Post 12 Feb 2006, 18:20
View user's profile Send private message Reply with quote
Giant



Joined: 10 Feb 2006
Posts: 14
Giant 13 Feb 2006, 14:50
Hello. I am still struggling to get it going. I incorporated a writeable section. No luck.

1) format ELF DWARF executable. Seems to require sections to have names and alignment.
Code:
format ELF DWARF executable
entry start
section 'one' readable executable align 0x20
start:
 mov r0,0 ;success
 swi 0x900001 ;syscall exit
section 'two' readable writeable align 020h
 ddd dd 0
    

RESULT: Killed
2) format ELF (is this unsupported?) Seems to not like section names and alignment. so:
Code:
format ELF executable
entry start
section readable executable
start:
 mov r0,0 ;success
 swi 0x900001 ;syscall exit
section readable writeable
 ddd dd 0
    

RESULT: does not assemble. says:
swi 0x900001 ;syscall exit
error: Constant value out of range


In fact swi seems to accept only 8-bit contants as an operand.

What gives? Thanks
Post 13 Feb 2006, 14:50
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20302
Location: In your JS exploiting you and your system
revolution 13 Feb 2006, 17:50
"Format ELF" has not been converted to ARM format, only "format ELF DWARF" is in proper ELF format at present. The ELF formatter you are using with the above code is expecting to make a file for X86 code. The error message you get is because in this mode the code is still in THUMB mode. Switch to CODE32 after setting the format.
Code:
format ELF executable 
entry start 
section readable executable 
code32 ;<-- add this line
start: 
 mov r0,0 ;success 
 swi 0x900001 ;syscall exit 
section readable writeable 
 ddd dd 0    


However all you may still be able to use the normal ELF format for your purpose. See the ARMv6.INC file around line 311 you see this code:
Code:
        mov     dword[edx],7fh + 'ELF' shl 8
        mov     al,1
        mov     [edx+4],al
        mov     [edx+5],al
        mov     [edx+6],al
        mov     byte[edx+10h],2         ;e_type
        mov     byte[edx+12h],40        ;machine type ARM
        mov     [edx+14h],al            ;e_version
        mov     dword[edx+024h],02000016h ;e_flags
        mov     byte[edx+28h],34h       ;e_ehsize
        mov     byte[edx+2ah],20h       ;e_phentsize
        mov     byte[edx+2eh],28h       ;e_shentsize    
after making your "format ELF" (without the DWARF) executable, edit the executable file and make the changes in the first few byte to match the code above (edx is a pointer to the beginning of the executable so all offsets are relative to the start of the file). The most important is the machine type of ARM at file offset 12h, the value must be 28h to indicate ARM code.

The next most important is the flags at offset 24h. Try the value 02000016h first. If that does not work try 02000002h instead. Without a symbol table in the executable the flags might need to be zeroed so maybe also try 02000000h or simply 0h.

Do you have an existing "hello world" or similar small working application? Compare the values in the first bytes, maybe the flags or some other setting needs to be different on your system.

My guess is that you will probably only need to change the two values at offset 12h (ARM code identifier 28h) and 24h (flags). Everything else should the same.

If you can get it to work then I will try to update the ELF format also when FASM v1.66 is ready.

[edit]Fixed my blunder with the ARM machine type[/edit]


Last edited by revolution on 14 Feb 2006, 03:47; edited 1 time in total
Post 13 Feb 2006, 17:50
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page Previous  1, 2, 3, 4 ... 31, 32, 33  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.