flat assembler - FASMARM v1.44 - Cross assembler for ARM CPUs

Index > Non-x86 architectures > FASMARM v1.44 - Cross assembler for ARM CPUs

Goto page Previous 1, 2, 3, ... 31, 32, 33 Next

Author

Thread

revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 20798
Location: In your JS exploiting you and your system

revolution 10 Nov 2005, 03:58

I have uploaded version 1.04 at the top of this thread.

One bug fixed with code classification when using TIMES:

Code:

times 4 nop ;was previously tagged at data

And improved the embedded line number information for smaller size.

10 Nov 2005, 03:58

Tomasz Grysztar

Joined: 16 Jun 2003
Posts: 8494
Location: Kraków, Poland

Tomasz Grysztar 10 Nov 2005, 11:26

I myself consider designing some dedicated parser for the syntaxes like ARM, so fasm could be adopted to use the original syntax of that architecture.
I'm also working now (though only in design area at the moment) on the new line of fasm's core (will become 1.65.x when ready) that will support in much better way (and without a not-fully-reliable tricks like the one used by my listing extension) exporting any kind of debug information or listings. This, to be truly well working and reliable, needs some important changes in all the basic internal structures used by fasm, so it might still take some time for me to finish it.

10 Nov 2005, 11:26

MazeGen

Joined: 06 Oct 2003
Posts: 977
Location: Czechoslovakia

MazeGen 10 Nov 2005, 12:28

yup! any kind of debug infromations! Smile

10 Nov 2005, 12:28

revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 20798
Location: In your JS exploiting you and your system

revolution 11 Nov 2005, 01:30

I have used some very ugly patches to make the ARM code parser work properly. For the most part it does parse the standard ARM syntax with the exception of the # to indicate literal values and labels starting with digits. But in my opinion the way ARM use the hash is entirely unnecessary and verbose. The other thing that I find rather stupid with ARM syntax is forcing all labels to start on column 1 and not allowing any instructions to start on column 1. I feel more comfortable using FASMARM over the ARM ADS (but my opinion may be biased Wink

).

I used the listing code as my template to generate the labels, line numbers and code classification information. This means there are always three extra passes for the ELF DWARF format. For my current projects this usually means doubling from 3 to 6 passes (about 10 seconds now becomes 20 seconds) to assemble (but still much faster than ADS). If there is a way to improve this process then that would be a very good thing.

11 Nov 2005, 01:30

Tomasz Grysztar

Joined: 16 Jun 2003
Posts: 8494
Location: Kraków, Poland

Tomasz Grysztar 11 Nov 2005, 01:59

Yes, and actually any additional passes should be avoided - after the code got resolved, doing one more pass might be risky, as for example %t symbol might get different value, and thus possibly - in such rare cases that generated code would be dependent on %t value - the additional passes would not exactly repeat the already-successfully-resolved code. That's why I'm designing an other way to gather such information - though, on the other hand, it will slow down the required passes a bit.

11 Nov 2005, 01:59

vid
Verbosity in development

Joined: 05 Sep 2003
Posts: 7103
Location: Slovakia

vid 11 Nov 2005, 09:16

no problem, FASM is fast enough anyway, i think for anyone. and this would be worth of some slowdown

11 Nov 2005, 09:16

revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 20798
Location: In your JS exploiting you and your system

revolution 11 Nov 2005, 16:37

Quote:

actually any additional passes should be avoided

I see a simple way to avoid the %t problem mentioned above is to call the time only once at startup and just use that time throughout the assembly process. Of course this can mean the time used in the code may be 10 or 20 seconds (for big projects) earlier than the finish time, but I don't think that is an important thing to worry about.

However, that aside, I do see the point that you are making about the extra passes and agree that avoiding them is a good thing.

11 Nov 2005, 16:37

comrade

Joined: 16 Jun 2003
Posts: 1150
Location: Russian Federation

comrade 21 Jan 2006, 19:25

Tomasz, are you going to make this official?

21 Jan 2006, 19:25

Tomasz Grysztar

Joined: 16 Jun 2003
Posts: 8494
Location: Kraków, Poland

Tomasz Grysztar 21 Jan 2006, 19:33

I'd prefer it to be a "fork", separate project - the changes I'm going to apply (or already applied) to the fasm's core may not agree with what is used here.
Also separating it from "regular" fasm would allow adapting parser and preprocessor more to the needs of ARM syntax - something I would strongly recommend.

On the other hand it would be better to make such forks after I finish the core changes for 1.65 line - since some of them are going to make easier collecting the various information about symbols etc.

21 Jan 2006, 19:33

comrade

Joined: 16 Jun 2003
Posts: 1150
Location: Russian Federation

comrade 21 Jan 2006, 21:16

Well, maybe at least put a link on your page, so people are aware that at least such "fork" exists.

21 Jan 2006, 21:16

Tomasz Grysztar

Joined: 16 Jun 2003
Posts: 8494
Location: Kraków, Poland

Tomasz Grysztar 23 Jan 2006, 12:33

This is the wider problem, why don't I maintain any official links page on my website. It's mainly because I'm afraid of it getting not up-to-date quickly, while the message board can provide the huge collections of links where anyone can put own addition or update at any time.

23 Jan 2006, 12:33

Kuemmel

Joined: 30 Jan 2006
Posts: 200
Location: Stuttgart, Germany

Kuemmel 07 Feb 2006, 21:27

Hi guys,

I recently discovered FASM to get into learning x86 assembler...but before I only coded in ARM assembly language for years on my Desktop Acorn Risc PC, with an ARM610, later equipped with an StrongARM, the precessor of the Intel XSCale developed by DEC.

A decade before there was a nice book about assembly programming called "Archimedes Assembly Language" by Mike Ginns...I remember that it is now available for free on the net somewhere.

There's also a relatively new still supported Desktp Computer running an 600 MHz XScale: http://www.iyonix.com/

...just for you information and to do a little bit of advertising for a wonderfull programmable CPU (except floating point...). If you wanna know more or need other help regarding the assembly programming tell me...

07 Feb 2006, 21:27

vid
Verbosity in development

Joined: 05 Sep 2003
Posts: 7103
Location: Slovakia

vid 08 Feb 2006, 01:21

Kuemmel: well, somebody who seems experienced with ARM? Is it really useful to code it in assembly? Eg. isn't there different processor "in each" mobile phone? How much do the processors differ?
And what about instruction possibilities? Is it real hardcore RISC, with only basic things (eg. harder to code by hand)?

Maybe you could link some ARM code examples...

08 Feb 2006, 01:21

Kuemmel

Joined: 30 Jan 2006
Posts: 200
Location: Stuttgart, Germany

Kuemmel 08 Feb 2006, 09:34

Hi vid,

yep, basically it's true, the cpu's vary a lot in all these devices (PDA, phone, etc.), but they are all based on the same architecture. Normally all these companies license the architecture/design from ARM (www.arm.com) and then build the cpu's as ARM themselves doesn't produce any hardware themselves.

These architectures have a great variety and are enhanced all the time, but normally within some limits the compatibility is there. Before long time (ARM600) for example there was a 26bit-mode, nowdays the XScale only supported 32bit-mode, so when you go to far back in ARM-history there are some differences. ARM themselves number the architecture, so the XScale is based on ARMv5TE-architecture, where 5 means the evolution number and T is for an added 'Thumb'-instruction core (16 bit sized instruction to save memory) and E for a kind of DSP-like MMX extention.

Recently there are more things added to the core, like SIMD instructions, or a JAZCELL, supporting JAVA byte code...
http://www.arm.com/products/CPUs/architecture.html
...but of course you always have to look, what the cpu-producing company (Intel, Samsung,...) is actually using of these offers from ARM...

I mainly coded on my Risc PC, using the core 26/32 bit cpu, that code is I think valid for all the later CPU's, except the 26-bit thing...the first thing to know that all instructions have a length of 32-bit (in ARM coding 32-bit was always related to 1 word, not like DWORD in x86 coding, as ARM was always 32-bit RISC).

What I like most is the following stuff of it:

- 16 registers (0 to 12 general purpose, R13 stackpointer, R14 Linkregister, R15 program counter)
- extensive use of conditioned execution on all commands (not only mov) ike:

Code:

cmp   r0, #10
movgt r0, #0
movle r0, #1

- doing multiple stuff with one instruction like:

Code:

add r0, r0, [r1, LSL #5]

...what means r0=r0 + (r1<<5).

and storing/loading multiple words:

Code:

stmia r0!,{r1-r8}

...this one stores r1 to r8 (8*32 bit data) at r0 and after storing (IA) increments the address (r0), so that you can easily do a memory copy loop without the need of an add r0,r0,#256.

The instruction set is really RISC-like limited, but actually I never felt limited, except of the missing FPU, even the missing DIV can be worked around normally with MUL by 1/x. The XScale/StrongARM has also the extension of a 64bit-MUL.

Downside of the CPU's:
- No FPU (except the new multimedia extensions)
- No DIV instruction
- Memory/Video Memory access slow compared to general desktop x86 architecture
- No second level cache

All this is only as far as I know...the ARM world is big Wink

...it's the most sold 32bit-CPU on the world (think of all the Advance/DS Nintendo Gamboys). Sometimes I've seen demo-coding for the gameboy, but I don't know about their developement tools, fur sure ARM company itself has lots of may be commercial tools.

When it comes to OS-coding may be it get's difficult to get hand on a decent tool or information for all the mobile phones...don't got any experience with other OS than the Risc OS 4.

08 Feb 2006, 09:34

vid
Verbosity in development

Joined: 05 Sep 2003
Posts: 7103
Location: Slovakia

vid 08 Feb 2006, 10:39

so you say that it is possible to write code which would be portable among most of today's PDA, mobiles etc?

08 Feb 2006, 10:39

Kuemmel

Joined: 30 Jan 2006
Posts: 200
Location: Stuttgart, Germany

Kuemmel 08 Feb 2006, 13:12

@vid: Yep, as far as I can see the basic core instructions (Thumb and ARM) are unchanged and always present in any ARM based cpu since more than 5 years. Just the add-ons like Java cell, SIMD, etc. are always either there or not...in the end it would be the same like to test an x86 CPU if there's 3dnow or SSE1/2/3 available, before you run the code.

The problem is the operation system. Probably there are many of them, which I don't even know...as far as I know there's ARM based Windows CE, but of course a lot device specific ones like Nintendo DS OS, or whatever phone OS...but as there's software running on these devices...there must be some kind of documentation or tools...

@revolution...by the way...on which device do you run your ARM code !?

08 Feb 2006, 13:12

Madis731

Joined: 25 Sep 2003
Posts: 2138
Location: Estonia

Madis731 08 Feb 2006, 17:57

Kuemmel wrote:

Code:

cmp   r0, #10
movgt r0, #0
movle r0, #1

Code:

add r0, r0, [r1, LSL #5]

Code:

stmia r0!,{r1-r8}

To defend x86 a bit then:
1) It has conditional moves and exchanges
2) LEA instruction can be powerful in situations like:
ARM:
add r0, r0, [r1,LSL #5]
x86:
lea eax,[ebx*8+eax-14]
3) The SIMD is also possible in this way:
REP MOVSB (or compare, scan,...word, doubleword etc.)

Its not THAT bad the IA32 Razz

_________________
My updated idol Very Happy

http://www.agner.org/optimize/

08 Feb 2006, 17:57

revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 20798
Location: In your JS exploiting you and your system

revolution 08 Feb 2006, 19:56

Quote:

revolution...by the way...on which device do you run your ARM code !?

My company makes propriatory systems using a number of different CPU's. The latest ARM product has a four CPU daughter board for high speed data encryption using Intel PXA27x.

As for the discussion about which is better, I think it is not a fair comparison. The development budgets for the two CPU's is considerably different and the design goals were very different.

Here are a few of the main differences I have noticed in no particular order (I'm not trying to say one or the other is better, just different):

ARM: 15 user registers + PC relative addressing
X86: 7 user registers + stack pointer

ARM: Register based addressing with limited immediate offsets or (almost) unlimited register shifting
X86: Memory addressing with unlimited immediate offsets and/or limited register shifts

ARM: 99% of user instructions are conditional
X86: Only a limited subset of MOV's are conditional

ARM: Strictly LOAD/OP/STORE architechure
X86: All arithmetic OP's can use memory directly

ARM: Large constants (>8 bits) are tricky to optimise
X86: Any constant can be placed in instruction stream

ARM: Low power consumption
X86: High power consumption

ARM: No hardware standard platform
X86: PC based platform is pseudo standard

ARM: Lack of software/tools/information for beginners
X86: Plenty of software/tools/information for beginners

ARM: Ability to address memory with negative register offsets
X86: Only additive register addressing supported

ARM: Basic RISC style instruction set
X86: CISC instruction set (DIV/DAA/ENTER/LEAVE etc.)

ARM: ALL I/O is memory based
X86: I/O using separate control bus

ARM: FLAG updates are selectable for each instruction
X86: FLAG updates are fixed into the instruction set

One of the things I have noticed while porting code from X86 to ARM is that while having 15 general registers (not including PC) might seem to be good for not needing to access the stack but in reality the lack of direct memory based arithmetic operations and the difficulty with optimising constants means many registers end up holding memory address offsets or constants sometimes leaving less available registers to perform the desired function. Another is that the ability to make instuctions conditional often greatly improves ability to process variable data. A third thing is that the predictable clocking and instruction throughput makes optimising for ARM more deterministic and involves less guesswork. Overall though I feel that in terms of clocks per instruction the ARM keeps pace almost one-to-one with a P4 for general algorithimic processing.

For the product I mentioned above it is probably the case that a single P4/3GHz could have done the same job as the four ARM CPU's, but the power consumption per board would be about 27 times more (80W compared to 3W). In terms of processing power per Watt (MIPS/mW) the P4 doesn't stand a chance.

08 Feb 2006, 19:56

Madis731

Joined: 25 Sep 2003
Posts: 2138
Location: Estonia

Madis731 08 Feb 2006, 21:47

Yeah, these two are not exactly comparable, because one is meant for mainstream - other for things like handhelds and other itegrations.
Why ARM consumes less power is because it has strict rules. Constant length instructions make it easier for processor to decode. Pentiums nowdays need to predict jumps, flags, fill/flush pipelines and it all takes transistors and power. All for our convenience Smile

but it all comes with a price.

All-in-all its nice to see ARM-in-FASM but I'm still looking for a good platform to code on. I've been on Intel from the first assembly days :S, well some Hitachys also but it was a long time ago...

08 Feb 2006, 21:47

revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 20798
Location: In your JS exploiting you and your system

revolution 09 Feb 2006, 04:53

Quote:

Why ARM consumes less power is because it has strict rules. Constant length instructions make it easier for processor to decode. Pentiums nowdays need to predict jumps, flags, fill/flush pipelines and it all takes transistors and power.

The xScale has two instruction sets (ARM and THUMB), branch prediction logic, store buffers, split 32k each data and code caches, TLB's, boot ROM, SRAM, integrated memory controller numerous serial ports and GPIO's and lots of other stuff all together in one package. When running at 500 MHz is still needs less than 1 Watt per processor.

I said this:

Quote:

... in terms of clocks per instruction the ARM keeps pace almost one-to-one with a P4 for general algorithimic processing.

That does not make clear what I was thinking at the time. What I meant was more like this:

Consider three metrics: 1) Instruction count required to encode Function-X. 2) Instruction bytes required to encode Function-X. and 3) Clock count required to execute Function-X. Where Function-X is some non-trivial function that needs to be performed. In general on all three metrics I find ARM code is very similar to P4 code.

Quote:

I'm still looking for a good platform to code on.

Aren't we all, but unfortunately it all comes down to popularity and marketing. The old 8086 was popular (because IBM was backing it) and Motorola (in the Apple's) was not. So Intel made lots of sales and poured money into marketing to keep it popular. That's why we have todays P4's.

09 Feb 2006, 04:53

Goto page Previous 1, 2, 3, ... 31, 32, 33 Next

< Last Thread | Next Thread >

Forum Rules:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum