flat assembler
Message board for the users of flat assembler.

Index > Non-x86 architectures > FASMARM v1.44 - Cross assembler for ARM CPUs

Goto page Previous  1, 2, 3 ... 28, 29, 30, 31, 32, 33  Next
Author
Thread Post new topic Reply to topic
l_inc



Joined: 23 Oct 2009
Posts: 881
l_inc 24 May 2016, 14:51
revolution
Quote:
For example we cannot instruct it to use a fixed set of bits of our choosing for ARM32 ADR/ADD

Once again. The linker is out of play here. We need not instruct it to calculate with values beyond its normal evaluation of the r_info field. What we need is to calculate at assembly time and see if the result of such calculations is encodeable within the constraints provided by the instruction encoding and the r_info field. Let me simplify the example a gave, because you're obviously missing the point. Assume foo is a relocatable label. Fasm can already compile the statement mov eax, foo+0x100. But it does not mean, it instructs the linker or the OS loader to add 0x100 to foo, right? Instead it takes the absolute value of foo adds 0x100 and then applies the relocatability property to the result, which is what it can encode into the instruction.

Quote:
Also note that the linker can change ADD to SUB and the assembler has no control over that.

Yes, this is something that needs to be considered as well, because a classical linker does not fiddle with already existing instruction opcodes, even though modern linkers have learned lots of tricks. However I think this is not gonna be a problem, because the linker cannot go beyond what the relocation field allows to do, so whatever transformation it applies, it will remain correct considering the type of relocation specified by the translator.

Quote:
Since using fixed bitmasks cannot work

I still think, calculations including those with but not limited to bitmasks should work.

_________________
Faith is a superposition of knowledge and fallacy
Post 24 May 2016, 14:51
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20445
Location: In your JS exploiting you and your system
revolution 24 May 2016, 15:12
We can only tell the linker which symbol name we want to link to. We cannot know the value of the symbol, so the assembler cannot check any bits to see if they "fit". What would we calculate from?

For example the Gn relocs for ADR/ADD have this formula applied:
Quote:
A group, Gn, is formed by examining the residual value, Yn, after the bits for group Gn–1 have been masked off.
Processing for group G0 starts with the absolute value of X. For ALU-type relocations a group is formed by
determining the most significant bit (MSB) in the residual and selecting the smallest constant Kn such that

MSB(Yn) & (255 << 2Kn) != 0,
except that if Yn is 0, then Kn is 0. The value Gn is then

Yn & (255 << 2Kn),
and the residual, Yn+1, for the next group is

Yn & ~Gn.
Note that if Yn is 0, then Gn will also be 0.
For group relocations that access memory the residual value is examined in its entirety (i.e. after the appropriate
sequence of ALU groups have been removed): if the relocation has not overflowed, then the residual for such an
instruction will always be a valid offset for the indicated type of memory access.
Overflow checking is always performed on the highest-numbered group in a sequence. For ALU-type relocations
the result has overflowed if Yn+1 is not zero. For memory access relocations the result has overflowed if the
residual is not a valid offset for the type of memory access.
Note The unchecked (_NC) group relocations all include processing of the Thumb bit of a symbol. However, the
memory forms of group relocations (eg R_ARM_LDR_G0) ignore this bit. Therefore the use of the memory
forms with symbols of type STT_FUNC is unpredictable.
So we have no control over which bits the linker chooses. The assembler cannot know if the distance to symbol X is positive or negative, or less than 0x3fff, or even if X can be constructed with three instructions.

I am not sure which part you are missing here, perhaps you misunderstand that ARM code cannot use a single instruction to encode all values? Anyhow, no, we cannot tell the linker which bits to use for each instruction, it chooses them, and it rewrites the instructions accordingly ADD/SUB.
Post 24 May 2016, 15:12
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20445
Location: In your JS exploiting you and your system
revolution 24 May 2016, 16:45
I asked one of my colleagues to review this exchange and perhaps shed some light onto the misunderstanding. She suggested that the difficulty might arise from my usage of "overflow check". It should be noted that the assembler does not perform this overflow check (because it can't, even in principle, know what to check), it always assembles without error. And what we are in fact doing is instructing the linker at what point to check for overflow based upon the context of the instruction. It is the linker that is doing the checks. I hope this helps to clarify things and we can move on to discussing the original question of how to best represent this in the source code.
Post 24 May 2016, 16:45
View user's profile Send private message Visit poster's website Reply with quote
l_inc



Joined: 23 Oct 2009
Posts: 881
l_inc 24 May 2016, 17:07
revolution
There's definitely some part of the overflow check that can be done at assembly time. As for the rest of the check, some types of relocations are exactly for the purpose of the link-time overflow check. The misunderstanding is IMHO that you think I want to explicitly instruct the linker to do some calculations ("Anyhow, no, we cannot tell the linker which bits to use for each instruction"). No, I don't suggest that. Moreover when you say: "We can only tell the linker which symbol name we want to link to" — then it seems you're mixing up unrelated things, that is external symbols and relocations. Relocatable symbols are symbols you do know the value of. It's just not absolute. And you can calculate with it. You cannot divide it by 7 and you cannot use it as a denominator of a quotient, but you still can add and subtract absolute values from it to form an addend.

Anyhow, just provide an example of a group of instructions that need a relocation of some type R_ARM_XXX_Gn, and later this week equipped with the documentation I'll tell what the programmer should specify for these instructions and what calculations should be done by the translator to peek that type of relocation and encode the addend.

_________________
Faith is a superposition of knowledge and fallacy
Post 24 May 2016, 17:07
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20445
Location: In your JS exploiting you and your system
revolution 24 May 2016, 17:30
Use my example posted earlier:
Code:
adr r1,X ;use the top bits i.e. G0_NC
add r1,r1,X ;use the next 8 bits i.e. G1_NC
add r1,r1,X ;use the final 8 bits i.e. G2    
Also allow for the user to select only two instructions, or just one if they so desire. For those cases the "_NC" part needs to be adjusted accordingly.

BTW: I already know how to instruct the linker to make the address, I am only looking for a good way to instruct the assembler about what it intended. I didn't like the AS style, but it works fine from a technical sense.

But I do not understand your comment "There's definitely some part of the overflow check that can be done at assembly time". If the target address is unknown then what check(s) can be done? The target symbol might be the very next address, or it might be 2GB distant, either before or after. We can't know ahead of time what order the linker will be told to use for each of the modules. User A might link them A:B, and user B might link them B:A. If our code is in module A and links to a symbol defined in module B, then I don't see how we can know the distance during assembly. Only the linker knows.
Post 24 May 2016, 17:30
View user's profile Send private message Visit poster's website Reply with quote
l_inc



Joined: 23 Oct 2009
Posts: 881
l_inc 24 May 2016, 22:35
revolution
I'm guessing you mean X is defined as extrn X, right? While the ELF specification defines relocation as "the process of connecting symbolic references with symbolic definitions", this is not relocation in the classical sense. As you can see, you are rather talking about symbol resolution, not relocation. That's where the misunderstanding comes from. As for relocation, the symbol X could be defined like label X at $+0x100000 . In this case it indeed needs relocation, not resolution, and in this case you also do not know the actual address before the link or even run time, but unlike for the unresolved X there's much more address-related information that you can work with. Including some limited overflow checking.

_________________
Faith is a superposition of knowledge and fallacy
Post 24 May 2016, 22:35
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20445
Location: In your JS exploiting you and your system
revolution 25 May 2016, 02:24
So now that you understand, do you have a suggestion?

BTW: There is no mention of the word "resolution" anywhere in the ELF specs so I don't know where you get that terminology. It seems to be your own?
Post 25 May 2016, 02:24
View user's profile Send private message Visit poster's website Reply with quote
l_inc



Joined: 23 Oct 2009
Posts: 881
l_inc 25 May 2016, 07:01
revolution
Quote:
So now that you understand, do you have a suggestion?

I'd still need some time to sort things out.
Quote:
BTW: There is no mention of the word "resolution" anywhere in the ELF specs so I don't know where you get that terminology. It seems to be your own?

Right. And the second reason for the misunderstandings is that you ignore what I write. Razz The wording "as you can see" points out to where it comes from. So if you had a look at the link I'd provided, you'd know it's the generally known and accepted term.

_________________
Faith is a superposition of knowledge and fallacy
Post 25 May 2016, 07:01
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20445
Location: In your JS exploiting you and your system
revolution 25 May 2016, 07:16
This is the ARM world here so I guess you'll just have to get used to the way people talk within these surroundings. Smile Don't go confusing yourself by applying terminology from another field an expect to be universally understood.
Post 25 May 2016, 07:16
View user's profile Send private message Visit poster's website Reply with quote
l_inc



Joined: 23 Oct 2009
Posts: 881
l_inc 25 May 2016, 09:28
revolution
This has nothing to do with ARM. The term is just as common and universal as the term compilation. Redefining any of these in "the ARM world" makes no sense. And you not being familiar with the term "symbol resolution" are as astonishing for me as you not knowing the term "compilation". A newbie programmer does not need much experience to come across an "unresolved external symbol" error. It's one of the most common problems asked on the Internet. For Windows primarily, but here's what the gcc linker for arm knows about unresolved symbols:

Code:
$ arm-eabi-ld --help |grep unresolved
  --no-undefined              Do not allow unresolved references in object files
  --allow-shlib-undefined     Allow unresolved references in shared libraries
  --no-allow-shlib-undefined  Do not allow unresolved references in shared libs
  --unresolved-symbols=<method>
                              How to handle unresolved symbols.  <method> is:
  --warn-unresolved-symbols   Report unresolved symbols as warnings
  --error-unresolved-symbols  Report unresolved symbols as errors
  --ignore-unresolved-symbol SYMBOL
  -z defs                     Report unresolved symbols in object files.    


Conceptually image relocation and symbol resolution are completely different things. Technically these probably might have a partially common processing mechanism. I just need to read some more ELF documentation to understand to what extent, and if there's indeed at least a technical reason for mixing up these.

P.S. Btw. I've just searched through the ELF documentation and disregarding its senseless definition for relocation it also mentions symbol resolution separately from relocation.

_________________
Faith is a superposition of knowledge and fallacy
Post 25 May 2016, 09:28
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8358
Location: Kraków, Poland
Tomasz Grysztar 25 May 2016, 10:27
l_inc wrote:
Conceptually image relocation and symbol resolution are completely different things. Technically these probably might have a partially common processing mechanism. I just need to read some more ELF documentation to understand to what extent, and if there's indeed at least a technical reason for mixing up these.

P.S. Btw. I've just searched through the ELF documentation and disregarding its senseless definition for relocation it also mentions symbol resolution separately from relocation.
Yes, the confusion is probably caused by the fact that there is one common mechanism and data structure format for both symbol resolution and relocation, not only in ELF but also in other common object formats. So even though ELF documentation acknowledges the symbol resolution as a separate concept, you will not find any data structures related to it, because it is all done with the same structure - relocations*.

The most frequent type of relocation is the one that defines the final value as symbol+addend (or symbol+addend-$ when addresses are encoded as relative not absolute). When addend is 0, you have a plain symbol resolution. When referenced symbol is your own section, this becomes a plain relocation (and the addend is then an address within the section that needs to be relocated). Hybrid is also possible - in fasm you can define things like "label Y at X + 123h" (with "extrn X"), and then if you use Y symbol, in object file you get a relocation entry that resolves symbol X with an addend 123h. As for the overflow checking, the assembler can (and should) check for an overflow in the addend, but only linker/loader can check for an overflow in the final value.

___
* Isn't it nice to have one simple entity that covers many use cases? It is just like some of fasm's directives. I wonder if it's something that mostly mathematicians like to do when programming, or a more universal inclination.
Post 25 May 2016, 10:27
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8358
Location: Kraków, Poland
Tomasz Grysztar 25 May 2016, 13:22
BTW, if you need to experiment with the assembly of ELF objects, you may take a look at ELFOBJ.INC I included with the examples in fasm g package. It is a relatively simple set of macros that emulate the ELF formatter from fasm 1 and should generate the same output, but it is much easier to tweak them than it is to tweak fasm 1 source. If you had some ARM instruction set macros (I have not written such macros myself, I thought that perhaps revolution would be the right person to do it) you could very quickly adjust these macros for the ARM purposes and then try adding new relocation types and experiment with them.
Post 25 May 2016, 13:22
View user's profile Send private message Visit poster's website Reply with quote
l_inc



Joined: 23 Oct 2009
Posts: 881
l_inc 25 May 2016, 23:29
Tomasz Grysztar
That's a nice summary for the confusion clarification. However I still wouldn't call the hybrid case relocation (or a hybrid case). As for me, this is resolution with an offset, i.e. effectively resolution of an imaginary unnamed symbol that has a fixed offset from some known named symbol, and hence conceptually still resolution. Relocation can only happen if there is an original assumed address that needs to be changed because of violation of the assumption, and the change is an offset equally applied to every such symbol within a section.

Quote:
Isn't it nice to have one simple entity that covers many use cases?

It is, but only as long as all the potential use cases have a common part that fits the simple entity. If not, then attempting to stretch the entity over new use cases not considered before will result in Frankenstein-like solutions.

Quote:
I wonder if it's something that mostly mathematicians like to do when programming

Might I say, mathematicians often write the worst code in terms of readability and maintainability. They have little coding culture and hygiene, they hardcode constants, use short names and squash similar things together, so that any attempt to widen the domain of the things makes one suffer while separating back the flies and the rissoles.


As for your suggestion regarding fasm g, I actually was going to use its syntax to demonstrate what I had in mind on arm relocation support. I'm not sure though, if that still makes sense. But revolution could definitely make use of it to play around with possible syntaxes for relocations and without having to implement full instruction set support.

_________________
Faith is a superposition of knowledge and fallacy
Post 25 May 2016, 23:29
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20445
Location: In your JS exploiting you and your system
revolution 26 May 2016, 03:05
I've implemented my initial proposal but I am still unsure if it makes sense to to do it that way. I am happy to here all suggestions and thoughts.
Post 26 May 2016, 03:05
View user's profile Send private message Visit poster's website Reply with quote
Andrew Martin



Joined: 30 Sep 2015
Posts: 29
Location: 404, Lugansk
Andrew Martin 08 Nov 2016, 17:26
FASMARM v1.4

This code

Code:
format BINARY
thumb
processor 0x00098280       ; cortex-m0


dw      _sti_handler + 1

db 250 dup (?)

        align   4
_sti_handler:

        beq     .continue
    .continue:
        nop
    


will produce

Code:
 beq     .continue
error: Requires CPU capability 7M, use directive "processor" to select.
    


Perhaps a new bug detected.
Post 08 Nov 2016, 17:26
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20445
Location: In your JS exploiting you and your system
revolution 09 Nov 2016, 03:07
Andrew Martin wrote:
Perhaps a new bug detected.
Yes it is. Thanks for the report.

For reference a minimal source example to trigger the bug is this:
Code:
thumb
processor 0x80
rb 256
beq @f
@@:    


Version 1.41 now available:
Quote:
v1.41 2016-Nov-09
  • Fix a bug with forward referencing of labels in thumb mode
Note that this version assembles against fasm v1.71.57
Post 09 Nov 2016, 03:07
View user's profile Send private message Visit poster's website Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 767
tthsqe 24 Feb 2017, 21:02
The readme says
Code:
For 664-bit code only the binary format is
currently supported. ELF64 and PE64 formats have not yet been updated.    


I suppose this is the reason for this?
Code:
pc@pc-500-210qe:~/Desktop$ ./fasmarm armFish.asm new
flat assembler for ARM  version 1.41 (built on fasm 1.71.57)  (16384 kilobytes memory)
3 passes, 220 bytes.
pc@pc-500-210qe:~/Desktop$ qemu-aarch64 ./new
./new: Invalid ELF image for this architecture
pc@pc-500-210qe:~/Desktop$ file new
new: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, stripped
    


BTW, if you could point out any mistakes in the following, that would be helpful.
Code:

        format ELF64 executable
        entry start

        processor cpu64_v8
        code64

        segment readable executable

start:  
                mov  w0, 0
                adr  x1, hello
                mov  w2, hello_len
                mov  x8, 64
                svc  0
                mov  w0, 6
                mov  x8, 93
                svc  0

hello:  db      'Hello world',10
hello_len=$-hello

        ;dummy section for bss, see http://board.flatassembler.net/topic.php?t=3689
        segment writeable    
Post 24 Feb 2017, 21:02
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20445
Location: In your JS exploiting you and your system
revolution 25 Feb 2017, 02:09
Support for ELF64 (and perhaps PE64) is still a work in progress.
Post 25 Feb 2017, 02:09
View user's profile Send private message Visit poster's website Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 767
tthsqe 25 Feb 2017, 03:41
Is it hard? I would be so Very Happy if you could at least get the elf64 working.
Post 25 Feb 2017, 03:41
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20445
Location: In your JS exploiting you and your system
revolution 25 Jul 2017, 09:52
Version 1.42 now available:
Quote:
v1.42 2017-Jul-23
  • Fix a bug with unclosed curly and square bracket parsing
Note that this version assembles against fasm v1.71.64
Post 25 Jul 2017, 09:52
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page Previous  1, 2, 3 ... 28, 29, 30, 31, 32, 33  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.