flat assembler
Message board for the users of flat assembler.

Index > Non-x86 architectures > [ARM] "Smart-Move" 32BIT, PC Relative Load, LDR =I

Author
Thread Post new topic Reply to topic
m3ntal



Joined: 08 Dec 2013
Posts: 296
m3ntal 22 Jun 2014, 20:08
"Smart-Move" immediate with PC relative load from nearby literal table - ldr r, [pc, p-$-8] - or optional movw/movt for FASMARM. =i prefix for "move 32BIT immediate" (compatible with GAS/GCC). &i for PC relative load.
Code:
; LITERAL.INC

literals equ  ; literal table
literals.i=-1 ; offset

; define literal 32BIT number. attach line/data
; definition: literals=itself+dw x

macro .literal x {
 literals equ literals, dw x
 literals.i=literals.i+4
}

; store literal table or initialize

macro .literals {
 local ..literals
 ?LITERALS \          ; address
  equ ..literals
 match j, literals \{ ; expand into a,b,c
  irp i, j \\{        ; expand each line
   i                  ; dw 1, dw 2, dw 3
  \\}
  restore ?LITERALS
 \}
 match , literals \{  ; initialize
  literals equ \      ; first lines...
  align 4,\
  ?LITERALS:,\        ; begin
  literals.i=0        ; offset
 \}
}

; move wide/top 32BIT

use.movwt? equ 0   ; 1 if CPU>=ARM.v6T2

macro movwt r, i {
 i=i and 0FFFFh
 movw r, i         ; low 16BIT
 i=(i shr 16) \
  and 0FFFFh
 if i<>0           ; high 16BIT?
  movt r, i
 end if
}

; generic move 32BIT immediate (worst
; case scenerio)

macro movi r, i {
 local n
 n=i and 0FFh
 mov r, n             ; mov r, b&FFh
 n=(i and 0FF00h)
 if n<>0              ; if 16+BIT...
  orr r, n            ; orr r, r, b&FF00h
 end if
 n=(i and 0FF0000h)
 if n<>0
  orr r, n            ; orr r, r, b&FF0000h
 end if
 n=(i and 0FF000000h)
 if n<>0
  orr r, n            ; orr r, r, i&FF000000h
 end if
}    
Upgrade ldr instruction to match sequences: =i for "move 32BIT immediate"; mov/mvn, constant rotation or movw/movt (if supported, CPU>=ARM.v6T2). Or &i for explicit PC relative load from literal table:
Code:
; ldr r, [pc, p-$-8] ; ldr r, &77777777h
; .literal 77777777h

macro ldr [p] {
 common
  local i, x
  define ?s 0
  match r=, &n, p \{         ; ldr r, &i
   x=(?LITERALS+\            ; PC relative
    literals.i)-$-8+4        ; address
   ldr r, [pc, x]            ; load
   .literal n                ; store i
   define ?s 1               ; matched
  \}
  match =0 \                 ; else
   r=,==n, ?s p \{           ; ldr r, =i
   i=(n)
   if (i>=0 & i<=255)\       ; use mov?
    | (i=-1 | i=$FFFFFFFF)\  ; use mvn?
    | (i and $FFFFF00F)=0\   ; use mov+ror?
    | (i and $FFFF00FF)=0\   ; FF00h
    | (i and $FFF00FFF)=0\   ; FF000h
    | (i and $FF00FFFF)=0\   ; FF0000h
    | (i and $F00FFFFF)=0\   ; FF00000h
    | (i and $00FFFFFF)=0    ; FF000000h
    mov r, i
   else if use.movwt?        ; use movw/movt?
    movwt r, i               ; CPU>=ARM.v6T2
   else                      ; worst case
    movi r, i                ; scenerio: mov+orr
   end if
   define ?s 1               ; matched
  \}
  if ?s eq 0                 ; else, use
   ldr p                     ; original ldr
  end if                     ; instruction
}

; 2-DO:

; * adr pseudo-instruction for relative address:
; add/sub x, pc, i
; * search table for existing 32BIT value
; * literal 'text': ldr r0, 'Text'    
Upgrade function/proc to insert literal table. Note: Custom inheritance in FASM's language supersedes C++/Java's concept of OOP.
Code:
macro function [p] {
 common
  function p ; function=itself/previous
  .literals  ; +this/new
}

macro endf {
 .literals   ; endf=this/new
 endf        ; +itself/previous
}    
Example (FASMARM):
Code:
include 'literal.inc'

.literals          ; initialize: p=$, i=0

ldr r0, [r1]       ; original ldr
ldr r0, =-1        ; mvn
ldr r0, =80h       ; mov...
ldr r0, =7F0h
ldr r0, =7F00h
ldr r0, =7F000h
ldr r0, =7F0000h
ldr r0, =7F00000h
ldr r0, =7F000000h
ldr r0, =80000000h
ldr r0, =1234h     ; movw/movt or mov+orr
ldr r0, &11111111h ; ldr r, [pc, p-$-8]...
ldr r0, &22222222h
ldr r0, &33333333h
ldr r0, &44444444h
bkpt 7777h

.literals          ; store constants    

Download updated Smart-Move+examples


Last edited by m3ntal on 25 Jun 2014, 20:38; edited 1 time in total
Post 22 Jun 2014, 20:08
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20299
Location: In your JS exploiting you and your system
revolution 23 Jun 2014, 03:05
I've been considering doing automatic literal placement and optimising it to use both +4KB and -4KB when possible (so that means forward referencing constants before usage, and it means only one table per 8KB) by using the virtual address space feature. But I never found a nice solution that was clear and consistent. Plus the "behind your back" placement of literal pools is a little bit scary since the programmer has no good idea where such pools will end up.

Another option is to place pools once a full cache line has been accumulated but of course it has the problem of having to know the internal implementation details which is not always assured to be known.

Do you have any ideas about this?

Also, regarding this line "else if use.movwt": You can use the %p variable to detect if the 6T2 instructions are enabled.

Also2, the usage of align 4 is not optimal. Most probably align 64 (or 32 for older CPUs) will be better to avoid the cache pollution problem of code and data contained in the same line. But if you don't care about performance then this won't matter.
Post 23 Jun 2014, 03:05
View user's profile Send private message Visit poster's website Reply with quote
m3ntal



Joined: 08 Dec 2013
Posts: 296
m3ntal 23 Jun 2014, 07:55
Quote:
I've been considering doing automatic literal placement... But I never found a nice solution that was clear and consistent. Plus the "behind your back" placement of literal pools is a little bit scary since the programmer has no good idea where such pools will end up.
First, load from literal table is optional with &i syntax. =i is for "move immediate" as in GAS/GCC. Second, I don't think including them before endf/endp is being secretive or hiding anything.
Quote:
Do you have any ideas about this?
My current implementation is fine for now. Here are ways I would improve it: adr pseudo-instruction for relative address: add/sub x, pc, i. Search table for existing 32BIT value. Literal 'text': ldr r0, 'Text'
Quote:
Also, regarding this line "else if use.movwt": You can use the %p variable to detect if the 6T2 instructions are enabled.
For compatibility with my own ARM assembler.
Quote:
Also2, the usage of align 4 is not optimal. Most probably align 64 (or 32 for older CPUs) will be better to avoid the cache pollution problem of code and data contained in the same line. But if you don't care about performance then this won't matter.
It's not that. It's just that I don't do what anyone else thinks is important. I do what I want. And I sure as Hell am not going to spend an eternity optimizing "Hello, World" or the message loop. I'm working on multiple projects: Micro-FASM (Raspberry PI), ZIDE, ZDS, etc.

New Web Page for this: Smart-Move


Last edited by m3ntal on 23 Jun 2014, 09:06; edited 1 time in total
Post 23 Jun 2014, 07:55
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20299
Location: In your JS exploiting you and your system
revolution 23 Jun 2014, 08:03
m3ntal wrote:
Second, I don't think including [literal pools] before endf/endp is being secretive or hiding anything.
That assumes the use of procedure macros. I wanted to avoid the reliance on such constructs.
Post 23 Jun 2014, 08:03
View user's profile Send private message Visit poster's website Reply with quote
m3ntal



Joined: 08 Dec 2013
Posts: 296
m3ntal 23 Jun 2014, 08:40
revolution: Do you have anything to contribute to this? Any variations?
Quote:
That assumes the use of procedure macros. I wanted to avoid the reliance on such constructs.
Do whatever you want in your code, whatever works for you. As in the example, .literals can be used without a function/procedure. I don't like the ugly mixed style either (proc, stdcall, etc). My function is simple and straight forward compared to proc.
Post 23 Jun 2014, 08:40
View user's profile Send private message Reply with quote
m3ntal



Joined: 08 Dec 2013
Posts: 296
m3ntal 25 Jun 2014, 20:29
New! Easy-Move+DP upgrade (ADD, SUB, CMP, ETC):

Supports immediate, [memory] and [s*i+b] syntax. Example:
Code:
include 'literal.inc'

mov r1, r2
mov r1, 80000000h
mov r1, [r2]
mov [r1], r2
mov r1, [r2+r3]
mov [r1+r2*4], r3
mov r1, [r2+123ABCh]

add r1, r2
add r1, 80000000h
add r1, [r2]
add [r1], r2
add r1, [r2+r3]
add [r1+r2*4], r3
add r1, [r2+123ABCh]

sub r1, r2
orr r1, 80000000h
and r1, [r2]
bic [r1], r2
sub r1, [r2+r3]
cmp [r1+r2*4], r3
mvn r1, [r2+123ABCh]    
Disassembly:
Code:
00000000 E1A01002 mov r1, r2
00000004 E3A01102 mov r1, 80000000h
00000008 E5921000 ldr r1, [r2]
0000000C E5812000 str r2, [r1]
00000010 E7921003 ldr r1, [r2, r3]
00000014 E7813102 str r3, [r1, r2, lsl 2]
00000018 E3A0C0BC mov r12, 0BCh
0000001C E38CCC3A orr r12, r12, 3A00h
00000020 E38CC812 orr r12, r12, 120000h
00000024 E792100C ldr r1, [r2, r12]
00000028 E0911002 adds r1, r1, r2
0000002C E2911102 adds r1, r1, 80000000h
00000030 E592B000 ldr r11, [r2]
00000034 E091100B adds r1, r1, r11
00000038 E591A000 ldr r10, [r1]
0000003C E09AA002 adds r10, r10, r2
00000040 E581A000 str r10, [r1]
00000044 E792B003 ldr r11, [r2, r3]
00000048 E091100B adds r1, r1, r11
0000004C E791A102 ldr r10, [r1, r2, lsl 2]
00000050 E09AA003 adds r10, r10, r3
00000054 E781A102 str r10, [r1, r2, lsl 2]
00000058 E3A0C0BC mov r12, 0BCh
0000005C E38CCC3A orr r12, r12, 3A00h
00000060 E38CC812 orr r12, r12, 120000h
00000064 E792B00C ldr r11, [r2, r12]
00000068 E091100B adds r1, r1, r11
0000006C E0511002 subs r1, r1, r2
00000070 E3811102 orr r1, r1, 80000000h
00000074 E592B000 ldr r11, [r2]
00000078 E011100B ands r1, r1, r11
0000007C E591A000 ldr r10, [r1]
00000080 E1DAA002 bics r10, r10, r2
00000084 E581A000 str r10, [r1]
00000088 E792B003 ldr r11, [r2, r3]
0000008C E051100B subs r1, r1, r11
00000090 E791A102 ldr r10, [r1, r2, lsl 2]
00000094 E15A0003 cmp r10, r3
00000098 E781A102 str r10, [r1, r2, lsl 2]
0000009C E3A0C0BC mov r12, 0BCh
000000A0 E38CCC3A orr r12, r12, 3A00h
000000A4 E38CC812 orr r12, r12, 120000h
000000A8 E792B00C ldr r11, [r2, r12]
000000AC E1E0100B mvn r1, r11    
Low-level get/set helper macros. Usage:
Code:
@get r1, r2           ; r=r/[m]/i
@get r1, 10000h
@get r1, [r2]
@get r1, [20000h]
@get r1, [r2+30000h]
@get r1, [30000h+r2]
@get r1, [r2+r3*4]

@set [r1], r2         ; [m]=r
@set [20000h], r1
@set [r1+30000h], r2
@set [40000h+r1], r2
@set [r1+r2*4], r3    
Code:
; registers for calculations

@r fix r12  ; spare
@sr fix r11 ; source
@dr fix r10 ; destiny

is.i? fix eqtype 0

is.r? fix in <r0,r1,r2,r3,r4,r5,r6,r7,\
 r8,r9,r10,r11,r12,r13,r14,r15>

macro @get r, x {
 define ?s 0
 match \
  [a], x \{
  match \
   [b+i], x \\{
   match \
    I*S, i \\\{              ; [a+b*c]
    if S eq 4
     ldr r, [b, I, lsl 2]
    else
     'Error'
    end if
    define ?s 1
   \\\}
   if ?s eq 0                ; [a+b]
    if b is.r? \             ; [r+r]
     & i is.r?
     ldr r, [b, i]
    else if b is.r? \        ; [r+i]
     & i is.i?
     ldr @r, =i
     ldr r, [b, @r]
    else if i is.r? \        ; [i+r]
     & b is.i?
     ldr @r, =b
     ldr r, [@r, i]
    else
     'Error'
    end if
   end if
   define ?s 1
  \\}
  if ?s eq 0
   if a is.r?                ; [r]
    ldr r, [a]
   else                      ; [m]
    ldr @r, =a
    ldr r, [@r]
   end if
  end if
  define ?s 1
 \}
 if ?s eq 0
  if x is.r?                 ; r
   mov r, x
  else                       ; i
   ldr r, =x
  end if
 end if
}

macro @set x, r {
 define ?s 0
 match \
  [a], x \{
  match \
   [b+i], x \\{
   match \
    I*S, i \\\{              ; [a+b*c]
    if S eq 4
     str r, [b, I, lsl 2]
    else
     'Error'
    end if
    define ?s 1
   \\\}
   if ?s eq 0                ; [a+b]
    if b is.r? \             ; [r+r]
     & i is.r?
     str r, [b, i]
    else if b is.r? \        ; [r+i]
     & i is.i?
     ldr @r, =i
     str r, [b, @r]
    else if i is.r? \        ; [i+r]
     & b is.i?
     ldr @r, =b
     str r, [@r, i]
    else
     'Error'
    end if
   end if
   define ?s 1
  \\}
  if ?s eq 0
   if a is.r?                ; [r]
    str r, [a]
   else                      ; [m]
    match [i j], x
     \\{ 'Error' \\}
    ldr @r, =a
    str r, [@r]
   end if
  end if
 \}
 if ?s eq 0
  'Error'
 end if
}    
Upgrade mov to support r/[r]/[m]/i:
Code:
macro mov [p] {
 common
  define ?s 0
  match r=,[m], p \{
   if m is.r?
    ldr r, [m]
   else
    @get r, [m]
   end if
   define ?s 1
  \}
  match [m]=,r, p \{
   if m is.r?
    str r, [m]
   else
    @set [m], r
   end if
   define ?s 1
  \}
  if ?s eq 0
   mov p
  end if
}    
Upgrade all data processing instructions - add, sub, cmp, orr, and, tst, bic, etc - to support r/[r]/[m]/i.:
Code:
macro @dp name, [p] {
 common
  define ?s 0
  match r=,[m], p \{
   mov @sr, [m]
   name r, @sr
   define ?s 1
  \}
  match [m]=,r, p \{
   mov @dr, [m]
   name @dr, r
   mov [m], @dr
   define ?s 1
  \}
  if ?s eq 0
   name p
  end if
}

macro and [p] { common @dp ands, p }
macro eor [p] { common @dp eors, p }
macro sub [p] { common @dp subs, p }
macro rsb [p] { common @dp rsbs, p }
macro add [p] { common @dp adds, p }
macro adc [p] { common @dp adcs, p }
macro sbc [p] { common @dp sbcs, p }
macro rsc [p] { common @dp rscs, p }
macro tst [p] { common @dp tst, p }
macro teq [p] { common @dp teq, p }
macro cmp [p] { common @dp cmp, p }
macro cmn [p] { common @dp cmn, p }
macro bic [p] { common @dp bics, p }
macro mvn [p] { common @dp mvn, p }    
Download updated code+examples
Post 25 Jun 2014, 20:29
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.