Hallo,
after solving myself

successfully an alignment question i come
with the following macros that allow coding such instruction
@mov r0,0xE000ED88
macro @a4nop {
if $ and 2
nop
end if
}
macro @mov arg,argimm{
local ..tmp
@a4nop
ldr arg,[r15,#0]
b ..tmp
dw argimm
label ..tmp
}
;--- example usage
.reset_h:
@mov r0,0xE000ED88
nop
nop
nop
B .reset_h
output
> arm disassemble 0x44 7
0x00000044 0x4800 LDR r0, [pc, #0] ; 0x00000048
0x00000046 0xe001 B 0x0000004c
0x00000048 0xed88e000 (32-bit Thumb2 ...)
0x0000004c 0xbf00 NOP
0x0000004e 0xbf00 NOP
0x00000050 0xbf00 NOP
0x00000052 0xe7f7 B 0x00000044
now, i think there is a
stall there, for one of two reasons i cannot discriminate:
1) LDR is not listed at page 83 of manual V7M
2) whenever Flash-AHB on I-Code-bus of my STM32F4 can read 64 cache lines of 128 bits, the D-bus has priority on the on it, because fetching literals in the executable section.
EDIT
reason 2) seems consequential but related. the stall explained in chap 3.3.2 Cortex-M4.
LDR Rx,[PC,#imm] might add 1 cycle because of contention with the fetch unit.
add then 2 cycles, because LDR using PC is a blocking operation. the fetch unit
should be the one using the D-bus, while I-Bus already busy on a speculative read of the next 2,4 bytes in the code on "normal" memory.
one simpler solution, using MOVW+MOVT,
2 cycles using immediates, and it does not involve fetching from memory.
0x00000044 0xf64e5088 MOVW r0, #60808 ; 0xed88
0x00000048 0xf2ce0000 MOVT r0, #57344 ; 0xe000
hints ?
Cheers,
