flat assembler
Message board for the users of flat assembler.

Index > Macroinstructions > Z80 jumps auto optimization

Author
Thread Post new topic Reply to topic
marste



Joined: 05 May 2015
Posts: 44
marste 05 Jan 2016, 14:46
I would like to discuss here a specific of the port for the ZX81, but the macro instructions and suggestions might be valid for more platforms/porting.

In Z80 there are 2 type of jumps:
1) JP - absolute - always possible
2) JR - relative - possible just in the range +127-128 and with a limited set of conditions, but smaller of 1 byte and is some condition might be even faster

The point is that even if you want to use jr when possible, is up to the programmer always check if the jump is inside the specified range (and if you have a lot of them and you can move pieces of code around is difficult to always verify/keep updated the best combination).

So I was wondering if we can build a macro (let's call it "jrp") that will check the address and opt for jp just if the address is out of range.

If the programmer want this behaviour the syntax should be something like: "JRP address" or "JRP c,address" for conditionals (c is one letter identifying the flag bit to check), and result "JP address"/"JR address" or analogously "JP c,address"/"JR c,address".

Is it possible?
Post 05 Jan 2016, 14:46
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8359
Location: Kraków, Poland
Tomasz Grysztar 05 Jan 2016, 17:43
One of the fundaments of fasm's architecture is its code resolving ability, which from my point of view is so important for an assembler than I even based my own definition of what an assembler is on this feature. In simple terms, code resolving allows you to forward-reference various values (like label addresses) and thus create interdependent definitions, like an equation - and the assembler must find a solution.

Because fasm's code resolving applies universally to all kinds of constructions, a macro that utilizes it to optimize the instruction sizes may be as simple as this:
Code:
; fasm 1 syntax
macro jrp address {
        if address-$ <= 127 & address-$ >=-128
                jr address
        else
                jp address
        end if
}    
The label that you jump to may occur later in source and its value may thus depend on the size of instruction that is chosen by this macro - this is where fasm's code resolving jumps in. When fasm generates output without signalling an error, it guarantees that it has found the correct solution and all the dependencies (like the "if" condition in the above macros) are fulfilled. Sometimes it may fail to find the solution and signalize the "code cannot be generated" error - it may fail to find the solution even if one exists, as in the case of "oscillator problem". If it becomes a problem, you may try to employ some of the techniques to deal with oscillation (see the linked thread), but with simple jump length optimization this almost never happens (as demonstrated by fasm's implementation of x86 jumps).

If you would like to see the real implementation of instruction encoding optimization though macros, please check out the new fasm g package. It contains implementation of x86 instructions in form of macros and they optimize opcodes (not only jumps, but other instructions that have multiple encodings as well) the same as native implementation of fasm 1 does. Also the macro sets that implement 8051 and AVR architectures contain JMP implementations that optimize jump lengths, because I wanted to demonstrate this basic fasm's ability wherever possible. AFAIK there does not yet exist an implementation of Z80 instruction set for fasm g - but I hope we may get one in the future. And it should have opcode length optimization, of course. Wink
Post 05 Jan 2016, 17:43
View user's profile Send private message Visit poster's website Reply with quote
marste



Joined: 05 May 2015
Posts: 44
marste 05 Jan 2016, 19:38
GREAT STUFF Tomasz!!! Smile
(referring to fasmg)


Last edited by marste on 24 Jul 2017, 17:15; edited 1 time in total
Post 05 Jan 2016, 19:38
View user's profile Send private message Reply with quote
shutdownall



Joined: 02 Apr 2010
Posts: 517
Location: Munich
shutdownall 05 Jan 2016, 23:44
Thanks for this macro, Tomasz.
In fact I am not a fan of these automatic jumps and would use short jumps only in a short context and use far jumps more general. This is one byte more for the target but even 20% faster executed. But this is maybe my philosophy only. In my programs I use JR (short jump) mostly together with anonymous labels which is really a very nice solution from flat assembler. Smile
Post 05 Jan 2016, 23:44
View user's profile Send private message Send e-mail Reply with quote
marste



Joined: 05 May 2015
Posts: 44
marste 09 Jan 2016, 01:21
As you know I'm choosing JR mostly for space saving reasons, but anyway as of my knowledge they are even 30% faster when there are conditional jump not executed (7 tstates vs 10)
Post 09 Jan 2016, 01:21
View user's profile Send private message Reply with quote
shoorick



Joined: 25 Feb 2005
Posts: 1614
Location: Ukraine
shoorick 09 Jan 2016, 06:53
IMHO, such kind of optimization should be controlled. There is not distance, size and clocks difference only.
I had not coding fo Z80 exactly, but I know relative jumps are important for portable code (independent of memory location). It is better to get error message than get inexpectively non-portable binary.
++++++++++++++++
A variable like "optimize" can be used for such kind of control
If it is not defined or set to zero there should be no any optimization, if set - then optimization used. there could be more complex kinds of optimisations etc. (until all silicon Z80 will disappear Very Happy )
Post 09 Jan 2016, 06:53
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8359
Location: Kraków, Poland
Tomasz Grysztar 09 Jan 2016, 09:35
A macro that marste proposed (JRP) would not interfere with the standard JR and JP instructions, you'd only use it where you wanted it.
On the other hand, whether one should care about leaving the code position-independent when it has fixed loading address is a subject of debate on this board.
Post 09 Jan 2016, 09:35
View user's profile Send private message Visit poster's website Reply with quote
marste



Joined: 05 May 2015
Posts: 44
marste 09 Jan 2016, 10:03
Adapted a bit the macro. The version able to manage also conditional jump is:
Code:
; jr if possible, else jp
; eg:
; jrp address
; jrp nz,address
macro jrp op1,op2 {
  if op2 eq
    if op1-$ <= 127 & op1-$ >=-128
        jr op1
    else
        jp op1
    end if
  else
    if op2-$ <= 127 & op2-$ >=-128
        jr op1,op2
    else
        jp op1,op2
    end if  
  end if
}
    

Used in 5 places and was able to optimize one more byte in respect to my manual work! lol

@Tomasz
If it is not urgent and you give some guideline I can try to adapt fasm_g to Z80. Would be nice to be able to compile optimal Z80 code under linux![/code]
Post 09 Jan 2016, 10:03
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8359
Location: Kraków, Poland
Tomasz Grysztar 09 Jan 2016, 11:01
marste wrote:
@Tomasz
If it is not urgent and you give some guideline I can try to adapt fasm_g to Z80. Would be nice to be able to compile optimal Z80 code under linux!
It is all about writing macros implementing each instruction. You can look at my 8051, AVR and x86/x87 samples for inspiration, I tried to keep them relatively simple.

For the basic instructions that take no operands, the macro can be as simple as:
Code:
macro NOP?
        db 0
end macro    
(the "?" character tells that the macro name is case-insensitive.

When an instruction has a fixed set of simple syntaxes, you can handle them all with MATCH directive:
Code:
macro EX? first,second
        match (=SP?), first
                match =HL?, second
                        db 0E3h
                else match =IX?, second
                        db 0DDh,0E3h
                else match =IY?, second
                        db 0FDh,0E3h
                else
                        err "invalid second operand"
                end match
        else match =AF?, first
                match =AF'?, second
                        db 08h
                else
                        err "invalid second operand"
                end match
        else match =DE?, first
                match =HL?, second
                        db 0EBh
                else
                        err "invalid second operand"
                end match
        else
                err "invalid operand"
        end match
end macro

EX (SP),HL
EX (SP),IX
EX AF,AF'
EX DE,HL    


When the registers from a larger set are allowed, you can use ELEMENT directive and then extract the register number from element's metadata - like it is done in the samples in fasmg package. But, to be honest, unless you need to process the arithmetic expressions containing register names (like I do it for x86 addressing modes) you may find the ELEMENT-based processing unnecessarily complex and it may be simpler to use just an assisted MATCH syntax based on some special constructions, like:
Code:
A? equ [:111b:]
B? equ [:000b:]
C? equ [:001b:]
D? equ [:010b:]
E? equ [:011b:]
H? equ [:100b:]
L? equ [:101b:]

macro INC? operand
        match [:r:], operand
                db 100b + r shl 3
        else match (=HL?), operand
                db 34h
        else match (=IX?+d), operand
                db 0DDh,34h,d
        else match (=IY?+d), operand
                db 0FDh,34h,d
        else
                err "invalid operand"
        end match
end macro

INC A
INC B
INC (HL)
INC (IX+2)    
This also shows a case when you might consider using the ELEMENT: if you wanted to allow not only the exact (IX+d) syntax, but also variations like (d+IX) or (a+IX+b), you would need to define IX as ELEMENT:
Code:
A? equ [:111b:]
B? equ [:000b:]
C? equ [:001b:]
D? equ [:010b:]
E? equ [:011b:]
H? equ [:100b:]
L? equ [:101b:]

element HL?
element IX?
element IY?

macro INC? operand
        match [:r:], operand
                db 100b + r shl 3
        else match (a), operand
                if a relativeto HL & a = HL
                        db 34h
                else if a relativeto IX
                        db 0DDh,34h,a-IX
                else if a relativeto IY
                        db 0FDh,34h,a-IY
                else
                        err "invalid operand"
                end if
        else
                err "invalid operand"
        end match
end macro

INC (3*8+IX+1)

x = IX+3
INC (x)    
When a text of parameter is interpreted as an arithmetic expression, like in the above macro, you may also consider creating a "proxy" value to make the macro more bullet-proff, I explained this in the other thread.

With these samples in mind I think you should be able to implement a complete instruction set. What syntax variants would you allow in your macros is all up to you.

And when you have a complete instruction set ready, the next step would be to add some macros for generating the right output format. Please ask if you have any questions.
Post 09 Jan 2016, 11:01
View user's profile Send private message Visit poster's website Reply with quote
shoorick



Joined: 25 Feb 2005
Posts: 1614
Location: Ukraine
shoorick 09 Jan 2016, 15:51
Tomasz Grysztar wrote:
On the other hand, whether one should care about leaving the code position-independent when it has fixed loading address is a subject of debate on this board.

he-he Wink
exactly! some kind of applications on Z80/i8080 was made position-independant (or self-relocated, or even more correct: self updating for current location) exactly due to ability to load and run in any location, selected by user. those were mostly debuggers, compressors or other kind of service tools, loaded to free space to not intersect with regular application, loaded at fixed address Wink i8080 has no relative jumps, so, there were tricks used to guess current offset by application. Cool

_________________
UNICODE forever!
Post 09 Jan 2016, 15:51
View user's profile Send private message Visit poster's website Reply with quote
shutdownall



Joined: 02 Apr 2010
Posts: 517
Location: Munich
shutdownall 09 Jan 2016, 16:07
If you want relocatable code it is not an easy task if the operating system does not support this by default. You should also keep in mind that this affects not only jumps but access of data as well. For the target used (ZX81) there are even several registers (index registers) which are unusable as used from the system rom which makes it more hard to code address independent.

Anyway this would conflict with the restriction of 1kB only for programs including system variables and display content as used for the ZX81. The display file (if full screen used) takes up to 768 byte plus about 100 bytes for system variables - so not worth to think about relocatable code for this use cases with 1 kB memory at all.

So the macro did at last press one more byte space - I would have expected more. Wink

Yes - anybody might port the fasmg to Z80, this is really fun I think. As the ZX-IDE (FASMW-ZX) does support not only Z80 code but BASIC structures of ZX81 and ZX80, too (plan to do a direct port of ZX Spectrum as well) and I want to have a nice one and only tool I made this special version. This is an editor with some tools which can compile as well which is quite different to a tool chain which I personally don't like at all.

We are in 2016 now and people may force me to work like we are in the 1980s - no thanks. That's why I don't use and don't like Linux which is too much tinkering in my eyes. Now I am working under OS X and use FASMW-ZX with VirtualBox on my Apple Mac together with other tools not available under OS X like Eprom Programmers, Logic Analyzer, USB Blaster and so on.

So fasmg is quite nice but for me personally no choice as a development tool. And I put quite much development in FASMW-ZX so not really interested in beginning again at point zero. Wink
Post 09 Jan 2016, 16:07
View user's profile Send private message Send e-mail Reply with quote
marste



Joined: 05 May 2015
Posts: 44
marste 09 Jan 2016, 19:44
shutdownall wrote:
So the macro did at last press one more byte space - I would have expected more. Wink


It seems I was doing a good work also manually! Laughing Wink
Post 09 Jan 2016, 19:44
View user's profile Send private message Reply with quote
shoorick



Joined: 25 Feb 2005
Posts: 1614
Location: Ukraine
shoorick 10 Jan 2016, 11:31
well, then I reduce my explanation into shorter form: if I open z80 book and see: JP match to 0C3h opcode, then I must see 0C3h in binary ever. Anything other better, shorter or faster etc. must be optional and turned on explicitly, except you write macros/assembler for yourself only.
Post 10 Jan 2016, 11:31
View user's profile Send private message Visit poster's website Reply with quote
marste



Joined: 05 May 2015
Posts: 44
marste 10 Jan 2016, 12:29
Jrp is a completely new instruction (not already opcoded) for programmers to use if they want (and need)!... [/u]
Post 10 Jan 2016, 12:29
View user's profile Send private message Reply with quote
shoorick



Joined: 25 Feb 2005
Posts: 1614
Location: Ukraine
shoorick 10 Jan 2016, 13:14
no, jrp is not an instruction, it is optimizing macro, and I agree with this form, as it is explicit extension
Post 10 Jan 2016, 13:14
View user's profile Send private message Visit poster's website Reply with quote
marste



Joined: 05 May 2015
Posts: 44
marste 15 Feb 2016, 15:34
PS the "little" project keeping me busy released: https://sourceforge.net/p/smmax/blog/2016/02/real-chess-for-the-zx81-1k/ Wink
Post 15 Feb 2016, 15:34
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.