flat assembler
Message board for the users of flat assembler.

Index > OS Construction > MBR: relocation minimization

Goto page 1, 2  Next
Author
Thread Post new topic Reply to topic
bitRAKE



Joined: 21 Jul 2003
Posts: 3055
Location: vpcmipstrm
bitRAKE
Seems like a valid target for optimization: How small can the relocation code of an MBR be? The only thing which can be assumed is that the BIOS has loaded a sector of 512 bytes at $7C00. Microsoft's MBR uses only 8086 instructions and requires 27 bytes to copy to $0:600. My current best is 24 bytes, but requires a word of stack space:
Code:
; another byte for 8086 compatibility: xor ax,ax / mov ss,ax
   push 0  ;286+
       pop ss
      mov sp,$0600
        sti
 cld
 mov cx,(MBR - Relocation)/2
 push sp
     push ss
     mov si,Relocation
   pop es
      pop di
      push ss
     push di
     rep es movsw
        retf    
...additionally, much less code is copied and SI points to the first byte of MBR after relocation (helping to reduce MBR processing code - for a later post). Microsoft puts the stack pointer at $7C00 (no change in code size to do this above) and where the code fragment is relocated to is not really important - it just can't be at $7C00. So, I think there are some tricky solutions left to further reduce the size.

The boot sector can't really assume CS=0, so it might be advantageous to use the 512 bytes before $7C00? If it's small enough a short jump would be sufficient rather than doing the far return.

_________________
¯\(°_o)/¯ unlicense.org
Post 05 Mar 2009, 17:04
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 7802
Location: Kraków, Poland
Tomasz Grysztar
Here's my try. It's 25 bytes (but fully 8086 compatible, so in this aspect equivalent to yours), doesn't use stack at all and sets all segment registers including DS to 0, and SI pointing to first byte after relocation aswell.
Code:
org 7C00h
        xor cx,cx
        mov ds,cx
        mov ss,cx
        mov si,Relocation
        les di,[si-4]
        mov sp,di
        sti
        cld
        mov cl,(MBR-Relocation)/2 ; will always fit in CL, of course Smile CH is already 0
        rep movsw
        jmp 0:600h
Relocation:    

Also, you can change the target address simply by changing the jump code. Smile
Post 05 Mar 2009, 18:02
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17716
Location: In your JS exploiting you and your system
revolution
bitRAKE wrote:
Microsoft puts the stack pointer at $7C00
I don't think MS write the BIOS. At least I hope they don't!
Post 05 Mar 2009, 23:58
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 3055
Location: vpcmipstrm
bitRAKE
revolution wrote:
bitRAKE wrote:
Microsoft puts the stack pointer at $7C00
I don't think MS write the BIOS. At least I hope they don't!
I was referring to the MBR which MS installs on system drives. It is covered in more detail through the link above.

_________________
¯\(°_o)/¯ unlicense.org
Post 06 Mar 2009, 01:53
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 3055
Location: vpcmipstrm
bitRAKE
Tomasz Grysztar wrote:
Also, you can change the target address simply by changing the jump code. Smile
Well done. I loaded the stack using LSS in one version which was rejected for not being 8086 compatible, and failed to see the LES opportunity. A byte can no longer be saved by not setting DS and using the ES override on MOVSW because LES needs DS set or an override. Although, the MOV to SP should follow changes to SS - to insure they are atomic - interrupts are (always?) disabled, but a NMI could occur? Not using the stack at all allows these instructions to be moved around quite freely.

Today at work I was thinking about how to organize the code for FASM to dynamically select the target address, so the code would flow into $7C00 after loading a boot sector - eliminating the need for a branch.
Code:
        cld             ; FC
        sti             ; FB
        xor cx,cx       ; 31 C9
     mov di,$FFFF    ; BF FF FF
  mov ss,cx       ; 8E D1
     mov sp,di       ; 89 FC
     mov es,cx       ; 8E C1
     mov si,$FFFF    ; BE FF FF
  mov cl,$FF      ; B1 FF
     rep es movsw    ; F3 26 A5
  jmp $-126       ; EB 80
    
...leading to 23 bytes. Kind of foolish because DS needs to be set to use SI to access the MBR later, and we're back at 25 with further restrictions. A byte could be saved with the follow, though:
Code:
 xor cx,cx               ; 31 C9
     mov ds,cx               ; 8E D9
     mov si,Relocation-4     ; BE 14 7C
  les di,[si]             ; C4 3C
;    lss sp,[si]             ; 0F B2 24
  mov ss,cx               ; 8E D1
     mov sp,di               ; 89 FC
     sti                     ; FB
        cld                     ; FC
        mov cl,(MBR-(Relocation-4))/2   ; B1 D5
     rep movsw               ; F3 A5
     jmp 0:600h          ; EA 00 06 00 00    
...only 23 bytes using LSS instruction. Now the target address just need to be an instruction to skip four bytes, but we could probably get more out of it. Something like F8 05 00 00 works because it disassembles to CLC/ADD AX,0. My current code benefits from 74 07 00 00 -- jumping into the middle of processing the MBR.
Code:
.found:
 mov cx,si               
.skip:
      add si,16               
.entry:
     shl byte [si],1 
    jne .done               
    jnc .skip               

; found an active entry, save only if CX zero and insure others are valid

       jcxz .found             

; only one entry can be active

.mbr_invalid: ; error code one
    mov cx,1                
.none_active:   ; error code zero
   jmp .error              
.done:
      jcxz .none_active
   ; proper exit occurs at end of partition table
      ; otherwise an invalid entry caused an early exit
   cmp si,$1FE
 jne .mbr_invalid    

_________________
¯\(°_o)/¯ unlicense.org


Last edited by bitRAKE on 06 Mar 2009, 04:16; edited 1 time in total
Post 06 Mar 2009, 03:48
View user's profile Send private message Visit poster's website Reply with quote
sinsi



Joined: 10 Aug 2007
Posts: 713
Location: Adelaide
sinsi
For the smallest relocation code, why bother setting up a stack? Just ensure interrupts are off with cli and save yourself 4 bytes.
Post 06 Mar 2009, 04:02
View user's profile Send private message Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 3055
Location: vpcmipstrm
bitRAKE
Maybe, I've assumed in error, but my guide has been the other MBRs and my experience with register values from BIOS. Two machines have put the stack pointer at $400 - which is not good when the INT 13h's use a lot of stack space. Some INT 13h's require interrupts to be enabled, iirc. So, then the question becomes where is it best (in terms of size) to set these registers/flags. True it isn't required for relocation.
Post 06 Mar 2009, 04:22
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 7802
Location: Kraków, Poland
Tomasz Grysztar
bitRAKE wrote:
Something like F8 05 00 00 works because it disassembles to CLC/ADD AX,0.

So why not FB 05 00 00, and you can get rid of STI in relocation code. Smile
Post 06 Mar 2009, 04:24
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 3055
Location: vpcmipstrm
bitRAKE
Tomasz Grysztar wrote:
bitRAKE wrote:
Something like F8 05 00 00 works because it disassembles to CLC/ADD AX,0.

So why not FB 05 00 00, and you can get rid of STI in relocation code. Smile
...and give the stack an odd address?

sinsi, If we look at it from that perspective - there isn't a need to relocate until much later in the process, and we could copy fewer bytes still. Only when we are certain a valid partition has been found the sector loader could be moved somewhere (on the stack even). Certainly an avenue to try optimization on.

_________________
¯\(°_o)/¯ unlicense.org
Post 06 Mar 2009, 04:33
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 7802
Location: Kraków, Poland
Tomasz Grysztar
bitRAKE wrote:
Tomasz Grysztar wrote:
bitRAKE wrote:
Something like F8 05 00 00 works because it disassembles to CLC/ADD AX,0.

So why not FB 05 00 00, and you can get rid of STI in relocation code. Smile
...and give the stack an odd address?

Oh, right, I don't know how did I think that B is even digit. Very Happy
But if we decided to not setup stack at all, then it might work. Smile

On second thought... does an odd stack address cause any problems in real mode?
Post 06 Mar 2009, 04:40
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 3055
Location: vpcmipstrm
bitRAKE
Tomasz Grysztar wrote:
On second thought... does an odd stack address cause any problems in real mode?
Hm...seems like it could be a problem (in a very rare and obscure kind of way):
Quote:
In the real-address mode, if the ESP or SP register is 1 when the PUSH instruction is executed, an #SS exception is generated but not delivered (the stack error reported prevents #SS delivery). Next, the processor generates a #DF exception and enters a shutdown state as described in the #DF discussion in Chapter 5 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A.

_________________
¯\(°_o)/¯ unlicense.org


Last edited by bitRAKE on 06 Mar 2009, 05:09; edited 1 time in total
Post 06 Mar 2009, 04:54
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 7802
Location: Kraków, Poland
Tomasz Grysztar
When your SP goes down to 1, then you're already in lots of trouble, since you're overwriting IDT. Wink

EDIT: revolution, we think in parallel. Very Happy


Last edited by Tomasz Grysztar on 06 Mar 2009, 04:58; edited 3 times in total
Post 06 Mar 2009, 04:56
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17716
Location: In your JS exploiting you and your system
revolution
But who sets SP to 1? Even coming to 1 from a higher value is going to overwrite your INT table. Somehow I don't think it will be such a big problem.

[edit]Tomasz beat me to it![/edit]
Post 06 Mar 2009, 04:57
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 7802
Location: Kraków, Poland
Tomasz Grysztar
Or you can jump to 0:0BCFBh and have stack evened this way. Smile
Post 06 Mar 2009, 05:02
View user's profile Send private message Visit poster's website Reply with quote
sinsi



Joined: 10 Aug 2007
Posts: 713
Location: Adelaide
sinsi
bitRAKE: From the title and your first lot of code the very first thing was relocation. Changing the goalposts? Smile
I think that relocation first up is the thing to do, because if you don't find an active partition you basically stop - "Missing operating system".

Odd stack addresses happen all the time - e.g. if a local is a buffer 11 bytes long, although I imagine compilers would even it out.
Post 06 Mar 2009, 05:03
View user's profile Send private message Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 3055
Location: vpcmipstrm
bitRAKE
Well, I did mention in the first post that additional code was coming...not that any defense is needed - it's clearly about MBR size optimization that hasn't changed. All the posts have been very productive in that direction.
Post 06 Mar 2009, 05:34
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17716
Location: In your JS exploiting you and your system
revolution
Tomasz Grysztar wrote:
Or you can jump to 0:0BCFBh and have stack evened this way. Smile
If you do that the you can eliminate these three bytes from the relocation code:
Code:
        mov sp,di               ; 89 FC
        sti                     ; FB    
And you can also move the SS assignment to be executed after the four bytes fb bc 00 00:
Code:
mov ss,sp    
Post 06 Mar 2009, 05:40
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 3055
Location: vpcmipstrm
bitRAKE
Would interrupts be a problem? A real problem - not the unaligned stack kind of problem. Smile
Post 06 Mar 2009, 05:43
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17716
Location: In your JS exploiting you and your system
revolution
What sort of problem are you thinking of?
Post 06 Mar 2009, 05:48
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 3055
Location: vpcmipstrm
bitRAKE
sti
mov sp,0

; could something be written to SS:SP here?
; at this point we don't know what SS is.

mov ss,sp
Post 06 Mar 2009, 05:53
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page 1, 2  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.

Website powered by rwasa.