flat assembler
Message board for the users of flat assembler.

Index > OS Construction > minimal pm boot sector

Author
Thread Post new topic Reply to topic
edfed



Joined: 20 Feb 2006
Posts: 4240
Location: 2018
edfed
hey guys, now, i propose this very light code as an example for my future book.

just have a look at it, and try to make it fit in less than
Quote:

040h bytes used

64 bytes of code and datas.
Code:
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;boot and protected mode switch ;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
        org 7C00h               ; where the binary should be loaded in ram
boot:                           ; boot = 7c00h (org)
        cli                     ; disable interrupts
        lgdt fword[cs:gdt.size] ; load the gdt from [cs:gdt] 6 bytes pseudo descriptor
        mov eax,cr0             ; equivalent to "or cr0,1"
        or al,1                 ;   switches the CPU in protected mode-
        mov cr0,eax             ;   protected mode enable
        jmp gdt.code:flush      ; equivalent to "mov cs,gdt.data" + "mov ip,flush"
flush:                          ;   the first instruction right after pm enable
        use32                   ; code below is 32 bits
        mov ax,gdt.data         ;
        mov ds,ax               ; make ds = .data entry in gdt, flat linear adress space
        mov word[0b8000h],7441h ; put a red char 'A' in upper left corner, on grey background, just to show it works
        hlt                     ; halts the processor, then, it will consume less energy
        jmp $                   ; infinite loop
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
align 8                         ; align on 8 byte boundary for optimal performance
gdt:                            ;
dw 0                            ; in order to align dword part of pseudo desciptor on dword boundary
.size   dw @f-gdt-1             ; word part of pseudo desciptor, size of gdt in bytes
.linear dd gdt                  ; dword part of pseudo descriptor, linear base adress
.code=$-gdt                     ; first entry in gdt (8*1)
dw 0ffffh,0                     ;   4Gbytes, start at linear 0
db 0,10011010b,11001111b,0      ;   granularity = 64Kbytes, code segment, ring 0, read only,etc...
.data=$-gdt                     ; second entry in gdt (8*2)
dw 0ffffh,0                     ;   4Gbytes, start at linear 0
db 0,10010010b,11001111b,0      ;   granularity = 64Kbytes, data segment, ring 0, read/write,etc...
@@:                             ; used for gdt.size calculation
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
free =  510-(padding-$$)        ; define "free" bytes count
padding rb free                 ; reserve "free" bytes to make line below at offset 510
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
        dw 0aa55h               ; magic number boot mark, used by bios to test if valid boot sector
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
d1='0'+free shr 8 and 0fh       ;
d2='0'+free shr 4 and 0fh       ;
d3='0'+free and 0fh             ;
if d1>'9'                       ;
        d1=d1+7                 ;
end if                          ;
                                ;
if d2>'9'                       ;
        d2=d2+7                 ;
end if                          ;
                                ;
if d3>'9'                       ;
        d3=d3+7                 ;
end if                          ;
                                ;
display d1,d2,d3,'h '           ;
display 'free bytes',13,10      ;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
d1='0'+(510-free)shr 8 and 0fh  ;
d2='0'+(510-free)shr 4 and 0fh  ;
d3='0'+(510-free)and 0fh        ;
if d1>'9'                       ;
        d1=d1+7                 ;
end if                          ;
                                ;
if d2>'9'                       ;
        d2=d2+7                 ;
end if                          ;
                                ;
if d3>'9'                       ;
        d3=d3+7                 ;
end if                          ;
                                ;
display d1,d2,d3,'h '           ;
display 'used bytes',13,10      ;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;     
    


test it and enjoy this minimal boot pm switch.
Image
i hope it will help beginners to understand what is done...
Post 14 May 2010, 21:21
View user's profile Send private message Visit poster's website Reply with quote
cod3b453



Joined: 25 Aug 2004
Posts: 619
cod3b453
Managed 60

Code:
        use16
        org 0x7C00

        cli

        lgdt [gdtr]

        mov eax,cr0
        or al,1
        mov cr0,eax

        xor ax,ax

        jmp pword 0x0008:@f

    gdt:

        dq 0x00CF92000000FFFF
        dq 0x00CF9A000000FFFF

        use32

    @@:

        mov ds,ax
        mov word [0xB8000],0x7441

        hlt
        jmp $

    gdtr:

        dw 15
        dd gdt

        rb (0x7DFE - $)
        dw 0xAA55     
Post 14 May 2010, 22:19
View user's profile Send private message Reply with quote
edfed



Joined: 20 Feb 2006
Posts: 4240
Location: 2018
edfed
Very Happy

but it is unreadable! and apparently, you use the null descriptor, wich lead to an exception if interrupts enabled.
then, we can hope a total use of 54bytes (don't use alignment and null desciptor), but this is not the goal!
Post 15 May 2010, 08:11
View user's profile Send private message Visit poster's website Reply with quote
sinsi



Joined: 10 Aug 2007
Posts: 707
Location: Adelaide
sinsi
"or al,1" = 2 bytes
"inc ax" = 1 byte
Post 15 May 2010, 08:42
View user's profile Send private message Reply with quote
cod3b453



Joined: 25 Aug 2004
Posts: 619
cod3b453
Obviously I was focussing on the less than 64 bytes Razz
Post 15 May 2010, 09:56
View user's profile Send private message Reply with quote
edfed



Joined: 20 Feb 2006
Posts: 4240
Location: 2018
edfed
then, you win! Very Happy




a boot:
Image
Post 15 May 2010, 10:08
View user's profile Send private message Visit poster's website Reply with quote
ManOfSteel



Joined: 02 Feb 2005
Posts: 1154
ManOfSteel
Ernest Shackleton's? Nice boots.
Post 15 May 2010, 11:21
View user's profile Send private message Reply with quote
Teehee



Joined: 05 Aug 2009
Posts: 568
Location: Brazil
Teehee
Hi guys.

Some questions:

1. How lgdt instruction works? Why do I need it to set PM?
2. How hlt instruction works? When should I use it?
3. Why do I need use cli instruction?
4. What does that line jmp $ mean?

Thanks.

_________________
Sorry if bad english.
Post 17 May 2010, 21:59
View user's profile Send private message Reply with quote
Coty



Joined: 17 May 2010
Posts: 554
Location: ␀
Coty
jmp $ stops the sistem almost like

loop:
jmp loop
... sorta like that...

cli disables ints so that your code does not get inerupted in the process good for this code snippet as we don't need ints here because were just writing to the screen Smile

hlt halts the CPU until (E)CX is empty

about lgdt read this: http://wiki.osdev.org/Babystep6

And why do you need to set PM? so that you can use 32bit and acsess more than 1MB of ram Smile (Even 16bit p-mode can acsess 16MB of ram)

_________________
http://codercat.org/
Post 17 May 2010, 22:17
View user's profile Send private message Send e-mail Visit poster's website Reply with quote
sinsi



Joined: 10 Aug 2007
Posts: 707
Location: Adelaide
sinsi
Quote:
hlt halts the CPU until (E)CX is empty

HLT halts the CPU until it receives an interrupt, nothing to do with CX.
Post 18 May 2010, 01:15
View user's profile Send private message Reply with quote
baldr



Joined: 19 Mar 2008
Posts: 1651
baldr
edfed,

There are some tricks to shave off several bytes:

1. lgdt can be 16-bit (i.e. GDT base is 24-bit and pseudo-descriptor is one byte shorter).
2. Blinking red on grey is 0xF4, hlt opcode.

With proposed by sinsi replacement, I came up with the following:
Code:
        org     0x7C00
        cli                     ; 0+1
        lgdt    [cs:gdtr]       ; 1+6
        mov     eax, cr0        ; 7+3
        inc     ax              ; 10+1
        mov     cr0, eax        ; 11+3
        mov     ax, data_selector; 14+3
        jmp     code_selector:protected_mode; 17+5
                                ; 22
        use32
protected_mode:
        mov     ds, ax          ; 22+2
        mov     word[0xB8000],0xF441; 24+9
        jmp     $-1             ; 33+2
                                ; 35

gdtr:   dw      gdt.limit       ; 35+2
        dw      gdt.base        ; 37+2
        db      0               ; 39+1
                                ; 40

gdt = $-8
gdt.base = gdt

struc descriptor [def] {
common
  local .base,.limit,.type
  .base = 0
  .type = 0x10
forward
  match =base==v, def \{ .base = v \}
  match =limit==v, def \{
    \local matched
    matched equ 0
    match vv*=4k,v \\{
      .limit = vv
      .type = .type or 0x8000
      restore matched
      matched equ 1
    \\}
    match =0, matched \\{
      .limit = v
    \\}
    restore matched
  \}
  match =w, def \{ .type = .type or 2 \}
  match =e, def \{ .type = .type or 4 \}
  match =code, def \{ .type = .type or 8 \}
  match =r, def \{ .type = .type or 2 \}
  match =c, def \{ .type = .type or 4 \}
  match =dpl==.dpl, def \{ .type = .type or (.dpl and 3) shl 5 \}
  match =p, def \{ .type = .type or 0x80 \}
  match =32bit, def \{ .type = .type or 0x4000 \}
common
  dw    .limit and 0xFFFF
  dw    .base and 0xFFFF
  db    .base shr 16 and 0xFF
  db    .type and 0xFF
  db    (.type shr 8 and 0xF0) or (.limit shr 16 and 0x0F)
  db    .base shr 24 and 0xFF
}

macro descriptor [def] {
common
  local .
  . descriptor def
}

code_selector = $-gdt
        descriptor code,32bit,base=0,limit=0xFFFFFF*4k,p,r
data_selector = $-gdt
        descriptor data,32bit,base=0,limit=0xFFFFFF*4k,p,w
gdt.limit = $-gdt-1

        rb      $$+510-$
        db      0x55, 0xAA    
56 bytes.

That big descriptor struc is made to improve readability.

Entire hlt / jmp business seems unnecessary: NMI will almost surely shutdown CPU due to invalid descriptors in IDT and triple-fault. Single hlt should be sufficient.

G bit in segment descriptor means 4 KiB granularity, not 64. And code segment can be either execute-only or execute/read, "read-only" comment can be misleading.
Post 18 May 2010, 02:48
View user's profile Send private message Reply with quote
sinsi



Joined: 10 Aug 2007
Posts: 707
Location: Adelaide
sinsi
If you take out the CS: override you're down to 55 bytes for your code.
Here's mine
Code:
           org 7C00h
boot:      cli
           lgdt [pmgdt]
           mov eax,cr0
           inc ax
           mov cr0,eax
           mov ax,16
           jmp 8:flush

flush:
           use32
           mov ds,ax
           mov word[0b8000h],7441h
           hlt

pmgdt:  dw 23,pmgdt-3
        db 0
        dw 0ffffh,0,9a00h,0cfh
        dw 0ffffh,0,9200h,0cfh

;rb 510-($-$$)
;db 55h,0aah
;times (80*18*2*512)-($-boot) db 0f6h
    

54 bytes

Nice trick with gdtr.

>That big descriptor struc is made to baffle us all.
Fixed that for you Smile
Post 18 May 2010, 06:53
View user's profile Send private message Reply with quote
baldr



Joined: 19 Mar 2008
Posts: 1651
baldr
sinsi,

Heh, assumptions. BIOS Boot Spec 1.01 explicitly states only values for dl and es:di when booting from BAIDs (cs:ip, naturally). You may as well assume bx==0x7C00 (from int 0x13/02 — many BIOSes leave it there, but not Bochs') and cut some more. Wink

cr0 seems to stay virgin 0x60000010, thus mov ax,16 can be reduced to dec ax (or dropped altogether if you don't mind DPL 1+ data segment).

Here is an example of hardly maintainable and doubtly reliable code that nevertheless works:
Code:
        org     0x7C00
        lgdt    [$+1]           ; 7c00+5; GDT limit: 1601; GDT base[15:0]: 7c01
        add     al, al          ; 7c05+2; GDT base[23:16]: 0
        cli                     ; 7c07+1
        mov     eax, cr0        ; 7c08+3
        inc     ax              ; 7c0b+1; ax: 0011
        mov     cr0, eax        ; 7c0c+3
        jmp     0x18:PM         ; 7c0f+5; overlaid with descriptor for selector 11 @7c11
        db      0               ; 7c14+1; base[15:8]
        dw             0xB200, 0x00CF; 7c15+4; DPL 1 data, limit: 0xF187cFFF
        dw      -1, 0, 0x9800, 0x00CF; 7c19+8; DPL 0 code

        use32
PM:     mov     ds, ax          ; 7c21+2
        mov     word[0xB8000],0x7441; 7c23+9
        hlt                     ; 7c2c+1
                                ; 7c2d
        rb      $$+510-$
        db      0x55, 0xAA    
45 bytes. Almost flat. It could be rearranged to be completely flat (at the price of 3 bytes AFAIK). Wink

Anyway, I don't think that edfed wants to confuse his readers with such code.
Post 18 May 2010, 12:33
View user's profile Send private message Reply with quote
edfed



Joined: 20 Feb 2006
Posts: 4240
Location: 2018
edfed
Quote:

Anyway, I don't think that edfed wants to confuse his readers with such code.

not only the reader is confused. me too i don't really understand this code, i suppose you use instructions to encode the first entry in gdt... it seems ot be really hard ot modify then.

anyway, congrats for your really light vesion. 45 bytes, waw!
Post 18 May 2010, 13:36
View user's profile Send private message Visit poster's website Reply with quote
baldr



Joined: 19 Mar 2008
Posts: 1651
baldr
edfed wrote:
i suppose you use instructions to encode the first entry in gdt...
It goes further: lgdt uses itself as a part of pseudo-descriptor (base 7c01, limit 1601) which, in turn, uses 00 from the following add al, al's opcode (thus it's critical that lgdt is 16-bit and uses only 24-bit GDT base, 00 00 is add [bx+si], al and I don't want to ruin something).

GDT is laid over that pseudo-descriptor: GDT[0] is unused (hence it's OK to use it for code), GDT[1] (selectors 8...b) contains PM setup code too (that's why it decodes to something weird like "base 21220f40, limit ac020fff, present DPL 2 reserved system descriptor"), and the last three bytes of mentioned code slightly garble limit of GDT[2].

Code is dependent on least-significant word of cr0 being 0010 (since P1 this holds true for its value after reset).

It's easy to shoot self in the foot with assembler. True perversion is to shoot with minimal lead/gunpowder usage. Could easily end up as Tesla gun or Gauss rifle, pocket size. Wink
Post 18 May 2010, 14:34
View user's profile Send private message Reply with quote
rnop



Joined: 27 Jun 2010
Posts: 3
rnop
neat code there sinsi

anyone have a er. 64bit sample?
Post 28 Jun 2010, 18:27
View user's profile Send private message Reply with quote
baldr



Joined: 19 Mar 2008
Posts: 1651
baldr
rnop,

Since 64-bit mode can only be reached from protected mode, you may extend any of the samples (they all have a plenty of room to grow).
Post 29 Jun 2010, 09:16
View user's profile Send private message Reply with quote
sinsi



Joined: 10 Aug 2007
Posts: 707
Location: Adelaide
sinsi
64-bit mode also needs paging tables, which is a bit more code...

baldr, you can go from real mode to long mode in 1 step. I started a thread about it somewhere.
Post 29 Jun 2010, 09:31
View user's profile Send private message Reply with quote
baldr



Joined: 19 Mar 2008
Posts: 1651
baldr
sinsi,

I prefer documented ways to handle mode transition. When it reads like "Paging can be enabled only if protection is enabled", I assume that CR0.PE must be set prior to setting CR0.PG (simultaneous mode change might work, but modern CPUs are too complex to predict something like that). It's better to be safe than sorry.

----8<----
rnop,

Mentioned thread is here.
Post 29 Jun 2010, 10:33
View user's profile Send private message Reply with quote
l4m2



Joined: 15 Jan 2015
Posts: 648
l4m2
[Only one segment] allowed?
Post 16 Jan 2015, 11:34
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.

Website powered by rwasa.