flat assembler
Message board for the users of flat assembler.

Index > OS Construction > alignment filler

Author
Thread Post new topic Reply to topic
tantrikwizard



Joined: 13 Dec 2006
Posts: 142
tantrikwizard
I'm working on a bit of a kernel and discovered something a bit unusual. Upon passing the pointer to a declare array of bytes (string), it seems the compiler produced code a couple bytes behind the first character. I also noticed some junk opcodes generated and being executed when calling a function. Here's the executed instructions from bochs:
<execution trace>
Code:
 call somefunction
   add ds:[eax], 0 ;<--where does this come from?  
    ;(actual code-generated function pointer points here
    ;not at the PUSH EBX below
   push ebp
   mov ebp, esp    

</execution trace>
the ADD instruction was either randomly thrown into the compiled binary or the offset to the function was misinterpreted (or im missing something critical). After some head scratching I aligned my function to a paragraph boundary with the align macro and noticed the ghost ADD instructions were replaced with NOP.: <execution trace>
Code:
    push eax
    push ebx
    call function1
    nop
    nop
    push ebp
    mov ebp, esp    

</execution trace>
again the EIP jumped to the assumed entry point of the function but the mysterious ADD instructions were replaced with NOP (i can live with a bunch of NOPs on entry of a function).

What am I missing here? This is causing some problems with strings
Code:
    align 16
str db 'mystring',0    
The returned string pointer (as in mov esi, str ) is pointing a few bytes before the first character when the string is aligned. If I execute a mov esi, str the esi register points to 0x10020, doing a memory dump on this address reveals two 90 NOP opcodes at 0x10020 preceding the first character of the defined string (the 'm' byte in 'mystring' begin at offset 0x10022). Any ideas on this is appreciated. I'm fairly new to fasm but have extensive experience with tasm, masm and nasm. From the beginners perspective it looks like a compiler bug.[/b]
Post 14 Dec 2006, 00:51
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
kohlrak



Joined: 21 Jul 2006
Posts: 1421
Location: Uncle Sam's Pad
kohlrak
I had some weird instances, too, where some code of mine crashed for no apparent reson, i eventually isolated the crash of my code to printf. If you're using code made entirely by you in fasm, then post that section of the original source, so that we may be able to see what might have caused it.
Post 14 Dec 2006, 01:51
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger Reply with quote
tantrikwizard



Joined: 13 Dec 2006
Posts: 142
tantrikwizard
kohlrak wrote:
I had some weird instances, too, where some code of mine crashed for no apparent reson, i eventually isolated the crash of my code to printf. If you're using code made entirely by you in fasm, then post that section of the original source, so that we may be able to see what might have caused it.



Here's an example that skews offsets. A boot loader loads this code to 0x10000 (1000:0)

Code:
format binary
KERNEL_BASE_ADDR EQU 0x10000

USE16
ORG 0x0
jmp KernelEntry

STRUC GDTEntry {
    .limit_low DW ?
    .base_low DW ?
    .base_middle DB ?
    .access DB ?
    .granularity DB ?
    .base_high DB ?
}

KernelEntry:
        CLI
        CLD

        MOV AX, CS
        MOV DS, AX

        MOV [NULLDESC.limit_low], 0
        MOV [NULLDESC.base_low], 0
        MOV [NULLDESC.base_middle], 0
        MOV [NULLDESC.access], 0
        MOV [NULLDESC.granularity], 0
        MOV [NULLDESC.base_high], 0

        MOV [CODEDESC.limit_low], 0FFFFh
        MOV [CODEDESC.base_low], 0
        MOV [CODEDESC.base_middle], 0
        MOV [CODEDESC.access], 09Ah
        MOV [CODEDESC.granularity], 0CFh
        MOV [CODEDESC.base_high], 0

        MOV [DATADESC.limit_low], 0FFFFh
        MOV [DATADESC.base_low], 0
        MOV [DATADESC.base_middle], 0
        MOV [DATADESC.access], 092h
        MOV [DATADESC.granularity], 0CFh
        MOV [DATADESC.base_high], 0
        
        MOV AX, GDT_END - GDT_START-1
        MOV [GDTLIMIT], AX
        XOR EAX, EAX
        MOV AX, CS
        SHL EAX, 4
        ADD EAX, GDT_START
        MOV [GDTADDR], EAX

        MOV AX, CS
        MOV DS, AX

        CLI
        CLD
        LGDT [GDT_PTR]

        MOV EAX, CR0
        OR AL, 1
        MOV CR0, EAX

        JMP pword 08h:KERNEL_BASE_ADDR+PModeStart

GDT_PTR:
        GDTLIMIT DW GDT_END - GDT_START - 1
        GDTADDR DD $+2
GDT_START:      
NULLDESC GDTEntry
CODEDESC GDTEntry
DATADESC GDTEntry
GDT_END:


USE32
align 16
PModeStart:
org (KERNEL_BASE_ADDR + (PModeStart - KernelEntry))

        MOV [GS:0], BYTE 'Z'

        MOV AX, 10h
        MOV DS, AX
        MOV ES, AX
        MOV SS, AX
        
        MOV esi, tempstr
        
        JMP $

tempstr db 'my test string', 0
    


The last 3-4 instructions are what I'm interested in. Bochs carrys out this execution with no problem until:

Code:
08:0x100d2 mov esi, 0x000100d7        ;bed7000100
08:0x100d7 jmp .+0xfffffffe (0x000100d7)     ;ebfe
    

at line 1 above, esi points to the next instruction at 0x100d7 (jmp $ opcode 0xebfe) instead of the string which follows the instruction. A dump of memory reveals:
Code:
0x100d7: 0xeb    0xfe    0x6d    0x79    0x20    0x74    0x65    0x73
0x100df: 0x74    0x20    0x73    0x74    0x72    0x69    0x6e    0x67
0x100e7: 0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00    

As you can see the mov esi, tempstr should load esi with the value of 0x100d9, not 0x100d7
Thanks everyone, any help is appreciated.
Post 14 Dec 2006, 02:51
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
kohlrak



Joined: 21 Jul 2006
Posts: 1421
Location: Uncle Sam's Pad
kohlrak
i'm still new to assembly, so i'll have to ask you to explain this...

Code:
JMP $    
I'm not used to seeing a "$" after a jump.
Post 14 Dec 2006, 03:01
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo
I think jmp $ just hangs/loops indefinitely ... it just jumps to its own offset. $ means current location.
Post 14 Dec 2006, 04:02
View user's profile Send private message Visit poster's website Reply with quote
cod3b453



Joined: 25 Aug 2004
Posts: 619
cod3b453
@kohlrak: $ is the value of the current memory address (also $$ is the base memory address of that bit of code) here's what it means:

Code:
jmp $
; is the same as
@@:
jmp @b
    


This is an infinte loop that causes the system to hang on that instruction.

@tantrikwizard: i can't see why the value is 2 bytes out...
Post 14 Dec 2006, 04:14
View user's profile Send private message Reply with quote
kohlrak



Joined: 21 Jul 2006
Posts: 1421
Location: Uncle Sam's Pad
kohlrak
Maybe one of us should compile it and try it...
Post 14 Dec 2006, 05:56
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 7796
Location: Kraków, Poland
Tomasz Grysztar
Compiling not needed. Wink
Code:
USE16
ORG 0x0
jmp KernelEntry
; ...
KernelEntry:    

Here you do short jump (2 bytes) BEFORE the KernelEntry label. Thus KernelEntry is 2, not 0.
Code:
USE32
align 16
PModeStart:
org (KERNEL_BASE_ADDR + (PModeStart - KernelEntry))    

...and here you assume that the segment starts from KernelEntry label?
Since the first part is "org 0", the simpler would be just "org KERNEL_BASE_ADDR+PModeStart".
Post 14 Dec 2006, 07:23
View user's profile Send private message Visit poster's website Reply with quote
tantrikwizard



Joined: 13 Dec 2006
Posts: 142
tantrikwizard
Tomasz Grysztar wrote:
Here you do short jump (2 bytes) BEFORE the KernelEntry label.


Excellent eye mate, thanks for the fix.
Post 14 Dec 2006, 14:06
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.

Website powered by rwasa.