flat assembler
Message board for the users of flat assembler.

Index > Linux > Weird "neg" kernel bug?

Author
Thread Post new topic Reply to topic
Patrick_



Joined: 11 Mar 2006
Posts: 53
Location: 127.0.0.1
Patrick_ 01 May 2006, 15:10
I've just discovered a very odd Linux bug. I fill up a timespec structure with tv_sec as 0, and with tv_nsec as -77. I then use the Linux interrupt (int 0x80) to make the call. The call works, but obviously returns with error because tv_nsec is -77. Then I execute a "neg eax" instruction, and test whether eax is 0 or not. If not, it tries to sleep again.

Well, since it tests whether eax is 0, and eax is never 0, since we're always setting tv_nsec to -77, it performs an infinite loop. The odd thing is, however, that my process count soars way past the 10,000 mark.

However, when I remove the "neg eax" instruction from my code, an infinite loop just occurs, without any increase in my process count. This is pretty odd. Here's the code I used:

Code:
        format ELF executable
        entry _start

_start:
        mov eax, 162    ;nanosleep
        mov [timespec_const.tv_sec], 3
        mov [timespec_const.tv_nsec], -77
make_sleep_call:
        mov ebx, dword timespec_const
        mov ecx, dword timespec_mod     ;this will be modified if sleep was interrupted
        int 0x80 

        int 0x3
        neg eax ;problematic instruction

        test eax, eax
        jnz interrupted_sleep

interrupted_sleep:
        ;Call sleep again, with the "remainder" values
        mov ebx, [timespec_mod.tv_sec]
        mov ecx, [timespec_mod.tv_nsec]
        mov [timespec_const.tv_sec], ebx
        mov [timespec_const.tv_nsec], ecx
        jmp make_sleep_call

        struc timespec {
              .tv_sec rd 1
              .tv_nsec rd 1
        }

        timespec_const timespec
        timespec_mod timespec    



I continue through the code in my debugger (ald), with the breakpoint right before the "neg" instruction. On every three times I continue through (getting back to the breakpoint), a SIGCHLD signal occurs. Otherwise, it's of course a SIGTRAP.

I'm using kernel 2.6.16. Can anyone else confirm this? I sent the bug to the lkml, but it's not showing up (moderator refused it?)
Post 01 May 2006, 15:10
View user's profile Send private message Reply with quote
Endre



Joined: 29 Dec 2003
Posts: 215
Location: Budapest, Hungary
Endre 01 May 2006, 17:58
I don't exactly know what you want to do but I hope you've realized that when you jump back to 'make_sleep_call' then eax no longer contains 162 (nanosleep) but some trash (the return value of nanosleep or its negated).

So you ought to check the kernel what it does on those values. I guess in one case you do call to some valid system function in other case you don't.
Post 01 May 2006, 17:58
View user's profile Send private message Reply with quote
Madis731



Joined: 25 Sep 2003
Posts: 2139
Location: Estonia
Madis731 03 May 2006, 07:39
The negation on the negative infinity is negative infinity. Meaning neg(80h)=80h, same goes to neg(80000000h) which is 80000000h. This is a place where sign doesn't change and you might get unexpected behaviour.
Post 03 May 2006, 07:39
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
Feryno



Joined: 23 Mar 2005
Posts: 509
Location: Czech republic, Slovak republic
Feryno 04 May 2006, 05:24
Hello, if you try
man nanosleep
you could read that nanoseconds must be in range 0-999999999
which is hexadecimal 0-3B9AC9FFh.
-77 is hexadecimal FFFFFFB3 which violate sys_nanosleep rules.
I have used nanosleep very frequently and I haven't had any problems because I have never put time in negative value. What do you want? Go back in the time? Get younger? Make a machine for travel in the time back?
And please use this method to determine if syscall fail:
syscall
or eax,eax
js syscall_failed

Error return values are like FFFFFFFF=-1, ... -11, ... maybe -200...
I have never found error return value like -200, there isn't so much error return values defined. Even it is not very probably, but of course there may be a bug in sys_nanosleep error return value, but I won't test it, I will keep syscall rules.
Post 04 May 2006, 05:24
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
Feryno



Joined: 23 Mar 2005
Posts: 509
Location: Czech republic, Slovak republic
Feryno 05 May 2006, 08:35
OK, today I have a time to explain, what your code did wrong. Endre explained it.
try:
strace ./your_program_name

What your program did:
first int80 returns with eax=-EINVAL=-22 because you violated nanosleep rules by put -77 as nanoseconds (-77d=FFFFFFB3h > 3B9AC9FFh=999999999d)
then you did:
neg eax
so eax holds value 22 and it executes sys_umount by next int80

without neg eax your code tries to execute syscall number -22d=FFFFFFEAh=4294967274d which isn't defined (there are about 260 syscalls) and this cause unexpected behaviour.

correct code should be:
Code:
        format ELF executable 
        entry _start 

_start: 
        mov [timespec_const.tv_sec], 3 
        mov [timespec_const.tv_nsec],put_correct_value_here   ; max 999999999d
make_sleep_call: 
        mov eax, 162    ;nanosleep 
        mov ebx, dword timespec_const 
        mov ecx, dword timespec_mod     ;this will be modified if sleep was interrupted 
        int 0x80  

        cmp eax, -4  ; -EINTR 
        jz interrupted_sleep 
        test eax, eax
        jz sleeping_done
        jmp interrupt_error

interrupted_sleep: 
        ;Call sleep again, with the "remainder" values 
        mov ebx, [timespec_mod.tv_sec] 
        mov ecx, [timespec_mod.tv_nsec] 
        mov [timespec_const.tv_sec], ebx 
        mov [timespec_const.tv_nsec], ecx 
        jmp make_sleep_call 

        struc timespec { 
              .tv_sec rd 1 
              .tv_nsec rd 1 
        } 

        timespec_const timespec 
        timespec_mod timespec
    


Description: /usr/share/man/man2/nanosleep.2.gz
Download
Filename: nanosleep.2.gz
Filesize: 1.95 KB
Downloaded: 677 Time(s)

Post 05 May 2006, 08:35
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
Patrick_



Joined: 11 Mar 2006
Posts: 53
Location: 127.0.0.1
Patrick_ 08 May 2006, 21:35
Haha, go back in time. Smile

Thanks guys, I never thought about eax never being changed. That was indeed the problem.
Post 08 May 2006, 21:35
View user's profile Send private message Reply with quote
Feryno



Joined: 23 Mar 2005
Posts: 509
Location: Czech republic, Slovak republic
Feryno 10 May 2006, 05:13
eax / rax MUST change because is return register = holds result and informs you whether system call succeded or failed

regs preserved across function calls in 64-bit:
rbx rsp rbp r12 r13 r14 r15

I'm not able to remember correctly which regs are preserved in 32-bit
maybe ebx, esi, edi, esp, ebp ?
Post 10 May 2006, 05:13
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.