flat assembler
Message board for the users of flat assembler.

Index > OS Construction > AMD64 Multitasking

Author
Thread Post new topic Reply to topic
narada



Joined: 15 Feb 2008
Posts: 77
Location: Ukraine, Dnepropetrovsk
narada 17 Mar 2008, 14:01
So, I have a code, that loads and go to LongMode. Start IRQ0(Timer) and mainloop (shell). But how to make multitasking, without TSS (w/o hw multitasking in AMD64)??? I searched an software multitasking, but nothing. I don't like C++, so work with Linux/BSD sources is wery hard for me. Please, help with simple realization of SOFTWARE MULTITASKING realization, (may be x32... I will rewrite it for x64). May be only good doc's...
Thanks for ANY information. Wink

_________________
http://www.omegicus.com
Post 17 Mar 2008, 14:01
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 17 Mar 2008, 14:24
Multitaskting or multithreading? I suggest you start with multithreading Smile

The basics are simple, you need a CONTEXT structure where you store the per-thread register state (later you'll need more per-thread information, like a backlink to the PROCESS it belongs to). On a context switch, for simple threading, you simply select a new CONTEXT structure and load register state from it - voila!
Post 17 Mar 2008, 14:24
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20309
Location: In your JS exploiting you and your system
revolution 17 Mar 2008, 14:42
In a flat unprotected OS multi-threading and multi-tasking are much the same thing. So if you want to start simple (which is recommended) then don't go for a protected OS with task isolation etc. just use a simple structure to save the CPU and FPU states and have a scheduler select which state to run during each time slice. For any shared resource use access locks to block other threads from entering.
Post 17 Mar 2008, 14:42
View user's profile Send private message Visit poster's website Reply with quote
edfed



Joined: 20 Feb 2006
Posts: 4330
Location: Now
edfed 17 Mar 2008, 17:57
multitasking:
define work space.
define code and data segments.

multitask with timer:
when timer0 = taskswitch.
int timer
Code:
task list:
save context
load context
jmp to new context
loop task list 
    


multithreading with multitasking:
Code:
 define interdependances 
 define sections
 load and execute
 create tasks
 put sections in tasks.
 return to main loop
    


this way is not architecture ddependant.
Post 17 Mar 2008, 17:57
View user's profile Send private message Visit poster's website Reply with quote
narada



Joined: 15 Feb 2008
Posts: 77
Location: Ukraine, Dnepropetrovsk
narada 17 Mar 2008, 20:05
f0dder, thanks, but I need multitasking ))). In 32bit OS I already had hardware multitasking, but question in AMD64 MT...

Ok: I can do CPU context switch - this is not a problem. But question is - how with software MT I can do CPL0..3 levels for processes, isolation, ..., understand? I repeat: hw mtasking - not a problem. need software.
Post 17 Mar 2008, 20:05
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
narada



Joined: 15 Feb 2008
Posts: 77
Location: Ukraine, Dnepropetrovsk
narada 17 Mar 2008, 20:07
Ok. I downloaded AMD tutorial - will learn now )))
But if someone have sources - please... )))
Post 17 Mar 2008, 20:07
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
Dex4u



Joined: 08 Feb 2005
Posts: 1601
Location: web
Dex4u 17 Mar 2008, 20:27
I would sugest you look at this code to give you a idea and convert it to long mode

http://www.dex4u.com/ASMcompo512b/nanoos2.zip
Post 17 Mar 2008, 20:27
View user's profile Send private message Reply with quote
narada



Joined: 15 Feb 2008
Posts: 77
Location: Ukraine, Dnepropetrovsk
narada 17 Mar 2008, 20:37
it's not that, but source very interesting, some good ideas.
BTW, dex4u - I saw your OS - very fine! )

-- ------------------
Thanks, I already find something in AMD tutorial #24593.
Post 17 Mar 2008, 20:37
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 18 Mar 2008, 00:31
narada, what do you mean by software vs. hardware multitasking?

Even on 32bit OS where you could use TSS, you generally wouldn't do that, because it's much slower than software context switching.

And for 64bit mode, you can still do ring3/ring0 split and use paging for isolation... what is it specifically you're having trouble with?
Post 18 Mar 2008, 00:31
View user's profile Send private message Visit poster's website Reply with quote
Feryno



Joined: 23 Mar 2005
Posts: 509
Location: Czech republic, Slovak republic
Feryno 18 Mar 2008, 12:58
Long mode doesn't support HW multitasking anymore. You must use SW multitasking (which is even faster than old hw multitasking in 32 bit world).
The principle is quite simple:
- timer interrupt (a part of kernel code) ticks at some period
- when timer hits, the running program is interrupted and its SS, RSP, RFLAGS, CS, RIP are saved in the kernel stack
- at this point, your OS decides whether to leave the interrupted program to continue (just do IRETQ instruction) or to change task
- if changing task, OS saves the rest of registers of the interrupted program, reloads registers of new program to execute. Registers SS, RSP, RFLAGS, CS, RIP of new program are placed into the kernel stack and the new program resumes just after doing IRETQ in kernel which reloads SS, RSP, RFLAGS, CS, RIP registers with the values of new program.

to narada
without TSS - I think it is not good way. The kernel (including timer interrupt) needs to know how to switch kernel stack during interrupt (ring3->ring0 switch) and this is defined in TSS. TSS can be much simpler than in HW multitasking. Minimal TSS requires perhaps only RSP0 value (RSP when switching from higher rings into ring0 kernel), but TSS is really good place where to store the rest of registers too. You can arrange register positions as you like (HW multitask has defined certain positions for registers)
Post 18 Mar 2008, 12:58
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 18 Mar 2008, 13:40
Imho it's silly that hardware context switching isn't implemented, and implemented in a fast way. After all, it's something that happens all the bloody time on a modern OS. But that's a pretty common thing with x86, it lacks a bunch of useful features, and parts of the instruction set can be done faster with multiple simpler instructions - I don't really get that, since one complex instruction could easily be made to generate the same µops as several simple? :-s
Post 18 Mar 2008, 13:40
View user's profile Send private message Visit poster's website Reply with quote
narada



Joined: 15 Feb 2008
Posts: 77
Location: Ukraine, Dnepropetrovsk
narada 18 Mar 2008, 15:10
Feryno, why you wrote "RFLAGS" - I learned AMD64 tutorial, EFLAGS still 32bit in AMD64LM...
Post 18 Mar 2008, 15:10
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
Feryno



Joined: 23 Mar 2005
Posts: 509
Location: Czech republic, Slovak republic
Feryno 19 Mar 2008, 09:15
to narada - yeah, you are right, the flags register is some "hybrid" - is 64 bits wide but uses only low 32 bits (just like EFLAGS) and the upper 32 bits are all zeros. Perhaps sometimes in the feature the zeroed bits become usefull for something...

when you push this register (pushf instruction) in long mode, all 64 bits are pushed into the stack

when you pop this register from stack (popf), 64 bits are loaded into flags register from the stack

there are only 2 instructions how to reload the whole flag register from the stack: popf, iretq
(+ some to modify only 1 bit of flags like clc, sti, ...)
I think doing iret (db 0CFh) instread of iretq (db 48h, 0CFh) in long mode reloads only 32 bits of flags register from the stack (and only 32 bits of IP register - hence EIP instead of RIP)
Post 19 Mar 2008, 09:15
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
narada



Joined: 15 Feb 2008
Posts: 77
Location: Ukraine, Dnepropetrovsk
narada 19 Mar 2008, 15:09
All this day learned linux 2.6.9 kernel sources for x86_64... How I hate C/C++!!! )))
Post 19 Mar 2008, 15:09
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20309
Location: In your JS exploiting you and your system
revolution 19 Mar 2008, 15:20
f0dder wrote:
Imho it's silly that hardware context switching isn't implemented, and implemented in a fast way. After all, it's something that happens all the bloody time on a modern OS. But that's a pretty common thing with x86, it lacks a bunch of useful features, and parts of the instruction set can be done faster with multiple simpler instructions - I don't really get that, since one complex instruction could easily be made to generate the same µops as several simple? :-s
I haven't looked into the task switch algorithm but I imagine the reason it is slower is the same as for instructions like 'loop'. Naively one might expect 'loop' to generate two µops (dec ecx/jnz target) but the problem is that loop has a restriction of not altering the flags, registers (except ecx) or memory so the normal µops can't simply be inserted into the code stream, the microcode has to use either more µops or slower µops. Scale that up to a complex task switch and things start to happen slower.
Post 19 Mar 2008, 15:20
View user's profile Send private message Visit poster's website Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 20 Mar 2008, 00:29
revolution: hadn't thought of flag dependency thing for LOOP... as for context switching, it would make sense to use a little die space to add in fast hardware context switching, since it's used all the time.
Post 20 Mar 2008, 00:29
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20309
Location: In your JS exploiting you and your system
revolution 20 Mar 2008, 02:24
f0dder wrote:
revolution: hadn't thought of flag dependency thing for LOOP... as for context switching, it would make sense to use a little die space to add in fast hardware context switching, since it's used all the time.
Actually it is not used all that often. Most desktop OSes switch at the fastest rate of ~ 30ms (IIRC) so the occasional task switch instruction executed once per 30ms (worst case) is a tiny overhead compared to the total of all the other instructions executed.
Post 20 Mar 2008, 02:24
View user's profile Send private message Visit poster's website Reply with quote
Mac2004



Joined: 15 Dec 2003
Posts: 314
Mac2004 02 May 2008, 06:18
narada wrote:
.... why you wrote "RFLAGS" - I learned AMD64 tutorial, EFLAGS still 32bit in AMD64LM...


AMD64 documents refer to rflags instead of eflags.

regards,
Mac2004
Post 02 May 2008, 06:18
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.