flat assembler
Message board for the users of flat assembler.

flat assembler > Heap > How are OS virtualization programs made?

Author
Thread Post new topic Reply to topic
Mino



Joined: 14 Jan 2018
Posts: 156
Good morning,
I'm sorry if the forum doesn't match, but it seemed to me to be suitable Smile
So here's my question: how are operating system virtualization programs, such as VMWare or Virtual Box, are made?
I know they simulate an architecture (even one different from our PC (as long as the number of bits is not exceeded), like a game console for example), but I never managed to understand and find how they did that...
If you have an idea, I am taker Smile Thanks to you !

_________________
The best way to predict the future is to invent it.
Post 16 Apr 2018, 18:47
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 15971
Location: Spacecraft Planet Earth
Moved to Heap.
Post 16 Apr 2018, 22:15
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 15971
Location: Spacecraft Planet Earth
I think it is more than OS virtualisation, it is machine virtualisation, hence the name VM.

There are some open source VMs around, e.g. QEMU. Have a look.
Post 16 Apr 2018, 22:23
View user's profile Send private message Visit poster's website Reply with quote
sleepsleep



Joined: 05 Oct 2006
Posts: 7498
Location: ˛                              ⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣ Posts: 6699
afaik,
implement a copy of registers, parse the program as usual but values to registers are loaded into your register variable instead of machine physical register,

calculation processed by how we decide to process our variable registers, if let say we want to add 5 into every loaded EAX, we could do so,

if we decide to ignore some specific instructions, we could do so too,

regarding the bits thing, afaik,
the bit ness is not much of an issue, you could simulate 64bits in 32bits or 16 or 8 if you want, it double, triple, or quadruple the instruction processing only,
Post 16 Apr 2018, 23:07
View user's profile Send private message Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 1225
Mino wrote:
Good morning,
I'm sorry if the forum doesn't match, but it seemed to me to be suitable Smile
So here's my question: how are operating system virtualization programs, such as VMWare or Virtual Box, are made?
I know they simulate an architecture (even one different from our PC (as long as the number of bits is not exceeded), like a game console for example), but I never managed to understand and find how they did that...
If you have an idea, I am taker Smile Thanks to you !
There's a difference between virtualization and emulation.

Emulation can run any CPU because it emulates the instructions, so you can emulate a different CPU than x86 or whatever the hypervisor (the emulator) runs on. This makes it slow, though.

Virtualization executes most of the code directly on the CPU.

It uses exception handling to handle privileged and other such instructions that are invalid (e.g. operating system instructions on the virtual machine). It has to emulate those instructions so that the virtual machine thinks it's running on real hardware. But most of the user code is not emulated, it's ran directly on the CPU. With the specialized virtualization instructions in newer CPUs, even the emulation of the privileged instructions is fast (not done in software anymore). I'm not very familiar with it though.

With virtualization you will rarely find it's slower than "natively", because most of the code is ran natively. Most people don't get this, sadly.

The exceptions are hardware devices, such as the graphics card, which will be slow because they have to be emulated (so playing games in a VM is much slower) unless the hypervisor pulls tricks like hardware acceleration (but that's "specialized").

Of course, you can "pass through" the GPU directly to the guest with some effort and then you can play a top end game in a VM just fine at almost native speeds... assuming you have 2 GPUs (because you can't use a single GPU on both the host and guest at the same time). Also you can pass through USB devices directly.

So if the reason you're using, say, Windows is due to a USB device and its drivers, think again: a VM will work just fine.
Post 17 Apr 2018, 11:50
View user's profile Send private message Reply with quote
Mino



Joined: 14 Jan 2018
Posts: 156
Okay, thank you very much your answers, it's already much clearer for me Very Happy
Post 17 Apr 2018, 13:27
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 15971
Location: Spacecraft Planet Earth
Furs wrote:
There's a difference between virtualization and emulation.
I think that emulation can also be called virtualisation. It encompasses both. You can create a virtual machine in an emulator. The process of virtualisation doesn't require using the native CPU instructions, that is merely an implementation detail.
Post 17 Apr 2018, 13:31
View user's profile Send private message Visit poster's website Reply with quote
Mino



Joined: 14 Jan 2018
Posts: 156
But I'm actually thinking about it: how are the architectures simulated (or emulated, it's up to you ^^)?
How are the registers, memory, instructions,... simulated?
I already tried to reproduce in C the instruction set of a small map, with a zone reserved for allocations and others in the real RAM, then to launch this program with a fake .iso (asm file which, normally compiled in .iso, would display "Hello, world!"); but I never succeeded...
Post 17 Apr 2018, 15:35
View user's profile Send private message Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 1225
Mino wrote:
But I'm actually thinking about it: how are the architectures simulated (or emulated, it's up to you ^^)?
How are the registers, memory, instructions,... simulated?
I already tried to reproduce in C the instruction set of a small map, with a zone reserved for allocations and others in the real RAM, then to launch this program with a fake .iso (asm file which, normally compiled in .iso, would display "Hello, world!"); but I never succeeded...
You can emulate them however you want, as long as they produce the same states and output as the emulated CPU.

In C, the simplest way is to just use a CPU state with all registers and other stuff, and use a big switch statement to "execute" the emulated instructions' opcodes. You have to know what every opcode does and then simply do the same thing in your C program (basically reimplementing an x86 CPU's logic in software). Of course the emulated "registers" will probably be on the stack (memory) that's why this kind of emulation is so slow.

Faster (but MUCH more complex) emulation tends to "translate" code on the fly: it looks at a large sequence of instructions and "converts" them to an equivalent in the native CPU. When emulating x86 you have to be super careful since self-modifying code is a thing, so binary translation is more complex.

Maybe you should look at byte-code languages, which are typically more basic since they're made to be ran like this, unlike a real CPU's instruction set. e.g. look at Lua bytecode, which has a simple instruction set. (it's a "Virtual CPU instruction set" if you will). It has source code for the interpreter (the one that "executes" the byte code) with a switch statement (so it's slow, but simple).


EDIT: I think I get your confusion. You need to literally emulate the whole CPU. You can't cut corners. This includes the instruction pointer. You can see where this is going.

Set a variable to be the emulated instruction pointer (rip/eip), then check the opcode(s) at *eip, handle them in switch, then increment eip or change it (if jump) etc. And simply loop again.
Post 17 Apr 2018, 15:40
View user's profile Send private message Reply with quote
Mino



Joined: 14 Jan 2018
Posts: 156
Great Wink It sounds simpler said that way Very Happy
But I imagine that following this emulation method (with the switch instruction, and everything...) it is not possible to emulate a system like Windows (if you forget the slowness, which will really be noticed).

Thanks for your answers anyway!
Post 17 Apr 2018, 18:05
View user's profile Send private message Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 1225
You don't have to emulate Windows, you emulate the processor and hardware. Then you can just install normal Windows onto it. VMWare and VirtualBox don't emulate Windows either (they do provide drivers for better integration in the guest though).

Why would it not be possible? It's possible to emulate any computer whatsoever with this method -- except it will be super slow of course.
Post 17 Apr 2018, 22:20
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 15971
Location: Spacecraft Planet Earth
Furs wrote:
EDIT: I think I get your confusion. You need to literally emulate the whole CPU. You can't cut corners. This includes the instruction pointer.
Yes. And also the hardware: keyboard, interrupts, ports, buses, RAM, display, etc.
Post 18 Apr 2018, 00:51
View user's profile Send private message Visit poster's website Reply with quote
sleepsleep



Joined: 05 Oct 2006
Posts: 7498
Location: ˛                              ⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣⁣ Posts: 6699
input -> process -> output

emulate
resulting exact output by mimicking the processor of processes,

virtualization
the process of running computer instructions in mimicked processor
Post 18 Apr 2018, 01:13
View user's profile Send private message Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2302
Location: Usono (aka, USA)
Mino wrote:

But I imagine that following this emulation method (with the switch instruction, and everything...) it is not possible to emulate a system like Windows (if you forget the slowness, which will really be noticed).


random Google search (hopefully accurate) wrote:

Standard C specifies that a switch can have at least 257 case statements.


So it depends on how many instructions (and prefixes? or whatever) that your particular version number of Windows (XP? 7?) has a hard requirement on. (Although maybe you can nest or just use multiple switches?)
Post 23 Apr 2018, 05:01
View user's profile Send private message Visit poster's website Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 1225
Well usually you handle only 1 byte in a switch, since the opcode is in 1 byte (some of them extend it of course, those are handled in a nested switch).
Post 28 Apr 2018, 23:07
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 15971
Location: Spacecraft Planet Earth
If your switch table is 256 cases long and you need more then it is probably time to consider a different approach, e.g. a simple jump table could suffice.
Post 29 Apr 2018, 01:32
View user's profile Send private message Visit poster's website Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 1225
Well a switch in C (or C++) is a jump table though and especially when elements are close together it will optimize it that way (it usually won't optimize if chains though). There's no "standard" way to use a jump table in C or C++ without relying on the optimizer with switch (which is good enough usually).

But GCC has some extensions: raw "asm goto" to use an indirect jump with asm (but max 31 operands/labels IIRC), or "indirect goto" with a manual table (e.g. goto *ptr). The former is good (but the operand limit can be a pain, you might be able to chain them though -- with a "fake" one that emits no insns, it's needed so GCC knows where the jump will end up).

Problem with the latter is that GCC will usually not know what are the "possible jump targets" so it will flush registers to memory etc and not do optimizations intra-jump, idk if it can statically analyze this, maybe if you use a constant table (array of pointers to labels) then it can.

Well also at least with these methods you can optimize the jump tables if you know they're, say, less than 65536 bytes apart, so use 16-bit offsets instead of addresses. You can't do that with switch, since it will probably use pointers. A good gain for cache use. Especially on x64 since it has 8-byte pointers (that's 4 times larger than 16-bit offsets, so 4 times the amount of cache wasted).
Post 29 Apr 2018, 12:16
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You can attach files in this forum
You can download files in this forum


Copyright © 1999-2018, Tomasz Grysztar.

Powered by rwasa.