flat assembler
Message board for the users of flat assembler.

Index > Main > Hypervisors - Challenges in Building Virtualization Software

Author
Thread Post new topic Reply to topic
HyperVista



Joined: 18 Apr 2005
Posts: 691
Location: Virginia, USA
HyperVista 05 Sep 2006, 13:35
Hypervisors are an interesting and exciting area of software development that is enjoying quite a bit of attention lately. There are many technical challenges in building hypervisors, the most demanding are 17 problematic instructions in the Intel ISA. However, with the new processors from Intel (VMX) and AMD (SVM), the problems presented by these 17 instructions are solved, paving the way for new kinds of effective and efficient virtualization software. It would be great for the FASM community to build some software using these new processors to show off the power of FASM. I'm currently writing my hypervisor using FASM (mostly C language with some FASM thrown in at the right places Wink ). Once I get further along, I hope to publish some articles in US software developer magazines about my hypervisor and the benefits of using FASM for such work.

Based on a question vid posed about the problematic instructions, following was my response. Because of it's length, vid asked that I consolidate it into a new thread. So here it is ...

There are 17 instructions mentioned in Intel ISA instructions that need to be addressed in some fashion in order to achieve true virtualization. Until hardware support for virtualization (VMX and SVM) became a reality recently, the preferred method of dealing with these 17 sensitive instructions was to modify the guest OS (a method known as "para-virtualization"). Naturally, to make these modifications to the OS you need the source code to make the modifications and you need to recompile the OS. Obviously, this is not a problem for Linux, but a huge problem with Windows since we don't have access to the source code. So, the only method left to us for virtualizing Windows is to do dynamic binary translation of those 17 sensitive instructions; scan all code for these instructions and patch them at run-time. This creates a very significant performance loss and results, in part, in the "lag" you see when running Windows in Bochs or VMWare.

In order to support a Type I VMM (hypervisor), a processor must meet three virtualization requirements:

1. The method of executing non-privileged instructions must be roughly equivalent in both privileged and user mode. A processor must not use an additional bit in an instruction word or in the address portion of an instruction when in privileged mode.

2. There must me a method such as a protection system or an addess translation system to protect the real system and any other VMs fro the active VM.

3. There must be a way to automatically signal the VMM (hypervisor) when a VM attempts to execute one of the 17 sensitive instructions. It must also be possible for the hypervisor to simulate the effect of the instruction. Sensitive instructions include:

3A. Instructions that attempt to change or reference the mode of the VM or the stare of the machine.

3B. Instructions that read or change sensitive registers and/or memory locations such as the clock register and interupt registers.

3C. Instructions that reference the storage protection system, memory system, or address relocation system. This class of instruction includes instructions that would allow the VM to access any location no in its virtual memory.

3D. All I/O instructions.

The 17 sensitive instructions I mentioned all violate one of the listed requirement 3 (3A - 3D) above.

Several of the 17 violate requirement 3B (sensitive register instructions), namely: SGDT (Store Global Descriptor Table), SIDT (Store Interupt Descriptor Table), and SLDT (Store Local Descriptor Table). These instructions are normally only used by the OS but are NOT privileged in the Intel Architecture. Since Intel processors only have one LDTR, IDTR and GDTR, a problem arises when multiple operating systems try to use the same registers.

The next sensitive instruction is the SMSW (Store Machine Status Word) instruction. SMSW stores the machine status word (bits 0 - 15 of CR0) into a general purpose register or memory location. Bits 6 - 15 of CR0 are reserved and not to be modified. However, bits 0 - 5 contain system flags that control the operating mode and state of the processor. Although SMSW only stores the machine status word, it is sensitive and unprivileged. You can see the problem if a guest OS (VM) is running in real mode within a hypervisor (VMM) running in protected mode. If the VM checked the MSW to see if it was in real mode, it would incorrectly see that it was in protected mode (PE bit set) and could halt or shutdown and not be able to run successfully.

The next two sensitive instructions are PUSHF and POPF (and their 32-bit versions PUSHFD and POPFD). The issue with these instructions is similar to SMSW because pushing the EFLAGS register onto the stack allows examination of operating mode and state. POPF allows some of the EFLAGS bits to be changed. It varies based on the processor's current operating mode. In real-mode, or when operating at CPL 0, all non-reserved flags in the EFLAGS register can be modified except for the VM, VIP, and VIF flags. In virtual-8086 mode, the IOPL must equal 3 to use the POPF instructions. The IOPL allows an OS to set the privilege level needed to perform I/O. In virtual-8086 mode, these key flags are not affected by POPF. However, in protected mode, there are several conditions based on privilege levels. For example, if CPL is greater than 0 and <= to the IOPL, all flags can be modified (except IOPL). If POPF/POPFD is executed without enough privilege, an exception is NOT generated.

The next set of the 17 sensitive instructions violate requirement 3C above (Protection System References). Namely, LAR (Load Access Rights byte), LSL (Load Segment Limit), VERR/VERW (Verify a segment for reading or writing). The problem with these instructions is they all perform the following check during their execution (CPL -> DPL) OR (RPL -> DPL). This condition checks to ensure that the current privilege level (located in bits 0 and 1 of the CS register and SS register) and the requested privilege level (bits 0 and 1 of any segment selector) are both greater than the descriptor privilege level (privilege level of a segment). This is a problem because prior to VMX and SVM, VMs don't normally execute at the highest privilege level (CPL 0). For example, in Xen, VMs run at CPL 2 (ring 2) - they only "think" they are running at CPL 0. Therefore, if a VM running at CPL 2 executes any of LAR, LSL, VERR or VERW to examine a segment descriptor with a DPL < 3, it is likely that the instruction will not execute properly.

POP and PUSH are also included in this category of problematic instructions for similar reasons. POP cannot be used to load the CS register since it contains the CPL. A value that is loaded into a segment register must be a valid segment selector. The reason that POP is one of the problematic 17 instructions is it depends on the value of CPL. If the SS register is being loaded and the segment selector's RPL and the segment descriptor's DPL are not equal to the CPL, a general protection exception is raised. Furthermore, if the DS, ES, FS, or GS register is being loaded, the segment being pointed to is a nonconforming code segment or data, and the RPL and CPL are > the DPL, a general protection exception is raised. Therefore, as in the case with LAR, LSL, VERR and VERW, if the VM is at CPL 3 (ring 3) and did a privilege level check it would likely fail because it thinks it's running at CPL 0. If a process that thinks it's running at CPL 0 pushes CS onto the stack and checks it's CPL it will see that it's running at CPL 3 and may crash.

The next set of problematic instructions are CALL, JMP, INT n, and RET. CALL saves procedure linking information on the stack and branches to the procedure given in its destination argument. Naturally there are four types of calls (near, far calls to the same privilege level, far calls to a different privilege level, and task switches). Task switches and far calls to different privilige levels are a problem for virtualization because they involve CPL, DPL, and RPL. If a far call is executed to a different privilege level, the code segment for the procedure being accessed has to be accessed through the call gate. A task uses a different stack for every privilege level. Therefore, when a far call is made to another privilege level, the processor switches to a stack corresponding to the new privilege level of the called procedure. A task switch operates operates in a similar manner as a call gate. (The main difference being the target operand of the call instruction specifies the segment selector of a task gate instead of a call gate). Both call gate and task gate have many privilege level checks to compare the CPL and RPL to DPLs. Since the VM is running at CPL 2 or 3, these checks won't work properly with the guest OS tries to access call gates or task gates at CPL 0. The JMP and INT n instructions have similar problems for virtualization. (The INT n instruction references the protection system many times during it's execution). Naturally, the RET instruction has the opposite effect as CALL in that it transfers control to a return adress placed on the stack (normally by CALL). The RET instruction can be used for three different types of returns: near, far, and inter-privilege-level returns. Much like the CALL instruction, the inter-privilege-level far return examines the privilege levels and access rights of the code and stack segments that are being returned to determine if the operation should be allowed. The DS, ES, FS and GS segment registers are cleared by the RET instruction if they refer to segments that cannot be accessed by the new privilege level. Therefore, RET is problematic for virtualization because a VM running at CPL 3 could cause the DS, ES, FS and GS segment registers to not be cleared when they should be.

The next problematic instruction of the 17 instructions is STR (Store Task Register) because it references the protection system. The STR stores the segment selector from the task register into a general purpose registor or memory location. The segment selector that is stored with this instruction points to the task state segment of the current executing task. This instruction is problematic for virtualization because it allows a task to examine its requested privilege level (RPL).

The last problematic instruction for virtualization is (believe it or not) MOV. The MOV opcode that stores segment registers allows all six of the segment registers to be stored to either a general purpose register or memory location. This is a problem because the CS and SS registers both contain the CPL in bits 0 and 1. Thus, a task could store the CS or SS in a general purpose register to find that it's not running at the expected CPL. The MOV opcode that loads segment registers does offer some protection because it won't allow the CS register to be loade at all. However, if a task tries to load the SS register, several privilege level checks occur that become problematic for the reasons already explained.

So ..... those are the 17 sensitive, or problematic instructions that make virtualization of Windows difficult. The current version of Xen (and Paralllels, for that matter) handle this situation because they take advantage of the new virtualization features in the latest Intel and AMD processors (VMX and SVM, repectively). These hardware based virtualization features permit multiple ture CPL 0 levels. Therefore, VMMs (hypervisors) don't need to be "tricked" into "thinking" they are running at ring 0 when they are actually running in ring 2 or 3. With these new processors, they really are running at ring 0.

This is one reason I'm very excited about VMX and SVM. It opens up many new possibilities for effective and efficient virtualization software. Hardware supported virtualization (VMX and SVM) is a relatively new software development space and I think FASM can show it's power here.
Post 05 Sep 2006, 13:35
View user's profile Send private message Visit poster's website Reply with quote
vid
Verbosity in development


Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid 05 Sep 2006, 13:41
Quote:
and I think FASM can show it's power here

mostly agree with that, especially because of it's ability to generate "unstandard" code formats.

PS: i'm moving this thread to Main, and linking from important/interesting topics
Post 05 Sep 2006, 13:41
View user's profile Send private message Visit poster's website AIM Address MSN Messenger ICQ Number Reply with quote
halyavin



Joined: 21 Aug 2004
Posts: 42
halyavin 10 Sep 2006, 07:45
But what happens it windows starts to use this technologies itself? You will have again a set of N problematic instructions.
Post 10 Sep 2006, 07:45
View user's profile Send private message Visit poster's website Reply with quote
HyperVista



Joined: 18 Apr 2005
Posts: 691
Location: Virginia, USA
HyperVista 10 Sep 2006, 14:12
halyavin wrote:
Quote:
But what happens it windows starts to use this technologies itself? You will have again a set of N problematic instructions.

Microsoft is very busy now writing their own hypervisor for inclusion in Windows (they are about two years away from completing it ... they are very behind schedule).

In the absence of VMX and SVM support in the processor, you are correct, these instructions continue to be problematic. The work around solutions developed thus far by VMWare, Microsoft (Virtual PC), and others have resulted in effective, but slow and performace draining results.

With VMX and SVM support in the processor, these issues go away because the loaded virtual machines actually run in multiple true ring0 and ring3 configurations. You will note that the common theme of most of the problematic instructions revovle around the VM being able to determine it's not running at ring0. VMX and SVM permit multiple ring0 and ring3 configurations silmultaneously. The result is virtualization software doesn't have to perform dynamic run-time trapping and binary translation of these problematic instructions.

Right now, there are very few software products or applications that support VMX and SVM (Xen 3.0 does and so does Parallels). These two products are strictly hypervisors in that they facilitate launching of multiiple OSes.

The VMX and SVM processors are relatively new and software hasn't caught up with this new technology just yet. Look at the sensational splash in the IT news recently over the Bluepill project (a hypervisor based rootkit utilizing AMD's SVM). http://theinvisiblethings.blogspot.com/2006/06/introducing-blue-pill.html

Intel has quite a few processors already on the market that support VMX. http://www.intel.com/products/processor_number/proc_info_table.pdf. I suspect most new processors from Intel and AMD will have support for virtualization from this point foward. Any new computers sold by Dell, Gateway, Toshiba, etc. will likely have virtualization support (VMX or SVM) built-in. Many users won't even know it's there because there are no application suites that take advantage of it .... yet. Very Happy

If anyone on this board is interested in this new technology and writing FASM applications to demonstrate the power of hypervisor support in the new processors from Intel and AMD, I strongly urge you to do so. A few of us on this board are putting together a project that will showcase FASM's power in this area. More on that soon as we have something to show. Cool

Can you tell this is a passion of mine?? Wink
Post 10 Sep 2006, 14:12
View user's profile Send private message Visit poster's website Reply with quote
okasvi



Joined: 18 Aug 2005
Posts: 382
Location: Finland
okasvi 10 Sep 2006, 15:50
HyperVista wrote:
Can you tell this is a passion of mine?? Wink

I think your nick already does tell us something Smile

_________________
When We Ride On Our Enemies
support reverse smileys |:
Post 10 Sep 2006, 15:50
View user's profile Send private message MSN Messenger Reply with quote
Borsuc



Joined: 29 Dec 2005
Posts: 2465
Location: Bucharest, Romania
Borsuc 22 Sep 2006, 15:09
Like I always said -- good old DOS, being simple and no protection, has many more possibilities than Windows. Wink

If I understood correctly, this "virtualization in hardware" is actually needed to bypass the ring3 protection of the OS, no? It's cool, but it's kinda same as going back to DOS (I mean, it's no-protection)... finally people see that "too much protection is bad". Or did I understand something wrong? Sorry if so.

Don't underestimate viruses -- they will take advantage of this soon Sad And Microsoft... they will employ super-ultra-mega protection to make this "multiple ring0" useless... don't tell me how, I know (just kiddin').


Think about it: we (humas) had to develop this virtualization thing just 'cause we (m$) are too greedy to share the source code... crappy capitalism Evil or Very Mad
Post 22 Sep 2006, 15:09
View user's profile Send private message Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 22 Sep 2006, 15:21
The_Grey_Beast: you got things wrong Smile. This is not "multiple ring0", it's "multiple faked ring0"... and it is lots of protection, the hypervisor is the dictator in control.

And the virtualization has nothing to do with "too greedy to share the source code".
Post 22 Sep 2006, 15:21
View user's profile Send private message Visit poster's website Reply with quote
Borsuc



Joined: 29 Dec 2005
Posts: 2465
Location: Bucharest, Romania
Borsuc 22 Sep 2006, 15:26
f0dder wrote:
And the virtualization has nothing to do with "too greedy to share the source code".
No, but Linux is fast without virtualization hardware Smile

What if the hypervisor is a virus?
Post 22 Sep 2006, 15:26
View user's profile Send private message Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 22 Sep 2006, 15:30
Quote:

No, but Linux is fast without virtualization hardware Smile

So is the NT kernel. If you're referring to XEN, that's something different. A pretty good idea IMHO, but theoretically it should be even harder to "break out" of a VMx. And, when the technology matures, it should be pretty fast as well (although XEN will probably remain faster, since it doesn't need to virtualize in the same way).

Quote:

What if the hypervisor is a virus?

That would be nasty. It would be a major undertaking to make one, though... and there still aren't that many VMx enabled machines there yet.

But it's a pretty good reason why any OS ought to either disable VMx (yep, can be done, and can't be turned on without reboot), or include a hypervisor.
Post 22 Sep 2006, 15:30
View user's profile Send private message Visit poster's website Reply with quote
HyperVista



Joined: 18 Apr 2005
Posts: 691
Location: Virginia, USA
HyperVista 22 Sep 2006, 15:31
in a way, you are both right. a malicious, or virus, hypervisor is a very, very big concern because the hypervisor is the "dictator" of the system..

M$ is being greedy here too because they are "para-virtualizing" windows vista to provide support for SVM and VMX and they will "license" what they are calling "Windows Enlightenments" for 3rd party virtualization solution providers like vmware, paralles, (and hypervista).... "enlightenments" my ass!!


Last edited by HyperVista on 22 Sep 2006, 15:38; edited 1 time in total
Post 22 Sep 2006, 15:31
View user's profile Send private message Visit poster's website Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 22 Sep 2006, 15:37
HyperVista: that does sound pretty nasty :/. In a way it's understandable though (let's forgot the money motive for now). If they let just everybody have the needed info, it might be just as bad as not running a hypervisor at all.

Of course if "enlightened" modules don't need some crypto certificate, it will just be reverse engineered and only the good guys will suffer.
Post 22 Sep 2006, 15:37
View user's profile Send private message Visit poster's website Reply with quote
HyperVista



Joined: 18 Apr 2005
Posts: 691
Location: Virginia, USA
HyperVista 22 Sep 2006, 15:47
f0dder - your comments are absolutely correct and insightful. hypervisors do provide an extreme level of security, precisely for the reason you stated; "breaking out" of a VMX environment will be tremendously difficult. it's the utlimate code and process "sandbox".

there are quite a few VMX capable processors out there now (at last count, i think there were 15 or so separate Intel products, including their mobile centrino line and a few AMD products that support SVM). more than you imagine. many users do or will have VMX or SVM capable processors and don't even know/realize hypervisor capabilities are there. a malicious hypervisor would definitely be a nasty turn of events.
Post 22 Sep 2006, 15:47
View user's profile Send private message Visit poster's website Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 22 Sep 2006, 16:17
I'm quite aware of the number of *models* that support VMX - what I was referring to was the amount of deployed computers, especially end-user wise, that have such a CPU. Remember that the dangerous malware writers target the platforms with broadest availability, which is why we don't see mass-infection of linux and os-x (and why many people running linux servers don't know they've been backdoored Smile ).

But of course there will be some proof-of-concept stuff, and eventually we might see ms-exploit worms utilizing VMX. It's going to take a lot of effort not to be detectable though. And you'll always be able to boot from a cd/dvd to check/clean (I don't see a generic bios flash infector as more than a curiosity).
Post 22 Sep 2006, 16:17
View user's profile Send private message Visit poster's website Reply with quote
HyperVista



Joined: 18 Apr 2005
Posts: 691
Location: Virginia, USA
HyperVista 22 Sep 2006, 16:27
imho, absolutely correct. i could not agree more! uefi may change the landscape wrt bios types of attacks, though.

are you in the software security business?? just curious...
Post 22 Sep 2006, 16:27
View user's profile Send private message Visit poster's website Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 22 Sep 2006, 16:31
Hm, haven't looked into uefi - got any links?

Quote:

are you in the software security business?? just curious...

Nope, but I've been reverse engineering for about 10 years Wink - at the moment I have an mail-OCR-related job at www.post.dk until I finish some education and find some computer related work.
Post 22 Sep 2006, 16:31
View user's profile Send private message Visit poster's website Reply with quote
HyperVista



Joined: 18 Apr 2005
Posts: 691
Location: Virginia, USA
HyperVista 22 Sep 2006, 16:44
re: uefi - this is a good place to start: http://www.uefi.org/index.php?pg=1

i find the pre-boot app capabilities very interesting Wink
Post 22 Sep 2006, 16:44
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.