| Author |
| Thread |
 |
|
mhanor
Joined: 17 Nov 2010
Posts: 7
|
BIOS debugging
Hello. I don't know if this is the best place to ask, but I'll do it anyway.
Does anyone know if the BIOS code ever gets executed concurrently (at least part of it) on a dual/multi core system, after powerup/reset? In my case, it's about a pretty standard Phoenix-Award BIOS (no EFI), running on an Intel P35 motherboard (Abit IP35-E). The parts of code that trouble me, one part is contained by the system BIOS module, the other is part of awardext.rom (XGROUP CODE). Both deal with CPU initialization (setting MSRs). I want to know if my problem has a possible cause in non-atomic operations (OR) across at least 2 cores, on a shared RAM location (a byte from the E000 segment, which stores the state of some CPU features, VT-x, NX, etc.). I do have the Intel manuals right on my desktop, but my skills are limited (I'm not a professional) and I can't set breakpoints/single-step the BIOS code, at least not without expensive hardware (that would be cool). I have read about MP initialization (BPS and APs) and about atomic operations in Intel's manual vol 3A, the MP Management chapter. If I understand correctly, a OR operation, on a RAM location (read-modify-write) is not an atomic operation without the LOCK prefix, but I don't know if the code runs concurrently.
Any advice is welcomed. Thank you.
Last edited by mhanor on 18 Nov 2010, 09:59; edited 1 time in total
|
17 Nov 2010, 20:04 |
|
bitRAKE
Joined: 21 Jul 2003
Posts: 2306
Location: dank orb
|
A BIOS that supports multiple cores (MP Spec) must have a small part which is executed by all cores. Abit IP35-E most certainly does support multiple cores. I've heard some BIOS's are using multiple cores to speed diagnostics - whereas an older BIOS would just configure and halt the AP cores.
Without a greater explanation of the problem it's difficult to advise specifically. Your understanding stated thus far seems on track.
|
18 Nov 2010, 05:37 |
|
mhanor
Joined: 17 Nov 2010
Posts: 7
|
As I've said, the byte is used to save the state of some CPU features. Sometimes, the board doesn't re-enable the VT-x bit in IA32_FEATURE_CONTROL MSR (0x3A) after resuming from S3 sleep (suspend to RAM). That's because the required bit of that byte is not set. The bit becomes 1 in the piece of code that checks the appropriate BIOS setting and enables VT-x accordingly. There's another piece of code that enables VT-x if that bit is set, I'm assuming this is the code that re-enables the feature after resuming from S3 standby.
The byte is also used to save the state for disabling NX (DEP, bit 34) and bit 25 (not documented), both from IA32_MISC_ENABLE MSR (0x1A0).
I have managed to disassemble 99% of the code for the system module and the XGROUP module, exceptions are most big data structures. All references to that byte (by searching its offset) are known to me. The byte is manipulated using only OR and AND bitwise operations, most are OR, there is one AND operation for masking some bit (other than my VT-x bit).
I understand that MSRs must be set for each core, meaning the code runs on each core, but does it run at the same time? Assuming both cores read the byte (containing 0x0) at the same time, then each modify a different bit and then write it, the first core's result is overwritten by the second result coming from the 2nd core.
I couldn't find a reference to that byte's address in the disassembled ACPI tables, or in any other BIOS module. Also, most of the times, the E000 RAM segment is "protected" by the two PAM registers (in the DRAM controller, D0:F0), which set it read only (writes go to DMI), or disabled (reads and writes are redirected to DMI). PAMs are set to read/write to RAM when the BIOS needs to modify in that segment. I'm also thinking of the possibility of one core running some code that just finished to set a bit, disables DRAM writes right after another core enabled writes to DRAM for setting another bit, but before it gets the chance to do it.
I'm going to paste the pieces of code that work with that byte.
LE:
Code: |
|
_1000:5A6E sub_15A6E proc near ; CODE XREF: sub_14377p
_1000:5A6E test byte ptr [bp+0Eh], 40h
_1000:5A72 jnz short locret_15AA1
_1000:5A74 call sub_15A46
_1000:5A77 jnb short locret_15AA1
_1000:5A79 mov ecx, 1A0h
_1000:5A7F rdmsr
_1000:5A81 btr eax, 19h
_1000:5A86 wrmsr
_1000:5A88 push es
_1000:5A89 push si
_1000:5A8A call enable_read_write_DRAM_E000
_1000:5A8F push seg _E000
_1000:5A92 pop es
_1000:5A93 assume es:_E000
_1000:5A93 mov si, offset byte_EE71A
_1000:5A96 and byte ptr es:[si], 0F7h
_1000:5A9A call read_only_DRAM_E000
_1000:5A9F pop si
_1000:5AA0 pop es
_1000:5AA1 assume es:nothing
_1000:5AA1
_1000:5AA1 locret_15AA1: ; CODE XREF: sub_15A6E+4j
_1000:5AA1 ; sub_15A6E+9j
_1000:5AA1 retn
_1000:5AA1 sub_15A6E endp
///////////////////////////////////////////////////////////////////////////////////////
_1000:628F loc_1628F: ; CODE XREF: sub_1620A+72j
_1000:628F push es
_1000:6290 push si
_1000:6291 call enable_read_write_DRAM_E000
_1000:6296 push seg _E000
_1000:6299 pop es
_1000:629A assume es:_E000
_1000:629A mov si, offset byte_EE71A
_1000:629D or byte ptr es:[si], 8
_1000:62A1 call read_only_DRAM_E000
_1000:62A6 pop si
_1000:62A7 pop es
_1000:62A8 assume es:nothing
_1000:62A8 jmp short locret_162B6 ; jumps to a retf
///////////////////////////////////////////////////////////////////////////////////////
_E000:4104 sub_E4104 proc near ; CODE XREF: _E000:4508p
_E000:4104 call enable_read_write_DRAM_E000
_E000:4109 call sub_E4162
_E000:410C jb short loc_E4156
_E000:410E mov cs:word_EE718, 8
_E000:4115 mov si, 14F2h
_E000:4118 call sub_E8930
_E000:411B or cs:byte_EE71A, al
_E000:4120 or al, al
_E000:4122 jz short loc_E4156
_E000:4124 mov eax, 1
_E000:412A cpuid
_E000:412C
_E000:412C loc_E412C:
_E000:412C cmp ax, 43Fh
_E000:412F jb short loc_E4156
_E000:4131 cmp ax, 6E8h
_E000:4134 jz short loc_E4156
_E000:4136 cmp ax, 6ECh
_E000:4139 jz short loc_E4156
_E000:413B bt ecx, 8
_E000:4140 jnb short loc_E4156
_E000:4142 call sub_E3F8D
_E000:4147 jnb short loc_E4156
_E000:4149 and cs:word_EE718, 0FFF7h
_E000:414F or cs:word_EE718, 2000h
_E000:4156
_E000:4156 loc_E4156: ; CODE XREF: sub_E4104+8j
_E000:4156 ; sub_E4104+1Ej ...
_E000:4156 call sub_E418A
_E000:4159 call sub_E41A0
_E000:415C call read_only_DRAM_E000
_E000:4161 retn
_E000:4161 sub_E4104 endp
/////////////////////////////////////////////////////////////////////////////////////////////
_E000:41A0 sub_E41A0 proc near ; CODE XREF: sub_E4104+55p
_E000:41A0 mov eax, 0
_E000:41A6 cpuid
_E000:41A8 cmp al, 3
_E000:41AA jbe short loc_E41BE
_E000:41AC mov si, 153Dh
_E000:41AF call sub_E8930
_E000:41B2 or al, al
_E000:41B4 jz short locret_E41E2
_E000:41B6 or cs:byte_EE71A, 4
_E000:41BC jmp short locret_E41E2 ; jumps to retn
//////////////////////////////////////////////////////////////////////////////////
_E000:4203 loc_E4203: ; CODE XREF: sub_E41E3+16j
_E000:4203 mov si, 156Fh
_E000:4206 call sub_E8930
_E000:4209 cmp al, 0
_E000:420B jz short loc_E4232
_E000:420D mov ecx, 1A0h
_E000:4213 rdmsr
_E000:4215 or edx, 4
_E000:4219 wrmsr
_E000:421B push es
_E000:421C call enable_read_write_DRAM_E000
_E000:4221 push seg _E000
_E000:4224 pop es
_E000:4225 assume es:_E000
_E000:4225 mov si, offset byte_EE71A
_E000:4228 or byte ptr es:[si], 10h
_E000:422C call read_only_DRAM_E000
_E000:4231 pop es
_E000:4232 assume es:nothing
_E000:4232
_E000:4232 loc_E4232: ; CODE XREF: sub_E41E3+1Ej
_E000:4232 ; sub_E41E3+28j
_E000:4232 pop si
_E000:4233 push cs
_E000:4234 push offset loc_E423F
_E000:4237 push offset sub_F8443
_E000:423A jmp far ptr loc_E8000
///////////////////////////////////////////////////////////////////////////////////////
_E000:42B3 loc_E42B3: ; CODE XREF: sub_E428C+1Dj
_E000:42B3 mov ecx, 3Ah
_E000:42B9 rdmsr
_E000:42BB test al, 1
_E000:42BD jnz short loc_E42EE
_E000:42BF mov si, 1588h
_E000:42C2 call sub_E8930 ; I'm assuming this checks the BIOS setting
_E000:42C5 cmp al, 1
_E000:42C7 jz short loc_E42FD
_E000:42C9 mov ecx, 3Ah
_E000:42CF or al, 4
_E000:42D1 wrmsr
_E000:42D3 or al, 1
_E000:42D5 wrmsr
_E000:42D7 push es
_E000:42D8 call enable_read_write_DRAM_E000
_E000:42DD push seg _E000
_E000:42E0 pop es
_E000:42E1 assume es:_E000
_E000:42E1 mov si, offset byte_EE71A
_E000:42E4 or byte ptr es:[si], 20h
_E000:42E8 call read_only_DRAM_E000
_E000:42ED pop es
_E000:42EE assume es:nothing
_E000:42EE
_E000:42EE loc_E42EE: ; CODE XREF: sub_E428C+25j
_E000:42EE ; sub_E428C+31j
_E000:42EE pop si
_E000:42EF push cs
_E000:42F0 push offset loc_E42FB
_E000:42F3 push offset sub_F8443
_E000:42F6 jmp far ptr loc_E8000
_E000:42FB ; ---------------------------------------------------------------------------
_E000:42FB
_E000:42FB loc_E42FB: ; DATA XREF: sub_E428C+64o
_E000:42FB pop ds
_E000:42FC assume ds:nothing
_E000:42FC retn
_E000:42FD ; ---------------------------------------------------------------------------
_E000:42FD
_E000:42FD loc_E42FD: ; CODE XREF: sub_E428C+3Bj
_E000:42FD mov ecx, 3Ah
_E000:4303 or al, 1
_E000:4305 wrmsr
_E000:4307 pop si
_E000:4308 push cs
_E000:4309
_E000:4309 loc_E4309:
_E000:4309 push offset loc_E4314
_E000:430C
_E000:430C loc_E430C:
_E000:430C push offset sub_F8443
_E000:430F
_E000:430F loc_E430F:
_E000:430F jmp far ptr loc_E8000
//////////////////////////////////////////////////////////////////////////////////////////////
_E000:9BA4 sub_E9BA4 proc near ; CODE XREF: sub_E9B9C:loc_E9B9Ep
_E000:9BA4 ; sub_E9BE7+59p
_E000:9BA4 push es
_E000:9BA5 push seg _E000
_E000:9BA8 pop es
_E000:9BA9 assume es:_E000
_E000:9BA9 mov eax, 1
_E000:9BAF cpuid
_E000:9BB1 test cl, 20h
_E000:9BB4 jz short loc_E9BD9
_E000:9BB6 mov ecx, 3Ah
_E000:9BBC rdmsr
_E000:9BBE test al, 1
_E000:9BC0 jnz short loc_E9BD9
_E000:9BC2 mov si, offset byte_EE71A
_E000:9BC5 test byte ptr es:[si], 20h
_E000:9BC9 jz short loc_E9BDB
_E000:9BCB mov ecx, 3Ah
_E000:9BD1 or al, 4
_E000:9BD3 wrmsr
_E000:9BD5 or al, 1
_E000:9BD7 wrmsr
_E000:9BD9
_E000:9BD9 loc_E9BD9: ; CODE XREF: sub_E9BA4+10j
_E000:9BD9 ; sub_E9BA4+1Cj
_E000:9BD9 pop es
_E000:9BDA assume es:nothing
_E000:9BDA retn
_E000:9BDB ; ---------------------------------------------------------------------------
_E000:9BDB
_E000:9BDB loc_E9BDB: ; CODE XREF: sub_E9BA4+25j
_E000:9BDB mov ecx, 3Ah
_E000:9BE1 or al, 1
_E000:9BE3 wrmsr
_E000:9BE5 pop es
_E000:9BE6 retn
_E000:9BE6 sub_E9BA4 endp
|
|
I haven't included all the code sections that read from that byte (tests of various bits).
Last edited by mhanor on 18 Nov 2010, 10:44; edited 1 time in total
|
18 Nov 2010, 10:17 |
|
sinsi
Joined: 10 Aug 2007
Posts: 600
Location: Adelaide
|
Is it a case of "AND [mem],x" or a load to register, AND, then write to memory? If it is a direct AND to memory it is guaranteed to be atomic, other CPUs will stall.
edit: are we looking at byte_EE71A ?
edit2: that's why they directly access it or indirectly via SI, guaranteed atomic.
Last edited by sinsi on 18 Nov 2010, 10:48; edited 1 time in total
|
18 Nov 2010, 10:43 |
|
mhanor
Joined: 17 Nov 2010
Posts: 7
|
yes, it's byte_EE71A
|
18 Nov 2010, 10:48 |
|
sinsi
Joined: 10 Aug 2007
Posts: 600
Location: Adelaide
|
Sorry, didn't want to post 3 times in a row
Even if all CPUs try to access the same byte, by the way they address it ([mem] or [si]) each CPU is guaranteed access by itself, the others wait.
The only thing is what read_only_DRAM_E000 does.
|
18 Nov 2010, 11:31 |
|
baldr
Joined: 19 Mar 2008
Posts: 1391
|
sinsi wrote: |
|
If it is a direct AND to memory it is guaranteed to be atomic, other CPUs will stall.
|
|
Are you sure that and mem,reg/imm is atomic without lock prefix?
sinsi wrote: |
|
Even if all CPUs try to access the same byte, by the way they address it ([mem] or [si]) each CPU is guaranteed access by itself, the others wait.
|
|
Without locking simultaneous accesses give predictable result probably only if all of them are reads.
|
18 Nov 2010, 11:44 |
|
sinsi
Joined: 10 Aug 2007
Posts: 600
Location: Adelaide
|
baldr wrote: |
|
Are you sure that and mem,reg/imm is atomic without lock prefix?
|
|
Is it considered a read/modify/write? I thought not, but now am not so sure.
baldr wrote: |
|
Without locking simultaneous accesses give predictable result probably only if all of them are reads.
|
|
Hm, agreed.
I would like to see the BIOS code for MP init, when I delved into my computer's BIOS each AP waited its turn for the init code.
|
18 Nov 2010, 12:06 |
|
bitRAKE
Joined: 21 Jul 2003
Posts: 2306
Location: dank orb
|
Would it matter if multiple cores execute,
and byte ptr es:[si], 0F7h
concurrently?
No, the result is the same. It is read-modify-write. Yet, all arrrangements lead to same result. The exception would be if another core were to attempt to perform some other operation prior to all cores completing this operation.
What does,
Code: |
|
_1000:5A6E test byte ptr [bp+0Eh], 40h
_1000:5A72 jnz short locret_15AA1
_1000:5A74 call sub_15A46
_1000:5A77 jnb short locret_15AA1
|
|
do?
At this post i'm assuming all cores are running concurrently.
|
18 Nov 2010, 15:48 |
|
mhanor
Joined: 17 Nov 2010
Posts: 7
|
I'm more concerned if a section containing:
Code: |
|
mov si, offset byte_EE71A
set the 3rd bit, remembers to set a reserved bit (25th) from IA32_MISC_ENABLE
_1000:629D or byte ptr es:[si], 8;
|
|
runs concurrently with the section that sets the bit I'm interested in:
Code: |
|
_E000:42E1 mov si, offset byte_EE71A
; set the 5th bit, remembers to enable VT-x bit
_E000:42E4 or byte ptr es:[si], 20h
|
|
Another situation would be, the section runs concurrently with another piece of code that sets PAMs for read only DRAM E000, before my code gets to set the 5th bit. In my system, the usual value for this byte is 28h. When the problem appears, the byte is 0x08h.
The Intel manuals speak of a BIOS initialization semaphore which APs must each acquire it before executing the initialization code. My CPU is dual core, so one is BSP and there's only one AP. I don't know if the BSP and the AP ever get the chance to run concurrently on pieces of code that work with that byte. Searching for the MP init section would help to understand it, but this seems imposibile for me. But I don't see how else could that bit not be set to 1 if the MSR 0x3A is set to 0x5 (VT-x enabled) after powerup (no S3), proof that the code gets executed. Others have reported this problem (no VT-x after S3 resuming), with other mainboards that use Intel chipset boards (unsually P35).
About the meaning of the code you've asked, it seems to check some conditions. If the conditions are not met, it exits the current code section (jump to retn, locret_15AA1). But you already knew that. I don't know exactly what it does because at not able to follow it, especially if that code uses data saved on some stack (referenced by BP, at some DS segment, where I don't know that's going on). Also, my programming skills are limited, especially when it comes to assembly language. At least, sub_15A46 seems to check some NVRAM data and some reserved(undocumented) bits from IA32_PLATFORM_ID MSR (0x17). If you want to look at the code, I can upload it somewhere, the IDA Pro code database (I'm using 4.9, the free version).
Choosing another (unused) byte to save and restore VT-x, that wouldn't be shared by other code sections, could be the easiest solution, but I wanted to try to understand the problem.
PS: I'm sorry if you're getting scratches on the brain from reading my english 
|
18 Nov 2010, 18:12 |
|
bitRAKE
Joined: 21 Jul 2003
Posts: 2306
Location: dank orb
|
Find the CPU init code which determines BPS core, and then you'll know what the other cores are doing. It's more likely they are halted and used in only a very limited capacity - unless you are seeing some locks throughout the BIOS?
|
19 Nov 2010, 02:33 |
|
mhanor
Joined: 17 Nov 2010
Posts: 7
|
There are 2 very small and very similar code sections, each has 2 spinlocks, each section has its own lock (a byte for each section, accessed with LOCK prefixed instructions, OR and DEC). Both code section are not referenced, I have yet to find how they get called.
Thank you for your help. I'll return after I've made more progress.
edit:
I think I have found the cause for my original problem (no VT-x after S3 resume). The code that sets the MSR and the bit after powerup, it first checks if the lock bit is set in the MSR, in which case it skips all the code. That means it won't set the EE71A bit ever again, for the current power session, because the CPU retains its MSR value after reset, but it's also locked, so it can't be manually set. There's another condition: after powerup, if you restart/reset the system, the BIOS modules are deflated again into RAM, which sets the EE71A byte to 0x0. And that's all it requires: powerup with the MSR set VT-x plus the lock, the restart clears the EE71A byte, while the code fails to set the bit again, and if you enter S3 sleep, the next time you resume, the VT-x will be disabled and locked.
I hope it makes sense. Thanks again for helping me.
|
19 Nov 2010, 22:21 |
|
bitRAKE
Joined: 21 Jul 2003
Posts: 2306
Location: dank orb
|
Are you suggesting that the S3 resume vector is set to code which disables VT-x and locks the MSR?
Related post: http://board.flatassembler.net/topic.php?p=116504#116504
(Some days were spent with my Phoenix BIOS code in IDA, but I can't find it presently.)
|
20 Nov 2010, 03:45 |
|
mhanor
Joined: 17 Nov 2010
Posts: 7
|
There's a piece of code (part of the BIOS resume code path) that just tests one bit from the EE71A byte, if it's set, it will set the bit 2 (enables VMX, hardware virtualization) from IA32_FEATURE_CONTROL. It will also set bit 0, the lock bit, from the same MSR, to complete it's configuration, regardless if the VMX bit is set or not. Locking the MSR is required (according to Intel docs) and it prevents any other writes to this MSR (you need to cut the CPU power to reset the MSR). You can read that piece of code in the code examples above, it starts at _E000:9BA4.
Last edited by mhanor on 21 Nov 2010, 12:13; edited 2 times in total
|
20 Nov 2010, 07:15 |
|
bitRAKE
Joined: 21 Jul 2003
Posts: 2306
Location: dank orb
|
I don't know how the BIOS detects S3 resume, but the CPU can be shutdown in S3 - per ACPI spec.
|
20 Nov 2010, 08:43 |
|
mhanor
Joined: 17 Nov 2010
Posts: 7
|
S3 sleep means no power to the CPU, I'm not saying otherwise.
edit: Here's the new code, it works:
Code: |
|
_E000:428C ; ---------------------------------------------------------------------------
_E000:428C push ds
_E000:428D push cs
_E000:428E push offset loc_E4299
_E000:4291 push offset unk_F8435 ; set PAM for F000 seg
_E000:4294 jmp far ptr loc_E8000
_E000:4299 ; ---------------------------------------------------------------------------
_E000:4299
_E000:4299 loc_E4299: ; DATA XREF: _E000:428Eo
_E000:4299 push 2000h
_E000:429C pop ds
_E000:429D assume ds:nothing
_E000:429D push si
_E000:429E mov eax, 1
_E000:42A4 cpuid
_E000:42A6 test cl, 20h
_E000:42A9 jnz short loc_E42B3
_E000:42AB mov si, 1588h
_E000:42AE or word ptr [si], 8
_E000:42B1 jmp short loc_E42F4
_E000:42B3 ; ---------------------------------------------------------------------------
_E000:42B3
_E000:42B3 loc_E42B3: ; CODE XREF: _E000:42A9j
_E000:42B3 mov ecx, 3Ah
_E000:42B9 rdmsr
_E000:42BB test al, 1
_E000:42BD jnz short loc_E42D9
_E000:42BF mov si, 1588h
_E000:42C2 call sub_E8930 ; checks BIOS
_E000:42C5 cmp al, 1
_E000:42C7 mov ecx, 3Ah
_E000:42CD rdmsr
_E000:42CF jz short loc_E42D5
_E000:42D1 or al, 4
_E000:42D3 wrmsr
_E000:42D5
_E000:42D5 loc_E42D5: ; CODE XREF: _E000:42CFj
_E000:42D5 or al, 1
_E000:42D7 wrmsr
_E000:42D9
_E000:42D9 loc_E42D9: ; CODE XREF: _E000:42BDj
_E000:42D9 test al, 4
_E000:42DB jz short loc_E42F4
_E000:42DD push es
_E000:42DE call sub_F7ABC
_E000:42E3 push seg _E000
_E000:42E6 pop es
_E000:42E7 assume es:_E000
_E000:42E7 mov si, offset unk_EE71A
_E000:42EA or byte ptr es:[si], 20h
_E000:42EE call sub_F7AC0
_E000:42F3 pop es
_E000:42F4 assume es:nothing
_E000:42F4
_E000:42F4 loc_E42F4: ; CODE XREF: _E000:42B1j
_E000:42F4 ; _E000:42DBj
_E000:42F4 pop si
_E000:42F5 push cs
_E000:42F6 push offset loc_E4301
_E000:42F9 push offset unk_F8443 ; set PAM for F000 seg
_E000:42FC jmp far ptr loc_E8000
_E000:4301 ; ---------------------------------------------------------------------------
_E000:4301
_E000:4301 loc_E4301: ; DATA XREF: _E000:42F6o
_E000:4301 pop ds
_E000:4302 assume ds:nothing
_E000:4302 retn
|
|
____________________________________________________________________
edit2:
Here's a generic fix for affected motherboards, that use Phoenix-Award BIOS-es.
Examples of such mainboards: Abit IP35-E, Abit IP35 Pro, Abit I-G31, Abit I-N73, EVGA 680i SLI, Foxconn MARS
Read the README before using it!
External link:
http://sites.google.com/site/quake2iasi/files/generic_bios_virtualization_fix1.zip?attredirects=0&d=1
You are not allowed to download the files attached to this post. You may need to log in in order to do so.
|
20 Nov 2010, 09:09 |
|
|
|
Forum Rules:
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum You cannot attach files in this forum You cannot download files in this forum
|
|
|
|
|
|
Powered by phpBB © 2001-2005 phpBB Group.
|