flat assembler
Message board for the users of flat assembler.
Index
> Main > Sempron 64 MMU bug? Goto page 1, 2 Next |
Author |
|
revolution 18 Aug 2008, 02:14
MY first thought is that the motherboard has a memory hole at your stated address. Are you using the same everything and just changing the CPU in the socket? I strongly doubt that the Sempron has the bug you suggest.
|
|||
18 Aug 2008, 02:14 |
|
Madis731 18 Aug 2008, 06:27
You seem to be hanging on to the value 30F000h, but what about 100000h or 5F0000h? What minimum alignment should the table have and is it different when beyond 1MB?
|
|||
18 Aug 2008, 06:27 |
|
lazer1 18 Aug 2008, 16:55
revolution wrote: MY first thought is that the motherboard has a memory hole at your stated address. Are you using the same everything and just changing the CPU in the socket? I strongly doubt that the Sempron has the bug you suggest. nothing at all wrong with the memory, I echo the contents and EVERYTHING fine, and the IDENTICAL code functions fine with the Turion X2 and the IDENTICAL code but with memory within the first 1mb functions fine. perhaps there is something wrong with how cr3 connects with the memory |
|||
18 Aug 2008, 16:55 |
|
revolution 18 Aug 2008, 17:04
Are you using the same motherboard with just a change of CPU for each test?
|
|||
18 Aug 2008, 17:04 |
|
lazer1 18 Aug 2008, 17:19
Madis731 wrote: You seem to be hanging on to the value 30F000h, but what about 100000h or 5F0000h? What minimum alignment should the table have and is it different when beyond 1MB? its because the memory setup is VERY COMPLICATED, I obtain memory via my own memory allocation code, I cannot just select any memory as that may be in use by my system. I would have to reserve it earlier, and its too much effort considering the Turion X2 has no problem. Also I may run out of space as the system boots from a floppy, eg I have less than 300 bytes space left for one of the early phases. 30f000h is the memory which happens to be allocated when I allocate 4K aligned to 4K, ie ???????????000h if I allocate AGAIN that also fails just on the Sempron. if I allocate AGAIN probably it allocates 310000h as the memory is relatively clean. the memory itself is fine, I loop through all the entries and echo all nonzero ones and they are identical to the original table. I even use clflush and mfence to guarantee the physical memory is coherent with the caches, and I also used variable range MTRR'S to guarantee that the expansion memory is all uncached. the problem is not just cr3, its all levels of the table on the Sempron. its just the MMU which is problematic, the code itself isnt. if the MMU table is entirely within the first 1mb, then code it maps outside the first 1mb is fine to access, that is how I access the 30f000h table. but if a table OF the MMU table is outside the first 1mb, then the Sempron MMU doesnt function. I set up 100MB of virtual memory, when that is referenced the page fault handler is supposed to connect it to physical memory. no problem with the Turion X2, but with the Sempron the page fault keeps repeating infinitely and each time after the first the code determines there is no page fault, the first page fault supplies the page which is at virtual address ffff8080063feff8h, done by allocating and connecting the missing table entries for the above address: when the code continues it immediately faults again at the same instruction and the fault handler determines there is no fault, and returns, and it faults again. the problem is the fault handler supplies the missing table pages from beyond 1mb, all the tables are there for the faulting address, and the Turion X2 is fine, ie the code must be right. this is a Sempron from PROBABLY 2006, (I forget) it is long before Vista was released. to allocate within the first 1mb I have to allocate before I bootstrap to long mode, if I allocate 4K aligned to 4K then and use it later there is no problem. that allocates at eg 65000h to replace the allocation at 66000h by coincidence on both machines all the allocations are identical and on the Turion X2 everything fine. the alignment constraints on the memory for tables is bits 0 to 11 must be 0, and bits 12 to 51 can be any PHYSICAL memory, NO OTHER CONSTRAINTS. I read somewhere that some Intel MMUs have a bug that the MMU tables need to be 32 bit, the tables need to be physical memory, the PCD + PWT flags means you can choose whether the tables are cached or not, the fully cached tables ie PCD==0 and PWT==0 function fine even with the 2 cpus of the Turion X2 but I also tried PCD==1, PWT==1 in all tables and it still fails on the Sempron. the only thing I can think of is if there is some MSR which limits what addresses can be used for tables, but I havent seen any mention of such, what point anyway to limit the MMU table addresses? |
|||
18 Aug 2008, 17:19 |
|
revolution 18 Aug 2008, 17:44
You didn't answer my question. If you are using different MB's then it is likely that your problem is the way in which the memory is laid out that is causing the problem you see. Like I mentioned above, I strongly doubt the Sempron is the cause of the problem, you need to look further into the system board that the Sempron is installed into to see why you have the problem.
|
|||
18 Aug 2008, 17:44 |
|
lazer1 18 Aug 2008, 18:08
revolution wrote: You didn't answer my question. If you are using different MB's then it is likely that your problem is the way in which the memory is laid out that is causing the problem you see. Like I mentioned above, I strongly doubt the Sempron is the cause of the problem, you need to look further into the system board that the Sempron is installed into to see why you have the problem. you are saying the mobo is bugged? its beyond my scope to look at the mobo, I am a programmer not an electrical engineer. if you explain HOW I can do what you are suggesting I can try, the mobo is an "Award modular BIOS" |
|||
18 Aug 2008, 18:08 |
|
revolution 18 Aug 2008, 18:20
I'm not saying your MB is bugged. Just confirm that the physical memory layout is what you expect.
|
|||
18 Aug 2008, 18:20 |
|
tom tobias 18 Aug 2008, 18:47
|
|||
18 Aug 2008, 18:47 |
|
lazer1 18 Aug 2008, 20:20
revolution wrote: I'm not saying your MB is bugged. Just confirm that the physical memory layout is what you expect. I know the physical memory from the BIOS functions, and the physical memory DOES exist because I filled the table and echoed the results without problem. steps: 1. alloc 4K, 30f000h 2. fill that with the same entries as the current MMU table at 66000h 3. echo the contents of the table at 30f000h, no problem. 4. "mov cr3, reg" where reg contains 30f000h 5. crash on Sempron, no problem with Turion X2 I couldnt do steps 1 to 3 if the memory didnt exist, and I have also both tried clflush without problem and also set the memory to uncacheable without problem. the easiest way to prove some memory rax exists is: clflush byte [ rax ] ; flush all levels of caches to physical memory. mfence ; force all memory ops to complete no crash when I do this, but a crash when I mov the memory to cr3. I even echoed all the MTRR's, eg: MTRRphysBase[ 0 ] = 6 MTRRphysMask[ 0 ] = ffc0000800 which means that the first 1G of memory is mapped as Writeback by the BIOS. the Sempron is bugged there is no other explanation. if you google for "mmu bugs" there are plenty of pages about x86 mmu bugs. |
|||
18 Aug 2008, 20:20 |
|
LocoDelAssembly 18 Aug 2008, 20:26
Perhaps it has nothing to do but, do you wait a few seconds before echoing the table at the problematic address space?
Have you check the AMD errata? [edit]Don't hesitate to post your code **IF YOU CAN**. In case you cannot then it would be possible for you to prepare a minimal code that crash the system? I have an Athlon64 (Venice, single core) here if you want me to test some code[/edit] [edit2]AMD revision guide: http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/25759.pdf . I haven't found nothing explicit but there are some bugs related to TLB buffers but I don't think that relates with your problem. Anyway, check the paper yourself too, maybe you can spot something that I missed (because frankly, I haven't read it word by word).[/edit2] |
|||
18 Aug 2008, 20:26 |
|
lazer1 19 Aug 2008, 13:59
LocoDelAssembly wrote: Perhaps it has nothing to do but, do you wait a few seconds before echoing the table at the problematic address space? its too unfeasible to send the code at the moment as this is code I have been working on since 2006! eventually I will make the system available but what you can do is: if you have access to supervisor level long mode code and access to physical memory: read cr3 to a register, zero out the PCD + PWT bits to make it into a PHYSICAL address. (A) now allocate 4K of physical memory aligned to 4K, (B) ie 4K at address ????????????000h and ensure it is beyond the first 1mb, now just verbatim copy the table (A) to (B), (via the virtual addresses) set the same PCD + PWT flags of the original cr3 to (B) and copy (B) to cr3 via "mov". the thing is that your CPU probably is fine, eg my Turion X2 is fine. note that your code will work with virtual addresses but cr3 and the MMU table work with physical addresses thus you need to know the mapping between virtual and physical, you cannot do these things from user code, it needs to be supervisor privilege level. or even simpler, read cr3 and zero out PCD + PWT and see if it is outside the first 1mb, if it is then your CPU doesnt have the bug. |
|||
19 Aug 2008, 13:59 |
|
f0dder 19 Aug 2008, 14:29
Have you verified that Gate A20 is set correctly? And have you checked the E820 memory map to make sure you're not writing your table to reserved areas?
|
|||
19 Aug 2008, 14:29 |
|
lazer1 19 Aug 2008, 15:25
f0dder wrote: Have you verified that Gate A20 is set correctly? And have you checked the E820 memory map to make sure you're not writing your table to reserved areas? no idea about Gate A20, can you explain what that is? looking at code I wrote in march 2007 I use Bios function e820h to get the system memory map, that is used to initialise the memory system. eax: 0e820h edx: SMAP ecx: 20 int 15h and the returned results are then processed my memory system only allocates from this memory I will see if I can echo out the memory map for the Sempron, if you can explain the Gate A20 idea as I dont recall that from my project |
|||
19 Aug 2008, 15:25 |
|
LocoDelAssembly 19 Aug 2008, 15:34
haha, excellent shot f0dder
http://en.wikipedia.org/wiki/A20_line . In some mobos it is not required to care about this, in others you are allowed to enable/disable A20 in BIOS setup and in others you must always care about this. It is yet unclear why you can echo the table though, but programming the A20 gate is mandatory anyway so do it and tell us if fix the problem. |
|||
19 Aug 2008, 15:34 |
|
lazer1 19 Aug 2008, 15:48
ok, the memory map for the Sempron appears to be:
0h <= mem < 9f800h 100000 <= mem < 3fff0000 AFAICT PC h/w remaps physical mem to be contiguous, the only holes are the absolute memory ranges taken up by things documented in the h/w manuals, eg the BIOS is a hole, the B8000h etc gfx addresse |
|||
19 Aug 2008, 15:48 |
|
lazer1 19 Aug 2008, 16:03
LocoDelAssembly wrote: haha, excellent shot f0dder that looks like it probably is the explanation as 30f000h has bit 20 as 1, I'm going to try it now, for multiprocessors does this have to be done for each cpu? for the Turion X2 it looks like A20 must be set for all the cpus by the mobo. |
|||
19 Aug 2008, 16:03 |
|
revolution 19 Aug 2008, 16:23
Glad you could finally solve your problem.
|
|||
19 Aug 2008, 16:23 |
|
lazer1 19 Aug 2008, 16:49
f0dder wrote: Have you verified that Gate A20 is set correctly? that has done the trick! everything is functioning now with the Sempron, not just "mov cr3, reg" with 30f000h but the later page fault, all the way to the code that was originally meant to run. I went via the wiki link given, and then the link from there: http://www.win.tue.nl/~aeb/linux/kbd/A20.html I tried the BIOS function given but it still crashed, (I didnt check the return value either) I then used the code fragment right at the top of the above URL and that has REMOVED ALL problems the fragment modified a bit is: Quote:
|
|||
19 Aug 2008, 16:49 |
|
Goto page 1, 2 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.