flat assembler
Message board for the users of flat assembler.

Index > Assembly > Processor fuzzing

Author
Thread Post new topic Reply to topic
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8349
Location: Kraków, Poland
Tomasz Grysztar 09 Sep 2017, 22:42
This is an interesting talk I found in the recommendations on YouTube: Breaking the x86 Instruction Set.
In some ways it brings me memories from the 90s again. Especially writing a code that will run differently depending on what CPU and/or mode is it executed on, I loved playing with such tricks.

I should note, however, that these DB E0 and DB E1 instructions that he listed as undocumented consistently across CPUs are actually 8087's FNENI and FNDISI, you can find them defined in 8087.INC in my fasmg package (well, fasm 1 knows them too). These were for the enabling and disabling the coprocessor's interrupts. It is probably for compatibility that they are still recognized, even though I'd guess they no longer do anything meaningful.

And F1 is the good old ICEBP, implemented by fasm as INT1.

{C0-C1}{30-37, 70-77, b0-b7, f0-f7} etc. is the family of opcodes for SAL instruction. Intel officially defines SAL and SHL to be alternative names for the same instruction, but SAL actually had a separate hidden code, which does the same as SHL anyway. I think I initially had these separate opcodes implemented in fasm (as I was using OPCODES.LST packaged with Ralph Brown's Interrupt List as my reference, and it lists the separate instruction code for SAL) but later I changed it to be in line with what is officially documented. Having a separate code for SAL was actually giving this system a nice symmetry, since there were then 8 shift/rotation instructions enumerated on 3 bits (0 - ROL, 1 - ROR, 2 - RCL, 3 - RCR, 4 - SHL, 5 - SHR, 6 - SAL, 7 - SAR). All the opcodes listed in the three rows in the video have the code 6 in the field that selects the shift/rotation instruction.

Finally for F6 /1 and F7 /1 we can also get a hint in the old reference:
OPCODES.LST wrote:
Code Extention # 16
(First byte(s) = F6h)
Note: Usually 001 do same thing as 000, TEST mem8,imm8
OPCODES.LST wrote:
Code Extention # 17
(First byte(s) = F7h)
Note: Usually 001 do same thing as 000, TEST mem,imm16
I'd guess they are one of the original 8086's contractions of opcode space that was preserved for compatibility, perhaps because some software might have been using it?
Post 09 Sep 2017, 22:42
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8349
Location: Kraków, Poland
Tomasz Grysztar 09 Sep 2017, 23:35
Oh, and since the F00F bug was also mentioned, I highly recommend reading Robert R. Collins's treatment of this topic from 1998. The workarounds did amaze me back in the day when I first read about them.
Post 09 Sep 2017, 23:35
View user's profile Send private message Visit poster's website Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 2493
Furs 12 Sep 2017, 10:46
I can't believe some of these bugs are actually real. I mean, locking up the CPU from ring 3, seriously? Do they not do any sort of QA or what? Even more ridiculous if this happens in the future after this guy made his code open source so Intel/AMD can actually use it to test their products.
Post 12 Sep 2017, 10:46
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20292
Location: In your JS exploiting you and your system
revolution 12 Sep 2017, 10:55
CPUs are complex. They have bugs. They will continue to get more complex. So they will continue to have bugs.
Post 12 Sep 2017, 10:55
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8349
Location: Kraków, Poland
Tomasz Grysztar 12 Sep 2017, 10:58
Also, the method he presented is very good for finding undocumented instructions, but it definitely cannot find all the potential hardware bugs. Even if a bug does not depend on additional context (like the contents of some GPRs or MSRs) it still may be invoked only by a very specific combination of fields in the opcode, looking for them would require an exhaustive search that is not realistically possible.
Post 12 Sep 2017, 10:58
View user's profile Send private message Visit poster's website Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 2493
Furs 12 Sep 2017, 12:21
revolution wrote:
CPUs are complex. They have bugs. They will continue to get more complex. So they will continue to have bugs.
That doesn't mean they need to have bugs. Testing your own instructions is a no-brainer. How difficult can it be to have a proper test-case QA for every instruction and prefix available to see it performs as you want it to? It wouldn't even need to be an exhaustive search since Intel/AMD know all of the instructions they use so they can just test them all.

Of course instructions that under very special states lock the CPU are a different matter. Those bugs are harder to find when doing simple test-cases. I'm not complaining about those kind of bugs here but the obvious ones.

Having an instruction that always locks the CPU due to a bug is stupid and not excusable no matter how complex the CPU is. This should never pass the test-case.
Post 12 Sep 2017, 12:21
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20292
Location: In your JS exploiting you and your system
revolution 12 Sep 2017, 13:12
Someone once told me: If you are not making mistakes then you are not making anything.

And let's not forget the Pentium FDIV bug.
Post 12 Sep 2017, 13:12
View user's profile Send private message Visit poster's website Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 2493
Furs 13 Sep 2017, 10:43
Well if nobody made mistakes, test-cases would be useless. However just because you make a mistake and have bugs doesn't mean they have to be let in the wild, and that's the problem. Testing reveals simple mistakes that can be corrected before it's too late.

Like I said, bugs that happen only under very specific states/circumstances are a different thing, since those are hard to track down or test. But something as obvious as "this instruction always locks up the CPU" is inexcusable to me... Confused
Post 13 Sep 2017, 10:43
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20292
Location: In your JS exploiting you and your system
revolution 13 Sep 2017, 11:43
People can also make mistakes in the test cases. Just because something has a test case doesn't mean the test is complete, comprehensive, accurate or working.

I'm not sure how we should punish Intel for the inexcusable mistake. Stop buying from them maybe?

The five billion dollar Hubble Space Telescope was tested and verified and sent into space with a flawed mirror.
Post 13 Sep 2017, 11:43
View user's profile Send private message Visit poster's website Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 2493
Furs 13 Sep 2017, 14:36
He didn't disclose the identity of the processor or its maker to protect people using it, so I'm not sure who to blame. FDIV/FOOF bug is a bit more excusable since it's so old. Back then, testing required more (comparatively speaking) resources and such.
Post 13 Sep 2017, 14:36
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20292
Location: In your JS exploiting you and your system
revolution 15 Sep 2017, 21:31
Another consideration is that there can be secret modes that such fuzzing like this can't find. Setting a particular set of bits in particular registers (like say CR1, or something) can enable the CPU to perform other functions, or disable certain functions, or copy data direct from memory, zeroing the RNG seed, or pretty much anything. It could even be masked by generating the expected #UD exception making the user think everything is normal.
Post 15 Sep 2017, 21:31
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8349
Location: Kraków, Poland
Tomasz Grysztar 16 Sep 2017, 09:33
revolution wrote:
Another consideration is that there can be secret modes that such fuzzing like this can't find.
And even the regular and well known modes still may require attention. Especially long mode is known for re-defining many instruction codes, so the results may vary substantially from 32-bit. As for the 16-bit modes, this method could only work for V86, testing instruction codes in real mode would be a bit more difficult. But people were doing it back in the 80s and 90s, as far as I know.
Post 16 Sep 2017, 09:33
View user's profile Send private message Visit poster's website Reply with quote
l4m2



Joined: 15 Jan 2015
Posts: 674
l4m2 15 Dec 2017, 10:37
Document doesn't have
Code:
mov CR0, [eax]    
but when running
Code:
C:\DOCUME~1\ADMINI~1>debug
-e100 f 22 0
-t    
there's no #UD(but some other fault)
Post 15 Dec 2017, 10:37
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8349
Location: Kraków, Poland
Tomasz Grysztar 15 Dec 2017, 12:04
l4m2 wrote:
Document doesn't have
Code:
mov CR0, [eax]    
but when running
Code:
C:\DOCUME~1\ADMINI~1>debug
-e100 f 22 0
-t    
there's no #UD(but some other fault)
Originally this encoding has not been documented, but since around 2008 the Intel manuals state this in the description of "Move to/from Control Registers" instructions:
Quote:
At the opcode level, the reg field within the ModR/M byte specifies which of the control registers is loaded or read. The 2 bits in the mod field are ignored. The r/m field specifies the general-purpose register loaded or read.
Post 15 Dec 2017, 12:04
View user's profile Send private message Visit poster's website Reply with quote
Enko



Joined: 03 Apr 2007
Posts: 676
Location: Mar del Plata
Enko 15 Dec 2017, 21:50
There is another interesting talk from the same guy about "ring -2" or "power managment mode"

https://www.youtube.com/watch?v=lR0nh-TdpVg
Post 15 Dec 2017, 21:50
View user's profile Send private message Reply with quote
Potato



Joined: 18 Apr 2020
Posts: 11
Potato 23 Apr 2020, 05:54
The Breaking the x86 Instruction Set video got recommended today by youtube - Funny how the algo works. Did the ring 3 processor lock instruction thats discussed ever get documented? Given its been years I figure there must have been some movement on this by now.
Post 23 Apr 2020, 05:54
View user's profile Send private message Reply with quote
Ali.Z



Joined: 08 Jan 2018
Posts: 712
Ali.Z 23 Apr 2020, 07:20

_________________
Asm For Wise Humans
Post 23 Apr 2020, 07:20
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20292
Location: In your JS exploiting you and your system
revolution 23 Apr 2020, 08:20
Ali.Z wrote:
https://www.youtube.com/watch?v=_eSAF_qT_FY
Good talk.
Post 23 Apr 2020, 08:20
View user's profile Send private message Visit poster's website Reply with quote
l4m2



Joined: 15 Jan 2015
Posts: 674
l4m2 27 Sep 2021, 00:25
According to http://www.os2museum.com/wp/undocumented-8086-opcodes-part-i/ , {D0-D3}{30-37, 70-77, b0-b7, f0-f7} was (C)SETMO(CL). Seems they didn't wire for this opcode, and if 80C86 exist (not sure if they do) this unlikely do the thing
Post 27 Sep 2021, 00:25
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.