flat assembler
Message board for the users of flat assembler.
Index
> Windows > anybody here have a computer without mmx support? Goto page 1, 2 Next |
Author |
|
Tomasz Grysztar 02 May 2018, 17:06
I still sometimes use my Pentium 1 machine and this one has no MMX, though nowadays I reverted it to be DOS-only, no Win9x there anymore. My most "everyday" machine is at AVX, I hope to get something with AVX-512 in not too distant future.
|
|||
02 May 2018, 17:06 |
|
rugxulo 02 May 2018, 18:01
vivik wrote: mmx is around since pentium 2, that is, since 1997. There were late model P1s with MMX, too. Usually (but not always) the 166 or 200 Mhz ones. Quote:
Presumably yes, mostly due to clones (e.g. Vortex86). Not everything is made by Intel, nor is everything a Core i9. But remember that you also have to check for CPUID availability first before using it! First introduced in (some) 486s, always available in Pentium, not so sure about various clones. IIRC, DOSBox (emulator) is a "fast" 486 DX2 which does support CPUID (but originally I don't think the CPUID check itself worked, can't remember, but I think they fixed that by now). Quote:
If relying on Windows, you're basically forced to match its requirements, too. So if Win10 requires late-model P4, then you can't go below that anyways. Similarly, MSVC (since, what, 2010?) turns on hard targeting SSE2 by default, so beware similar settings. It's safe to say that most people aren't sympathetic to anything less than a P4 / Athlon64 these days (SSE2). Many aren't even sympathetic to native 32-bit cpus or OSes anymore either (only barely 32-bit userland/apps under 64-bit OSes atop 64-bit capable cpus). I don't have any AVX machines, but my semi-modern ones (desktop, laptop) all support at least SSSE3 [sic]. Basically, using anything beyond SSE2 probably needs CPUID check. But you're right, hardware is so quickly obsoleted these days that it matters less (although, in principle, I think it's better to always check). P.S. You need OS-level support to enable anything beyond MMX (e.g. SSE). So you can only run it once enabled, so you have to make sure your OS is supported first. But FPU/MMX has been long ago "deprecated" in lieu of SSE, so it's probably not wise to rely too much on it. |
|||
02 May 2018, 18:01 |
|
vivik 02 May 2018, 19:17
Hm, so it's probably ok to remove fpu and mmx code from libjpeg, for those algorithms that also have a sse2 implementation. I'll assume everyone has sse2 at least, not sure about sse3 though.
Dos probably isn't fit for jpg images, not enough colors. Same for win9x, i guess. Anyway, I'm sure they already have image viewers by now, no job for me. It feels like mmx and sse(2) were made specifically for jpg and mp3, to speed up discrete cosine transform used in both. |
|||
02 May 2018, 19:17 |
|
DimonSoft 02 May 2018, 22:12
vivik wrote: Dos probably isn't fit for jpg images, not enough colors. Same for win9x, i guess. Anyway, I'm sure they already have image viewers by now, no job for me. It is not a matter of an OS, it’s a matter of video adapter and its supported modes. |
|||
02 May 2018, 22:12 |
|
revolution 03 May 2018, 00:13
I think the best way to check for instruction set support is the use exception handling. Using CPUID is not always the final answer about whether or not a particular instruction is available. This is because the OS also has to support the usage of instructions that use extra register sets.
So set up an exception handler (which one should be doing anyway) and just simply start executing your MMX instructions. If you don't get any exceptions thrown, then all is okay. Else you will get an exception, then you can start the code on another path to compute things the non-MMX way. Note that executing CPUID itself can cause an exception. So if you don't catch it with an exception handler, or use some other detection method, then your code would crash. |
|||
03 May 2018, 00:13 |
|
Furs 03 May 2018, 20:20
revolution wrote: I think the best way to check for instruction set support is the use exception handling. Using CPUID is not always the final answer about whether or not a particular instruction is available. This is because the OS also has to support the usage of instructions that use extra register sets. You should place the handler at the beginning of the program to do the checks, and then update some global variable based on exceptions thrown (or none), not throw that exception every time you need to do some MMX stuff as you implied. |
|||
03 May 2018, 20:20 |
|
revolution 04 May 2018, 00:40
Well I meant that the program should execute an instruction and see if it faults. Then use that information to decide which execution path to follow. Setting a global variable would probably be the easiest way to record the result.
This can be done once for each instruction class you want to use: CMOVcc, BSWAP, FPU, MMX, SSE, SSE2, AVX, AVX2, AVX512, etc. So you build up a set of flags to say what is executable and what is not executable. You would have to do this anyway if you used CPUID. |
|||
04 May 2018, 00:40 |
|
Furs 04 May 2018, 11:41
Ah yeah then fully agreed, I'd do the same.
|
|||
04 May 2018, 11:41 |
|
rugxulo 05 May 2018, 01:13
revolution wrote: Well I meant that the program should execute an instruction and see if it faults. Then use that information to decide which execution path to follow. Setting a global variable would probably be the easiest way to record the result. Function pointer? That's what I'd do. I have some example code I could post about detecting such things. Though GCC (since 4.4? -mtune=native etc.) probably has built-ins already for it (and/or intrinsics), so it's probably not necessary. See Wikipedia about CPUID for some examples. BTW, DJGPP's libc uses RDTSC in its uclock() via signal handler (by catching SIGILL). Ironically, that source file was last updated fifteen years ago today. (Didn't RDTSC have an overflow bug in early Pentium models? I recently read that there was another bug where it wouldn't work under V86 mode, but that site seems temporarily down.) revolution wrote:
CMOV is from 686/PPro circa 1995, so it's quite ancient. Though there were some clones (VIA C3?) that didn't have it, but most Linux distros long ago went "686 only". BSWAP is 486 (1989), so I doubt anyone doesn't have that by now! FPU is Pentium (1993), so almost always available. MMX is deprecated, should be mostly avoided. SSE1 is P3/Athlon XP circa 2001 while SSE2 is P4/Athlon64 circa 2003. AVX is 2011, which I don't have, but I don't care either. And even AMD's Ryzen (Zen+) doesn't have AVX-512 (yet?). Last edited by rugxulo on 06 May 2018, 01:14; edited 2 times in total |
|||
05 May 2018, 01:13 |
|
rugxulo 05 May 2018, 01:16
Also, I vaguely recall hearing that Google wrote their own detection library, but I don't remember where or what it's called. Maybe libcpuid?
|
|||
05 May 2018, 01:16 |
|
rugxulo 05 May 2018, 01:36
BTW, here's one person's notes about such things including 186+ INT6 handling. (I'm not aware of a lot of DOS programs utilizing that, but at least shareware EMU386 did, but only for the real-mode 386 instructions.)
Later versions of Turbo Pascal for DOS/Windows had cpu detection for its own runtime routines (if 286? 386?), and you could read (or disable!) that via the "test8086" variable. |
|||
05 May 2018, 01:36 |
|
revolution 05 May 2018, 02:00
Just because it is old doesn't mean it is bad. Old is gold.
And I forgot to include 3DNow. |
|||
05 May 2018, 02:00 |
|
rugxulo 05 May 2018, 02:13
revolution wrote: And I forgot to include 3DNow. Actually, AMD removed support of that from recent processors! EDIT: And just to add to this (but not unnecessarily clutter with more posts), Free Pascal has some detection in its RTL, but I'm not 100% familiar with it, so I don't know all the quirks. Also see unit cpu (which is used by unit mmx). Last edited by rugxulo on 05 May 2018, 02:25; edited 1 time in total |
|||
05 May 2018, 02:13 |
|
rugxulo 05 May 2018, 02:24
DimonSoft wrote:
DOS is irrelevant to everyone here (except me). Just for completeness, you're normally stuck with VESA 1.2/2/3, which is good enough for most things. Octavio (user here) once wrote/modified/ported a simple 486-ish JPEG viewer (320x200?) years ago in assembly. But other than that, there were many JPG viewers: ShowJPG, PictView, PacePlayer, Display/dispt5, Blocek, Lxpic, etc. (Just off the top of my head, and I'm sure there were dozens of others. EDIT: QPV/386, CompuShow) EDIT: Just to add more details, since we're talking DOS, there are several DPMI hosts / DOS extenders that will enable SSE for you: HX's HDPMI32, Causeway (CWSTUB) 4.x, DOS/32A 9.1.2, CWSDPMI (r5 2008 or r7 2010). |
|||
05 May 2018, 02:24 |
|
Melissa 06 May 2018, 22:10
Tomasz Grysztar wrote: I still sometimes use my Pentium 1 machine and this one has no MMX, though nowadays I reverted it to be DOS-only, no Win9x there anymore. My most "everyday" machine is at AVX, I hope to get something with AVX-512 in not too distant future. Intel still considers avx 512 high end feature. They promise it since 2015 on mainstream, alas no. I guess that sometime in 2020 avx 512 will be common on desktop cpu's. |
|||
06 May 2018, 22:10 |
|
Melissa 06 May 2018, 22:13
vivik wrote: mmx is around since pentium 2, that is, since 1997. Is there even a point in doing cpuid check anymore? avx2. Haswell, and I wait for avx 512 to become mainstream, to replace CPU. |
|||
06 May 2018, 22:13 |
|
rugxulo 07 May 2018, 17:25
According to Wikipedia, ....
Wikipedia wrote:
Worse is that it has performance/power tradeoffs, see here. Cloudflare wrote:
Also, AMD seems to (usually) need "-mprefer-avx128" with GCC. So, like everything else: "Should you use it?" ... "It depends". |
|||
07 May 2018, 17:25 |
|
Furs 07 May 2018, 20:12
I think AMD doesn't do "true" AVX2 and just splits the lane into two SSE operations.
Also, AVX-512 to me seems like a frankenstein of bloat: too many registers/operands and the "mask registers"... oh well I very much prefer writing "normal" vector code (if I have to, not that anyone likes writing vectorized code tbh). (well encoding is bloated, but I'm more referring to the "specification" bloat here, for humans...) |
|||
07 May 2018, 20:12 |
|
Melissa 08 May 2018, 00:26
"Also, AMD seems to (usually) need "-mprefer-avx128" with GCC. "
Yeh, AMD has 4 128 bit units, intel 2 256 bit ones. That means amd can execute 4 128 bit instructions, while intel only two. But intel 2 fma 256 bit instructions and AMD only one. |
|||
08 May 2018, 00:26 |
|
Goto page 1, 2 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.