flat assembler
Message board for the users of flat assembler.

Index > Windows > anybody here have a computer without mmx support?

Goto page 1, 2  Next
Author
Thread Post new topic Reply to topic
vivik



Joined: 29 Oct 2016
Posts: 671
vivik 02 May 2018, 16:49
mmx is around since pentium 2, that is, since 1997. Is there even a point in doing cpuid check anymore?

What is your highest supported simd instruction set? I need to know how many computers like that left. Also I need testers, I'd like to test my WINDOWS programs on an older hardware.
Post 02 May 2018, 16:49
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8353
Location: Kraków, Poland
Tomasz Grysztar 02 May 2018, 17:06
I still sometimes use my Pentium 1 machine and this one has no MMX, though nowadays I reverted it to be DOS-only, no Win9x there anymore. My most "everyday" machine is at AVX, I hope to get something with AVX-512 in not too distant future.
Post 02 May 2018, 17:06
View user's profile Send private message Visit poster's website Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 02 May 2018, 18:01
vivik wrote:
mmx is around since pentium 2, that is, since 1997.


There were late model P1s with MMX, too. Usually (but not always) the 166 or 200 Mhz ones.

Quote:

Is there even a point in doing cpuid check anymore?


Presumably yes, mostly due to clones (e.g. Vortex86). Not everything is made by Intel, nor is everything a Core i9.

But remember that you also have to check for CPUID availability first before using it! Laughing First introduced in (some) 486s, always available in Pentium, not so sure about various clones. IIRC, DOSBox (emulator) is a "fast" 486 DX2 which does support CPUID (but originally I don't think the CPUID check itself worked, can't remember, but I think they fixed that by now).

Quote:

What is your highest supported simd instruction set? I need to know how many computers like that left. Also I need testers, I'd like to test my WINDOWS programs on an older hardware.


If relying on Windows, you're basically forced to match its requirements, too. So if Win10 requires late-model P4, then you can't go below that anyways.

Similarly, MSVC (since, what, 2010?) turns on hard targeting SSE2 by default, so beware similar settings.

It's safe to say that most people aren't sympathetic to anything less than a P4 / Athlon64 these days (SSE2). Many aren't even sympathetic to native 32-bit cpus or OSes anymore either (only barely 32-bit userland/apps under 64-bit OSes atop 64-bit capable cpus).

I don't have any AVX machines, but my semi-modern ones (desktop, laptop) all support at least SSSE3 [sic].

Basically, using anything beyond SSE2 probably needs CPUID check. But you're right, hardware is so quickly obsoleted these days that it matters less (although, in principle, I think it's better to always check).

P.S. You need OS-level support to enable anything beyond MMX (e.g. SSE). So you can only run it once enabled, so you have to make sure your OS is supported first. But FPU/MMX has been long ago "deprecated" in lieu of SSE, so it's probably not wise to rely too much on it.
Post 02 May 2018, 18:01
View user's profile Send private message Visit poster's website Reply with quote
vivik



Joined: 29 Oct 2016
Posts: 671
vivik 02 May 2018, 19:17
Hm, so it's probably ok to remove fpu and mmx code from libjpeg, for those algorithms that also have a sse2 implementation. I'll assume everyone has sse2 at least, not sure about sse3 though.

Dos probably isn't fit for jpg images, not enough colors. Same for win9x, i guess. Anyway, I'm sure they already have image viewers by now, no job for me. It feels like mmx and sse(2) were made specifically for jpg and mp3, to speed up discrete cosine transform used in both.
Post 02 May 2018, 19:17
View user's profile Send private message Reply with quote
DimonSoft



Joined: 03 Mar 2010
Posts: 1228
Location: Belarus
DimonSoft 02 May 2018, 22:12
vivik wrote:
Dos probably isn't fit for jpg images, not enough colors. Same for win9x, i guess. Anyway, I'm sure they already have image viewers by now, no job for me.

It is not a matter of an OS, it’s a matter of video adapter and its supported modes.
Post 02 May 2018, 22:12
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20339
Location: In your JS exploiting you and your system
revolution 03 May 2018, 00:13
I think the best way to check for instruction set support is the use exception handling. Using CPUID is not always the final answer about whether or not a particular instruction is available. This is because the OS also has to support the usage of instructions that use extra register sets.

So set up an exception handler (which one should be doing anyway) and just simply start executing your MMX instructions. If you don't get any exceptions thrown, then all is okay. Else you will get an exception, then you can start the code on another path to compute things the non-MMX way.

Note that executing CPUID itself can cause an exception. So if you don't catch it with an exception handler, or use some other detection method, then your code would crash.
Post 03 May 2018, 00:13
View user's profile Send private message Visit poster's website Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 2503
Furs 03 May 2018, 20:20
revolution wrote:
I think the best way to check for instruction set support is the use exception handling. Using CPUID is not always the final answer about whether or not a particular instruction is available. This is because the OS also has to support the usage of instructions that use extra register sets.

So set up an exception handler (which one should be doing anyway) and just simply start executing your MMX instructions. If you don't get any exceptions thrown, then all is okay. Else you will get an exception, then you can start the code on another path to compute things the non-MMX way.

Note that executing CPUID itself can cause an exception. So if you don't catch it with an exception handler, or use some other detection method, then your code would crash.
That's terribly inefficient, the way you worded it, but I agree with the point about using exceptions.

You should place the handler at the beginning of the program to do the checks, and then update some global variable based on exceptions thrown (or none), not throw that exception every time you need to do some MMX stuff as you implied. Confused
Post 03 May 2018, 20:20
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20339
Location: In your JS exploiting you and your system
revolution 04 May 2018, 00:40
Well I meant that the program should execute an instruction and see if it faults. Then use that information to decide which execution path to follow. Setting a global variable would probably be the easiest way to record the result.

This can be done once for each instruction class you want to use: CMOVcc, BSWAP, FPU, MMX, SSE, SSE2, AVX, AVX2, AVX512, etc. So you build up a set of flags to say what is executable and what is not executable. You would have to do this anyway if you used CPUID.
Post 04 May 2018, 00:40
View user's profile Send private message Visit poster's website Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 2503
Furs 04 May 2018, 11:41
Ah yeah then fully agreed, I'd do the same.
Post 04 May 2018, 11:41
View user's profile Send private message Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 05 May 2018, 01:13
revolution wrote:
Well I meant that the program should execute an instruction and see if it faults. Then use that information to decide which execution path to follow. Setting a global variable would probably be the easiest way to record the result.


Function pointer? That's what I'd do.

I have some example code I could post about detecting such things. Though GCC (since 4.4? -mtune=native etc.) probably has built-ins already for it (and/or intrinsics), so it's probably not necessary. See Wikipedia about CPUID for some examples.

BTW, DJGPP's libc uses RDTSC in its uclock() via signal handler (by catching SIGILL). Ironically, that source file was last updated fifteen years ago today.

(Didn't RDTSC have an overflow bug in early Pentium models? I recently read that there was another bug where it wouldn't work under V86 mode, but that site seems temporarily down.)

revolution wrote:

This can be done once for each instruction class you want to use: CMOVcc, BSWAP, FPU, MMX, SSE, SSE2, AVX, AVX2, AVX512, etc. So you build up a set of flags to say what is executable and what is not executable. You would have to do this anyway if you used CPUID.


CMOV is from 686/PPro circa 1995, so it's quite ancient. Though there were some clones (VIA C3?) that didn't have it, but most Linux distros long ago went "686 only". BSWAP is 486 (1989), so I doubt anyone doesn't have that by now! FPU is Pentium (1993), so almost always available. MMX is deprecated, should be mostly avoided. SSE1 is P3/Athlon XP circa 2001 while SSE2 is P4/Athlon64 circa 2003. AVX is 2011, which I don't have, but I don't care either. And even AMD's Ryzen (Zen+) doesn't have AVX-512 (yet?).


Last edited by rugxulo on 06 May 2018, 01:14; edited 2 times in total
Post 05 May 2018, 01:13
View user's profile Send private message Visit poster's website Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 05 May 2018, 01:16
Also, I vaguely recall hearing that Google wrote their own detection library, but I don't remember where or what it's called. Maybe libcpuid?
Post 05 May 2018, 01:16
View user's profile Send private message Visit poster's website Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 05 May 2018, 01:36
BTW, here's one person's notes about such things including 186+ INT6 handling. (I'm not aware of a lot of DOS programs utilizing that, but at least shareware EMU386 did, but only for the real-mode 386 instructions.)

Later versions of Turbo Pascal for DOS/Windows had cpu detection for its own runtime routines (if 286? 386?), and you could read (or disable!) that via the "test8086" variable.
Post 05 May 2018, 01:36
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20339
Location: In your JS exploiting you and your system
revolution 05 May 2018, 02:00
Just because it is old doesn't mean it is bad. Old is gold. Wink

And I forgot to include 3DNow.
Post 05 May 2018, 02:00
View user's profile Send private message Visit poster's website Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 05 May 2018, 02:13
revolution wrote:
And I forgot to include 3DNow.


Actually, AMD removed support of that from recent processors!

EDIT: And just to add to this (but not unnecessarily clutter with more posts), Free Pascal has some detection in its RTL, but I'm not 100% familiar with it, so I don't know all the quirks. Also see unit cpu (which is used by unit mmx).


Last edited by rugxulo on 05 May 2018, 02:25; edited 1 time in total
Post 05 May 2018, 02:13
View user's profile Send private message Visit poster's website Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 05 May 2018, 02:24
DimonSoft wrote:
vivik wrote:
Dos probably isn't fit for jpg images, not enough colors. Same for win9x, i guess. Anyway, I'm sure they already have image viewers by now, no job for me.

It is not a matter of an OS, it’s a matter of video adapter and its supported modes.


DOS is irrelevant to everyone here (except me). Razz

Just for completeness, you're normally stuck with VESA 1.2/2/3, which is good enough for most things.

Octavio (user here) once wrote/modified/ported a simple 486-ish JPEG viewer (320x200?) years ago in assembly. But other than that, there were many JPG viewers: ShowJPG, PictView, PacePlayer, Display/dispt5, Blocek, Lxpic, etc. (Just off the top of my head, and I'm sure there were dozens of others. EDIT: QPV/386, CompuShow)

EDIT: Just to add more details, since we're talking DOS, there are several DPMI hosts / DOS extenders that will enable SSE for you: HX's HDPMI32, Causeway (CWSTUB) 4.x, DOS/32A 9.1.2, CWSDPMI (r5 2008 or r7 2010).
Post 05 May 2018, 02:24
View user's profile Send private message Visit poster's website Reply with quote
Melissa



Joined: 12 Apr 2012
Posts: 125
Melissa 06 May 2018, 22:10
Tomasz Grysztar wrote:
I still sometimes use my Pentium 1 machine and this one has no MMX, though nowadays I reverted it to be DOS-only, no Win9x there anymore. My most "everyday" machine is at AVX, I hope to get something with AVX-512 in not too distant future.


Intel still considers avx 512 high end feature. They promise it since 2015 on mainstream, alas no.
I guess that sometime in 2020 avx 512 will be common on desktop cpu's.
Post 06 May 2018, 22:10
View user's profile Send private message Reply with quote
Melissa



Joined: 12 Apr 2012
Posts: 125
Melissa 06 May 2018, 22:13
vivik wrote:
mmx is around since pentium 2, that is, since 1997. Is there even a point in doing cpuid check anymore?

What is your highest supported simd instruction set? I need to know how many computers like that left. Also I need testers, I'd like to test my WINDOWS programs on an older hardware.


avx2. Haswell, and I wait for avx 512 to become mainstream, to replace CPU.
Post 06 May 2018, 22:13
View user's profile Send private message Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 07 May 2018, 17:25
According to Wikipedia, ....

Wikipedia wrote:

AVX-512 consists of multiple extensions that are not all meant to be supported by all processors implementing them


Worse is that it has performance/power tradeoffs, see here.

Cloudflare wrote:

If you do not require AVX-512 for some specific high performance tasks, I suggest you disable AVX-512 execution on your server or desktop, to avoid accidental AVX-512 throttling.


Also, AMD seems to (usually) need "-mprefer-avx128" with GCC.

So, like everything else: "Should you use it?" ... "It depends".
Post 07 May 2018, 17:25
View user's profile Send private message Visit poster's website Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 2503
Furs 07 May 2018, 20:12
I think AMD doesn't do "true" AVX2 and just splits the lane into two SSE operations.

Also, AVX-512 to me seems like a frankenstein of bloat: too many registers/operands and the "mask registers"... oh well Confused I very much prefer writing "normal" vector code (if I have to, not that anyone likes writing vectorized code tbh).

(well encoding is bloated, but I'm more referring to the "specification" bloat here, for humans...)
Post 07 May 2018, 20:12
View user's profile Send private message Reply with quote
Melissa



Joined: 12 Apr 2012
Posts: 125
Melissa 08 May 2018, 00:26
"Also, AMD seems to (usually) need "-mprefer-avx128" with GCC. "

Yeh, AMD has 4 128 bit units, intel 2 256 bit ones. That means amd can execute 4 128 bit instructions, while intel only two. But intel 2 fma 256 bit instructions and AMD only one.
Post 08 May 2018, 00:26
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page 1, 2  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.