flat assembler
Message board for the users of flat assembler.
![]() Goto page Previous 1, 2 |
Author |
|
LocoDelAssembly 10 Nov 2006, 16:07
LocoDelAssembly wrote: smiddy, do you mean http://softpixel.com/~cwright/programming/simd/cpuid.php ? But, I think that the site assumes that the code gets executed under an OS supporting SSE, maybe doing what you suggest on a boot code will not work so good. And the OS could not support SSE but as user-mode program you can't modify CR4 so you have to do some tricks to be sure that you will not get a UD exception even if the processor supports SSE. Look at smiddy posts |
|||
![]() |
|
Mark Larson 10 Nov 2006, 16:19
smiddy wrote: I think all you need to do is: I've been doing SSE/SSE2 programming for many years. You do not have to init the FPU unit. I posted this a while ago on c.l.a.x when someone was asking about for of SSE/SSE2 programs. I've done a lot of SSE and SSE2 programming over the years. I have an optimization website that goes over some basic tricks to speed up code with SSE/SSE2 ( along with other tricks). http://www.mark.masmcode.com/ P4's and up on the Intel side really run SSE/SSE2 code very fast. So I've used that advantage a lot to make code run extremely fast. converting a string to a qword using SSE2 http://www.oldboard.assemblercode.com/index.php?topic=4253.msg28940#m... SSE2 quaternion multiply http://www.oldboard.assemblercode.com/index.php?topic=3469.0 Mersenne Twister Random Number Generator in SSE2 http://www.oldboard.assemblercode.com/index.php?topic=3565.0 my account on masmforum got messed up ( all these links are for masmforum). So some messages will say they are from hutch- instead of marklarson. The way you tell it's the real me, is it'll say "guest" under "hutch--". Counting the number of lines in a file using SSE2 http://www.oldboard.assemblercode.com/index.php?topic=2692.msg18800#m... string copy using SSE2 http://www.oldboard.assemblercode.com/index.php?topic=2632.msg18047#m... Computing MD5 using SSE2 http://www.oldboard.assemblercode.com/index.php?topic=2921.0 I am working on a raytracer that I haven't finished yet. You can use scalar SSE code just like FP code ( you don't do stuff in parallel, it's a single floating point value you are doing an operation on). Scalar code is faster on a P4. ( not sure about AMD). http://www.masm32.com/board/index.php?topic=1140.0 line counting again. But I actually have 2 different versions using 2 different algorithms. If you scroll down the second posted one is done in a non-intuitive manner. http://www.masm32.com/board/index.php?topic=5434.msg40666#msg40666 _________________ BIOS programmers do it fastest! ![]() |
|||
![]() |
|
Mark Larson 10 Nov 2006, 16:23
Madis731 wrote:
This isn't entirely correst. Windows XP ( and I think 2000) automatically turn on SSE2 support. Linux does the same thing. It has to be enabled or it won't work. It's not the same as FPU or ALU registers. I wrote some code in the BIOS to do a memory test using SSE2. I had forgotten to turn on SSE2 support, and it didn't work. Once I flipped the magic switch it worked great. Older Windows OSes won't be turning it on, because they won't know about it. Same with DOS. So if you have one of those, you will have to turn it on manually if you want to use it. I am not sure when it got turned on under Windows. So Win98 might support it. _________________ BIOS programmers do it fastest! ![]() |
|||
![]() |
|
smiddy 10 Nov 2006, 17:13
Mark, awesome post! Thanks for the wealth of information. Once I get the opportunity to look at all these links I will.
|
|||
![]() |
|
rugxulo 10 Nov 2006, 18:58
The latest stable version of OpenWatcom is 1.6.
Quote:
|
|||
![]() |
|
Goto page Previous 1, 2 < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2023, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.