flat assembler
Message board for the users of flat assembler.

Index > Linux > Vector libraries for C++ on FASM in x86-64 Linux

Goto page 1, 2  Next
Author
Thread Post new topic Reply to topic
jackblack



Joined: 04 Feb 2011
Posts: 12
jackblack 23 Aug 2012, 20:16
I have recently started a project of SIMD libraries development for C++ on FASM for x86-64 Linux.
I would be glad to hear any opinion or feedback about the project, cleanness of the code and documentation. Here is the project's web site on SourceForge.
Post 23 Aug 2012, 20:16
View user's profile Send private message Reply with quote
Biterider



Joined: 10 Apr 2005
Posts: 5
Location: Switzerland
Biterider 24 Aug 2012, 06:36
Hi
Im impressed. I myself started a similar library for a signal analyser project many years before but never went so far. Congratulations, it is a really good work.

Biterider

_________________
ObjAsm32
Post 24 Aug 2012, 06:36
View user's profile Send private message Reply with quote
typedef



Joined: 25 Jul 2010
Posts: 2909
Location: 0x77760000
typedef 24 Aug 2012, 09:51
Nice. Seems like the math library was your most favorite huh?
Post 24 Aug 2012, 09:51
View user's profile Send private message Reply with quote
jackblack



Joined: 04 Feb 2011
Posts: 12
jackblack 24 Aug 2012, 18:06
Not only the math. But the math lib is a right place for optimization. My libraries were developed mostly for data analysis, and for some fun for myself Smile
Post 24 Aug 2012, 18:06
View user's profile Send private message Reply with quote
typedef



Joined: 25 Jul 2010
Posts: 2909
Location: 0x77760000
typedef 24 Aug 2012, 21:51
They seem to fit too.

I like the ones that have to do with numbers. By the way did you think of a graphics library. You could use the same libraries (math, Angle, statistics)

That would also be a nice way of showcasing what these libraries can do. But I guess that part is left out for some one else. ha ha
Post 24 Aug 2012, 21:51
View user's profile Send private message Reply with quote
jackblack



Joined: 04 Feb 2011
Posts: 12
jackblack 25 Aug 2012, 10:32
>By the way did you think of a graphics library. You could use the same libraries (math, Angle, statistics)

It's a good idea, but it's already done by someone.
Here is the project called libjpeg-turbo http://libjpeg-turbo.virtualgl.org/. It's now a part of Arch Linux release. So this job is already done.

>That would also be a nice way of showcasing what these libraries can do.
This is a good idea. Will think about it.
Post 25 Aug 2012, 10:32
View user's profile Send private message Reply with quote
jackblack



Joined: 04 Feb 2011
Posts: 12
jackblack 25 Aug 2012, 10:35
>I myself started a similar library for a signal analyser project many years before but never went so far.
If you have any fresh ideas for my project, I will be glad for any kinds help.
Post 25 Aug 2012, 10:35
View user's profile Send private message Reply with quote
jackblack



Joined: 04 Feb 2011
Posts: 12
jackblack 25 Aug 2012, 11:18
By the way. Here is a speed test of the Numbers conversion library.
Code:
################################################################################
#       Numbers conversion library speed test                                  #
################################################################################
This test converts 1000000 numbers in 100 rounds.

Integer numbers conversion:
===========================

Octal numbers conversion:
    'sscanf' time: 36.097842 sec
    'strtoul' time: 9.514678 sec
    'LinAsm' time: 6.387945 sec

Hexadecimal numbers conversion:
    'sscanf' time: 34.354336 sec
    'strtoul' time: 10.910184 sec
    'LinAsm' time: 9.501984 sec

Decimal numbers conversion:
    'sscanf' time: 35.352888 sec
    'strtoul' time: 9.828523 sec
    'LinAsm' time: 7.790533 sec

Floating-point numbers conversion:
==================================

Hexadecimal numbers conversion:
    'sscanf' time: 52.327102 sec
    'strtod' time: 20.347512 sec
    'LinAsm' time: 12.780951 sec

Decimal numbers conversion:
    'sscanf' time: 54.220025 sec
    'strtod' time: 30.267146 sec
    'LinAsm' time: 7.706974 sec    

Was very interesting to compare floating-point variants with "strtod" function from GNU C lib. LinAsm 2 and 3 times faster in this task.

P.S. All tests were run on overclocked Pentium 4
Post 25 Aug 2012, 11:18
View user's profile Send private message Reply with quote
TmX



Joined: 02 Mar 2006
Posts: 841
Location: Jakarta, Indonesia
TmX 27 Aug 2012, 10:30
Very nice!
BTW, do you have plan to support 32-bit OSes, at least Linux... or Windows?

Smile
Post 27 Aug 2012, 10:30
View user's profile Send private message Reply with quote
jackblack



Joined: 04 Feb 2011
Posts: 12
jackblack 28 Aug 2012, 13:54
No. It is not in my plans. 32 bit OS lose popularity from year to year. Now all CPUs are 64 bit. So many users install 64 bit OS on them. And 64 bit code is going to be dominant trend in software development. Why support platforms which will be legacy in few years?

But I think that it is a good idea to add support for C language besides C++, instead of 32-bit OS. So I added C prototypes to the libraries (release v0.97). They can be used now in both C and C++ programs.
Post 28 Aug 2012, 13:54
View user's profile Send private message Reply with quote
randall



Joined: 03 Dec 2011
Posts: 155
Location: Poland
randall 28 Aug 2012, 14:18
Windows 9 will be 64 bit only. There won't be 32 bit version. I am afraid fasm will not work there...
Post 28 Aug 2012, 14:18
View user's profile Send private message Visit poster's website Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 28 Aug 2012, 15:45
randall wrote:
Windows 9 will be 64 bit only. There won't be 32 bit version. I am afraid fasm will not work there...
Says who?

And even if they don't do a 32bit version, I'd be very surprised if they didn't include WoW64.

_________________
Image - carpe noctem
Post 28 Aug 2012, 15:45
View user's profile Send private message Visit poster's website Reply with quote
randall



Joined: 03 Dec 2011
Posts: 155
Location: Poland
randall 28 Aug 2012, 16:11
"Microsoft is planning to drop their 32-bit flavor of Windows beginning with the next release, Windows 9. Microsoft has already shared that Windows 8 will be their last 32-bit release and then Windows 9 will only support "x64" when it comes to the x86 architecture."

From www.phoronix.com.
Post 28 Aug 2012, 16:11
View user's profile Send private message Visit poster's website Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 28 Aug 2012, 16:23
You could have linked to the article in question (ugh, those phoronix URLs are horrible!) - which links to a mailing list post that links to some 3rd party article. Yeah, sure, there might be some official Microsoft statement somewhere, but the article certainly reads as hearsay. (And I'm too lazy to dig any deeper into this at the moment Smile).

It would make sense to drop the 32bit support, though... but this doesn't mean WoW64 would be dropped (even if that hearsay article implies it would). I expect we'll have 32bit legacy programs for a whole lot longer than we had 16bit legacy... most software works just fine as 32bit, and aren't going to benefit (heck, might even be ever so slightly penalized) from 64bit ports.

There might be a remote possibility of WoW64 being ripped out and delegated to "XP Mode", but that would be rather whacky.
Post 28 Aug 2012, 16:23
View user's profile Send private message Visit poster's website Reply with quote
jackblack



Joined: 04 Feb 2011
Posts: 12
jackblack 28 Aug 2012, 21:00
Anyway 32 bit platforms become less and less popular either with WOW64 or native support. LinAsm won't support them because a tons of patches should be applied to the code. And we a trying to catch a train which is already gone and won't come back again.
Post 28 Aug 2012, 21:00
View user's profile Send private message Reply with quote
jackblack



Joined: 04 Feb 2011
Posts: 12
jackblack 29 Aug 2012, 20:23
For people, who would like to see the library speed, I made a simple performance testing page with graphics. I compare LinAsm algorithms with their GNU libc analogs.
Here is the link to LinAsm performance tests.

http://linasm.sourceforge.net/about/performance.php
Post 29 Aug 2012, 20:23
View user's profile Send private message Reply with quote
r22



Joined: 27 Dec 2004
Posts: 805
r22 30 Aug 2012, 19:23
@jackblack - nice work. Taking the time to implement and test all those functions is an admirable feat.

I'm a big fan of radix sort
http://board.flatassembler.net/topic.php?t=5081
http://stackoverflow.com/questions/8082425/fastest-way-to-sort-32bit-signed-integer-arrays-in-javascript

You should use ALIGN for your functions and NOP padding for your loop labels. This will give you a noticeable speed boost.

Code:
align 16
Function_Name:
...
    

Code:
...
AMDPad16
.loop_label:
...
jmp .loop_label
    

AMDPad16 is a macro I made that used the optimal combination and pattern of NOP opcodes.
http://board.flatassembler.net/topic.php?t=4445

edit: Proper code alignment is the lowest of low hanging fruit for assembly level optimization.
Post 30 Aug 2012, 19:23
View user's profile Send private message AIM Address Yahoo Messenger Reply with quote
jackblack



Joined: 04 Feb 2011
Posts: 12
jackblack 30 Aug 2012, 19:53
Good idea. I will try it and compare the results. Have you any ideas what performance improvement it can give? Just wonder.
Post 30 Aug 2012, 19:53
View user's profile Send private message Reply with quote
r22



Joined: 27 Dec 2004
Posts: 805
r22 31 Aug 2012, 15:49
5-10% depending on the processor.
Post 31 Aug 2012, 15:49
View user's profile Send private message AIM Address Yahoo Messenger Reply with quote
jackblack



Joined: 04 Feb 2011
Posts: 12
jackblack 02 Sep 2012, 18:38
Hi, r22
I made some testing with unaligned and aligned code
Seems this is not so actual for modern CPUs. And they operate with the same speed in both situations.
'LinAsm' time (aligned code): 7.056060 sec
'LinAsm' time (unaligned code): 7.050908 sec

Looks like the speed overhead of aligned code can be reached for much older CPU: PIII, Pentium MMX, and so on.

I used Pentium 4 NetBurst overclocked to 3.7 Ghz for the testing.
Post 02 Sep 2012, 18:38
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page 1, 2  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.