flat assembler
Message board for the users of flat assembler.
Index
> Main > Fastest Memory Copying Algorithms Goto page Previous 1, 2 |
Author |
|
MCD 29 Nov 2004, 17:13
Quote:
Sorry, you must understand, I was in a great hurry when writing this because I'm never online @home, but rather in some kind of Internet Cafe with time-limited access. But, I've fixed them now, and I hope that there aren't any bugs left! In the first part, I forgot to change esi to edi in the target movntps and in the second TSC-example for DOS, I forgot to change some register from my abreviation to regular register names! _________________ MCD - the inevitable return of the Mad Computer Doggy -||__/ .|+-~ .|| || |
|||
29 Nov 2004, 17:13 |
|
MCD 29 Nov 2004, 17:17
And yes, I read most, but not all of your codes previously posted
|
|||
29 Nov 2004, 17:17 |
|
vid 15 Feb 2006, 15:27
also look here about some memcopy algos:
http://board.flatassembler.net/topic.php?t=4467 |
|||
15 Feb 2006, 15:27 |
|
r22 16 Feb 2006, 02:17
Single case algorithms aren't the best way to measure a memcopy speed. Unless your building the algo for a custom task like, data will ALWAYS be 16byte aligned or data size will ALWAYS be a multiple of 8.
For 32bit systems (without some overhead) the code doesn't know whether the CPU has SSE extensions. In the same sense (without overhead) the code doesn't know if the source or destination or both are aligned or unaligned. Here's the ASM of WinXP SP2's memmove api in ntdll.dll I left the disassembly addressing in (you can delete it with fasm's IDE block select). On Win XP 64, the 64bit memcopy is by far the fastest for random align, random size copying. But the 32bit version doesn't seem to use any MMX or SSE which means no prefetching, so it'll probably only be faster for copies of less than say 4096 bytes that are unaligned.
|
|||||||||||
16 Feb 2006, 02:17 |
|
viki 06 Jul 2006, 13:04
I have additional question about move32 proc. It is ok that first we execute this movsb code. I'm affraid that if data is alligned this instruction cause that next looped instracions will work on not alligned datas. Am I right?
|
|||
06 Jul 2006, 13:04 |
|
Goto page Previous 1, 2 < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.