flat assembler
Message board for the users of flat assembler.
Index
> Main > memcpy |
Author |
|
vid 02 Jan 2008, 16:57
wow... does that code work?
are you sure about "imul ecx, sz" there? |
|||
02 Jan 2008, 16:57 |
|
revolution 02 Jan 2008, 17:01
Think about using 'rep movsb' or similar instructions.
|
|||
02 Jan 2008, 17:01 |
|
packet_50071 02 Jan 2008, 17:35
of course the code works
imul ecx, sz -- yeah - i know its only for recent processors edit after following revolution advise -- is this code faster ! Code: void memcpy2(void *dest, const void *src, unsigned long count ,unsigned int sz) { count = count * sz; __asm { push ecx push esi push edi ;saved the past values mov esi,[src] mov edi,[dest] mov ecx,count ;loaded the new values cpy: rep movsb ; copy values from esi to edi until ecx end: pop ecx pop esi pop edi ;restored the past values } } y is the other codes complicated any way !! - or am i missing something?[/code] |
|||
02 Jan 2008, 17:35 |
|
Borsuc 02 Jan 2008, 18:05
packet_50071 wrote: imul ecx, sz -- yeah - i know its only for recent processors Your second code should be faster. The fact that the other are so "complicated" is because they use MMX or SSE to copy multiple bytes (more than 32-bits) in parallel. With MMX you have 64-bits (8 bytes) and with SSE 128-bits (16 bytes) at a time. Not only that, but there are other factors involved: cache, prefetch, etc.. I hope you know about those? (they are related to how the processor works) |
|||
02 Jan 2008, 18:05 |
|
packet_50071 02 Jan 2008, 18:28
So using SSE is obviously much faster than mine right !
|
|||
02 Jan 2008, 18:28 |
|
revolution 02 Jan 2008, 18:33
The answer is 'it depends'. Sorry but it really does. There is no universal copy routine that is always best. SSE and cache blocking etc. are great if the amount of data to transfer is large, but fail miserably if the data sizes are small. rep movsd (and variants) are pretty good and simple to use but do have shortcomings with certain data sizes also.
If you can accurately profile your data copying requirements it would help to make a better judgement of what will work best for you. |
|||
02 Jan 2008, 18:33 |
|
packet_50071 02 Jan 2008, 18:42
i want some thing that won't fail no matter what the data size is and do the copying as fast as it can -- therefore I need to have have both right ?
'data sizes are small' - what would u thing the small number be ? |
|||
02 Jan 2008, 18:42 |
|
revolution 02 Jan 2008, 18:58
You want to have your cake and eat it too. Sorry, you can't have both. Just use the OS memcpy. It is pretty good for most purposes. If you have some particularly nasty edge case then you can look at optimising your own routine specifically for your needs.
|
|||
02 Jan 2008, 18:58 |
|
packet_50071 02 Jan 2008, 19:33
LOL I am making a simple OS myself - so i cannot uses other resource
|
|||
02 Jan 2008, 19:33 |
|
revolution 02 Jan 2008, 19:48
Okay, can I assume you are using a modern CPU like P4 or later?
'rep movsb' is hard to beat. The CPU has some stuff in there to make it faster under certain conditions. I think for you this might be the best solution. Try not to spend too much time worrying about optimisation yet. Standard advice is to 'get it working first then optimise if necessary'. It is good advice with many experienced programmers across the years behind it. |
|||
02 Jan 2008, 19:48 |
|
packet_50071 02 Jan 2008, 20:08
thx revolution - i need this fuction to make the screen move up
|
|||
02 Jan 2008, 20:08 |
|
edfed 02 Jan 2008, 20:41
i stop you right now.
to make a screen_move_up, you need a specialised function named screen_move_up: not an else... screen pixels are byte, 2bytes, 3bytes or 4bytes lengh optimisation of this code is to make it full, with the less call as possible, a really short loop, and where are the two offsets? into kernel memory? in program memory? or in an other memory...? |
|||
02 Jan 2008, 20:41 |
|
packet_50071 03 Jan 2008, 00:17
nah -- i am following this as a guide - and i don't think its wrong
http://www.osdever.net/bkerndev/ |
|||
03 Jan 2008, 00:17 |
|
revolution 03 Jan 2008, 08:59
packet_50071: for text mode I think a basic movsd would be quite adequate. it is only moving a few kB at a low update rate.
I think edfed assumed you were doing graphics mode moving. For that he is correct, it requires very different techniques, especially if you want to use the GPU to assist the CPU. |
|||
03 Jan 2008, 08:59 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.