Message board for the users of flat assembler.
> Main > Add stream buffer to another
Goto page Previous 1, 2
Assuming you are running this on a system produced in the last 10 or 15 years, then with the operation being very simple (it's just an add) I think that effecting a cache management strategy would give all the speed up that is available.
I'm looking for FPU instructions to do multiplied add operations at same tact.
As long as the operation remains simple, once you have it streaming through the cache then it almost won't matter which instructions you use because the CPU will outpace the external memory bus by a good margin.
Compare option 1:
operate on data (each operation has to wait for data to arrive)
store data (the data bus has to switch from read mode to write mode each time)
loop a billion times
initiate cache streaming of multiple data (say 1000 items)
initiate cache streaming of multiple data (say 1000 items) again (second buffer)
operate on first buffer data
initiate writing first buffer data to memory
loop a million times
Notice for option 2 that there is less waiting, and there are many fewer bus direction switches.
If you don't try it then you won't see how much faster things can really be.
|01 May 2021, 10:49||
I'll try but seems it's better to switch my buffers
to FLOAT anyway to prevent additional conversion step by windows audio engine. I'm working on my own engine's strategy at the moment then will see where and what type of data is better. Thanks for advise.
|01 May 2021, 11:24||
Nice site for describing FPU instructions:
|01 May 2021, 11:30||
|Goto page Previous 1, 2
< Last Thread | Next Thread >
Copyright © 1999-2020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.
Website powered by rwasa.