flat assembler
Message board for the users of flat assembler.
Index
> Main > Multiply to ZERO is slower than to value |
Author |
|
bitRAKE 10 Jul 2021, 03:39
That is strange. What CPU?
Maybe create a constant for smallest non-zero float. I've seen that done in other code. _________________ ¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup |
|||
10 Jul 2021, 03:39 |
|
Overclick 10 Jul 2021, 05:31
There is more factors than zero or non zero. The size of operands deals too. Similar size calculates faster. As you know sp have only up to 8 numbers shifted by nulls after point that can be calculated correctly. Seems that different of distance takes some extra time or something. When I rebalance my project to keep values closer to zero it works faster and no matter zero or not. Have to investigate it deeper.
|
|||
10 Jul 2021, 05:31 |
|
revolution 10 Jul 2021, 05:46
Check for denormal numbers in the pipeline and see if the flush to zero flag is unset making the FPU do extra processing.
|
|||
10 Jul 2021, 05:46 |
|
Overclick 10 Jul 2021, 13:25
As wiki says Zero is denormal number too...
|
|||
10 Jul 2021, 13:25 |
|
revolution 10 Jul 2021, 13:33
In most FPU implementations zero is not treated in a special execution path, so it should give normal performance.
Trouble comes when denormals (non-zero very small numbers) are there either as an input or an output. Check your FTZ flag. If that is already set then another thing to look for is infinity (or divide by zero). |
|||
10 Jul 2021, 13:33 |
|
Overclick 10 Jul 2021, 20:26
New interesting think I've got. Multiply to Zero takes some time to be calculated BUT to NaN just pass on as NaN. It can be used as a speed up trick.
I'm seriously thinking to turn empty values to NaN and back to zero after all. My engine takes 8 channels 192k frame rate sp values from stream. Turn stereo to 7.1 surround effect , 31 line EQ for each channel, delays subengine etc. It uses 248 virtual channels for work process, 31 mulps instructions for each line of EQ. That is very costly as you see. Optimisation is the only way I can do. My old cpu takes up to 30% of core performance to do that job. I need any idea to speed it up. |
|||
10 Jul 2021, 20:26 |
|
Overclick 13 Jul 2021, 12:28
Quote:
Denormals happens sometime at engine starting, but it's flow out in process and no more denormals after that. No one flag happens except PE, but it's expected at huge low freq counters. Can PE slow down my process as something difficult to calculate? I'll try to turn it to dp, but I believe it slower anyway. |
|||
13 Jul 2021, 12:28 |
|
bitRAKE 13 Jul 2021, 13:39
Calculate what your data bandwidth usage (in and out of the core(s)) is, and that will give you an idea of how close you are getting to ideal conditions.
192kHz * 8 * 4 bytes = 6MB/s, but the virtual channels are probably consuming bandwidth as well. I imagine a scenario where all the virtual channels stay in cache while the audio data streams through - coercing the processor to play along might be tricky, or maybe the buffers are too large to cache? _________________ ¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup |
|||
13 Jul 2021, 13:39 |
|
Overclick 13 Jul 2021, 15:15
Yeah I do it by one thread only. No point to divide to threads as very big data dependence.
It is all about heavy Filter design, but I do calculations closer from part to part of data to make sure it's fit the cache. And also every single register in use... |
|||
13 Jul 2021, 15:15 |
|
Overclick 13 Jul 2021, 15:33
Maybe it is not big problem. That how it looks at my old 6-core cpu. New processors will not even feel it.
bitRAKE You did ask me what processor I use. Sorry, I shie to answer. It is Phenom 1100t at 4 GHz. Old, I know.
|
||||||||||
13 Jul 2021, 15:33 |
|
bitRAKE 13 Jul 2021, 21:10
Given what you've said, that looks excellent.
_________________ ¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup |
|||
13 Jul 2021, 21:10 |
|
Overclick 13 Jul 2021, 21:31
Ш
Quote:
Thanks friend, I'm trying to do the best converter at the market. It is difficult....... And it is not my order. It is my f..n hobby, first of all for myself |
|||
13 Jul 2021, 21:31 |
|
Overclick 13 Jul 2021, 21:36
My old version is here
https://sourceforge.net/projects/stereo-to-7-1-converter/ But I hate it already when I started to use the new engine |
|||
13 Jul 2021, 21:36 |
|
Melissa 18 Sep 2021, 11:45
Overclick wrote: Hi You are not multipling by zero here, rather small number. |
|||
18 Sep 2021, 11:45 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.