It isn't just the CPU though. Even if you knew exactly every transistor, it is still the code you are running that affects things. On some CPUs the OOO buffer is more than 100 instructions long. So you also have to know every one of those 100+ instructions ahead of your snippet, and which port they will go into, and what instructions are currently in each port, and how many ports you have, and whether or not the memory read/write buffers are full, and the current state of the BTB and caches, whether or not another SMP instruction stream is interleaved with your stream, etc. etc. etc. It's mind bogglingly complex.
Sometimes it's good to have unused ports or units in a thread, because they become free to use for another thread with Hyperthreading.
It's not often you see hyperthreading double the performance, but it does happen (and it did for me) when I did a brute force test of a very long latency-bottlenecked algorithm (but you could parallelize individual inputs of course). Using 8 threads with hyperthreading finished in almost half the time I estimated with 4 threads (8 threads with HT), which was a 2 hour gain. And that's with me still using my PC for lightweight stuff (I lowered the priority on that test program to minimum).
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum You cannot attach files in this forum You can download files in this forum