flat assembler
Message board for the users of flat assembler.
Index
> Main > Modern CPU about registers. Goto page Previous 1, 2, 3 Next |
Author |
|
revolution 09 Apr 2021, 06:26
In most cases stack is only backed by the RAM. Stack data mostly resides in the cache and the RAM never sees it. So it isn't "slow" actually.
|
|||
09 Apr 2021, 06:26 |
|
Roman 09 Apr 2021, 06:27
Cache slower then regs ! Read Intel documentation !
Reading from register has 0 or 1 cycle latency. Writing to registers has 0 or 1cycle latency. Reading/Writing L1 cache has a 3 to 5 cycle latency (varies by architecture age) Last edited by Roman on 09 Apr 2021, 07:02; edited 2 times in total |
|||
09 Apr 2021, 06:27 |
|
revolution 09 Apr 2021, 06:28
Roman wrote: Cache slower then regs ! Read Intel documentation ! |
|||
09 Apr 2021, 06:28 |
|
Roman 09 Apr 2021, 06:31
You too
|
|||
09 Apr 2021, 06:31 |
|
revolution 09 Apr 2021, 06:34
I can see that the Itanium, with 128 addressable registers, was too slow and power hungry. And it was difficult to code for. So I use that as a basic for comparison.
More registers doesn't automatically mean faster! Last edited by revolution on 09 Apr 2021, 07:31; edited 1 time in total |
|||
09 Apr 2021, 06:34 |
|
Roman 09 Apr 2021, 06:35
Itanium had special registers pr for Call and CallLoop ?!
I think no ! PS: For example in 18 centure was steam cars. Your opinion in this case that the car is slow ! But we all know in 21 centure the cars engine has undergone a lot of technical and more complex changes ! And we all see cars in 21 centure to be not so slow ! Itanium 2 processor was released in 2002 ! 19 years ago ! For electronics and CPU its like centurys ! |
|||
09 Apr 2021, 06:35 |
|
revolution 09 Apr 2021, 07:00
Itanium had a better scheme called register rotation.
I think that trying to bolt on the changes to x86 is not going to help. You need a new design. So, you can design your own CPU with 1024 registers and show the world what we are all doing wrong. |
|||
09 Apr 2021, 07:00 |
|
Roman 09 Apr 2021, 07:04
I not have factory for created my CPU.
This is main my problem. Quote: Itanium had a better scheme called register rotation. If had a better scheme called register rotation, why Intel or AMD not use this scheme ?! Very strange. Don't you find it ? |
|||
09 Apr 2021, 07:04 |
|
revolution 09 Apr 2021, 07:10
Roman wrote: I not have factory for created my CPU. Roman wrote:
|
|||
09 Apr 2021, 07:10 |
|
Roman 09 Apr 2021, 07:18
I showed my idea and profit if use pr regs and CallLoop.
Why you ignored this and keep saying many regs its not good ? I propose only 18 new regs. Not 1024 regs ! And 10 new CPU commands. Implemented this, not problem for Intel or AMD. Like as not problem implement for AVX or SSE news instructions. Last edited by Roman on 09 Apr 2021, 07:26; edited 2 times in total |
|||
09 Apr 2021, 07:18 |
|
revolution 09 Apr 2021, 07:23
I'm not saying it isn't good. I'm saying you have to prove it is faster, you can't just assume it will be and turn that into a claim.
If someone makes a claim, then it is up to them to prove it. That is how science works. So it is up to you to show that it will be faster. If you design a CPU with N registers and show a simulation where the speed is better, then you can make history, and people will want to make the CPU. But so far, your ideas and claims are empty, and have nothing to prove the claims. |
|||
09 Apr 2021, 07:23 |
|
Roman 09 Apr 2021, 07:25
revolution wrote:
I showed this. I write this many times ! PS: I started write asm programs from 1995 year ! My expirience let me to see more efficient mechanism. And not forgoted in 1995 year CPU with many regs could be costed very expensive. This reason imposes thoughts that a modern processor also cannot have many registers too. But its very big mistake ! Last edited by Roman on 09 Apr 2021, 07:35; edited 2 times in total |
|||
09 Apr 2021, 07:25 |
|
revolution 09 Apr 2021, 07:30
You haven't showed any design yet.
What you have posted is only ideas. Ideas are fine by themselves, but they have to be converted into a design to measure the results, before we know if they work. |
|||
09 Apr 2021, 07:30 |
|
Roman 09 Apr 2021, 07:36
Quote:
Hm. You not readed my posts ? And whatch is mean design ? Last edited by Roman on 09 Apr 2021, 07:39; edited 1 time in total |
|||
09 Apr 2021, 07:36 |
|
revolution 09 Apr 2021, 07:38
Roman wrote:
You need to show encoding space bit assignments at a minimum. After that is the logic gates to implement it and merge it into the existing design. |
|||
09 Apr 2021, 07:38 |
|
Roman 09 Apr 2021, 07:40
Quote:
I propose create news CPU commands and mechanism. Like did Intel for AVX commands. Lets say new CPU command CallPr and CallPrLoop. And we could use old Call like as usual. How do this correct i think Intel or AMD engineers know better. And for which processor architecture. |
|||
09 Apr 2021, 07:40 |
|
revolution 09 Apr 2021, 08:45
Show your design to them. Or post it here first if you want to.
It is easy to say "do this and do that", but without a design to show how it works, then it is just words. |
|||
09 Apr 2021, 08:45 |
|
DimonSoft 09 Apr 2021, 16:06
Roman wrote: Stack slow and some times not comfortable to work with stack(because changed esp) ! It will be needed as soon as program passes 16 parameters, not necessarily in a single call. And even before that! Every time a procedure wants to call another procedure that has a different parameter set and then continue its work with its own parameters it will have to save pr0, pr1, etc. somewhere. Guess where? Roman wrote: Because in procs we could changed parameters on the fly ! I see no reason why we can’t now. In fact, we’ve been able to do so for ages. And, since they’re generally on the stack, they’re not subject to fixed register size. Roman wrote: I showed my idea and profit if use pr regs and CallLoop. How often does one need CallLoop? Is that often enough to make everyone pay for a more complex and expensive processor? |
|||
09 Apr 2021, 16:06 |
|
Roman 09 Apr 2021, 16:43
Quote: And even before that! Every time a procedure wants to call another procedure that has a different parameter set and then continue its work with its own parameters it will have to save pr0, pr1, etc. somewhere. Guess where? Very simple. If you needed do push regs. If no needed not do push. Simple Call work as usual. CallPr and CallPrLoop work litle different. CallPr and CallPrLoop only use 4 or 8 bytes in stack for return address. I think enough is only CallPrLoop. Proc could change(if need ) num loops or set reg break for stop CallPrLoop. Quote: Is that often enough to make everyone pay for a more complex and expensive processor? You think AVX 512 cost chip ? But AMD made in new CPU AVX 512. DimonSoft how often do you use AVX 512 ? |
|||
09 Apr 2021, 16:43 |
|
Goto page Previous 1, 2, 3 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.