flat assembler
Message board for the users of flat assembler.

Index > Main > Modern CPU about registers.

Goto page Previous  1, 2, 3  Next
Author
Thread Post new topic Reply to topic
Roman



Joined: 21 Apr 2012
Posts: 1082
Roman
revolution wrote:

But why do you want it? Why do you think it would be "faster"?

Because not doubling params for call and not push in stack many params !
In modern CPU push params to stack cost some CPU ticks !
Because stack its RAM memory ! We know read and write in RAM memory not fast operation !

Second profit special regs pr gived new asm command CallLoop !

Third in proc we could change some regs pr ! This changed algorithm CallLoop !
Because in procs we could changed parameters on the fly !
This more powerfull then stack !


Last edited by Roman on 09 Apr 2021, 06:34; edited 3 times in total
Post 09 Apr 2021, 06:20
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 18222
Location: In your JS exploiting you and your system
revolution
In most cases stack is only backed by the RAM. Stack data mostly resides in the cache and the RAM never sees it. So it isn't "slow" actually.
Post 09 Apr 2021, 06:26
View user's profile Send private message Visit poster's website Reply with quote
Roman



Joined: 21 Apr 2012
Posts: 1082
Roman
Cache slower then regs ! Read Intel documentation !

Reading from register has 0 or 1 cycle latency.
Writing to registers has 0 or 1cycle latency.

Reading/Writing L1 cache has a 3 to 5 cycle latency (varies by architecture age)


Last edited by Roman on 09 Apr 2021, 07:02; edited 2 times in total
Post 09 Apr 2021, 06:27
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 18222
Location: In your JS exploiting you and your system
revolution
Roman wrote:
Cache slower then regs ! Read Intel documentation !
You don't know that it would still be true with more registers.
Post 09 Apr 2021, 06:28
View user's profile Send private message Visit poster's website Reply with quote
Roman



Joined: 21 Apr 2012
Posts: 1082
Roman
You too Very Happy
Post 09 Apr 2021, 06:31
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 18222
Location: In your JS exploiting you and your system
revolution
I can see that the Itanium, with 128 addressable registers, was too slow and power hungry. And it was difficult to code for. So I use that as a basic for comparison.

More registers doesn't automatically mean faster!


Last edited by revolution on 09 Apr 2021, 07:31; edited 1 time in total
Post 09 Apr 2021, 06:34
View user's profile Send private message Visit poster's website Reply with quote
Roman



Joined: 21 Apr 2012
Posts: 1082
Roman
Itanium had special registers pr for Call and CallLoop ?!
I think no !

PS: For example in 18 centure was steam cars. Your opinion in this case that the car is slow ! But we all know in 21 centure the cars engine has undergone a lot of technical and more complex changes !
And we all see cars in 21 centure to be not so slow !

Itanium 2 processor was released in 2002 ! 19 years ago !
For electronics and CPU its like centurys !
Post 09 Apr 2021, 06:35
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 18222
Location: In your JS exploiting you and your system
revolution
Itanium had a better scheme called register rotation.

I think that trying to bolt on the changes to x86 is not going to help. You need a new design. So, you can design your own CPU with 1024 registers and show the world what we are all doing wrong.
Post 09 Apr 2021, 07:00
View user's profile Send private message Visit poster's website Reply with quote
Roman



Joined: 21 Apr 2012
Posts: 1082
Roman
I not have factory for created my CPU.
This is main my problem.

Quote:
Itanium had a better scheme called register rotation.

If had a better scheme called register rotation, why Intel or AMD not use this scheme ?!
Very strange. Don't you find it ?
Post 09 Apr 2021, 07:04
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 18222
Location: In your JS exploiting you and your system
revolution
Roman wrote:
I not have factory for created my CPU.
This is main my problem.
You don't need to produce the CPU, only design it. You can design it on any modern system with no problem. All you need is the ideas and the motivation.
Roman wrote:
Quote:
Itanium had a better scheme called register rotation.

If had a better scheme called register rotation, why Intel or AMD not use this scheme ?!
Very strange. Don't you find it ?
It is no mystery. It takes too much encoding space, too much power, and too much thinking to program with it. That's why I keep saying, simply looking at the count of registers is no indication of how performant it would be. There are other considerations. Things don't exist in isolation.
Post 09 Apr 2021, 07:10
View user's profile Send private message Visit poster's website Reply with quote
Roman



Joined: 21 Apr 2012
Posts: 1082
Roman
I showed my idea and profit if use pr regs and CallLoop.
Why you ignored this and keep saying many regs its not good ?

I propose only 18 new regs. Not 1024 regs !
And 10 new CPU commands.

Implemented this, not problem for Intel or AMD.
Like as not problem implement for AVX or SSE news instructions.


Last edited by Roman on 09 Apr 2021, 07:26; edited 2 times in total
Post 09 Apr 2021, 07:18
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 18222
Location: In your JS exploiting you and your system
revolution
I'm not saying it isn't good. I'm saying you have to prove it is faster, you can't just assume it will be and turn that into a claim.

If someone makes a claim, then it is up to them to prove it. That is how science works. So it is up to you to show that it will be faster.

If you design a CPU with N registers and show a simulation where the speed is better, then you can make history, and people will want to make the CPU.

But so far, your ideas and claims are empty, and have nothing to prove the claims.
Post 09 Apr 2021, 07:23
View user's profile Send private message Visit poster's website Reply with quote
Roman



Joined: 21 Apr 2012
Posts: 1082
Roman
revolution wrote:

If someone makes a claim, then it is up to them to prove it. That is how science works. So it is up to you to show that it will be faster.

I showed this. I write this many times !

PS: I started write asm programs from 1995 year ! My expirience let me to see more efficient mechanism.
And not forgoted in 1995 year CPU with many regs could be costed very expensive.
This reason imposes thoughts that a modern processor also cannot have many registers too.
But its very big mistake !


Last edited by Roman on 09 Apr 2021, 07:35; edited 2 times in total
Post 09 Apr 2021, 07:25
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 18222
Location: In your JS exploiting you and your system
revolution
You haven't showed any design yet.

What you have posted is only ideas.

Ideas are fine by themselves, but they have to be converted into a design to measure the results, before we know if they work.
Post 09 Apr 2021, 07:30
View user's profile Send private message Visit poster's website Reply with quote
Roman



Joined: 21 Apr 2012
Posts: 1082
Roman
Quote:

You haven't showed any design yet.

Hm.
You not readed my posts ?

And whatch is mean design ?


Last edited by Roman on 09 Apr 2021, 07:39; edited 1 time in total
Post 09 Apr 2021, 07:36
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 18222
Location: In your JS exploiting you and your system
revolution
Roman wrote:
Quote:

You haven't showed any design yet.

Hm.
You not readed my posts ?
I think you misunderstand what a design is.

You need to show encoding space bit assignments at a minimum.

After that is the logic gates to implement it and merge it into the existing design.
Post 09 Apr 2021, 07:38
View user's profile Send private message Visit poster's website Reply with quote
Roman



Joined: 21 Apr 2012
Posts: 1082
Roman
Quote:

You need to show encoding space bit assignments at a minimum.

After that is the logic gates to implement it and merge it into the existing design.

I propose create news CPU commands and mechanism.
Like did Intel for AVX commands. Lets say new CPU command CallPr and CallPrLoop.
And we could use old Call like as usual.

How do this correct i think Intel or AMD engineers know better.
And for which processor architecture.
Post 09 Apr 2021, 07:40
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 18222
Location: In your JS exploiting you and your system
revolution
Show your design to them. Or post it here first if you want to.

It is easy to say "do this and do that", but without a design to show how it works, then it is just words.
Post 09 Apr 2021, 08:45
View user's profile Send private message Visit poster's website Reply with quote
DimonSoft



Joined: 03 Mar 2010
Posts: 958
Location: Belarus
DimonSoft
Roman wrote:
Stack slow and some times not comfortable to work with stack(because changed esp) !
I show this in prevision post.

PS: And i not forbid stack. Some time stack needed for program.

It will be needed as soon as program passes 16 parameters, not necessarily in a single call.

And even before that! Every time a procedure wants to call another procedure that has a different parameter set and then continue its work with its own parameters it will have to save pr0, pr1, etc. somewhere. Guess where?

Roman wrote:
Because in procs we could changed parameters on the fly !

I see no reason why we can’t now. In fact, we’ve been able to do so for ages. And, since they’re generally on the stack, they’re not subject to fixed register size.

Roman wrote:
I showed my idea and profit if use pr regs and CallLoop.

How often does one need CallLoop? Is that often enough to make everyone pay for a more complex and expensive processor?
Post 09 Apr 2021, 16:06
View user's profile Send private message Visit poster's website Reply with quote
Roman



Joined: 21 Apr 2012
Posts: 1082
Roman
Quote:
And even before that! Every time a procedure wants to call another procedure that has a different parameter set and then continue its work with its own parameters it will have to save pr0, pr1, etc. somewhere. Guess where?

Very simple. If you needed do push regs. If no needed not do push.

Simple Call work as usual. CallPr and CallPrLoop work litle different.

CallPr and CallPrLoop only use 4 or 8 bytes in stack for return address.
I think enough is only CallPrLoop.
Proc could change(if need ) num loops or set reg break for stop CallPrLoop.


Quote:
Is that often enough to make everyone pay for a more complex and expensive processor?

You think AVX 512 cost chip ?
But AMD made in new CPU AVX 512.

DimonSoft how often do you use AVX 512 ?
Post 09 Apr 2021, 16:43
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page Previous  1, 2, 3  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.

Website powered by rwasa.