flat assembler
Message board for the users of flat assembler.

flat assembler > Main > How fast work on RISC CPU REPNE SCASB ?

Author
Thread Post new topic Reply to topic
Roman



Joined: 21 Apr 2012
Posts: 467
In time DOS programing (year 1998) usualy used REPNE SCASB.
And repe cmpsb for compare two strings.

My question. How fast work REPNE SCASB and repe cmpsb on modern RISC CPU ?

I read about asm command LOOP. Say on RISC CPU fast work dec reg\jnz label.

My CPU Intel I5 2320


Last edited by Roman on 30 Dec 2018, 13:51; edited 1 time in total
Post 30 Dec 2018, 12:34
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 16455
Location: Crab Nebula
There is more to it than just which instructions you use.

But firstly, something to keep in mind is that RISC, a true RISC CPU won't have equivalent instructions to CMPSB etc. You would have to use a sequence of more basic instructions to do the memory reads, the comparison, the pointer updates, and the loop counter updates. So there would be more work for the programmer to get it working in the RISCy way.

However, like I said in the first sentence, it is more than just which instructions you choose. Modern CPUs run much faster than the external DRAM interface. So in cases where the byte count is large the speed is probably limited to the DRAM interface speed. If the data is already in the internal CPU caches then you might get it to be faster, it depends upon what your actually do, and how many bytes you compare/transfer on average.

So what is the answer to your question "is it faster"? The answer is we don't know. It depends upon your application code and your data flows and which CPU you use. It might be faster the RISC way, it might be faster the CISC way. You'd need to test it in your use case to know.
Post 30 Dec 2018, 12:49
View user's profile Send private message Visit poster's website Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 1388
Generally, scasb/cmpsb are slow (but small in code size!), while movsb and stosb are the fastest for large data on modern CPUs, as they have special circuitry for them.

IMO all of these operations should be done by the CPU, as the CPU knows best what its clocks are, bandwidth, what's in cache, and other factors. Such detail should be abstracted away from applications or libraries so that the same code works optimal on all CPUs.

That's why CISC instructions are the best way to go.
Post 30 Dec 2018, 15:31
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2019, Tomasz Grysztar.

Powered by rwasa.