flat assembler
Message board for the users of flat assembler.
![]() Goto page 1, 2, 3, 4 Next |
Author |
|
revolution
Because they are easily scanned, 'just keep loading bytes until you find a zero' is easy to code and easy to understand. Also, if you come in to a comms channel half way within a string you can synchronise by waiting for the next zero and the start processing future information.
I think the use is now mostly traditional. Old habits die hard. |
|||
![]() |
|
Azu
revolution wrote: Because they are easily scanned, 'just keep loading bytes until you find a zero' is easy to code and easy to understand. mov ecx,[esi] rep movsd Easier then @@: mov eax,[esi] test eax,eax jz @f movsd jmp @b @@: ??? revolution wrote: Also, if you come in to a comms channel half way within a string you can synchronise by waiting for the next zero and the start processing future information. |
|||
![]() |
|
buzzkill
Personally, I really prefer C-strings (null-terminated) over Pascal-strings (length byte). For one thing, you're not limited to any arbitrary amount of characters, and also I think they're faster to work with, ie. "just go on until you hit a 0" is easier than "read 1 byte, then read that many following bytes" (the latter would require an extra helper variable).
The Linux write syscall for instance doesn't use null-terminated strings, and so one of the first things I do is create a wrapper around it that does... Also, it's handy (if you're on a *nix platform) that you can interface with libc, which does of course use C-strings (and as far as I know, so does most (library)code). |
|||
![]() |
|
Azu
buzzkill wrote: Personally, I really prefer C-strings (null-terminated) over Pascal-strings (length byte). For one thing, you're not limited to any arbitrary amount of characters buzzkill wrote: , and also I think they're faster to work with, ie. "just go on until you hit a 0" is easier than "read 1 byte, then read that many following bytes" (the latter would require an extra helper variable). buzzkill wrote: Also, it's handy (if you're on a *nix platform) that you can interface with libc, which does of course use C-strings (and as far as I know, so does most (library)code). Azu wrote: (besides interacting with other code that uses them for no reason) Last edited by Azu on 30 Mar 2009, 17:04; edited 1 time in total |
|||
![]() |
|
revolution
Azu wrote:
Azu wrote:
|
|||
![]() |
|
Azu
revolution wrote:
It seems to me like it's easier/faster/smaller to make code that uses prepended length strings rather then null terminated strings. In all cases I could think of. ![]() I think that I am missing something or else C wouldn't have been made to use null termination. |
|||
![]() |
|
revolution
Azu wrote: Isn't it slower since you have to keep iterating through the whole string just to find how long it is? |
|||
![]() |
|
Azu
revolution wrote:
Last edited by Azu on 30 Mar 2009, 17:10; edited 1 time in total |
|||
![]() |
|
revolution
Probably C started from the old old processors where registers were a limited resource. The extra loop counter may have been a problem? I don't expect anyone really knows but it doesn't matter if you write your own code, you can do what suits you best.
|
|||
![]() |
|
Azu
revolution wrote: I don't expect anyone really knows but it doesn't matter if you write your own code, you can do what suits you best. revolution wrote: Probably C started from the old old processor where registers were a limited resource. The extra loop counter may have been a problem? |
|||
![]() |
|
revolution
Performance metrics are very application specific, I doubt that you can find any sensible advice that can work in everyone's situation.
What is your usage model? How many strings are you dealing with each microsecond in your application? How much of the percent processor time will be spent dealing with strings? What string operations do you do mostly? These are the sort of Q's you need to ask yourself in order to decide what will be the best solution. |
|||
![]() |
|
Azu
revolution wrote: Performance metrics are very application specific, I doubt that you can find any sensible advice that can work in everyone's situation. Which? P.S. the operations would mainly be scanning, copying, or combinations of the two. Last edited by Azu on 30 Mar 2009, 17:19; edited 1 time in total |
|||
![]() |
|
buzzkill
It may also matter whether you want your strings mutable or immutable (like some newer HLLs). If you eg add to your string, you have to go back to the length byte/word and change that. With a null-terminated string, your algorithms/functions don't change, it's still "keep going until 0".
|
|||
![]() |
|
revolution
Azu wrote: Which? |
|||
![]() |
|
Azu
buzzkill wrote: It may also matter whether you want your strings mutable or immutable (like some newer HLLs). If you eg add to your string, you have to go back to the length byte/word and change that. With a null-terminated string, your algorithms/functions don't change, it's still "keep going until 0". revolution wrote:
|
|||
![]() |
|
buzzkill
This just occurred to me: performance-wise, if you calculate the length of a C-string once, the entire string will be in your (L1) cache, so future operations on the string could be sped up. Is this a real advantage, or am I just onto nothing here?
|
|||
![]() |
|
Azu
buzzkill wrote: This just occurred to me: performance-wise, if you calculate the length of a C-string once, the entire string will be in your (L1) cache, so future operations on the string could be sped up. Is this a real advantage, or am I just onto nothing here? And if the L1 cache is to valuable to store a length in it isn't it to valuable to store a string in it, anyways? |
|||
![]() |
|
buzzkill
Quote:
No, with Pascal-strings you have to write in two places: behind the original string, and before it to the length byte, and with C-strings you only write the part behind the original string. Now, with Pascal-strings you could build in more safety with accessing the string like an array, because you could check the subscript with the length byte (ie, string[10] is allowed only if length(string) >= 10 if you count your subscripts from 1, which is I believe the Pascal-way). But since C has always been about speed and control, and not so much safety, I would guess that the C inventors chose the null-terminated string for those reasons (though I have no literature at hand to back that up). |
|||
![]() |
|
Azu
buzzkill wrote:
I meant writing the length of the string to the beginning, instead of writing a null to the end and not being able to use nulls in the string. I think it would make comparisons and scanning and copying faster. I just want to know if I'm missing anything (besides it taking a register). Last edited by Azu on 30 Mar 2009, 17:37; edited 1 time in total |
|||
![]() |
|
Goto page 1, 2, 3, 4 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.
Website powered by rwasa.