flat assembler
Message board for the users of flat assembler.
Index
> Main > Clear top byte of 64bit GPR? Goto page 1, 2, 3 Next |
Author |
|
sinsi 28 Aug 2009, 08:56
Code: x dq 00ffffffffffffffh ... and rax,[x] ... Like this? |
|||
28 Aug 2009, 08:56 |
|
Azu 28 Aug 2009, 09:01
Thanks.. it will be much slower when x isn't in L1, though. Is there a way that is faster across the board?
Also, what about for working on a memory location? Can only have 1 memory arg.. |
|||
28 Aug 2009, 09:01 |
|
sinsi 28 Aug 2009, 09:20
>Thanks.. it will be much slower when x isn't in L1, though. Is there a way that is faster across the board?
Code: jmp @f x dq 00ffffffffffffffh @@: and rax,[x] >Also, what about for working on a memory location? Can only have 1 memory arg.. Code: x rq 1 ... mov byte[x+7],0 ;and [x],00ffffffffffffffh |
|||
28 Aug 2009, 09:20 |
|
Azu 28 Aug 2009, 09:24
sinsi wrote: >Thanks.. it will be much slower when x isn't in L1, though. Is there a way that is faster across the board? sinsi wrote: >Also, what about for working on a memory location? Can only have 1 memory arg.. But.. what about other masks, like 01ffffffffffffffh? |
|||
28 Aug 2009, 09:24 |
|
neville 28 Aug 2009, 09:32
Quote: A branch and memory operation are faster than two shifts? I guess sinsi is saying that "embedding" the data within the code ensures it will be in L1 cache. _________________ FAMOS - the first memory operating system |
|||
28 Aug 2009, 09:32 |
|
sinsi 28 Aug 2009, 09:39
>I guess sinsi is saying that "embedding" the data within the code ensures it will be in L1 cache.
Yeah, I was trying to be clever... Shifting might be good if you put another instruction between the two shifts. >But.. what about other masks, like 01ffffffffffffffh? and byte[x+7],1 - if it's memory just treat it as 8 bytes. |
|||
28 Aug 2009, 09:39 |
|
Azu 28 Aug 2009, 10:09
Thanks guys.
One more question; Why doesn't shrd rdx,rax,72 work the same as mov rdx,rax and rdx,qword[mem00FFFFFFFFFFFFFF]? |
|||
28 Aug 2009, 10:09 |
|
sinsi 28 Aug 2009, 11:02
A shift is limited - with 'shrd rdx,rax,72', the shift count is '72 and 63'.
|
|||
28 Aug 2009, 11:02 |
|
Azu 28 Aug 2009, 11:06
But isn't the imm 8 bits? It should go up to 255 then.. (or 127 if it's signed)
|
|||
28 Aug 2009, 11:06 |
|
sinsi 28 Aug 2009, 11:13
32-bit mask is 5 bits (00011111), 64-bit is 6 bits (00111111). Even CL is masked (except with an 8088?)
|
|||
28 Aug 2009, 11:13 |
|
Azu 28 Aug 2009, 11:15
So what are the other bits used for? |
|||
28 Aug 2009, 11:15 |
|
sinsi 28 Aug 2009, 11:30
Those masks are applied by the CPU, not a compiler.
Think of 'shrd rdx,rax,64' as 'mov rdx,rax' and you'll get the idea. What is the point of shifting a register beyond its size? Maybe you are thinking of a rotation rather than a shift? |
|||
28 Aug 2009, 11:30 |
|
Azu 28 Aug 2009, 11:45
Because it should be faster than
Code: mov rdx,rax and rdx,qword[mem00FFFFFFFFFFFFFF] edit: Er.. actually nevermind. Even if it worked it wouldn't do the same thing. I'd need a shl after it.. and then they are both two instructions anyways.. |
|||
28 Aug 2009, 11:45 |
|
sinsi 28 Aug 2009, 11:57
There's probably a real easy way to do it with ss*e**
|
|||
28 Aug 2009, 11:57 |
|
Tomasz Grysztar 28 Aug 2009, 12:08
Azu wrote: Because it should be faster than And why not Code: mov rdx,00FFFFFFFFFFFFFFh and rdx,rax |
|||
28 Aug 2009, 12:08 |
|
Azu 28 Aug 2009, 12:14
Basically I want to compare a bunch of strings of different lengths.
e.g. Code: mov rax,qword[memory] cmp rax,"abcdefgh" mov rdx,rax je match1 cmp rax,"12345678" je match2 and rdx,$00FFFFFFFFFFFFFFF cmp rdx,"qwertyu" je match3 and rdx,$0000FFFFFFFFFFFFFF cmp rdx,"barfoo" je match4 cmp rdx,"foobar" je match5 I guess I should have just outlined the whole thing from the beginning. Sorry about that. |
|||
28 Aug 2009, 12:14 |
|
LocoDelAssembly 28 Aug 2009, 15:18
Code: mov rax,qword[memory] cmp rax, [abcdefgh] mov rdx,rax je match1 cmp rax, [_12345678] je match2 shl rdx, 8 cmp rdx, [qwertyu] je match3 shl rdx, 8 cmp rdx, [barfoo] je match4 cmp rdx, [foobar] je match5 align 64 ; AMD's cache line size abcdefgh dq "abcdefgh" _12345678 dq "12345678" qwertyu dq "qwertyu" shl 8 barfoo dq "barfoo" shl 16 foobar dq "foobar" shl 16 But probably this will be slower if used many times. Try to compare what happens when a string table is used and what when the strings are moved to a register via mov reg, imm64 first. |
|||
28 Aug 2009, 15:18 |
|
Azu 28 Aug 2009, 15:27
Thanks.. is there a way to automatically define constants for that?
Something like Code: blah = place macro autodefine const{ if ~ defined const virtual at blah label const const.size at $ if const.size eq byte db `const elseif const.size eq word dw `const elseif const.size eq dword dd `const elseif const.size eq qword dq `const elseif const.size eq dqword ddq `const end end virtual blah = blah + const.size end if } cmp rax,autodefine(qword[abcdefgh]) align 64 place: It would save much time, I think. |
|||
28 Aug 2009, 15:27 |
|
Borsuc 28 Aug 2009, 15:39
No that's bad size-wise and maybe performance-wise (caching), you should use a loop instead and have a "data structure" with strings & offsets where to jump to. (depending how many strings you have -- if you plan to have many, it's certainly NOT a good idea).
Also putting data in between code is kinda nasty (read: slow) for micro-ops and caching. Example of defining such a data structure from one of my programs that use it. Of course, if the strings are not variable-length (like in your case) you won't even NEED the prefix-size for the string (you can use null-terminators too if you want). Code: Strings: irps arg, string1 string2 blah whatever { local str, strsize db strsize str db `arg strsize = $-str dd JumpLabel_#arg } db 0 ; terminator (so we know when the data structure ended) The format is: Code: <Length Of String><String><4-bytes Jump Label Address for string> Then you define labels like: Code: JumpLabel_String1: ; String 1 code Note that this is for 32-bit but you get the idea (I think it should work in 64-bit no problem). Then you'll need to loop through this string. I'm not sure about 64-bit, but in 32-bit you can easily do this with string operations like cmpsd or cmpsb (maybe with a rep prefix). |
|||
28 Aug 2009, 15:39 |
|
Goto page 1, 2, 3 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.