flat assembler
Message board for the users of flat assembler.

Index > Main > flat assembler 1.69.45-47

Author
Thread Post new topic Reply to topic
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 17 Mar 2012, 17:43
Todays release comes with major expansion of supported instruction sets. The newly implemented instructions include AVX2, BMI, HLE and RTM from Intel, and TBM from AMD. It was all written very quickly, so there may be some bugs - please let me know if you find anything wrong with opcode generating.

In the AVX2 there is a subset of GATHER instructions, which have unique and new syntax. It is possible to use XMM/YMM registers in the address for the memory operand (and they are treated as a set of packed addresses), they act as an index register, so may be scaled by factor of 2/4/8, too. Here a sample of how it looks in practice:
Code:
vgatherdps xmm1,[eax+xmm0],xmm3      ; xmm1,vm32x,xmm2
vgatherdps ymm1,[ecx+ymm7*2],ymm3    ; ymm1,vm32y,ymm2

vgatherqps xmm1,[ebp+xmm2],xmm3      ; xmm1,vm64x,xmm2
vgatherqps xmm1,[ebx+ymm2],xmm3      ; xmm1,vm64y,xmm2

vpgatherdd xmm1,[xmm0],xmm3          ; xmm1,vm32x,xmm2
vpgatherdd ymm1,[ymm0+3],ymm3        ; ymm1,vm32y,ymm2    
There is one detail concerning GATHER instructions which I still have to decide what to do with. Intel manual states that "if any pair of the index, mask, or destination registers are the same, instruction results in a UD fault". Now I'm not sure what to choose - should fasm check whether each of the three XMM/YMM registers is different, and signal an error otherwise, to prevent generating an invalid opcode? Perhaps yes. I have not implemented such check yet, however.

Now, since all the things from my lists of things to do for fasm 1.70 are done (I did remove Mach-O from that list - it's on the list for 1.72 now Wink), the days of 1.69.x line are near their end. When I get a stable version and no new bug reports are coming, I will release 1.70 milestone - I hope that will happen quite soon, though I still have to write the documentation for AVX instructions first. I may have not this much time for fasm development later this year, so I'd love to have 1.70 finished ASAP.
Post 17 Mar 2012, 17:43
View user's profile Send private message Visit poster's website Reply with quote
typedef



Joined: 25 Jul 2010
Posts: 2909
Location: 0x77760000
typedef 17 Mar 2012, 18:30
Nice.

Another request if possible for next release. Can you implement the recent files list ?
Don't sweat it if you have other better priorities to attend to. I just thought that'd help too.
Post 17 Mar 2012, 18:30
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20300
Location: In your JS exploiting you and your system
revolution 18 Mar 2012, 00:53
Tomasz Grysztar wrote:
There is one detail concerning GATHER instructions which I still have to decide what to do with. Intel manual states that "if any pair of the index, mask, or destination registers are the same, instruction results in a UD fault". Now I'm not sure what to choose - should fasm check whether each of the three XMM/YMM registers is different, and signal an error otherwise, to prevent generating an invalid opcode? Perhaps yes. I have not implemented such check yet, however.
My thought is yes, do the check. I did this with all of the ARM/THUMB code in fasmarm. It is especially helpful with macros since sometimes the actual registers being used are hidden and/or unknown during the coding stage. My viewpoint on #UD is that only the properly defined UD2 should be compilable to generate such a runtime error. When I assemble my code I like to think that it will run without unexpected faults generated by "bad" code generation.
Post 18 Mar 2012, 00:53
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 18 Mar 2012, 09:03
revolution wrote:
My viewpoint on #UD is that only the properly defined UD2 should be compilable to generate such a runtime error.
This may not be perfectly attainable, as for example LAR will generate #UD in V86 mode, but not in 16-bit PM and many other instructions will #UD depending on specific modes being set or not, like MOV to debug registers when CR4.DE=1, etc.
But this can at least be applied to the cases of opcodes that will ALWAYS generated an #UD, and that's why I also though this check for gathering instructions should be implemented (I think I need a new error message for this one).
Post 18 Mar 2012, 09:03
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20300
Location: In your JS exploiting you and your system
revolution 18 Mar 2012, 09:05
I notice this discrepancy
Code:
virtual at eax*2  ;<--- Okay
 label x2
end virtual

mov ebx,[x2]

virtual at ymm7
 label y
end virtual

virtual at ymm7*2  ;<--- error: invalid address.
 label y2
end virtual

vgatherdps ymm1,dword[ecx+ymm7*2],ymm3
vgatherdps ymm1,dword[ecx+y*2],ymm3
vgatherdps ymm1,dword[ecx+y2],ymm3    
Post 18 Mar 2012, 09:05
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 18 Mar 2012, 10:13
I'm fixing this in 1.69.46 release, along with the added check for disallowed combinations of registers in V(P)GATHERxxx.
Post 18 Mar 2012, 10:13
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20300
Location: In your JS exploiting you and your system
revolution 18 Mar 2012, 12:33
This no longer assembles with v1.69.46:
Code:
use64
virtual at rip  ;<--- error: invalid address.
 label z
end virtual    
Is this a deliberate change?
Post 18 Mar 2012, 12:33
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 18 Mar 2012, 13:56
No, and it's already corrected.
Post 18 Mar 2012, 13:56
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20300
Location: In your JS exploiting you and your system
revolution 18 Mar 2012, 14:04
The latest v1.69.46 accepts this:
Code:
virtual at rip+(-1) shl 64
 label z
end virtual
mov rax,[z]    
Post 18 Mar 2012, 14:04
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 18 Mar 2012, 14:36
Oh, you found a place where I forgot to add support for the new 65-th value bit. I hope there isn't many such places left. Wink I'm uploading another correction.
Post 18 Mar 2012, 14:36
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20300
Location: In your JS exploiting you and your system
revolution 18 Mar 2012, 14:55
Still accepts. Seems that the .exe's have not been updated but the source has been altered.
Post 18 Mar 2012, 14:55
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 18 Mar 2012, 14:56
Oh, perhaps I forgot to re-assemble. Very Happy
Post 18 Mar 2012, 14:56
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20300
Location: In your JS exploiting you and your system
revolution 18 Mar 2012, 15:30
Better now.
Post 18 Mar 2012, 15:30
View user's profile Send private message Visit poster's website Reply with quote
peter



Joined: 09 May 2006
Posts: 63
peter 19 Mar 2012, 08:16
Tomasz, thank you so much for implementing AVX2 instructions.
Post 19 Mar 2012, 08:16
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20300
Location: In your JS exploiting you and your system
revolution 19 Mar 2012, 12:11
v1.69.47:
Code:
use64
label k6 at rip+eip ; <--- Okay! Expected an error here
label k7 at rip+eax ; <--- error: invalid address.    
[RIP+EIP] is an interesting address to read Wink
Post 19 Mar 2012, 12:11
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 19 Mar 2012, 12:21
Yeah, very interesting address indeed. Smile A fix is on the way.
Post 19 Mar 2012, 12:21
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 19 Mar 2012, 13:11
One more thing: should I disallow putting segment prefix in the V(P)GATHERxxx address? Intel documentation states that the computed addresses are linear ones, so segment prefixes will probably do nothing there.
Post 19 Mar 2012, 13:11
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20300
Location: In your JS exploiting you and your system
revolution 19 Mar 2012, 14:34
Tomasz Grysztar wrote:
One more thing: should I disallow putting segment prefix in the V(P)GATHERxxx address? Intel documentation states that the computed addresses are linear ones, so segment prefixes will probably do nothing there.
I can't find where that is stated. In which manual are you reading that? But if that is true then it would break 32-bit code.

Are you referring to this?
Section 4.2 - VECTOR SIB (VSIB) MEMORY ADDRESSING wrote:
In AVX2, an SIB byte that follows the ModR/M byte can support VSIB memory addressing to an array of linear addresses.
If so, then I think the use of "linear" is not meant to be taken as not using a segment base.
Post 19 Mar 2012, 14:34
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 19 Mar 2012, 20:18
I think they should be more precise about these things. Well - in the old days the meaning of "linear address" phrase in Intel manuals was quite strict.
Post 19 Mar 2012, 20:18
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.