flat assembler
Message board for the users of flat assembler.

Index > Linux > converting 32 bit linked fasm to 64 bit g++ inline asm

Author
Thread Post new topic Reply to topic
fpga



Joined: 22 Sep 2009
Posts: 36
fpga 24 Jan 2010, 15:02
I've fallen at the first hurdle and wonder if someone could put me straight
re what I'm doing wrong to get a file descriptor of -14 when I open /dev/vcsa1

Here's my code
Code:
//My amd64 inline asm version 
#include <iostream>
using namespace std;

void my_fn(){
   const char * fl =       "/dev/vcsa1";
   const char * test_str = "          "; //make same size as fl
   int test_int = -1; //ie initialised to fail unless replaced by value 

   //1 open /dev/vcsa1 & return file descriptor in rax  
   __asm__ __volatile__(
   ".equ sys_open, 2\n"           //changed for 64bit
   ".equ O_RDWR, 02\n"          //same for 32 and 64 bit
   "mov $sys_open, %%rax\n" 
   "mov $O_RDWR, %%rcx\n" 
   "mov $0600, %%rdx\n" //read/write for user in x86. Not sure for AMD64?
   "syscall\n"                         //changed from int $0x80
   :"=b"(test_str), "=a"(test_int)
   :"b"(fl)
   ); 
  
   //test file name in rbx and returned file descriptor
   cout << test_str << endl; // /dev/vcsa1                     pass
   cout << test_int << endl; // 2 before syscall -14 after ie  fail  
}

int main(){
   my_fn();
   return 0;
}

    

Any help much appreciated

EDIT:
Here is the fasm code
I converted it from konsola.asm (in nasm)
found at http://rudy.mif.pg.gda.pl/~bogdro/linux/tryb_txt_linux_en.html
Hope this helps

Code:
;format  ELF executable
;commented out and replaced with...
format  ELF64

;entry   start
;commented out and replaced with...
public _print_str as 'print_str'

sys_exit       equ      1
sys_read       equ 3
sys_write      equ 4
sys_open       equ 5
sys_close      equ 6
sys_lseek      equ 19
SEEK_SET       equ        0
O_RDWR                equ    02o

; position to display
nasz_wiersz          equ  10
nasza_kolumna     equ     10

;start:
;commented out and replaced with...
_print_str: 

        mov     eax, sys_open           ; open the file
     mov     ebx, plik         ; file name
       mov     ecx, O_RDWR       ; read and write
  mov     edx, 600o         ; read and write for the user
     int     80h                           ; open

    cmp     eax, 0
      jl      .koniec

 mov     ebx, eax                      ; file handle
 mov     eax, sys_read           ; read the file (console attributes first)
  mov     ecx, konsola            ; where to read
     mov     edx, 4                     ; how many bytes to read
 int     80h


 mov     eax, sys_lseek             ; seek to the correct psition
    movzx   ecx, byte [l_kolumn]    ; zero extends dest operand to size of source operand
       imul    ecx, nasz_wiersz
    add     ecx, nasza_kolumna      ; ECX=row*row length + column
       shl     ecx, 1                        ; ECX *= 2, because there are 2 bytes
                                                ; on the screen for each character:
                              ; the character and its attribute
   add     ecx, 4                        ; +4, because we're moving from
                                                     ; the beginning of the file
  mov     edx, SEEK_SET              ; start at the beginning of the file
     int     80h


 mov     eax, sys_write             ; writing to the file
    mov     ecx, znak                     ; what to write
       mov     edx, 2                        ; how many bytes to write
     int     80h

     mov     eax, sys_close             ; close the file
 int     80h


         xor     eax, eax                         ; EAX = 0 = no errors

.koniec:
  mov     ebx, eax
    mov     eax, sys_exit
       int     80h                              ; exit with zero code or the error,
                                                       ; from opening the file

plik      db      "/dev/vcsa1", 0               ; first text console file
                                                  ; attributes of the console:
konsola:
l_wierszy    db      0
l_kolumn   db      0
kursor_x   db      0
kursor_y   db      0

                       ; character with the attribute we're going to display:
znak             db      "*"                           ;<<<<<<HERE
atrybut                db      43h             ; skyblue on red  ;<<<<<<HERE

    


Code:
/*
this file is a .cpp skeleton just to link object file of fasm konsola.asm
*/
using namespace std;
extern "C" void print_str();
int main(){
   print_str();
   return 0;
} 
    


Just linked the two like this
Code:
#!/bin/sh
#create the exe "linked_konsola"
./fasm linked_konsola_fasm.asm linked_konsola_fasm.o
g++ -c linked_konsola_cpp.cpp
g++ -o linked_konsola linked_konsola_fasm.o linked_konsola_cpp.o
    


To run it I did...
ctrl/alt/F1
su'd to get root permissions
./linked_konsola

It prints a red start on a blue background at postion 10,10


Last edited by fpga on 24 Jan 2010, 15:49; edited 5 times in total
Post 24 Jan 2010, 15:02
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20299
Location: In your JS exploiting you and your system
revolution 24 Jan 2010, 15:26
Erm, where is the fasm code? Confused
Post 24 Jan 2010, 15:26
View user's profile Send private message Visit poster's website Reply with quote
fpga



Joined: 22 Sep 2009
Posts: 36
fpga 24 Jan 2010, 15:36
I've included it in the 1st post now. Hope it helps.
Post 24 Jan 2010, 15:36
View user's profile Send private message Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 24 Jan 2010, 16:25
Any reason you want to use inline assembly instead of linking against an external object assembled by FASM?

Some general obervations:
  • you really only need to do "const char * test_str;" - memory pointed to by test_str isn't written to in your example, only the pointer is. Also, see note below.
  • you should add at least RCX and RDX to the clobber list (and check what's considered clobberable when using SYSCALL interface; there might be more). RAX and RBX needn't be added since you specify those as output arguments, and they're thus considered clobbered implicitly.
  • if SYSCALL sets flags, you should add "cc" to the clobber-list.
  • when you decide to do a sys_read (or anything other call that causes memory write not specified by output-parm) you should add "memory" to the clobber list.

As for why you're getting an unexpected return value, dunno - never used syscalls directly in linux programs, and I think it's a pretty bad idea to do so. Try tracing your program with gdb, then trace a C version calling open(), and see what the differences are.

Note on const: "const char*" means the contents are const, but the pointer isn't... If you expected the contents to change, this is obviously wrong. If you consider fl to be entire immutable (ie, the string it points to as well as the pointer), that would be "const char * const fl = "/dev/vcsa1";". The C syntax for pointers and const is probably one of the triciker things in the language... Here's a sample:
Code:
//const      char *      fasm = "fasm";    // memory:read, pointer:read-write
//      char *const fasm = "fasm";      // memory:read-write, pointer:read
const   char *const fasm = "fasm";      // memory:read, pointer:read

const char *readptr;
char *writeptr;

int main()
{
//  fasm = "masm";      // can't reassign pointer
//  fasm[0] = 'm';      // can't write to read-only

    readptr = fasm;     // allowed, since readptr points-to-const
//  readptr[0] = 'm';   // still can't write to read-only

    writeptr = fasm;    // allowed in C (with warning), disallowed in C++
    writeptr[0] = 'm';  // might work, might crash - implementation specific
}    

Things can get pretty damn hairy when you need multiple layers of pointed-to types; fortunately you rarely need this, though, and typedefs help a lot when you do.
Post 24 Jan 2010, 16:25
View user's profile Send private message Visit poster's website Reply with quote
fpga



Joined: 22 Sep 2009
Posts: 36
fpga 24 Jan 2010, 17:24
Good question!
Yes there is.

This code is for updating the screen for a text based simulation and many processed parts/characters will need to be redrawn as a result of their state/position being changed during each screen update.

The parts' char, x, y and fg/bg colour attribute are embedded in nested c++ objects and the different types of the outer objects are stored in various stl vectors.

As such I need to loop through such vectors and interrogate their nested objects for the display information. I don't fancy looping through these structures in asm unless there's an easy way and don't want to incur the cost of a call to asm for each part that needs redisplaying.
ie
Code:
void Update_screen(){
   for each machine in vector
      for each part in machine
            asm bit (
               suck part.char into reg a
               suck part.fg_bg into reg b
               etc
               write char directly to video ram without call overhead     
}
    


I hope this explains why

Edit:
Just seen your further post
Thank you for your advice
I did a .S of an example c "open file" program which used the relevant include files.
It looked a nightmare and I hadn't got a clue what some of the routines were that were being called.
I would have thought the routines were corruption but for the word "call" preceding them.
Post 24 Jan 2010, 17:24
View user's profile Send private message Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 25 Jan 2010, 00:41
fpga wrote:
As such I need to loop through such vectors and interrogate their nested objects for the display information. I don't fancy looping through these structures in asm unless there's an easy way and don't want to incur the cost of a call to asm for each part that needs redisplaying.
Call overhead probably won't matter too much here, but a good idea thinking about some optimizations nonetheless - and you definitely don't want to attempt accessing STL vectors in assembly Smile

Instead of focusing on optimizing CALL overhead away, though, you should focus on optimizing the routine - calling a write per char is pretty bad, the CALL overhead you're optimizing away is going to be neglicible compared to the user->kernel->user overhead, and the various processing done as well. Unfortunately, afiak mmap()'ing a /dev/vcs* file doesn't work (at least I couldn't get it to work), that would have been a nice way of accessing the textmode buffer.

I suggest you keep your own copy of the textmode memory though, do your drawing directly to this buffer, and when you need a redisplay simply write this entire buffer to the /dev/vcs* file handle you have open - no need to bother about doing direct SYSCALLs then, CALL overhead to write() won't be a problem. If you're frequently only doing very small updates, you can employ partial-refresh optimizations... I suppose an "isDirty" flag per line is enough, doing it for each individual cell is too much overhead (counting both storage for detection, code time for doing detection, and the call overhead for display individual characters).

Also, have you considered looking at ncurses? Manually writing to a vcs is faster, but afaik ncurses isn't that bad... and it's more portable (vcs doesn't even work across SSH) and has a lot of pre-baked functionality (including partial screen refreshes, afaik).

fpga wrote:
I did a .S of an example c "open file" program which used the relevant include files.
It looked a nightmare and I hadn't got a clue what some of the routines were that were being called.
I would have thought the routines were corruption but for the word "call" preceding them.
Well, I suggested tracing with GDB rather than looking at an assembly listing, since what you're after (parameters to the SYSCALL and an idea of trashed registers) isn't going to show from a listing (it's part of libc and not your program code, plus you need to look at runtime behavior). GCC .s output can be somewhat interesting - I often find that disassembling output object files can be easier than reading compiler generated assembly (this goes for GCC as well as MSVC).
Post 25 Jan 2010, 00:41
View user's profile Send private message Visit poster's website Reply with quote
fpga



Joined: 22 Sep 2009
Posts: 36
fpga 25 Jan 2010, 10:08
Code:
//My amd64 inline asm version
//this now returns a file descriptor of 3
#include <iostream>
using namespace std;

void my_fn(){
   const char * fl =       "/dev/vcsa1";
   const char * test_str = "          "; //make same size as fl
   int test_int = -1;                    //fails unless replaced by value 
   unsigned char buf[] = {0,0,0,0};

   //1 open /dev/vcsa1 & return file descriptor in rax  
   __asm__ __volatile__(
   ".equ sys_open,2\n"
   ".equ O_RDWR,02\n"
   "movq $sys_open,%%rax\n" 
   "movq $O_RDWR,%%rsi\n" 
   "movq $0600,%%rdx\n" //read/write for user in x86. Not sure for AMD64?
   "syscall\n" 
   :"=D"(test_str),"=a"(test_int)
   :"D"(fl)              //rdi
   :"rsi","rdx","rcx","r10","r11","cc" //Is this enough
   ); 

   //cout << test_str << endl; // /dev/vcsa1    ie pass
   //cout << test_int << endl; // 3             ie pass

   //read 1st 4 bytes of file into buf[]
   __asm__ __volatile__(
   "movq %%rax, %%rsi\n"   //returned file descriptor into rsi
   "movq $0,%%rax\n"      //syscall read into rax
   "movq $4,%%rdx\n"      //qty of bytes to read into rdx
   "syscall\n" 
   :"=a"(test_int)
   :"D"(&buf[0])          //ptr of where to place 1st byte into rdi
   :"rsi","rdx","rcx","r10","r11","cc"
   );
   cout << test_int << endl; //returning -9 expected 4   ie fail 
}

int main(){
   my_fn();
   return 0;
}
    


I appear to have made some progress in that I'm now getting 3 as a file descriptor when I (seem to) open /dev/vcsa1. The problem was that in addition to changing the equates, 64bit Linux seems to use different registers too. I'm now struggling with the read statement which is telling me I'm reading -9 bytes instead of the 4 specified.

I completely agree with your advice that one write per char is unreasonable and that dirty lines + a single write per screen update is the way to go.

My current display system is in ncurses. It works but disappointingly compared to my powerbasic version. I wrote the graphics module on that using the windows api. It was more akin to your above suggestion ie updating a data structure and then blitting the whole lot to the screen. This is not a criticism of ncurses 'cos it's a UI not a fast graphics library.

I'll look into doing things the way that you suggest and thank you very much for your advice.

Perhaps the above thread ie
http://board.flatassembler.net/topic.php?t=5950
will throw some light on where I'm going wrong

perhaps not ie
"and plans for a near future:
- make 64bit version. (most likely will go into separate package. not sure yet.)"
Thx for the thread anyway arafel
Post 25 Jan 2010, 10:08
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.