flat assembler
Message board for the users of flat assembler.
 Home   FAQ   Search   Register 
 Profile   Log in to check your private messages   Log in 
flat assembler > Projects and Ideas > Put on your optimization hats, its time to beat the compiler

Goto page 1, 2  Next
Author
Thread Post new topic Reply to topic
tthsqe



Joined: 20 May 2009
Posts: 653
Put on your optimization hats, its time to beat the compiler
I have completed the conversion of a large c++ project to fasm at
https://github.com/tthsqe12/asm
it is supposed to the same as this
https://github.com/official-stockfish/Stockfish

As you can see, it is a chess engine, and I'm hoping to make it faster. If anyone has any idea at the instruction level or higher, I would be willing to listen, but I also realize it might take sometime to to digest the original c++ code.
Post 28 Jun 2016, 03:54
View user's profile Send private message Reply with quote
redsock



Joined: 09 Oct 2009
Posts: 243
Location: Australia
Is there profiler information for your codebase? (what specific functions can/should be closely inspected, etc)

Is this windows-only?

_________________
2 Ton Digital - https://2ton.com.au/
Post 28 Jun 2016, 04:26
View user's profile Send private message Reply with quote
randall



Joined: 03 Dec 2011
Posts: 149
Location: Poland
And now, is it faster or slower than C++ version?

_________________
https://github.com/michal-z
Post 28 Jun 2016, 14:54
View user's profile Send private message Visit poster's website Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 653
it is currently faster than the fastest c++ compile, but I want to go faster and also support linux/osx. Randall, do you think you could help with the linux port? I have all of the os specific functions in one file, which would need translation to linux syscalls.
Post 28 Jun 2016, 16:40
View user's profile Send private message Reply with quote
randall



Joined: 03 Dec 2011
Posts: 149
Location: Poland

tthsqe wrote:
it is currently faster than the fastest c++ compile, but I want to go faster and also support linux/osx. Randall, do you think you could help with the linux port? I have all of the os specific functions in one file, which would need translation to linux syscalls.



Impressive work. I do not code for Linux any more, I don't even have Linux installed on my machine. Sorry I won't be able to help with Linux port.
But, I will try to run your program under Vtune to look for some possible optimizations.

_________________
https://github.com/michal-z
Post 28 Jun 2016, 19:45
View user's profile Send private message Visit poster's website Reply with quote
HaHaAnonymous



Joined: 02 Dec 2012
Posts: 1173
Location: Unknown
Very good work, tthsqe!

I used to play/cheat a lot with "stockfish" in the past (2010-2012). It is my favorite chess engine.

If there is something I can do to help port to Linux it would be great.
Post 12 Jul 2016, 00:59
View user's profile Send private message Reply with quote
HaHaAnonymous



Joined: 02 Dec 2012
Posts: 1173
Location: Unknown

Quote:

I have all of the os specific functions in one file, which would need translation to linux syscalls.


Where is the file?
Post 12 Jul 2016, 04:47
View user's profile Send private message Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 653
the relevant commit is
https://github.com/tthsqe12/asm/commit/634c1b1ceabbfa4e3fc512540b44d570663b398c

You will want to have a look at asm/asmFish/OsWindows.asm and try to fill in the functions in asm/asmFish/OsLinx.asm. The ones that are not done yet are simply marked with int3. The main source for linux is asm/asmFish/asmFish_base.ASM, which is currently just a timer test function. The windows version is fully functional and plays great! ( http://spcc.beepworld.de )
Post 12 Jul 2016, 15:07
View user's profile Send private message Reply with quote
HaHaAnonymous



Joined: 02 Dec 2012
Posts: 1173
Location: Unknown
Thank you!

As github takes a lot of bandwidth, I cannot clone your project here (I just downloaded the commit from zip).

I am currently very limited in internet and can use only 50MB of 3G data daily. I am online only Monday to Wednesday now.

About the missing functions, if you could tell me how to exit each function. If there is any register that must return untouched and other things.

Example, now I did "_FileOpen":

Code:

_FileOpen:
        ; in: rcx path string
        ; out: rax handle from CreateFile (win), fd (linux)
                mov   rdi,rcx
                mov   esi,O_RDWR
                mov   eax,sys_open
            syscall
                ret




Is OK to return like that? What if the function fail, I have to "translate" the error returned to satisfy the Windows one?

Here are some linux data that you may include in your Linux files:

Linux 64-bit syscall numbers:

Code:

sys_read                     = $0000
sys_write                    = $0001
sys_open                     = $0002
sys_close                    = $0003
sys_newstat                  = $0004
sys_newfstat                 = $0005
sys_newlstat                 = $0006
sys_stat                     = $0004
sys_fstat                    = $0005
sys_lstat                    = $0006
sys_poll                     = $0007
sys_lseek                    = $0008
sys_mmap                     = $0009
sys_mprotect                 = $000A
sys_munmap                   = $000B
sys_brk                      = $000C
sys_rt_sigaction             = $000D
sys_rt_sigprocmask           = $000E
stub_rt_sigreturn            = $000F
sys_ioctl                    = $0010
sys_pread64                  = $0011
sys_pwrite64                 = $0012
sys_readv                    = $0013
sys_writev                   = $0014
sys_access                   = $0015
sys_pipe                     = $0016
sys_select                   = $0017
sys_sched_yield              = $0018
sys_mremap                   = $0019
sys_msync                    = $001A
sys_mincore                  = $001B
sys_madvise                  = $001C
sys_shmget                   = $001D
sys_shmat                    = $001E
sys_shmctl                   = $001F
sys_dup                      = $0020
sys_dup2                     = $0021
sys_pause                    = $0022
sys_nanosleep                = $0023
sys_getitimer                = $0024
sys_alarm                    = $0025
sys_setitimer                = $0026
sys_getpid                   = $0027
sys_sendfile64               = $0028
sys_socket                   = $0029
sys_connect                  = $002A
sys_accept                   = $002B
sys_sendto                   = $002C
sys_recvfrom                 = $002D
sys_sendmsg                  = $002E
sys_recvmsg                  = $002F
sys_shutdown                 = $0030
sys_bind                     = $0031
sys_listen                   = $0032
sys_getsockname              = $0033
sys_getpeername              = $0034
sys_socketpair               = $0035
sys_setsockopt               = $0036
sys_getsockopt               = $0037
stub_clone                   = $0038
stub_fork                    = $0039
stub_vfork                   = $003A
stub_execve                  = $003B
sys_exit                     = $003C
sys_wait4                    = $003D
sys_kill                     = $003E
sys_newuname                 = $003F
sys_semget                   = $0040
sys_semop                    = $0041
sys_semctl                   = $0042
sys_shmdt                    = $0043
sys_msgget                   = $0044
sys_msgsnd                   = $0045
sys_msgrcv                   = $0046
sys_msgctl                   = $0047
sys_fcntl                    = $0048
sys_flock                    = $0049
sys_fsync                    = $004A
sys_fdatasync                = $004B
sys_truncate                 = $004C
sys_ftruncate                = $004D
sys_getdents                 = $004E
sys_getcwd                   = $004F
sys_chdir                    = $0050
sys_fchdir                   = $0051
sys_rename                   = $0052
sys_mkdir                    = $0053
sys_rmdir                    = $0054
sys_creat                    = $0055
sys_link                     = $0056
sys_unlink                   = $0057
sys_symlink                  = $0058
sys_readlink                 = $0059
sys_chmod                    = $005A
sys_fchmod                   = $005B
sys_chown                    = $005C
sys_fchown                   = $005D
sys_lchown                   = $005E
sys_umask                    = $005F
sys_gettimeofday             = $0060
sys_getrlimit                = $0061
sys_getrusage                = $0062
sys_sysinfo                  = $0063
sys_times                    = $0064
sys_ptrace                   = $0065
sys_getuid                   = $0066
sys_syslog                   = $0067
sys_getgid                   = $0068
sys_setuid                   = $0069
sys_setgid                   = $006A
sys_geteuid                  = $006B
sys_getegid                  = $006C
sys_setpgid                  = $006D
sys_getppid                  = $006E
sys_getpgrp                  = $006F
sys_setsid                   = $0070
sys_setreuid                 = $0071
sys_setregid                 = $0072
sys_getgroups                = $0073
sys_setgroups                = $0074
sys_setresuid                = $0075
sys_getresuid                = $0076
sys_setresgid                = $0077
sys_getresgid                = $0078
sys_getpgid                  = $0079
sys_setfsuid                 = $007A
sys_setfsgid                 = $007B
sys_getsid                   = $007C
sys_capget                   = $007D
sys_capset                   = $007E
sys_rt_sigpending            = $007F
sys_rt_sigtimedwait          = $0080
sys_rt_sigqueueinfo          = $0081
sys_rt_sigsuspend            = $0082
sys_sigaltstack              = $0083
sys_utime                    = $0084
sys_mknod                    = $0085
sys_personality              = $0087
sys_ustat                    = $0088
sys_statfs                   = $0089
sys_fstatfs                  = $008A
sys_sysfs                    = $008B
sys_getpriority              = $008C
sys_setpriority              = $008D
sys_sched_setparam           = $008E
sys_sched_getparam           = $008F
sys_sched_setscheduler       = $0090
sys_sched_getscheduler       = $0091
sys_sched_get_priority_max   = $0092
sys_sched_get_priority_min   = $0093
sys_sched_rr_get_interval    = $0094
sys_mlock                    = $0095
sys_munlock                  = $0096
sys_mlockall                 = $0097
sys_munlockall               = $0098
sys_vhangup                  = $0099
sys_modify_ldt               = $009A
sys_pivot_root               = $009B
sys_sysctl                   = $009C
sys_prctl                    = $009D
sys_arch_prctl               = $009E
sys_adjtimex                 = $009F
sys_setrlimit                = $00A0
sys_chroot                   = $00A1
sys_sync                     = $00A2
sys_acct                     = $00A3
sys_settimeofday             = $00A4
sys_mount                    = $00A5
sys_umount                   = $00A6
sys_swapon                   = $00A7
sys_swapoff                  = $00A8
sys_reboot                   = $00A9
sys_sethostname              = $00AA
sys_setdomainname            = $00AB
stub_iopl                    = $00AC
sys_ioperm                   = $00AD
sys_init_module              = $00AF
sys_delete_module            = $00B0
sys_quotactl                 = $00B3
sys_gettid                   = $00BA
sys_readahead                = $00BB
sys_setxattr                 = $00BC
sys_lsetxattr                = $00BD
sys_fsetxattr                = $00BE
sys_getxattr                 = $00BF
sys_lgetxattr                = $00C0
sys_fgetxattr                = $00C1
sys_listxattr                = $00C2
sys_llistxattr               = $00C3
sys_flistxattr               = $00C4
sys_removexattr              = $00C5
sys_lremovexattr             = $00C6
sys_fremovexattr             = $00C7
sys_tkill                    = $00C8
sys_time                     = $00C9
sys_futex                    = $00CA
sys_sched_setaffinity        = $00CB
sys_sched_getaffinity        = $00CC
sys_io_setup                 = $00CE
sys_io_destroy               = $00CF
sys_io_getevents             = $00D0
sys_io_submit                = $00D1
sys_io_cancel                = $00D2
sys_lookup_dcookie           = $00D4
sys_epoll_create             = $00D5
sys_remap_file_pages         = $00D8
sys_getdents64               = $00D9
sys_set_tid_address          = $00DA
sys_restart_syscall          = $00DB
sys_semtimedop               = $00DC
sys_fadvise64                = $00DD
sys_timer_create             = $00DE
sys_timer_settime            = $00DF
sys_timer_gettime            = $00E0
sys_timer_getoverrun         = $00E1
sys_timer_delete             = $00E2
sys_clock_settime            = $00E3
sys_clock_gettime            = $00E4
sys_clock_getres             = $00E5
sys_clock_nanosleep          = $00E6
sys_exit_group               = $00E7
sys_epoll_wait               = $00E8
sys_epoll_ctl                = $00E9
sys_tgkill                   = $00EA
sys_utimes                   = $00EB
sys_mbind                    = $00ED
sys_set_mempolicy            = $00EE
sys_get_mempolicy            = $00EF
sys_mq_open                  = $00F0
sys_mq_unlink                = $00F1
sys_mq_timedsend             = $00F2
sys_mq_timedreceive          = $00F3
sys_mq_notify                = $00F4
sys_mq_getsetattr            = $00F5
sys_kexec_load               = $00F6
sys_waitid                   = $00F7
sys_add_key                  = $00F8
sys_request_key              = $00F9
sys_keyctl                   = $00FA
sys_ioprio_set               = $00FB
sys_ioprio_get               = $00FC
sys_inotify_init             = $00FD
sys_inotify_add_watch        = $00FE
sys_inotify_rm_watch         = $00FF
sys_migrate_pages            = $0100
sys_openat                   = $0101
sys_mkdirat                  = $0102
sys_mknodat                  = $0103
sys_fchownat                 = $0104
sys_futimesat                = $0105
sys_newfstatat               = $0106
sys_unlinkat                 = $0107
sys_renameat                 = $0108
sys_linkat                   = $0109
sys_symlinkat                = $010A
sys_readlinkat               = $010B
sys_fchmodat                 = $010C
sys_faccessat                = $010D
sys_pselect6                 = $010E
sys_ppoll                    = $010F
sys_unshare                  = $0110
sys_set_robust_list          = $0111
sys_get_robust_list          = $0112
sys_splice                   = $0113
sys_tee                      = $0114
sys_sync_file_range          = $0115
sys_vmsplice                 = $0116
sys_move_pages               = $0117
sys_utimensat                = $0118
sys_epoll_pwait              = $0119
sys_signalfd                 = $011A
sys_timerfd_create           = $011B
sys_eventfd                  = $011C
sys_fallocate                = $011D
sys_timerfd_settime          = $011E
sys_timerfd_gettime          = $011F
sys_accept4                  = $0120
sys_signalfd4                = $0121
sys_eventfd2                 = $0122
sys_epoll_create1            = $0123
sys_dup3                     = $0124
sys_pipe2                    = $0125
sys_inotify_init1            = $0126
sys_preadv                   = $0127
sys_pwritev                  = $0128
sys_rt_tgsigqueueinfo        = $0129
sys_perf_event_open          = $012A
sys_recvmmsg                 = $012B
sys_fanotify_init            = $012C
sys_fanotify_mark            = $012D
sys_prlimit64                = $012E
sys_name_to_handle_at        = $012F
sys_open_by_handle_at        = $0130
sys_clock_adjtime            = $0131
sys_syncfs                   = $0132
sys_sendmmsg                 = $0133
sys_setns                    = $0134
sys_getcpu                   = $0135
sys_process_vm_readv         = $0136
sys_process_vm_writev        = $0137
sys_kcmp                     = $0138
sys_finit_module             = $0139
sys_ni_syscall               = $013A
sys_ni_syscall2              = $013B
sys_ni_syscall3              = $013C
sys_seccomp                  = $013D
compat_sys_rt_sigaction      = $0200
stub_x32_rt_sigreturn        = $0201
compat_sys_ioctl             = $0202
compat_sys_readv             = $0203
compat_sys_writev            = $0204
compat_sys_recvfrom          = $0205
compat_sys_sendmsg           = $0206
compat_sys_recvmsg           = $0207
stub_x32_execve              = $0208
compat_sys_ptrace            = $0209
compat_sys_rt_sigpending     = $020A
compat_sys_rt_sigtimedwait   = $020B
compat_sys_rt_sigqueueinfo   = $020C
compat_sys_sigaltstack       = $020D
compat_sys_timer_create      = $020E
compat_sys_mq_notify         = $020F
compat_sys_kexec_load        = $0210
compat_sys_waitid            = $0211
compat_sys_set_robust_list   = $0212
compat_sys_get_robust_list   = $0213
compat_sys_vmsplice          = $0214
compat_sys_move_pages        = $0215
compat_sys_preadv64          = $0216
compat_sys_pwritev64         = $0217
compat_sys_rt_tgsigqueueinfo = $0218
compat_sys_recvmmsg          = $0219
compat_sys_sendmmsg          = $021A
compat_sys_process_vm_readv  = $021B
compat_sys_process_vm_writev = $021C
compat_sys_setsockopt        = $021D
compat_sys_getsockopt        = $021E
compat_sys_io_setup          = $021F
compat_sys_io_submit         = $0220




Linux defines:

Code:

; Signals
SIGHUP                  = 1
SIGINT                  = 2
SIGQUIT                 = 3
SIGILL                  = 4
SIGTRAP                 = 5
SIGIOT                  = 6
SIGABRT                 = 6
SIGBUS                  = 7
SIGFPE                  = 8
SIGKILL                 = 9
SIGUSR1                 = 10
SIGSEGV                 = 11
SIGUSR2                 = 12
SIGPIPE                 = 13
SIGALRM                 = 14
SIGTERM                 = 15
SIGSTKFLT               = 16
SIGCHLD                 = 17
SIGCLD                  = 17
SIGCONT                 = 18
SIGSTOP                 = 19
SIGTSTP                 = 20
SIGTTIN                 = 21
SIGTTOU                 = 22
SIGURG                  = 23
SIGXCPU                 = 24
SIGXFSZ                 = 25
SIGVTALRM               = 26
SIGPROF                 = 27
SIGWINCH                = 28
SIGIO                   = 29
SIGPOLL                 = 29
SIGINFO                 = 30
SIGPWR                  = 30
SIGSYS                  = 31

; Error numbers
EPERM           = 1
ENOENT          = 2
ESRCH           = 3
EINTR           = 4
EIO             = 5
ENXIO           = 6
E2BIG           = 7
ENOEXEC         = 8
EBADF           = 9
ECHILD          = 10
EAGAIN          = 11
ENOMEM          = 12
EACCES          = 13
EFAULT          = 14
ENOTBLK         = 15
EBUSY           = 16
EEXIST          = 17
EXDEV           = 18
ENODEV          = 19
ENOTDIR         = 20
EISDIR          = 21
EINVAL          = 22
ENFILE          = 23
EMFILE          = 24
ENOTTY          = 25
ETXTBSY         = 26
EFBIG           = 27
ENOSPC          = 28
ESPIPE          = 29
EROFS           = 30
EMLINK          = 31
EPIPE           = 32
EDOM            = 33
ERANGE          = 34
EDEADLK         = 35
ENAMETOOLONG    = 36
ENOLCK          = 37
ENOSYS          = 38
ENOTEMPTY       = 39
ELOOP           = 40
EWOULDBLOCK     = EAGAIN
ENOMSG          = 42
EIDRM           = 43
ECHRNG          = 44
EL2NSYNC        = 45
EL3HLT          = 46
EL3RST          = 47
ELNRNG          = 48
EUNATCH         = 49
ENOCSI          = 50
EL2HLT          = 51
EBADE           = 52
EBADR           = 53
EXFULL          = 54
ENOANO          = 55
EBADRQC         = 56
EBADSLT         = 57
EDEADLOCK       = EDEADLK
EBFONT          = 59
ENOSTR          = 60
ENODATA         = 61
ETIME           = 62
ENOSR           = 63
ENONET          = 64
ENOPKG          = 65
EREMOTE         = 66
ENOLINK         = 67
EADV            = 68
ESRMNT          = 69
ECOMM           = 70
EPROTO          = 71
EMULTIHOP       = 72
EDOTDOT         = 73
EBADMSG         = 74
EOVERFLOW       = 75
ENOTUNIQ        = 76
EBADFD          = 77
EREMCHG         = 78
ELIBACC         = 79
ELIBBAD         = 80
ELIBSCN         = 81
ELIBMAX         = 82
ELIBEXEC        = 83
EILSEQ          = 84
ERESTART        = 85
ESTRPIPE        = 86
EUSERS          = 87
ENOTSOCK        = 88
EDESTADDRREQ    = 89
EMSGSIZE        = 90
EPROTOTYPE      = 91
ENOPROTOOPT     = 92
EPROTONOSUPPORT = 93
ESOCKTNOSUPPORT = 94
EOPNOTSUPP      = 95
EPFNOSUPPORT    = 96
EAFNOSUPPORT    = 97
EADDRINUSE      = 98
EADDRNOTAVAIL   = 99
ENETDOWN        = 100
ENETUNREACH     = 101
ENETRESET       = 102
ECONNABORTED    = 103
ECONNRESET      = 104
ENOBUFS         = 105
EISCONN         = 106
ENOTCONN        = 107
ESHUTDOWN       = 108
ETOOMANYREFS    = 109
ETIMEDOUT       = 110
ECONNREFUSED    = 111
EHOSTDOWN       = 112
EHOSTUNREACH    = 113
EALREADY        = 114
EINPROGRESS     = 115
ESTALE          = 116
EUCLEAN         = 117
ENOTNAM         = 118
ENAVAIL         = 119
EISNAM          = 120
EREMOTEIO       = 121
EDQUOT          = 122
ENOMEDIUM       = 123
EMEDIUMTYPE     = 124
ECANCELED       = 125
ENOKEY          = 126
EKEYEXPIRED     = 127
EKEYREVOKED     = 128
EKEYREJECTED    = 129
EOWNERDEAD      = 130
ENOTRECOVERABLE = 131
ERFKILL         = 132
EHWPOISON       = 133

; O_ flags
O_ACCMODE               = 00000003o
O_RDONLY                = 00000000o
O_WRONLY                = 00000001o
O_RDWR                  = 00000002o
O_CREAT                 = 00000100o
O_EXCL                  = 00000200o
O_NOCTTY                = 00000400o
O_TRUNC                 = 00001000o
O_APPEND                = 00002000o
O_NONBLOCK              = 00004000o
O_NDELAY                = O_NONBLOCK
O_SYNC                  = 04010000o
O_FSYNC                 = O_SYNC
O_ASYNC                 = 00020000o
O_DIRECTORY             = 00200000o
O_NOFOLLOW              = 00400000o
O_CLOEXEC               = 02000000o
O_DIRECT                = 00040000o
O_NOATIME               = 01000000o
O_PATH                  = 10000000o
O_DSYNC                 = 00010000o
O_RSYNC                 = O_SYNC
O_LARGEFILE             = 00100000o

; R_ flags
R_OK = 4
W_OK = 2
X_OK = 1
F_OK = 0

; S_ flags
S_IRWXU            = 00000700o
S_IRUSR            = 00000400o
S_IWUSR            = 00000200o
S_IXUSR            = 00000100o
S_IRWXG            = 00000070o
S_IRGRP            = 00000040o
S_IWGRP            = 00000020o
S_IXGRP            = 00000010o
S_IRWXO            = 00000007o
S_IROTH            = 00000004o
S_IWOTH            = 00000002o
S_IXOTH            = 00000001o

; PROT_ flags
PROT_READ               = $01
PROT_WRITE              = $02
PROT_EXEC               = $04
PROT_SEM                = $08
PROT_NONE               = $00
PROT_GROWSDOWN          = $01000000
PROT_GROWSUP            = $02000000

; MAP_ flags
MAP_SHARED              = $01
MAP_PRIVATE             = $02
MAP_TYPE                = $0F
MAP_FIXED               = $10
MAP_ANONYMOUS           = $20
MAP_ANON                = MAP_ANONYMOUS
MAP_FILE                = 0
MAP_HUGE_SHIFT          = 26
MAP_HUGE_MASK           = $3F
MAP_32BIT               = $40
MAP_GROWSUP             = $00200
MAP_GROWSDOWN           = $00100
MAP_DENYWRITE           = $00800
MAP_EXECUTABLE          = $01000
MAP_LOCKED              = $02000
MAP_NORESERVE           = $04000
MAP_POPULATE            = $08000
MAP_NONBLOCK            = $10000
MAP_STACK               = $20000
MAP_HUGETLB             = $40000

; MS_ flags
MS_ASYNC                = 1
MS_SYNC                 = 4
MS_INVALIDATE           = 2

; MCL_ flags
MCL_CURRENT             = 1
MCL_FUTURE              = 2

; MREMAP_ flags
MREMAP_MAYMOVE          = 1
MREMAP_FIXED            = 2

; MADV_ flags
MADV_NORMAL             = 0
MADV_RANDOM             = 1
MADV_SEQUENTIAL         = 2
MADV_WILLNEED           = 3
MADV_DONTNEED           = 4
MADV_REMOVE             = 9
MADV_DONTFORK           = 10
MADV_DOFORK             = 11
MADV_MERGEABLE          = 12
MADV_UNMERGEABLE        = 13
MADV_HUGEPAGE           = 14
MADV_NOHUGEPAGE         = 15
MADV_HWPOISON           = 100

; SEEK_ flags
SEEK_6                  = $0B
SEEK_10                 = $2B
SEEK_SET                = 0
SEEK_CUR                = 1
SEEK_END                = 2
SEEK_DATA               = 3
SEEK_HOLE               = 4
SEEK_MAX                = SEEK_HOLE

; CLONE_ flags
CSIGNAL                 = $000000FF
CLONE_VM                = $00000100
CLONE_FS                = $00000200
CLONE_FILES             = $00000400
CLONE_SIGHAND           = $00000800
CLONE_PTRACE            = $00002000
CLONE_VFORK             = $00004000
CLONE_PARENT            = $00008000
CLONE_THREAD            = $00010000
CLONE_NEWNS             = $00020000
CLONE_SYSVSEM           = $00040000
CLONE_SETTLS            = $00080000
CLONE_PARENT_SETTID     = $00100000
CLONE_CHILD_CLEARTID    = $00200000
CLONE_DETACHED          = $00400000
CLONE_UNTRACED          = $00800000
CLONE_CHILD_SETTID      = $01000000
CLONE_NEWUTS            = $04000000
CLONE_NEWIPC            = $08000000
CLONE_NEWUSER           = $10000000
CLONE_NEWPID            = $20000000
CLONE_NEWNET            = $40000000
CLONE_IO                = $80000000

; CLOCK_ flags
CLOCK_REALTIME           = 0
CLOCK_MONOTONIC          = 1
CLOCK_PROCESS_CPUTIME_ID = 2
CLOCK_THREAD_CPUTIME_ID  = 3
CLOCK_MONOTONIC_RAW      = 4
CLOCK_REALTIME_COARSE    = 5
CLOCK_MONOTONIC_COARSE   = 6
CLOCK_BOOTTIME           = 7
CLOCK_REALTIME_ALARM     = 8
CLOCK_BOOTTIME_ALARM     = 9
CLOCK_SGI_CYCLE          = 10
CLOCK_TAI                = 11
MAX_CLOCKS               = 16
CLOCKS_MASK              = CLOCK_REALTIME or CLOCK_MONOTONIC
CLOCKS_MONO              = CLOCK_MONOTONIC
TIMER_ABSTIME            = 1

; SIGEV_ flags
SIGEV_SIGNAL    = 0
SIGEV_NONE      = 1
SIGEV_THREAD    = 2
SIGEV_THREAD_ID = 4


Post 12 Jul 2016, 16:45
View user's profile Send private message Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 653
So part of the design decision involved using the MS 64 bit ABI, which preserves registers rbp, rbx, rsi, rdi, r12, r13, r14, r15; if these are preserved, it should work. I'm not sure if the stack needs to be aligned for the syscall, so it do that also just in case.


Code:
_FileOpen
        ; in: rcx path string 
        ; out: rax handle from CreateFile (win), fd (linux) 
               push   rbx rsi rdi 
                mov   rdi,rcx 
                mov   esi,O_RDWR 
                mov   eax,sys_open 
            syscall 
                pop   rdi rsi rbx
                ret 


You will find some strachwork in OsLinuxOLD.asm related to the more difficult mutex functions, which might require adjustments to some of the structs. Also, if you ever are in a position to actually test the whole thing, I do not pass the size (rdx) to _VirtualFree as it is not required by windows. Every use of _VirtualFree will need to be updated for this to work. Finally, if you want to handle error cases like i do in OsWindows.asm that would be great but not necessary at this time.
Post 12 Jul 2016, 18:28
View user's profile Send private message Reply with quote
HaHaAnonymous



Joined: 02 Dec 2012
Posts: 1173
Location: Unknown
I tried to assemble the file "asmFish.asm", several errors of include file not found, I tried to fix them and another error of "undefined symbol" popped up.

What "_GetCommandLine" is supposed to return, and how?

I need more information about "_ReadIn" as well. It is not clear enough.

"_ErrorBox" generates a "Windows message box"? Is that really needed, couldn't just print to "stderr"?


Quote:

I do not pass the size (rdx) to _VirtualFree as it is not required by windows. Every use of _VirtualFree will need to be updated for this to work.


OK, and in what files those calls are located?

Thank you!
Post 13 Jul 2016, 00:29
View user's profile Send private message Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 653
Ah, sorry - the main source is asmFish_base.asm (or ..._popcnt, ..._bmi2).

If an assert fails while playing a game of chess not via cmd line, I wanted to see the error message. I fixed all of the bugs related to chess playing on the windows version, so if you want to make _ErrorBox write to stderr, that is fine.

_ReadIn is supposed to read from stdin and get 'one line', where I define one line as a string of characters where the last and only the last character is < 0x20. It should read into the buffer whose address is stored in InputBuffer and its size in InputBufferSizeB. Upon return, rsi should hold qword[InputBuffer], which is the start of the string.

_GetCommandLine does exactly what you expect on windows, and should be broken on linux. I will remove this from the linux version so you can ignore it.

Finally, the size parameter of _VirtualFree is going to require a modification by me - let me do this.

BTW, you should be able to assemble asmFish_base.asm and run the program. I get

Code:
Hello!
 time = 2353352ms
one second later
 time = 2354353ms

Post 13 Jul 2016, 18:07
View user's profile Send private message Reply with quote
HaHaAnonymous



Joined: 02 Dec 2012
Posts: 1173
Location: Unknown

Quote:

BTW, you should be able to assemble asmFish_base.asm and run the program.


Yea, it did.



New symbols:

Code:

; stdio
stdin  = 0
stdout = 1
stderr = 2

; PRIO_ flags
PRIO_PROCESS = 0
PRIO_PGRP    = 1
PRIO_USER    = 2




I did what I could:

    _SetNormalPriority
    _SetRealtimePriority
    _ReadIn
    _ExitThread
    _ThreadCreate
    _FileUnmap
    _FileMap
    _FileClose
    _FileOpen
    _ErrorBox



Code:

align 16
strlen:
                xor   raxrax
   @@:
                cmp   byte [rdi+rax], $00
                 je   @f
                inc   rax
                cmp   byte [rdi+rax], $00
                 je   @f
                inc   rax
                cmp   byte [rdi+rax], $00
                 je   @f
                inc   rax
                cmp   byte [rdi+rax], $00
                 je   @f
                inc   rax
                cmp   byte [rdi+rax], $00
                 je   @f
                inc   rax
                jmp   @b
   @@:
                ret



align 16
_ErrorBox:
        ; rdi points to null terminated string to write to message box
        ; this may be called from a leaf with no stack allignment
        ; one purpose is a hard exit on failure
        ; loading user32.dll multiple times (i.e. on each call)
        ;   seems to result in a crash in ExitProcess
        ;  so we load only once
               call   strlen
               push   rdi rsi rbx
                mov   rsirdi
                mov   edistderr
                mov   rdxrax
                mov   eaxsys_write
            syscall
                pop   rbx rsi rdi
                ret



align 16
_FileOpen:
        ; in: rcx path string
        ; out: rax handle from CreateFile (win), fd (linux)
               push   rbx rsi rdi
                mov   rdircx
                mov   esiO_RDWR
                mov   eaxsys_open
            syscall
                pop   rdi rsi rbx
                ret



align 16
_FileClose:
        ; in: rcx handle from CreateFile (win), fd (linux)
               push   rbx rsi rdi
                mov   rdircx
                mov   eaxsys_close
            syscall
                pop   rdi rsi rbx
                ret



align 16
_FileMap:
        ; in: rcx handle (win), fd (linux)
        ; out: rax base address
        ;      rdx handle from CreateFileMapping (win), size (linux)
        ; get file size
               push   rbx rsi rdi
               push   rcx
                sub   rsp160
                mov   rdircx
                mov   rsirsp
                mov   eaxsys_fstat
            syscall
                mov   rdx, [rsp+$30; file size
                add   rsp160
        ; map file
                pop   r8                            ; fd
                xor   ediedi                      ; addr
               push   rdx
                mov   rsirdx                      ; length
                mov   edxPROT_READ or PROT_WRITE  ; protection flags
                mov   r10MAP_PRIVATE or MAP_32BIT ; mapping flags
                mov   r9dedi                      ; offset
                mov   eaxsys_mmap
            syscall
        ; return size in rdx, base address in rax
                pop   rdx
                pop   rdi rsi rbx
                ret



align 16
_FileUnmap:
        ; in: rcx base address
        ;     rdx handle from CreateFileMapping (win), size (linux)
               push   rbx rsi rdi
                mov   rdircx        ; addr
                mov   rsirdx        ; length
                mov   eaxsys_munmap
            syscall
                pop   rdi rsi rbx
                ret



align 16
_ThreadCreate:
        ; rcx: start address
        ; rdx: parameter to pass
               push   rbx rsi rdi
               push   rdx
               push   rcx
        ; allocate memory for the thread stack
                xor   ediedi                                                         ; addr
                mov   esi8192                                                        ; length
                mov   edxPROT_READ or PROT_WRITE                                     ; protection flags
                mov   r10dMAP_PRIVATE or MAP_32BIT or MAP_ANONYMOUS or MAP_GROWSDOWN ; mapping flags
                mov   r8d0                                                           ; fd
                mov   r9d0                                                           ; offset
                mov   eaxsys_mmap
            syscall
        ; create child
                add   rax8192
                mov   ediCLONE_VM or CLONE_FS or CLONE_FILES or CLONE_SIGHAND or CLONE_THREAD ; flags
                mov   rsirax                                                                  ; child_stack
                mov   edx0                                                                    ; ptid
                mov   r10d0                                                                   ; ctid
                mov   r8d0                                                                    ; regs
                mov   eaxstub_clone
                pop   r9
                pop   rbx
            syscall
        ; redirect child to function
               test   raxrax
                jnz  .return
                mov   rcxrbx
               call   r9
        ; make sure child is terminated if it returns
                xor   ecxecx
               call   _ExitThread
   align 8
   .return:
                pop   rdi rsi rbx
                ret



align 16
_ExitThread:
        ; rcx is exit code
                mov   rdircx
                mov   eaxsys_exit
            syscall



align 16
_ReadIn:
        ; out: eax =  0 if not file end
        ;      eax = -1 if file end
        ;      rsi address of string start
        ;      rcx address of string end
        ;
        ; uses global InputBuffer and InputBufferSizeB
        ; reads one line and then returns
        ; any char < ' ' is considered a newline char and
               push   rdi rbx
                mov   rbxqword [InputBuffer]
        ; "address of string start"
               push   rbx
                mov   r9rbx
        ; "new line" found? "r8" will tell
                xor   r8dr8d
                sub   rbx$01
                add   r9qword [InputBufferSizeB]
   .read:
                inc   rbx
        ; not sure if reading in blocks would "lose" other commands, then I'm reading from "stdin" 1 by 1 to be safe.
        ; not sure if I understood what you said correctly, I'm reading exactly one line from "stdin" into "buffer" and returning
                mov   edistdin
                mov   rsirbx
                mov   edx$01
                mov   eaxsys_read
        ; check for buffer overflow
                cmp   rbxr9
                jge  .buffer_overflow
            syscall
        ; check for file end
               test   alal
                 jz  .file_end
        ; check for "new line"
                cmp   byte [rbx], $20
                 je  .new_line_end
                cmp   byte [rbx], $3C
                jne  .read
        ; first "end of line" char found
                mov   r8d$01
                jmp  .read
   align 8
   .new_line_end:
               test   r8br8b
                mov   r8d$00
                 jz  .read
        ; "end of line" detected, file end not reached
                mov   eax$00
        ; "address of string end" at "<", end-start = relevant length (i.e. excluding "< ")
                mov   rcxrbx
                dec   rcx
        ; "address of string start" in "rsi"
                pop   rsi
                pop   rbx rdi
                ret
   align 8
   .buffer_overflow:
                mov   eax$00
        ; "address of string end" at last valid char+1, end-start = relevant length
                mov   rcxrbx
        ; "address of string start" in "rsi"
                pop   rsi
                pop   rbx rdi
                ret
   align 8
   .file_end:
        ; end of file reached while reading into buffer
                mov   eax, -1
        ; "address of string end" at last valid char+1, end-start = relevant length (i.e. if "relevant length" is 0, there was nothing to read)
                mov   rcxrbx
        ; "address of string start" in "rsi"
                pop   rsi
                pop   rbx rdi
                ret



align 16
_SetRealtimePriority:
        ; must be root to set "higher" priority, normal user can only lower priority
               push   rsi rdi rbx
                mov   ediPRIO_PROCESS    ; which
                mov   esi0               ; who
                mov   edx, -15             ; priority
                mov   eaxsys_setpriority
            syscall
                pop   rbx rdi rsi
                ret



align 16
_SetNormalPriority:
        ; must be root to set "higher" priority, normal user can only lower priority
               push   rsi rdi rbx
                mov   ediPRIO_PROCESS    ; which
                mov   esi0               ; who
                mov   edx0               ; priority
                mov   eaxsys_setpriority
            syscall
                pop   rbx rdi rsi
                ret



More attention with "_ReadIn", "_FileMap", "_FileUnMap", "_ThreadCreate", and "_ExitThread". Not sure if they return/do what the program is expecting.

I will need to learn more about these:

    _MutexDestroy
    _MutexUnlock
    _MutexLock
    _MutexCreate


More info about these:

    _ThreadJoin
    _VirtualAllocNuma
    _SetThreadPoolInfo


And I have no clue about these:

    _EventDestroy
    _EventWait
    _EventSignal
    _EventCreate


Any help and information is appreciated. Thank you!
Post 18 Jul 2016, 03:40
View user's profile Send private message Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 653
Thanks HaHa.
I have looked over your functions, edited them, and added the missing ones with which I am more familiar. I will post my changes to github and try to get something that works.
Post 20 Jul 2016, 17:50
View user's profile Send private message Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 653
The Linux version seems to be working - please try it and try to play chess with it in whatever gui you have.
Post 21 Jul 2016, 02:26
View user's profile Send private message Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3415
Location: Bulgaria
Really great job!
Post 21 Jul 2016, 04:54
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
tthsqe



Joined: 20 May 2009
Posts: 653
Haha, the current ExitProcess function is broken

Code:
_ExitProcess:
        ; rcx is exit code
               push   rdi
                mov   rdircx
                mov   eaxsys_exit
            syscall


When called from child that was created via stub_clone, it terminates only that child and not every thread. I would like to completely stop the program. How to do that?
Post 23 Jul 2016, 00:43
View user's profile Send private message Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3415
Location: Bulgaria
sys_exit_group?
Post 23 Jul 2016, 07:31
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
HaHaAnonymous



Joined: 02 Dec 2012
Posts: 1173
Location: Unknown

Quote:

sys_exit_group?


Yea.

Code:

_ExitProcess
        ; rcx is exit code 
                mov   rdircx 
                mov   eaxsys_exit_group
            syscall





Quote:

The Linux version seems to be working - please try it and try to play chess with it in whatever gui you have.


I do not have one now. I will search for one, and I'll let you know when I test it.

If you know of one, please let me know then I can save time and bandwidth.

Thank you!
Post 25 Jul 2016, 03:14
View user's profile Send private message Reply with quote
HaHaAnonymous



Joined: 02 Dec 2012
Posts: 1173
Location: Unknown
I found "pyChess" and it worked fine.

I have just tested "asmFish", and it worked great (apparently). Good to see its performance at "http://spcc.beepworld.de".

Very good work! Thank you!
Post 25 Jul 2016, 17:58
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page 1, 2  Next

< Last Thread | Next Thread >

Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001-2005 phpBB Group.

Main index   Download   Documentation   Examples   Message board
Copyright © 2004-2016, Tomasz Grysztar.