flat assembler
Message board for the users of flat assembler.
Index
> Linux > CGI example for linux |
Author |
|
Tarkin 04 May 2012, 16:05
I know there have been a couple of questions on this, and I've been playing with scripted CGI.
I was wondering just how much faster hand-coded assembler could be than scripted and compiled C. Here is what I came up with: Code: ; vim: set ft=fasm: set ft=2: ; simple exercise just to see if I can do CGI in assembler... use32 format elf executable 3 entry _start ; system call numbers copied from /usr/include/asm/unistd_32.h ; 32-bit 2.6-series kernel ; include 'include/unistd-26.inc' ; since you might not have this file or equivalent: __NR_write = 4 __NR_exit = 1 define CRLF 0x0D,0x0A macro SYSCALL { int 0x80 } BUFFER_LENGTH = 1024 segment executable align 4 ;;; this startup code was inspired by: ;;; "Startup state of a Linux/i386 ELF Binary" ;;; http://asm.sourceforge.net/articles/startup.html _start: mov edx,esp ; save esp pop eax ; get ARGC mov [ARGC],eax ; save ARGC mov [ARGV],esp ; save ARGV shl eax,2 ; fast eax * 4 add eax,esp ; eax + esp = -> NULL, end of ARGV mov esp,eax ; update esp pop ecx ; get NULL off the stack, init counter (ecx) mov [ENVP],esp ; esp -> ENVP, save it COUNTENV: ; count the environment variables pop eax ; get *ENVP inc ecx ; increment counter test eax,eax ; NULL check jnz COUNTENV ; repeat while !NULL dec ecx ; adjust count for NULL mov [ENVC],ecx ; save count mov esp,edx ; restore esp ; set us up the algorithm mov ebp,[ENVP] ; ebp -> ENVP mov edi,BUFFER ; edi = dynamic data buffer ENVCOPY: ; copies <li>(env var string)</li> <br /> to thebuffer mov ecx,4 mov esi,LIO rep movsb mov esi,[ebp] ; get next *string push esi ; save string origin mov ecx,255 push ecx xor eax,eax .scan: lodsb test eax,eax loopnz .scan pop eax sub eax,ecx ; compute length mov ecx,eax ; ecx = string length pop esi ; pop string start off stack rep movsb ; mov esi,LIC mov ecx,8 rep movsb times 4 inc ebp mov eax,[ebp] test eax,eax jnz ENVCOPY mov eax,edi ; buffer end mov ebx,BUFFER ; buffer start sub eax,ebx ; compute length mov [DY_L],eax ; store length mov esi,HD_P xor ebx,ebx inc ebx DOCLOOP: ; load pointers and length, write to stdout, repeat lodsd mov ecx,eax lodsd mov edx,eax mov eax,__NR_write SYSCALL mov eax,[esi+4] ; peekahead test eax,eax jnz DOCLOOP DONE: xor ebx,ebx mov eax,__NR_exit SYSCALL ;;; DATA segment readable writeable ; program arguments and environment variables ARGC: dd 0 ARGV: dd 0 ENVC: dd 0 ENVP: dd 0 CNT0: dd 0 ; buffer pointers and length HD_P: dd HEADER HD_L: dd HEND - HEADER S1_P: dd PAGE_STATIC1 S1_L: dd S1END - PAGE_STATIC1 DY_P: dd BUFFER DY_L: dd 0 S2_P: dd PAGE_STATIC2 S2_L: dd S2END - PAGE_STATIC2 ELST: dd 0 ; end of list ; static buffers - html headers, html before & after the env list HEADER: db "HTTP/1.1 200 OK",CRLF db "Content-Type: text/html",CRLF db "Connection: Close",CRLF db "X-Powered-By: ASSEMBLER",CRLF db CRLF,CRLF HEND: PAGE_STATIC1: db "<html>",CRLF db "<head>",CRLF db "<title>ASM CGI</title>",CRLF db "</head>",CRLF db "<body>",CRLF db "<h1>CGI INFO:</h1>",CRLF db "<ul>",CRLF S1END: PAGE_STATIC2: db "</ul>",CRLF db "</body>",CRLF db "</html>",CRLF S2END: LIO: db "<li>" LIC: db "</li> ",CRLF ;;; BSS segment readable writeable align 16 BUFFER: times BUFFER_LENGTH rd 0 I use thttpd http://www.acme.com/software/thttpd/ to test my CGI stuff. AFAICT, this assembler version trounces a libc/c version. I think it takes longer for thttpd/Linux longer to fork() the CGI process, than for the assembled binary to execute!! Hope this helps someone.... TTFN, Tarkin Last edited by Tarkin on 10 May 2012, 15:47; edited 1 time in total |
|||
04 May 2012, 16:05 |
|
JohnFound 04 May 2012, 17:47
Hm, did you made some performance measurements? Please, describe your benchmark setup in more details.
|
|||
04 May 2012, 17:47 |
|
Tarkin 05 May 2012, 00:48
JohnFound wrote: Hm, did you made some performance measurements? Please, describe your benchmark setup in more details. Just off-the-cuff observations. Also, I now realize that the comparison of an assembled binary to interpreted forth or C is really unfair. My dev server setup: hp pavilion a730n Intel Pentium4 530 (P) 3.0 GHz (HT) HyperThreaded, 2 Threads 1 GB ram, 800 MHz FSB NIC: Integrated 10/100 Base-T Realtek RTL8101L Linux hostname 2.6.32-5-686 #1 SMP Mon Mar 26 05:20:33 UTC 2012 i686 GNU/Linux Debian 6.0.3 (32-bit) thttpd/2.25b 29dec2003 thttpd cgi: CGI_PATTERN=**.4th|**.cgi|**.c gcc gcc version 4.4.5 (Debian 4.4.5-8 fasm version 1.70 for forth interpreter & cgi example posted above tcc 0.9.25 Fabrice Bellard's TinyC compiler - linked to glibc (vs libtcc) flow-forth 0.2.1 (custom forth interpreter in assembly) http://sgmtech.homelinux.org/flow/flow.html Browser: Firefox 12.0 Linux + firebug 1.9.1 Times were observed from the firebug "net" panel Observations: forth: 5ms to 10ms, average 6ms libc,tcc,interp: 13ms to 23ms, average 13ms libc, tcc,compiled: 5ms to 10ms, average 5ms libc, gcc,compiled: - 4ms to 15ms, average 6ms same, with -O2 - 5ms to 6ms, average 5ms same, with -O3 - 5ms to 23ms, varied wildy same, with -O4 - 5ms to 7ms,average 5ms asm binary - 4ms to 5ms, average 4ms With no other tabs & programs running on the client, and no unnecessary processes running on server, I'd expect to see more consistent results. Here is the forth version: Code: #! /path/to/forth \ flow-forth binary is named 'flow2' : NEWL 13 10 EMIT EMIT ; : dumpvars 0 BEGIN DUP ENVC = IF DROP EXIT THEN DUP 1+ SWAP CELLS ENVS + @ DUP STRLEN ." <li>" TYPE ." </li> " CR AGAIN ; ." HTTP/1.1 200 OK" CR ." Content-Type: text/html" CR ." Connection: close" CR ." X-Powered-By: flow-forth" CR NEWL NEWL ." <html>" CR ." <head>" CR ." <title>FORTH CGI</title>" CR ." </head>" CR ." <body>" CR ." <h1>CGI INFO</h1>" CR ." <ul>" CR dumpvars ." </ul>" CR ." </body>" CR ." </html>" CR BYE source of the interpreted C file: Code: #!/usr/local/bin/tcc -run -lc /* vim: set ft=c: set ts=2 */ #include <stdlib.h> #include <stdio.h> #include <string.h> #include <unistd.h> extern char **environ; char page[] = { 'H','T','T','P','/','1','.','1',' ','2','0','0',' ','O','K','\r','\n', /* 17 */ 'C','o','n','t','e','n','t','-','T','y','p','e',':',' ', /* 14 */ 't','e','x','t','/','h','t','m','l',';','c','h','a','r','s','e','t','=','u','t','f','-','8','\r','\n', /* 25 */ 'X','-','P','o','w','e','r','e','d','-','B','y',':',' ','t','c','c','-','c','g','i','\r','\n', /* 23 */ '\r','\n','\r','\n', /* 4 */ /* 83 bytes */ 0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 }; /* should be 1024 bytes total */ int main(int argc, char **argv) { char **envs; strcat(page,"<html>\r\n"); strcat(page,"<head>\r\n"); strcat(page,"<title>TCC CGI</title\r\n"); strcat(page,"</head>\r\n"); strcat(page,"<body>\r\n"); strcat(page,"<h1>CGI INFO</h1>\r\n"); strcat(page,"<ul>\r\n"); for (envs = environ; *envs != NULL; envs++) { strcat(page,"<li>"); strcat(page,*envs); strcat(page,"</li> \r\n"); } strcat(page,"</ul>\r\n"); strcat(page,"</body>\r\n"); strcat(page,"</html>\r\n"); printf("%s",page); exit(EXIT_SUCCESS); } and finally, the compiled C version: Code: /* vim: set ft=c: set ts=2: */ #if __TINYC__ #include "stddef.h" #include "stdarg.h" #include "tcclib.h" #else #include <stdlib.h> #include <stdio.h> #endif #ifndef EXIT_SUCCESS #define EXIT_SUCCESS 0 #endif int main(int argc, char *argv[], char *envp[]) { int i=0; char *env; printf("HTTP/1.1 200 OK\r\n"); printf("Content-Type: text/html\r\n"); printf("Connection: close\r\n"); printf("X-Powered-By: gcc-cgi\r\n"); printf("\r\n\r\n"); printf("<html>\r\n"); printf("<head><title>CGIVARS</title></head>\r\n"); printf("<body>\r\n"); printf("<h1>CGI INFO</h1>\r\n"); printf("<ul>\r\n"); for (i=0; envp[i] != NULL; i++) printf("<li>%s</li> \r\n",envp[i]); printf("</ul>\r\n"); printf("</body>\r\n"); printf("</html>\r\n"); exit(EXIT_SUCCESS); } |
|||
05 May 2012, 00:48 |
|
Tarkin 08 May 2012, 20:01
perhaps linode?
http://www.linode.com/faq.cfm A bit pricy @ 20USD/mo (24 month cycle price?), but might be worth it... TTFN, Tarkin |
|||
08 May 2012, 20:01 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.