flat assembler
Message board for the users of flat assembler.

Index > Windows > small PE for 4k intro

Goto page 1, 2, 3  Next
Author
Thread Post new topic Reply to topic
keenin



Joined: 25 Aug 2003
Posts: 33
keenin 31 Aug 2003, 15:14
Hi,

I am writing a 4k intro for the first time and wondered which size optimizations could be done (PE files).
Presently I put everything into one section and performed optimizations with that many pushes. I got the windowing and initializing (including screen resolution) of opengl to 2k, but that's too much.

I appreciate every byte that I could save.
Post 31 Aug 2003, 15:14
View user's profile Send private message Reply with quote
roticv



Joined: 19 Jun 2003
Posts: 374
Location: Singapore
roticv 31 Aug 2003, 16:43
Post some codes and we can see how much more bytes we can cut down on.
Post 31 Aug 2003, 16:43
View user's profile Send private message Visit poster's website MSN Messenger Reply with quote
keenin



Joined: 25 Aug 2003
Posts: 33
keenin 31 Aug 2003, 18:11
An excerpt:

Code:
; data

  cfg db "intro4k.cfg",0
  dc dd ?
  devmode DEVMODE
  title db "intro4k",0
  msg dd ?
  of rb 136
  pfd PIXELFORMATDESCRIPTOR
  rc dd ?
  screen rd 7 ; bpp,width,height,0,frq,dpp,full
  style dd WS_POPUP or WS_SYSMENU or WS_CAPTION or WS_MINIMIZEBOX,WS_EX_APPWINDOW
  wndclassex dd sizeof.WNDCLASSEX,CS_VREDRAW or CS_HREDRAW or CS_OWNDC,wnd_proc,0,0,0,0,0,0,0,title,0
  wnd dd ?
  instance dd ?

; code

  main:
     xor edi,edi
     mov esi,screen

    ; getting configuration
      ... setting screen

    ; registering class
     ... RegisterClassEx

    ; changing resolution
     mov ebx,style
     cmp [esi+24],edi
     je .create_window

     mov [ebx],dword WS_POPUP
     mov eax,devmode
     mov [eax+36],word sizeof.DEVMODE
     mov [eax+38],dword DM_BITSPERPEL or DM_PELSHEIGHT or DM_PELSWIDTH or DM_DISPLAYFLAGS or DM_DISPLAYFREQUENCY
     lea ecx,[edi+5]
     lea edi,[eax+104]
     rep movsd
     sub esi,20
     xor edi,edi
     invoke ChangeDisplaySettings, eax,CDS_FULLSCREEN
     invoke ShowCursor, edi

    ; creating window
    .create_window:
     mov edx,title
     invoke CreateWindowEx, [ebx+4],edx,edx,[ebx],edi,edi,[esi+4],[esi+8],edi,edi,[ebx+20],edi
     mov [wnd],eax
     mov esi,eax
     invoke ShowWindow, eax,SW_SHOW

     mov ebx,msg
    @@:
     invoke PeekMessage, ebx,edi,edi,edi,PM_REMOVE
     or eax,eax
     jz .redraw
     cmp [ebx+4],dword WM_QUIT
     je .quit

     invoke TranslateMessage, ebx
     invoke DispatchMessage, ebx
     jmp @b

    .redraw:
     invoke InvalidateRect, esi,edi,edi
     jmp @b

    .quit:
     ret

...

     invoke GetDC, [hwnd]
     mov [dc],eax
     mov ebx,eax

     mov esi,screen

     xor eax,eax
     mov ecx,10

     mov edi,pfd
     rep stosd
     sub edi,40

     mov [edi],byte 40 ;is word
     mov [edi+2],byte 1 ;is word
     mov [edi+4],byte PFD_DRAW_TO_WINDOW or PFD_SUPPORT_OPENGL or PFD_DOUBLEBUFFER ;is dword
     mov eax,[esi]
     mov [edi+9],al
     mov eax,[esi+20]
     mov [edi+23],al

     invoke ChoosePixelFormat, ebx,edi
     invoke SetPixelFormat, ebx,eax,edi
     invoke wglCreateContext, ebx
     mov [rc],eax
     invoke wglMakeCurrent, ebx,eax
     xor eax,eax
     invoke glViewport, eax,eax,[esi+4],[esi+8]

    


The problems are those many invokes (calls).
Is there a way to use near calls instead of far calls to win32 api stuff (like writing one address to eax and calling eax+k)?

The pe file consists of many dos related stuff, can't I use/overwrite them for data?

Another problem:
The display frequency does not change though it is ought to (I am using W2k).
Post 31 Aug 2003, 18:11
View user's profile Send private message Reply with quote
zjlcc



Joined: 23 Jul 2003
Posts: 32
Location: china
zjlcc 01 Sep 2003, 01:51
keenin:please post .ZIP file ,thank you
Post 01 Sep 2003, 01:51
View user's profile Send private message Reply with quote
Vortex



Joined: 17 Jun 2003
Posts: 318
Vortex 01 Sep 2003, 19:39
To reduce the size of your PE,you can merge sections.

_________________
Code it... That's all...
Post 01 Sep 2003, 19:39
View user's profile Send private message Visit poster's website Reply with quote
comrade



Joined: 16 Jun 2003
Posts: 1150
Location: Russian Federation
comrade 01 Sep 2003, 20:15
Trying to figure out gouraud shading, and texture mapping for that matter. Any ideas?

_________________
comrade (comrade64@live.com; http://comrade.ownz.com/)
Post 01 Sep 2003, 20:15
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger ICQ Number Reply with quote
keenin



Joined: 25 Aug 2003
Posts: 33
keenin 01 Sep 2003, 20:36
Yeah, I already did everything into one section.

Mainly my questions are:
Can far calls to WinAPI translated to near calls?
Can the dos stuff in PE files be overwritten with data and be used?
Post 01 Sep 2003, 20:36
View user's profile Send private message Reply with quote
keenin



Joined: 25 Aug 2003
Posts: 33
keenin 01 Sep 2003, 20:37
Which graphics library do you use, comrade?
Post 01 Sep 2003, 20:37
View user's profile Send private message Reply with quote
comrade



Joined: 16 Jun 2003
Posts: 1150
Location: Russian Federation
comrade 01 Sep 2003, 21:09
None. I know its fairly easy in OpenGL and Direct3D, only couple of API calls. But I cannot figure out mathematically. I tried on paper several times, but no results.

_________________
comrade (comrade64@live.com; http://comrade.ownz.com/)
Post 01 Sep 2003, 21:09
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger ICQ Number Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 01 Sep 2003, 21:44
You can do the following trick to reduce size of your calls: when you have to call succesively many functions from the one library, OPENGL.DLL for example, you can load into one register the address of table with addresses to those functions (if your definition of imports from OPENGL.DLL starts with "import opengl,", the label for that table is "opengl") and call them like:
Code:
     mov esi,opengl
     invoke esi+wglCreateContext-opengl, ebx 
   ; ...
     invoke esi+wglMakeCurrent-opengl, ebx,eax
    

When the table of imports is small (no more than 32 items), the displacement for "call [esi+...] " instruction that will be generated by such macros will be always one byte, so the whole call instruction will be usually three bytes long (and only two bytes when you call the first function in table, so it's good to put the most frequently used function as a first in the import table), the "mov esi,..." takes five bytes. As the standard "call [...]" is six bytes long, you'll get size reduction even for only two such calls this way.
Post 01 Sep 2003, 21:44
View user's profile Send private message Visit poster's website Reply with quote
keenin



Joined: 25 Aug 2003
Posts: 33
keenin 01 Sep 2003, 22:15
Thanks Privalov, it's really great.
I can save a lot of bytes that way.

comrade, isn't gouraud shading just linear interpolation between the color of vertices? if so, it might be not that hard to perform. interpolating the color of a line is very easy, but presently i cannot think of a high performance interpolation of the color of a plane.
Post 01 Sep 2003, 22:15
View user's profile Send private message Reply with quote
Vortex



Joined: 17 Jun 2003
Posts: 318
Vortex 02 Sep 2003, 20:01
Privalov,

It's a very nice trick.

_________________
Code it... That's all...
Post 02 Sep 2003, 20:01
View user's profile Send private message Visit poster's website Reply with quote
keenin



Joined: 25 Aug 2003
Posts: 33
keenin 03 Sep 2003, 19:06
Hi,

I am using Privalov's nice trick all the time now Smile, but as the address is one byte with sign - isn't it? - I wondered, if these negative values could be used. In the data I defined a label and I am accessing the data that is defined before and after it through edi+/-X, but in calls I don't see a nice (!) way to do the same (I could think only of moving a procedure's address to a register, but it seems that there are gaps between the libraries).

keenin.
Post 03 Sep 2003, 19:06
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 03 Sep 2003, 19:20
You can put the most frequently used import in the middle of the table, and then do calls like (this example assumes that wglCreateContext import was chosen):
Code:
     mov esi,wglCreateContext
     invoke esi+wglCreateContext-wglCreateContext, ebx 
   ; ... 
     invoke esi+wglMakeCurrent-wglCreateContext, ebx,eax    

This way also negative offsets will be used, allowing you to have up to 64 functions accessed with byte immediate.
Post 03 Sep 2003, 19:20
View user's profile Send private message Visit poster's website Reply with quote
keenin



Joined: 25 Aug 2003
Posts: 33
keenin 03 Sep 2003, 21:25
Thanks, again.
But I also wondered, why there is a space gab between two libraries (for example user and gdi it was appr. 0xDE at my prg).

keenin.
Post 03 Sep 2003, 21:25
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 03 Sep 2003, 21:40
This is because the import macros put the strings with API names for each table just after the end of this table. For your needs it'll be probably better to build import data completely manually, as it is done in PEDEMO example in the console package - this way you'll have the best control over it.
Post 03 Sep 2003, 21:40
View user's profile Send private message Visit poster's website Reply with quote
keenin



Joined: 25 Aug 2003
Posts: 33
keenin 03 Sep 2003, 22:56
Thanks anyway; now I understand the system of creating the tables Smile

You are really great. I have never seen a single person doing that awesome, marvellous work.

Regards.
Post 03 Sep 2003, 22:56
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 04 Sep 2003, 23:39
I have managed to fit template example into 1024 bytes this way, here's the source:
Code:
format PE GUI 4.0
entry start

include '%include%\win32a.inc'

  start:

        mov     esi,user
        mov     edi,wc
        macro   invoke proc,[arg] {common invoke esi+proc-user,arg}
        virtual at edi
        ediwc   WNDCLASS
        end     virtual

        invoke  GetModuleHandle,ebx
        mov     [ediwc.hInstance],eax
        invoke  LoadIcon,0,IDI_EXCLAMATION
        mov     [ediwc.hIcon],eax
        invoke  LoadCursor,0,IDC_ARROW
        mov     [ediwc.hCursor],eax
        xor     eax,eax
        mov     [ediwc.style],eax
        mov     [ediwc.lpfnWndProc],WindowProc
        mov     [ediwc.cbClsExtra],eax
        mov     [ediwc.cbWndExtra],eax
        mov     [ediwc.lpszMenuName],eax
        mov     [ediwc.hbrBackground],COLOR_BTNFACE+1
        mov     ebx,_title
        mov     [ediwc.lpszClassName],ebx
        invoke  RegisterClass,edi
        invoke  CreateWindowEx,0,ebx,ebx,WS_VISIBLE+WS_DLGFRAME+WS_SYSMENU,64,64,127,127,NULL,NULL,[ediwc.hInstance],NULL

        mov     ebx,msg
  msg_loop:
        invoke  GetMessage,ebx,NULL,0,0
        or      eax,eax
        jz      end_loop
        invoke  TranslateMessage,ebx
        invoke  DispatchMessage,ebx
        jmp     msg_loop

  end_loop:
        invoke  ExitProcess,0

proc WindowProc, hwnd,wmsg,wparam,lparam
        enter
        push    esi
        mov     esi,user
        cmp     [wmsg],WM_DESTROY
        je      wmdestroy
        invoke  DefWindowProc,[hwnd],[wmsg],[wparam],[lparam]
        jmp     finish
  wmdestroy:
        invoke  PostQuitMessage,0
        xor     eax,eax
  finish:
        pop     esi
        return

data import

  library kernel,'KERNEL32.DLL',\
          user,'USER32.DLL'

  import kernel,\
         GetModuleHandle,'GetModuleHandleA',\
         ExitProcess,'ExitProcess'

  import user,\
         RegisterClass,'RegisterClassA',\
         LoadIcon,'LoadIconA',\
         LoadCursor,'LoadCursorA',\
         CreateWindowEx,'CreateWindowExA',\
         DefWindowProc,'DefWindowProcA',\
         GetMessage,'GetMessageA',\
         TranslateMessage,'TranslateMessage',\
         DispatchMessage,'DispatchMessageA',\
         PostQuitMessage,'PostQuitMessage'

end data

  _title db '1024',0

  msg MSG
  wc WNDCLASS
    
Post 04 Sep 2003, 23:39
View user's profile Send private message Visit poster's website Reply with quote
keenin



Joined: 25 Aug 2003
Posts: 33
keenin 05 Sep 2003, 15:54
Thanks for that really nice example!

Just a few questions Smile: What section does FASM create if there is no one defined? Is it needed to define the imports as "data import"?

I encountered that the uninitialized data consumes space in the executable if it is not at the end of the section (i.e. file as there is only one). Is that correct? Or do I always have to allocate it manually (for example with GlobalAlloc) to get the smallest size?

kennin.
Post 05 Sep 2003, 15:54
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8351
Location: Kraków, Poland
Tomasz Grysztar 05 Sep 2003, 16:08
When no section is defined, fasm creates the .flat section, for both the data and code.
Each section has two sizes assigned to it - total size in memory, and the amount of bytes to read from file into that memory. The difference between those two sizes marks the unitialized data; therefore you should put the unitialized data at the end of section (otherwise it'll be just zero-initialized).
Post 05 Sep 2003, 16:08
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page 1, 2, 3  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.