flat assembler
Message board for the users of flat assembler.

Index > Compiler Internals > [sug] targeted allocation of memory in Windows fasm

Goto page 1, 2, 3  Next
Author
Thread Post new topic Reply to topic
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20335
Location: In your JS exploiting you and your system
revolution 11 Jan 2015, 18:12
By using exception handling it is possible to make fasm use only the memory that it needs when assembling.

My motivation for this is fourfold:
  1. Fix an existing bug with the allocation algorithm. Currently, and somewhat paradoxically, fasm can use more memory only when you have less memory available. fasm can't allocate more than 1GB of RAM if you have 2GB or more RAM available. Only when you drop below about 1.5GB will fasm be able to allocate all of it.
  2. Fix a problem with Windows showing an annoying dialog box asking to close fasm because of excessive memory use. When later versions of Windows detect low memory availability the OS will show a dialog box to the user suggesting to close a program to free up memory space. This happens when fasm allocates 1GB and triggers the box. But by the time the user has read the message fasm has already finished and the user is left to swat away the annoying box which served no useful purpose.
  3. Be more respectful of other programs by using less memory when it is not needed. We use multitasking OSes and programs need to be more friendly with their memory usage. If you try to run a few instances of fasm simultaneously to take advantage of the extra cores you may find you end up with memory starvation even when compiling very small files. This can be worked around but requires manually setting the RAM usage which is cumbersome and error prone.
  4. Be able to use more memory when needed than is currently possible. Currently even on a system with many gigabytes of available memory it is not possible to use more than 1GB.
So my suggestion is to make some changes to FASM.ASM and SYSTEM.INC in the WIN32 folder (the console version) as detailed below.

Please see further ahead in the thread for updated code with enhancements and bug fixes.

In FASM.ASM:

We start by making fasm large address aware. This allows us to use up to approximately 2.75GB. Although the code shown here is only able to use 2GB. This can be raised if desired to the maximum of 2.75GB with some changes to the memory reservation code. For now we keep the limit to just under 2GB.
Code:
-       format  PE console
+       format  PE console large    
Next we need to add a link to the SetUnhandledExceptionFilter API. We also eliminate the GlobalMemoryStatus link because it is not needed.
Code:
-    GlobalMemoryStatus dd rva _GlobalMemoryStatus
+    SetUnhandledExceptionFilter dd rva _SetUnhandledExceptionFilter
;...
-  _GlobalMemoryStatus dw 0
-    db 'GlobalMemoryStatus',0
+  _SetUnhandledExceptionFilter dw 0
+    db 'SetUnhandledExceptionFilter',0    
In SYSTEM.INC:

Define some constants for ease of readability.
Code:
+MEMORY_ALLOCATION_BLOCK_SIZE   = 1 shl 16      ;64kB. Higher numbers reduce the number of calls to the allocator. A value of 1 will allocate just the pages where the memory is accessed but will call the allocator more times
+MEMORY_ALLOCATION_MAX_SIZE     = 0x7ffc0000    ;Windows fails all calls that try to allocate more than this    
Change the allocation algorithm to reserve (i.e. not commit) all of the address space. Because later we want to commit pages on demand.
Code:
-       push    buffer
-       call    [GlobalMemoryStatus]
-       mov     eax,dword [buffer+20]
-       mov     edx,dword [buffer+12]
-       cmp     eax,0
-       jl      large_memory
-       cmp     edx,0
-       jl      large_memory
-       shr     eax,2
-       add     eax,edx
-       jmp     allocate_memory
-    large_memory:
-       mov     eax,80000000h
+       mov     eax,MEMORY_ALLOCATION_MAX_SIZE
     allocate_memory:    
Change the API call for reservation.
Code:
-       push    MEM_COMMIT
+       push    MEM_RESERVE    
Add in a call to SetUnhandledExceptionFilter so that the memory can be committed on demand.
Code:
+       push    memory_allocation_handler
+       call    [SetUnhandledExceptionFilter]    
Change the computation slightly for situations where the reservation fails so as to make more memory available if it is needed. Instead of halving the value each iteration we take 3/4 and try again.
Code:
-       shl     eax,1
+       lea     eax,[eax*3]    
Add in the code to commit memory on demand when we get an access exception.
Code:
+memory_allocation_handler:
+       mov     eax,[esp+4]     ;get pointer to exception information
+       mov     ecx,[eax]       ;EXCEPTION_POINTERS.ExceptionRecord
+       mov     edx,[ecx]       ;EXCEPTION_RECORD.ExceptionCode
+       cmp     edx,0xc0000005  ;EXCEPTION_ACCESS_VIOLATION
+       jnz     .fail
+       mov     edx,[ecx+24]    ;EXCEPTION_RECORD.ExceptionAddress
+       cmp     edx,[memory_start]
+       jb      .fail
+       mov     eax,[additional_memory_end]
+       sub     eax,edx         ;don't allocate more than EAX bytes
+       jbe     .fail
+       mov     ecx,MEMORY_ALLOCATION_BLOCK_SIZE
+       cmp     eax,ecx
+       jbe     .allocation_size_defined
+       mov     eax,ecx         ;don't allocate more than ECX bytes
+    .allocation_size_defined:
+       push    PAGE_READWRITE
+       push    MEM_COMMIT
+       push    eax             ;allocate this many bytes
+       push    edx             ;at this address
+       call    [VirtualAlloc]
+       test    eax,eax
+       jz      out_of_memory
+       mov     eax,-1          ;EXCEPTION_CONTINUE_EXECUTION
+       ret     4
+    .fail:                     ;some other exception happened
+       mov     eax,0           ;EXCEPTION_CONTINUE_SEARCH
+       retn    4    
And lastly modify the read function to commit RAM for each file. Note that this memory is only committed inside the block we reserved earlier, it is not a new section of the address space that needs to be tracked and freed upon exit.
Code:
 read:
+       push    ebx edx ecx
+       push    PAGE_READWRITE
+       push    MEM_COMMIT
+       push    ecx             ;allocate the required number of bytes
+       push    edx             ;at this address
+       call    [VirtualAlloc]
+       pop     ecx edx ebx
+       test    eax,eax
+       jz      file_error    
Attached is the patch differences file. And the two modified files FASM.ASM and SYSTEM.INC from the WIN32 folder for those that might like to try it out.

During my testing I found that fasm assembling itself needs only ~2.3MB of memory. On my test system fasm will reserve 2GB of RAM at startup and only commit 2.3MB during compilation. The Windows dialog box complaining about low memory availability no longer appears. And I can can now have multiple instances all compiling simultaneously without running out of RAM.


Description: The complete modified SYSTEM.INC
Download
Filename: SYSTEM.INC
Filesize: 9.72 KB
Downloaded: 888 Time(s)

Description: The complete modified FASM.ASM
Download
Filename: FASM.ASM
Filesize: 8.05 KB
Downloaded: 873 Time(s)

Description: Patch file showing the changes
Download
Filename: Patch.txt
Filesize: 4.25 KB
Downloaded: 920 Time(s)



Last edited by revolution on 13 Apr 2015, 15:48; edited 1 time in total
Post 11 Jan 2015, 18:12
View user's profile Send private message Visit poster's website Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3499
Location: Bulgaria
JohnFound 11 Jan 2015, 19:18
Very good idea! May I use it for Fresh IDE? Also, it would be good to have it for Linux as well.
Post 11 Jan 2015, 19:18
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20335
Location: In your JS exploiting you and your system
revolution 11 Jan 2015, 19:28
JohnFound wrote:
Very good idea! May I use it for Fresh IDE? Also, it would be good to have it for Linux as well.
Yes, please do. A Linux version would be good also. And fasmw.
Post 11 Jan 2015, 19:28
View user's profile Send private message Visit poster's website Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3499
Location: Bulgaria
JohnFound 11 Jan 2015, 20:32
After some though and source reading, Linux version uses sys_brk, that IIRC, works exactly this way - allocates pages only on access. So, no need of update.
Post 11 Jan 2015, 20:32
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
l_inc



Joined: 23 Oct 2009
Posts: 881
l_inc 11 Jan 2015, 20:35
revolution
Quote:
fasm can't allocate more than 1GB of RAM if you have 2GB or more RAM available

That is just not true. I always have more and I always let it preallocate 1.5 GB. The rest is used for libraries.
Quote:
Fix a problem with Windows showing an annoying dialog box asking to close fasm because of excessive memory use

I can't remember I've ever seen that dialog box.
Quote:
Change the allocation algorithm to reserve (i.e. not commit) all of the address space. Because later we want to commit pages on demand.

In fact, Windows does that for you. Even if you allocate memory with MEM_COMMIT the kernel does nothing but creating the VADs. The page tables remain untouched until you actually try to read or write the allocated memory.

_________________
Faith is a superposition of knowledge and fallacy
Post 11 Jan 2015, 20:35
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20335
Location: In your JS exploiting you and your system
revolution 11 Jan 2015, 20:40
l_inc wrote:
revolution
Quote:
fasm can't allocate more than 1GB of RAM if you have 2GB or more RAM available

That is just not true. I always have more and I always let it preallocate 1.5 GB. The rest is used for libraries.
Quote:
Fix a problem with Windows showing an annoying dialog box asking to close fasm because of excessive memory use

I can't remember I've ever seen that dialog box.
Quote:
Change the allocation algorithm to reserve (i.e. not commit) all of the address space. Because later we want to commit pages on demand.

In fact, Windows does that for you. Even if you allocate memory with MEM_COMMIT the kernel does nothing but creating the VADs. The page tables remain untouched until you actually try to read or write the allocated memory.
It is true for me. I always get only 1GB allocated, unless I have less than 2GB free then I can get more allocated up to 1.75GB at which point Windows gives me the annoying dialog.

Plus the commit will starve others apps of memory, Windows will not allow the committed RAM to be used by other apps else where is the guarantee of having committed RAM?
Post 11 Jan 2015, 20:40
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20335
Location: In your JS exploiting you and your system
revolution 11 Jan 2015, 20:45
When the current code sees more than 2GB it tries to allocate 0x80000000 which will fail due to the 0x7ffc0000 limit, then it halves the amount to 0x40000000 and succeeds in getting 1GB. Always for me, everytime I see "(1048576 kilobytes memory)".
Post 11 Jan 2015, 20:45
View user's profile Send private message Visit poster's website Reply with quote
l_inc



Joined: 23 Oct 2009
Posts: 881
l_inc 11 Jan 2015, 20:49
revolution
Quote:
else where is the guarantee of having committed RAM?

The guarantee is called "pagefile". If you run out of real RAM by touching too many of the committed pages, the working set of other applications will be reduced, and then you'll run out of your own working set limit and your pages will be swapped out. You can do your own tests and look at the working set performance counter in the process explorer before you allocate memory, after that, but before touching it and after touching the pages. Your working set will increase only after touching the pages. Besides you can read the "Windows Internals" and the WRK.

_________________
Faith is a superposition of knowledge and fallacy
Post 11 Jan 2015, 20:49
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20335
Location: In your JS exploiting you and your system
revolution 11 Jan 2015, 20:53
I seem to not see what you see.

Anyhow, I still feel that committing so much RAM is not the best use of the RAM or the pagefile. Commit on demand reduces the load on both.
Post 11 Jan 2015, 20:53
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8353
Location: Kraków, Poland
Tomasz Grysztar 11 Jan 2015, 20:55
revolution wrote:
[*]Fix an existing bug with the allocation algorithm. Currently, and somewhat paradoxically, fasm can use more memory only when you have less memory available. fasm can't allocate more than 1GB of RAM if you have 2GB or more RAM available. Only when you drop below about 1.5GB will fasm be able to allocate all of it.
You can allocate more memory in such case with a right setting to "-m" parameter. The default allocation is just a simplistic heuristic that I wrote years ago when trying to adapt fasm's memory usage scheme to a multitasking environment. Note that fasm to this day uses two memory blocks that it originally assumed to be conventional and extended memory of a machine, on modern system two such blocks need to be provided to "emulate" the layout of memory from a classic DOS-like system without paging. Though there is no command line setting that would allow to "fine tune" the size of each block separately, the size provided with "-m" is divided into two blocks by another heuristic.

In case of Linux version I did not implement any eager heuristic, it just has a hard limit of 16M by default, and to allocate more you always need to use "-m" parameter. Turns out it was not that bad decision, as I never got any complaints. Wink

revolution wrote:
We start by making fasm large address aware.
By the way: the reason why I never marked the console version with "large" keyword (though I did it with fasmw) is that I mantain this version using a very basic set of API and PE features, in order to keep it compatible with environments like Win32s WDOSX/HX, etc. That's why this interface is stuck almost unchanged since early 2000's.

Anyway, this is a very interesting modification of fasm, and I think something like this could be suitable for fasmw.
Post 11 Jan 2015, 20:55
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20335
Location: In your JS exploiting you and your system
revolution 11 Jan 2015, 21:01
Tomasz Grysztar wrote:
By the way: the reason why I never marked the console version with "large" keyword (though I did it with fasmw) is that I mantain this version using a very basic set of API and PE features, in order to keep it compatible with environments like Win32s WDOSX/HX, etc. That's why this interface is stuck almost unchanged since early 2000's.
I was taught that the large addresses flag was harmless with PE loaders that don't understand it. If that is not true then all of my other programs might not be compatible with such systems because I always mark my stuff as large.
Post 11 Jan 2015, 21:01
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8353
Location: Kraków, Poland
Tomasz Grysztar 11 Jan 2015, 21:05
revolution wrote:
Tomasz Grysztar wrote:
By the way: the reason why I never marked the console version with "large" keyword (though I did it with fasmw) is that I mantain this version using a very basic set of API and PE features, in order to keep it compatible with environments like Win32s WDOSX/HX, etc. That's why this interface is stuck almost unchanged since early 2000's.
I was taught that the large addresses flag was harmless with PE loaders that don't understand it. If that is not true then all of my other programs might not be compatible with such systems because I always mark my stuff as large.
I may have been overly cautious there, this is just a general approach I've kept. PEDEMO.ASM is also an example of this approach, it still has the .reloc section with a short comment that it is needed for Win32s. And even with this comment I still sometimes receive questions what is the purpose of this seemingly redundant line. Cool


Last edited by Tomasz Grysztar on 11 Jan 2015, 21:06; edited 1 time in total
Post 11 Jan 2015, 21:05
View user's profile Send private message Visit poster's website Reply with quote
l_inc



Joined: 23 Oct 2009
Posts: 881
l_inc 11 Jan 2015, 21:06
revolution
Quote:
Commit on demand reduces the load on both

It does not. The pages are not swapped out unless they've ever been touched, because they just don't exist.
Quote:
it tries to allocate 0x80000000 which will fail due to the 0x7ffc0000 limit

I might have forgotten something, but I can't remember there is such limit. You are free to allocate more than 2GB if you've configured a 3:1 split. Otherwise you shouldn't even try to allocate that much, because there are libraries that take away a significant amount of the virtual address space.
Quote:
I always get only 1GB allocated, unless I have less than 2GB free then I can get more allocated up to 1.75GB

This behaviour would be interesting to investigate, but my guess would be that the libraries occupy less sparse regions of virtual memory in some cases.

_________________
Faith is a superposition of knowledge and fallacy
Post 11 Jan 2015, 21:06
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20335
Location: In your JS exploiting you and your system
revolution 11 Jan 2015, 21:08
The 0x7ffc0000 limit exists only per call. In total you can get more but it requires more than one call.

Or, at least, it does for me on Win7-64. Perhaps you see something different there also?
Post 11 Jan 2015, 21:08
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20335
Location: In your JS exploiting you and your system
revolution 11 Jan 2015, 21:19
l_inc wrote:
revolution
Quote:
Commit on demand reduces the load on both

It does not. The pages are not swapped out unless they've ever been touched, because they just don't exist.
Your pagefile is not of infinite size. At some point Windows will deny commit allocations when the sum of all commits exceeds its capacity.
Post 11 Jan 2015, 21:19
View user's profile Send private message Visit poster's website Reply with quote
l_inc



Joined: 23 Oct 2009
Posts: 881
l_inc 11 Jan 2015, 22:21
revolution
Quote:
The 0x7ffc0000 limit exists only per call. In total you can get more but it requires more than one call

It would be a bit cumbersome for me to do the test, cause I always use the standard 2:2 split. And reconfiguring it often results in crashes of some applications. I looked up into the wrk and I didn't find such a limitation after superficial reading. I guess your limit is caused by other virtual addressing space reservations (including dlls) that do not allow a virtual address range to be contiguous.
Quote:
Your pagefile is not of infinite size. At some point Windows will deny commit allocations when the sum of all commits exceeds its capacity.

That's totally true. But this rarely happens (don't you btw. have your pagefile disabled? This would explain your annoying low memory message). You should get into low memory conditions just at the moment of compilation, so that another application gets it's memory allocation request denied. Because compilation doesn't normally last for long it is preferable for fasm to commit all the required memory at once, because it avoids additional performance penalties this way. As long as this contributes solely to the commit charge, but not to the process working set, there isn't much of a problem here.

_________________
Faith is a superposition of knowledge and fallacy
Post 11 Jan 2015, 22:21
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20335
Location: In your JS exploiting you and your system
revolution 11 Jan 2015, 22:41
l_inc wrote:
I guess your limit is caused by other virtual addressing space reservations (including dlls) that do not allow a virtual address range to be contiguous.
I thought this also but that is not the reason. I can still allocate a contiguous range of ~2.75GB with two commit calls each below 0x7ffc0000.
l_inc wrote:
... (don't you btw. have your pagefile disabled? This would explain your annoying low memory message). You should get into low memory conditions just at the moment of compilation, so that another application gets it's memory allocation request denied. Because compilation doesn't normally last for long it is preferable for fasm to commit all the required memory at once, because it avoids additional performance penalties this way. As long as this contributes solely to the commit charge, but not to the process working set, there isn't much of a problem here.
I did see a minor performance hit, it is a few % even with the block size set to 1 byte (i.e. one 4kB page per call), the tradeoff is that I am not wasting my memory, I can use my memory more efficiently. I am not a speed junky anyway, if I have to choose I prefer things to be correct rather than a few % faster. And, yes, I do not use a pagefile.

I want to re-quote this part with a highlight:
l_inc wrote:
... it is preferable for fasm to commit all the required memory at once ...
I completely agree, but the key word there is "required". How can we determine the requirement beforehand?
Post 11 Jan 2015, 22:41
View user's profile Send private message Visit poster's website Reply with quote
l_inc



Joined: 23 Oct 2009
Posts: 881
l_inc 11 Jan 2015, 22:49
revolution
By "required" I meant what the user requested it to allocate with the "-m" switch. Smile

Quote:
the tradeoff is that I am not wasting my memory, I can use my memory more efficiently

Again, you are not wasting your memory anyway. You waste the commit charge, which is the system's reservation for the case you actually use the commited memory.

_________________
Faith is a superposition of knowledge and fallacy
Post 11 Jan 2015, 22:49
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20335
Location: In your JS exploiting you and your system
revolution 11 Jan 2015, 22:57
Committed memory is memory I can't give to another process. I think of that as a waste if I don't use all of the amount committed.
Post 11 Jan 2015, 22:57
View user's profile Send private message Visit poster's website Reply with quote
l_inc



Joined: 23 Oct 2009
Posts: 881
l_inc 11 Jan 2015, 23:01
revolution
It's not memory. It's just a number. What you waste is a portion of a number.

_________________
Faith is a superposition of knowledge and fallacy
Post 11 Jan 2015, 23:01
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page 1, 2, 3  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.