flat assembler
Message board for the users of flat assembler.

Index > Windows > [win64] Big buffer in .bss section

Author
Thread Post new topic Reply to topic
cinolt



Joined: 08 Aug 2012
Posts: 3
cinolt 20 Sep 2014, 18:19
I want a 2GB buffer reserved in virtual memory, and I thought I could do this by just declaring it in the .bss section:
Code:
format pe64 gui
entry start

section '.idata' import data readable writeable
  dd 0, 0, 0, rva kernel32_name, rva kernel32_table
  dd 0, 0, 0, 0, 0
  kernel32_table:
    ExitProcess      dq rva _ExitProcess
    dq 0
  kernel32_name:     db 'kernel32.dll', 0, 0
  _ExitProcess:      db 0, 0, 'ExitProcess', 0

section '.bss' readable writeable
  bigbuffer: rb 1024*1024*1024*2
; bigbuffer: rb 1024*1024*1024
; bigbuffer: rb 1

section '.text' code readable executable
  start:
    xor ecx, ecx         ; uExitCode
    call [ ExitProcess ]    

I get this output:
Code:
$ fasm test.asm
flat assembler  version 1.71.21  (1048576 kilobytes memory)
error: out of memory.    

So, I tried increasing the available memory with the -m option:
Code:
$ fasm -m 3000000 test.asm
flat assembler  version 1.71.21  (1500000 kilobytes memory)
error: out of memory.    

I thought this was weird, because I don't see why fasm should need memory when I'm just reserving space in the .bss section.

Reducing bigbuffer to 1GB works fine, but fasm takes much more longer to assemble it for some reason:
Code:
$ fasm -m 3000000 test.asm
flat assembler  version 1.71.21  (1500000 kilobytes memory)
2 passes, 0.8 seconds, 1536 bytes.    

Anyone know what's going on here? Is it just a PE+ limitation or is it a bug with fasm?
Post 20 Sep 2014, 18:19
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8359
Location: Kraków, Poland
Tomasz Grysztar 20 Sep 2014, 19:29
This is an unfortunate limitation caused by fasm's internal architecture. Every data reservation directive is reflected by an actual block of reserved (and even zeroed) data at assembly time.

I think a possible solution to such problem would be to allow a special syntax for SECTION directive to grow the reserved portion without the actual use of RB.
Post 20 Sep 2014, 19:29
View user's profile Send private message Visit poster's website Reply with quote
comrade



Joined: 16 Jun 2003
Posts: 1150
Location: Russian Federation
comrade 21 Sep 2014, 06:17
cinolt, FASM limitations aside, why do you need a 2 GB section? Your approach may very well be appropriate, I am just curious what you are trying to do. You should be aware that the entire 2 GB will be charged against the Windows system commit limit, even if you do not touch a single page.

Tomasz Grysztar wrote:
I think a possible solution to such problem would be to allow a special syntax for SECTION directive to grow the reserved portion without the actual use of RB.


Do you have a tracker somewhere where we can open a ticket for this request?

_________________
comrade (comrade64@live.com; http://comrade.ownz.com/)
Post 21 Sep 2014, 06:17
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger ICQ Number Reply with quote
cinolt



Joined: 08 Aug 2012
Posts: 3
cinolt 21 Sep 2014, 14:44
Thanks for your replies.

Tomasz Grysztar: Oh, that is unfortunate. I would greatly appreciate if such a solution would be implemented by one of the developers. For now, I figure I'll just manually call VirtualAlloc at runtime.

comrade: The 2GB buffer is the main buffer for a "field" in a "sandbox" type game that I'm writing (e.g. MineCraft, Terraria, etc). That is, the buffer is just a big 2D square array of entries, and each entry contains data about the corresponding position within the field.

Typically I think programmers would allocate memory on the heap for a modest size, then when the player moves "outside" the buffer, it would get reallocated and the buffer contents would have to get reorganized (think about a 2D array of a certain size, having to get copied onto a 2D array of a bigger size). It's just more overhead having to call the API, then reorganizing the data.

Since I'm working with win64, with a 64-bit address space (16 exabytes of memory!), and how I only care more about speed than memory usage, it seems like it would be a waste to not just commit 2GB of memory, then use it as the largest possible buffer size. No reallocation, no reorganization.

I didn't necessarily see why an OS would have a system commit limit like that, when in this endeavor I'm simply saying "in this process space, address space x through x+1024*1024*1024*2 are to be accessible." However after doing some searching it appears that Windows does have a "commit charge". It makes sense now that I think about it, because memory that's committed is going to have to be guaranteed to be available. So the ideal solution actually would be to only MEM_RESERVE a 2GB space, then MEM_COMMIT pages as I use them, would you agree?
Post 21 Sep 2014, 14:44
View user's profile Send private message Reply with quote
comrade



Joined: 16 Jun 2003
Posts: 1150
Location: Russian Federation
comrade 21 Sep 2014, 21:23
cinolt wrote:
it would get reallocated and the buffer contents would have to get reorganized (think about a 2D array of a certain size, having to get copied onto a 2D array of a bigger size)

Yes, exactly: a naïve malloc-realloc strategy has this shortcoming. Plus at the transition stage where you do the memory copy, you will be using 2x memory (as both the old and the new buffer will be allocated).

cinolt wrote:
So the ideal solution actually would be to only MEM_RESERVE a 2GB space, then MEM_COMMIT pages as I use them, would you agree?

That's exactly what I was going to recommend. Set-up a SEH page-fault handler around your code. Make sure you filter only for EXCEPTION_ACCESS_VIOLATION exceptions in your reserved address range (EXCEPTION_RECORD.ExceptionInformation[1] will give you the faulting address). Then use VirtualAlloc to commit a page (or a range of pages, up to you). The VirtualAlloc call will fail with ERROR_COMMITMENT_LIMIT if the system is out of commit (though typically it can be retried some seconds later after the OS expands the pagefile). Once committed, return EXCEPTION_CONTINUE_EXECUTION from the SEH filter to retry the memory access.

You still have the benefit of a contiguous address range that would require no reorganization or dubious copying when extended.

cinolt wrote:
Since I'm working with win64, with a 64-bit address space (16 exabytes of memory!)

If you consider making a 32-bit port of your application, you can use the same trick with pagefile-backed (Linux terminology: anonymous) file mapping objects. Look into CreateFileMapping with the SEC_RESERVE attribute. You will need additional logic in your application to map smaller windows of the large section at a time. You can use the same VirtualAlloc trick to turn MEM_RESERVE pages to MEM_COMMIT.

The pagefile-backed file mapping may also be useful on 64-bit, if you ever want to create a sparse view of a contiguous region. Think of a 1 TB address range that gives you the benefit of writing anywhere in it; though you still have to make sure you commit sane amounts of it.

cinolt wrote:
However after doing some searching it appears that Windows does have a "commit charge". It makes sense now that I think about it, because memory that's committed is going to have to be guaranteed to be available.


Yes. The Windows Internals book by Mark Russinovich and David Solomon goes into depth on these topics. As well does the following, excellent blog post series by Mark:
http://blogs.technet.com/b/markrussinovich/archive/2008/11/17/3155406.aspx

Make sure you read Pavel's comments below the page, for curious implementation details.

_________________
comrade (comrade64@live.com; http://comrade.ownz.com/)
Post 21 Sep 2014, 21:23
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger ICQ Number Reply with quote
cinolt



Joined: 08 Aug 2012
Posts: 3
cinolt 22 Sep 2014, 00:48
Very informative post and blog post series. Looks like I'll have to learn SEH for the first time Smile

One question I still have is, is there any real advantage to using pagefile-backed file mapping on win64? According to Pavel's comments in the blog post, reserving virtual memory does incur some memory commits, and I verified this as reserving 1TB works fine, but fails with 2TB on my particular system.
I then tried CreateFileMapping as so:
Code:
CreateFileMapping( INVALID_HANDLE_VALUE, 0, PAGE_READWRITE | SEC_RESERVE, 0x00000100, 0, 0 );    

which creates the mapping with a maximum size of 1TB. And again, this function call succeeds with 1TB, but fails with 2TB, so it's not like I can get more reserved memory this way. Is there any benefit from having a "sparse" view?
Post 22 Sep 2014, 00:48
View user's profile Send private message Reply with quote
comrade



Joined: 16 Jun 2003
Posts: 1150
Location: Russian Federation
comrade 22 Sep 2014, 03:15
cinolt wrote:
Is there any benefit from having a "sparse" view?

Depends on your use case. Entertaining the idea of your sandbox world, could it be developed with several sparse regions, with lots of nothing in between? The populated regions would be MEM_COMMIT, while "nothing" would stay MEM_RESERVE. This is really a question of how your world is organized in memory (a simple 2D array? a list of 2D arrays? etc) - for some organizations you can exploit the memory manager.

Pavel Lebedinsky wrote:

A few more points:
1. Reserved memory does contribute to commit charge, because the memory manager charges commit for pagetable space necessary to map the entire reserved range. On 64 bit this can be a significant number (reserving 1 TB of memory will consume approximately 2 GB of commit).
...


This may have been improved in Windows 8+. I would have to check it out, but from a recent experiment (on Windows 8 ), I did not observe any commitment charge for a 1 TB reserved mapping.

_________________
comrade (comrade64@live.com; http://comrade.ownz.com/)
Post 22 Sep 2014, 03:15
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger ICQ Number Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20454
Location: In your JS exploiting you and your system
revolution 22 Sep 2014, 04:31
Each 4k block needs an 8 byte entry in the page table to mark it reserved. So, yes, 1TB (268,435,456 * 4k blocks) will require 2GB (268,435,456 * 8 bytes) of space to reserve it. There isn't any way around that unless the OS delays the reservation until the first byte is accessed or something.

But, on topic, using the section definition to reserve large areas of memory is probably not the best plan IMO. If the system has insufficient memory available then the OS generates a generic error saying the program cannot be loaded and makes it look like your code is broken. So perhaps it is better to include the extra handful of instructions needed to reserve the memory at runtime and that way you can generate a nice application specific error to the user in case of any failure.
Post 22 Sep 2014, 04:31
View user's profile Send private message Visit poster's website Reply with quote
comrade



Joined: 16 Jun 2003
Posts: 1150
Location: Russian Federation
comrade 26 Sep 2014, 09:11
comrade wrote:
Pavel Lebedinsky wrote:

A few more points:
1. Reserved memory does contribute to commit charge, because the memory manager charges commit for pagetable space necessary to map the entire reserved range. On 64 bit this can be a significant number (reserving 1 TB of memory will consume approximately 2 GB of commit).
...


This may have been improved in Windows 8+. I would have to check it out, but from a recent experiment (on Windows 8 ), I did not observe any commitment charge for a 1 TB reserved mapping.


I've done an experiment and this seems to have improved on Windows 8.1, from Windows 8. On Windows 8.1, when creating a really large file mapping object (in the multi-GB range), no commit charges are initially observed. (I've measured commit charges by observing PERFORMANCE_INFORMATION.CommitTotal via the GetPerformanceInfo Win32 API.) Once the section is continuously mapped (say, using small views), the commit charges gradually accumulate to what they would be in Windows 8 initially after creating the file mapping object.

cinolt wrote:
Very informative post and blog post series. Looks like I'll have to learn SEH for the first time Smile

The method with on-demand commit is also described here:
http://blogs.msdn.com/b/oldnewthing/archive/2012/02/10/10266256.aspx

Hope this helps.

_________________
comrade (comrade64@live.com; http://comrade.ownz.com/)
Post 26 Sep 2014, 09:11
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger ICQ Number Reply with quote
sinsi



Joined: 10 Aug 2007
Posts: 794
Location: Adelaide
sinsi 26 Sep 2014, 09:27
Maybe use Address Windowing Extensions? There are quite a few restrictions though (Vista+ might be a killer...)
Post 26 Sep 2014, 09:27
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.