flat assembler
Message board for the users of flat assembler.

Index > Windows > Simple program taking an extremely long time to complete

Author
Thread Post new topic Reply to topic
magicSqr



Joined: 27 Aug 2011
Posts: 105
magicSqr 06 Aug 2015, 18:03
Hi,

I have some really simple code. I use CreateFile, then CreateFileMapping, setting the size to 1GB, then MapViewOfFile. I store sets of two values up to the 1GB limit. I then use UnmapViewOfFile, CloseHandle, CloseHandle. The program does this 3 times. The problem is, it creates, fills and closes the first file in 1 second, sometimes it does the same for the 2nd and the 3rd can take upto 5 minutes to complete. Other times the 2nd can take this long and the 3rd even longer. While it is doing this it completely stops any other usage of the computer. It seems like a massive memory leak but if it is I can't find it.

Any help would be greatly appreciated

magicSqr

Code:
format PE64 console 
entry start 

include '%fasminc%\win64ax.inc' 

section '.text' code readable executable 

    start: 
        sub     rsp, 08                     ; required for stack alignment
        
        mov     r12, 2                      ; m
        mov     r13, 1                      ; n
        mov     rcx, 3                      ; number of files to fill
    .createFile:
        int3
        push    rcx
        invoke  CreateFile, fileStr, GENERIC_WRITE + GENERIC_READ, 0, NULL, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL
        mov     [mnHnd], rax
        invoke  CreateFileMapping, [mnHnd], NULL, PAGE_READWRITE, dword [MN_ListFileSz + 04], dword [MN_ListFileSz], NULL
        mov     [mnMapHnd], rax
        invoke  MapViewOfFile, [mnMapHnd], FILE_MAP_WRITE, 0, 0, 0
        mov     [mnPtr], rax
        push    [MN_ListFileSz]             
        pop     [MN_ListFileEnd]            ; set MN_ListFileEnd to 4000 0000   1 GB
        add     [MN_ListFileEnd], rax       ; end of space for data is MN_ListFileSz + mapped view.
        mov     rdi, rax                    ; set rdi to start of file map
    .loop:
        mov     [rdi], r12                  ; store m
        mov     [rdi + 08], r13             ; store n
        add     rdi, 16                     ; update rdi by 16 bytes
        cmp     rdi, [MN_ListFileEnd]       ; have we reached end of mapped area?
        jae     .nextFile                   ; if yes then unmap and close handles
    .next_n:
        inc     r13
        inc     r13                         ; next n
        cmp     r13, r12                    ; is n > m?
        jb      .loop                       ; if not, continue
        inc     r12                         ; next m
        mov     r13, 1                      ; n = 1
        bt      r12, 0
        adc     r13, 0                      ; make n opposite parity to m
        jmp     .loop
   .nextFile:
        invoke  UnmapViewOfFile, [mnPtr]
        invoke  CloseHandle, [mnMapHnd]
        invoke  CloseHandle, [mnHnd]
        pop     rcx
        loop    @f                          
        jmp     .finished                   ; all files completed, finish
    @@:
        mov     esi, fileStr + 24
    @@:
        inc     byte [esi]                  ; increase number of file name
        cmp     byte [esi], 0x3A
        jne     .next_n_for_new_file
        mov     byte [esi], "0"
        dec     esi
        jmp     @b
    .next_n_for_new_file:
        inc     r13
        inc     r13                         ; next n
        cmp     r13, r12                    ; is n > m?
        jb      .createFile                 ; if not, continue
        inc     r12                         ; next m
        mov     r13, 1                      ; n = 1
        bt      r12, 0                      
        adc     r13, 0                      ; make n opposite parity to m
        jmp     .createFile                 ; start a new file
    .finished:
        invoke  ExitProcess, 0 

;************************************************************************************************** 
section '.data' data readable writeable 

align 8

        mnHnd           dq ?
        mnMapHnd        dq ?
        mnPtr           dq ?
        MN_ListFileSz   dq 1024*1024*1024
        MN_ListFileEnd  dq ?
        fileStr         db "e:\mnList\myMN_List000000.dat", 0

;************************************************************************************************** 
section '.idata' import data readable writeable 

library kernel32,   'KERNEL32.DLL',\ 
        user32,     'USER32.DLL'
    
include '%fasminc%\api\Kernel32.inc' 
include '%fasminc%\api\User32.inc'
    
Post 06 Aug 2015, 18:03
View user's profile Send private message Reply with quote
l_inc



Joined: 23 Oct 2009
Posts: 881
l_inc 06 Aug 2015, 22:54
magicSqr
Quote:
The problem is, it creates, fills and closes the first file in 1 second, sometimes it does the same for the 2nd and the 3rd can take upto 5 minutes to complete.

Takes 0.6 seconds for all three for me. Smile But...

Quote:
It seems like a massive memory leak but if it is I can't find it.

There's no memory leak. Though you're stressing the cache manager extremely. As soon as it reaches its working set limits, which might happen after you've written one file or two or three..., the system starts actively paging the file data in relatively little chunks, and not only of the files your code writes, but also of other system components including kernel and drivers data. The result is thrashing.

The file mapping solution here is a quite inefficient one. So there are multiple things you could do about the lags you experience. The most simple one is to just specify FILE_FLAG_SEQUENTIAL_SCAN for the CreateFile call. This should reduce the amount of memory the cache manager uses for the files. The most efficient solution would be however to store the data in chunks of 4 or 8 megabytes into a single region of private virtual memory (VirtualAlloc) of that size and write those chunks simply with WriteFile into files opened with FILE_FLAG_NO_BUFFERING+FILE_FLAG_WRITE_THROUGH. You can create the files even faster if you parallelize data storage and data generation by doing overlapped I/O, in which case you'd need at least 2 regions of private virtual memory of 4/8 MB, but that's gonna make the code a bit more complicated and wouldn't give much, because your data generation is in the order of a fraction of a percentage point of the time needed to store the data to the drive, especially if it's not an SSD.

_________________
Faith is a superposition of knowledge and fallacy
Post 06 Aug 2015, 22:54
View user's profile Send private message Reply with quote
magicSqr



Joined: 27 Aug 2011
Posts: 105
magicSqr 10 Aug 2015, 22:06
Thanks l_inc, I hadn't considered the cache angle. I'll check out which of your methods works best for my problem.

Quote:

Takes 0.6 seconds for all three for me. Smile But...


ggrrrrr Evil or Very Mad lol
Post 10 Aug 2015, 22:06
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.