flat assembler
Message board for the users of flat assembler.
Index
> Windows > Open file, read by char and count a specyfic char Goto page 1, 2 Next |
Author |
|
RIxRIpt 20 Jan 2014, 20:24
Quote: add counter, 48 FASM Docs wrote: When operand is a data in memory, the address of that data (also any numerical expression, but it may contain registers) should be enclosed in square brackets or preceded by ptr operator. For example instruction mov eax,3 will put the immediate value 3 into the EAX register, instruction mov eax,[7] will put the 32-bit value from the address 7 into EAX and the instruction mov byte [7],3 will put the immediate value 3 into the byte at address 7, it can also be written as mov byte ptr 7,3. To specify which segment register should be used for addressing, segment register name followed by a colon should be put just before the address value (inside the square brackets or after the ptr operator). http://flatassembler.net/docs.php?article=manual#1.2.1 And here's my implementation of "Open file, read by char and count a specific char" (using Microsoft C Runtime DLL [msvcrt]) Code: format PE CONSOLE entry main include 'win32a.inc' section '.code' code readable writeable executable proc main cinvoke fopen, fileName, fileMode mov [fp], eax test eax, eax jz .end .loop: cinvoke fgetc, [fp] cmp eax, -1 ;EOF je .eof cmp al, [letter] jne .loop inc [count] jmp .loop .eof: movzx eax, [letter] cinvoke printf, fmt, eax, fileName, [count] cinvoke fclose, [fp] cinvoke getch xor eax, eax .end: ret endp section '.data' data readable writeable fp dd ? count dd ? fileName db 'file.txt', 0 fileMode db 'r', 0 letter db 't' fmt db 'Number of `%c` in %s: %i', 13, 10, 0 section '.idata' import data readable writeable library msvcrt, 'msvcrt.dll' import msvcrt,\ fopen, 'fopen',\ fclose, 'fclose',\ fgetc, 'fgetc',\ printf, 'printf',\ getch, '_getch' ;for pause _________________ Привет =3 Admins, please activate my account "RIscRIpt" Last edited by RIxRIpt on 20 Jan 2014, 20:39; edited 1 time in total |
|||
20 Jan 2014, 20:24 |
|
Roman 20 Jan 2014, 20:29
Jakubs11
You example read 1 bytes from file. And if buffer = text symbols counter +1 And in cycle read file. Then after read all file counter apply 48 (thi is text number 0 ) Print number in ram counter. counter this is adres memory. По русски понимаеш ? |
|||
20 Jan 2014, 20:29 |
|
Jakubs11 20 Jan 2014, 21:00
Roman wrote:
no, sorry, I don't spreak russian. RIxRIpt, thank you for your implementation. Helped me understand my errors. Best regards. |
|||
20 Jan 2014, 21:00 |
|
AsmGuru62 21 Jan 2014, 15:10
And here it is made into a function with parameters:
Code: ; ; COUNT BYTES IN FILE ; format PE GUI 4.0 entry start stack 4000h, 4000h include 'Win32W.Inc' ; --------------------------------------------------------------------------- section '.data' data readable writeable glb_FilePath db 'C:\Temp\MyFile.txt',0 ; <-- put your test file name in here glb_BufMsg rb 80 glb_FmtMsg db 'There are %d characters found in file.',0 glb_Title db 'Counting Characters...',0 ; --------------------------------------------------------------------------- section '.code' code readable executable ; --------------------------------------------------------------------------- virtual at 0 loc1: .Buffer db ? ; Loaded ANSI character from file .FindMe db ? ; Character to look for .Padding rb 2 ; For alignment (stack must be aligned to DWORD) .CharsLoaded dd ? ; Can be 0 (if file ended) or 1 (next character loaded) .CountChars dd ? ; Character counter .size = $ end virtual align 16 TFile_CountChars: ; --------------------------------------------------------------------------- ; INPUT: ; ESI = ANSI file name ; AL = character code to count ; OUTPUT: ; EAX = count of characters in file ; --------------------------------------------------------------------------- push ebx esi edi ebp ; ; Save AL for now (because CreateFile will destroy it) ; mov edi, eax ; ; Open file for reading ; invoke CreateFileA, esi, GENERIC_READ, 0, 0, OPEN_EXISTING, FILE_FLAG_SEQUENTIAL_SCAN, 0 cmp eax, INVALID_HANDLE_VALUE je .no_file ; ; File opened OK ; mov ebx, eax ; store file handle into EBX ; ; At this point some local variables needed, so the ; small structure 'loc1' is allocated on stack and ; EBP is pointed to this structure. ; sub esp, loc1.size mov ebp, esp ; ; Two buffers are needed for ReadFile: ; - buffer into which character is loaded (ESI) ; - buffer into which # of bytes is stored (EDI) ; mov esi, ebp lea eax, [ebp + loc1.CharsLoaded] xchg eax, edi mov [ebp + loc1.FindMe], al and [ebp + loc1.CountChars], 0 ; Set COUNTER=0 ; ; In the loop read characters from file (1-by-1) ; .read_char: invoke ReadFile, ebx, esi, 1, edi, 0 ; ; See if we got the character ; mov ecx, [edi] jecxz .no_more_bytes ; ; See if the character at ESI is the one we need ; mov al, [esi] ; ; ECX = 1 (if AL is matching the ANSI code parameter) or 0 (if no match) ; xor ecx, ecx cmp al, [ebp + loc1.FindMe] sete cl ; ; Add ECX to the counter ; add [ebp + loc1.CountChars], ecx jmp .read_char .no_more_bytes: invoke CloseHandle, ebx ; Close file mov eax, [ebp + loc1.CountChars] ; EAX = return value add esp, loc1.size ; 'Forget' local variables jmp .done .no_file: xor eax, eax ; Return zero .done: pop ebp edi esi ebx ret ; --------------------------------------------------------------------------- ; PROGRAM ENTRY POINT ; --------------------------------------------------------------------------- align 16 start: ; ; A small test ; mov al, 't' mov esi, glb_FilePath call TFile_CountChars ; ; Show the counter in a message box ; mov edi, glb_BufMsg cinvoke wsprintfA, edi, glb_FmtMsg, eax invoke MessageBoxA, 0, edi, glb_Title, MB_ICONINFORMATION ; ; Quit ; invoke ExitProcess, 0 ; --------------------------------------------------------------------------- section '.idata' import data readable writeable library kernel32,'KERNEL32.DLL',user32,'USER32.DLL',gdi32,'GDI32.DLL' include 'API\Kernel32.Inc' include 'API\User32.Inc' include 'API\Gdi32.Inc' |
|||
21 Jan 2014, 15:10 |
|
m3ntal 21 Jan 2014, 16:45
It's best to read the entire file once into memory then iterate through it. Reading individual bytes is strongly discouraged, it could take seconds to load 4MB worth of files (4,194,304 disk reads versus one. Optimized C/C++ compilers will use memory I/O). Pseudo:
Code: ReadFile(file.h, p, file.size, tmp.rw, 0) Code: function count.file.c, file, c locals n try load.text file get n=text.count.c r0, c flush endf n |
|||
21 Jan 2014, 16:45 |
|
RIxRIpt 21 Jan 2014, 18:06
m3ntal wrote: It's best to read the entire file once into memory then iterate through it. Reading individual bytes is strongly discouraged, it could take seconds to load 4MB worth of files (4,194,304 disk reads versus one. Optimized C/C++ compilers will use memory I/O). I don't think you would read the entire 16GB file at once. I guess you wanted to suggest reading by blocks. (for example 4KB) By the way, msvcrt._getch uses its own buffer with size 4096 (at least in my system, proof) _________________ Привет =3 Admins, please activate my account "RIscRIpt" |
|||
21 Jan 2014, 18:06 |
|
m3ntal 21 Jan 2014, 20:16
Quote: I don't think you would read the entire 16GB file at once |
|||
21 Jan 2014, 20:16 |
|
revolution 21 Jan 2014, 20:31
m3ntal wrote: 4MB (Megabytes, 4,194,304 bytes) was the example I used. Millions of physical disk reads (only need one) = Extreme hard drive thrashing. |
|||
21 Jan 2014, 20:31 |
|
m3ntal 21 Jan 2014, 20:59
Quote: a user would have to go to extreme lengths to force it to perform poorly |
|||
21 Jan 2014, 20:59 |
|
revolution 21 Jan 2014, 21:06
m3ntal wrote:
It would be a good test to disable caching (which isn't actually easy to do BTW) and note just how much extra time it takes to do 4M individual physical HDD read I/O ops. I'd guess it would take a very long time. |
|||
21 Jan 2014, 21:06 |
|
upsurt 21 Jan 2014, 22:58
RIxRIpt wrote: And here's my implementation of "Open file, read by char and count a specific char" (using Microsoft C Runtime DLL [msvcrt]) great, thank you! how about searching a word instead of a char? |
|||
21 Jan 2014, 22:58 |
|
neville 22 Jan 2014, 01:49
revolution wrote: For almost all users Windows will have caching enabled .... I find Windows to be very robust and fast when reading and writing files, and a user would have to go to extreme lengths to force it to perform poorly. .... So Windows is more than capable of Extreme Hard Drive Thrashing, which I have personally witnessed on many occasions (thankfully always on other people's machines!) Meanwhile, enjoy your dance on the head of a pin _________________ FAMOS - the first memory operating system |
|||
22 Jan 2014, 01:49 |
|
Frank 22 Jan 2014, 03:14
neville wrote: Imo it is misleading to suggest that Windows' disk caching is provided as a "feature". Without it Windows'performance would be even more canine We heard from "m3ntal" (the former "uart777", if I understood this right) that reading a 4MB file BYTE FOR BYTE takes 5 seconds under Windows 7. You seem to claim that disk caching in Windows serves to hide some kind of deficiency in the operating system, and that reading a 4MB file BYTE FOR BYTE can be done at the same speed (or faster) even without disk caching in other operating systems. Please provide evidence. For example, how long does it take to read 4MB BYTE FOR BYTE in the operating system FAMOS that you advertise in your signature? From your message I understand that hobbyist operating systems such as FAMOS achieve the same performance or better (4MB in 5 seconds) WITHOUT disk caching. |
|||
22 Jan 2014, 03:14 |
|
m3ntal 22 Jan 2014, 03:29
Sorry for kind of changing the subject, but it's true that ReadFile can take noticeable time. Consider that one 1366x768x32 image = 4,196,352 bytes! (1366*768*4)
Jakubs11: .while 1 is not valid. Here is a forever macro that creates an infinite loop: Code: macro forever { local ..start, ..next, ..end ?START equ ..start ?NEXT equ ..next ?END equ ..end ?START: } macro endfv { ?NEXT: jmp ?START ?END: restore ?START, ?NEXT, ?END } Code: for (;;) {} /* nasty hacks! */ while (1) {} |
|||
22 Jan 2014, 03:29 |
|
Frank 22 Jan 2014, 03:43
upsurt wrote:
How about doing your homework assignments yourself, rather than asking others to do them for you? |
|||
22 Jan 2014, 03:43 |
|
Melissa 22 Jan 2014, 05:32
On Linux file can be directly read from disk, but only multiple of 512
bytes. I guess that is because disk read operation takes at least 512 bytes. Reading directly from disk 8192 512 bytes blocks takes about half a second on my machine. |
|||
22 Jan 2014, 05:32 |
|
Frank 22 Jan 2014, 11:48
So your 8192 read accesses to the disk (sector for sector, 512 bytes at a time) take half a second. Reading a 4M file BYTE FOR BYTE (without caching the sectors!) would mean 4M read accesses to the disk. By extension, that would take half a second X 512 = 256 seconds on your machine.
|
|||
22 Jan 2014, 11:48 |
|
Melissa 22 Jan 2014, 12:42
Yes, 4M disk reads takes 4m37s.
|
|||
22 Jan 2014, 12:42 |
|
AsmGuru62 22 Jan 2014, 14:32
It is obvious that reading file by bytes is much slower than doing it by larger pieces.
I just showed how the original post CAN BE done as a function. I had no intention of optimizing it in any way. Also, it is possible that ReadFile() API makes some internal checks, like making sure that the file handle parameter is a valid handle. That will happen ~4M times in this case, so yeah, it will be slow way to read a file. As for looking for a word in a file - it is more complex. If you reading by 1 bytes as in the original post, you need to keep incrementing a count of matching characters and clear it for every byte, which is not a match. Here is pseudo-code: Code: long nMatches=0, foundMatches, ofsFile=-1; char c; char word[8]; strcpy (word, "text"); // Loking for "text" in file foundMatches = strlen (word); while (1) { c = LoadByteFromFile(); if (no bytes left in file) break; // // 'c' is a next character from file // ++ofsFile; if (c == word [nMatches]) { ++nMatches; if (nMatches == foundMatches) { ofsFile -= foundMatches-1; // // Word has been found at offset 'ofsFile' in file! // } } else { if ((nMatches != 0) && (c == word [0])) { nMatches=1; continue; } nMatches=0; } } This pseudo-code was not tested -- it is basically brut-force scan of the file. I may be missing something, but you should get the idea. Last edited by AsmGuru62 on 23 Jan 2014, 13:41; edited 1 time in total |
|||
22 Jan 2014, 14:32 |
|
Goto page 1, 2 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.