flat assembler
Message board for the users of flat assembler.

Index > DOS > Decommenter / Parser issues: where did those NULs come from?

Goto page Previous  1, 2, 3  Next
Author
Thread Post new topic Reply to topic
fasmnewbie



Joined: 01 Mar 2011
Posts: 555
fasmnewbie 16 May 2014, 07:03
revolution wrote:
fasmnewbie wrote:
I am just testing some features of int 21h because windoze won't allow me access to some BIOS disk service - for obvious reasons of course
If you can access things through NTVDM (the DOS machine) then you can also do it with Windows API calls. NTVDM uses Windows to do its stuff.
Took me two hours last weekend just to figure out how CreateFile works. I wonder how those C++ programmers can even read MSDN and those uppercase + lowercase switches. I almost cried just by looking at the Docs.
Post 16 May 2014, 07:03
View user's profile Send private message Visit poster's website Reply with quote
fasmnewbie



Joined: 01 Mar 2011
Posts: 555
fasmnewbie 16 May 2014, 07:08
revolution wrote:
fasmnewbie wrote:
Any idea how to fix it? This is my first time engaging this kind of problem and using int 21h's file service. Show me some macro skill Very Happy
Yes I know how to fix it. Use a proper line parser. But this is not a task suited to macros.
Line parser?? Hmmm...that sounds cryptic enough. C'mon revolution. I know you have thousands of macros collecting dusts somewhere in your folders. Its about time to show some love to needy people Laughing
Post 16 May 2014, 07:08
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20309
Location: In your JS exploiting you and your system
revolution 16 May 2014, 07:26
I have, and use, very few macros. But there are existing parsers on this board. If you just want ready made code then you can probably find it here somewhere.
Post 16 May 2014, 07:26
View user's profile Send private message Visit poster's website Reply with quote
fasmnewbie



Joined: 01 Mar 2011
Posts: 555
fasmnewbie 16 May 2014, 07:40
revolution wrote:
I have, and use, very few macros. But there are existing parsers on this board. If you just want ready made code then you can probably find it here somewhere.
ehmmm (still googling what does a line parser mean).
Post 16 May 2014, 07:40
View user's profile Send private message Visit poster's website Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3499
Location: Bulgaria
JohnFound 16 May 2014, 08:54
fasmnewbie wrote:
revolution wrote:
I have, and use, very few macros. But there are existing parsers on this board. If you just want ready made code then you can probably find it here somewhere.
ehmmm (still googling what does a line parser mean).


It means scanning the line, analyzing where a quoted text begins and ends and ignoring the semicolons inside the quotes.

_________________
Tox ID: 48C0321ADDB2FE5F644BB5E3D58B0D58C35E5BCBC81D7CD333633FEDF1047914A534256478D9
Post 16 May 2014, 08:54
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
AsmGuru62



Joined: 28 Jan 2004
Posts: 1619
Location: Toronto, Canada
AsmGuru62 16 May 2014, 14:36
ummm... I'll just risk it and ask: why exactly remove the comments from the source?
Post 16 May 2014, 14:36
View user's profile Send private message Send e-mail Reply with quote
fasmnewbie



Joined: 01 Mar 2011
Posts: 555
fasmnewbie 17 May 2014, 05:15
Update: Blank lines deleted.
Both test file and the expected output are shown below for comparison

Code:
format mz
include 'dbg16.inc'
entry start:main
SIZE = 814  ;Size of source file in bytes

segment info
targetf db "newf.txt",0     ;new file with compacted code(.asm also can)
sourcef db "test.asm",0     ;file to be strip off comments
buff db SIZE dup(?)         ;buffer read from tst.asm
pack db SIZE dup(?)         ;buffer of compacted file for newf.asm
limit dw ?

segment start
;FIRST: Remove side and line comments
main:   dsinit@ info
        fopenr@ sourcef,SIZE,buff
begin:  xor si,si
        xor di,di
next:   mov al,[buff+si]
        cmp al,';'
            je pck
        cmp al,09h  ;if ';' preceded by TAB
            je tst
ok:     mov byte[pack+di],al
        cmp si,SIZE
            je done
        inc si
        inc di
        jmp next
tst:    cmp byte[buff+si+1],';'
            jne ok
pck:    inc si
        mov al,[buff+si]
        cmp al,0dh
            je ok
        cmp al,0ah
            je ok
        jmp pck
done:   mov word[limit],di

;SECOND - Delete blank lines
begin1: xor si,si
        xor di,di
next1:  mov al,[pack+si]
        cmp al,0dh
            je pck1
ok1:    mov byte[buff+di],al ;re-use buff
        inc si
        inc di
        cmp si,SIZE
            je quit
        jmp next1
pck1:   cmp [pack+si+2],0dh
            jne ok1
        add si,2
        sub word[limit],2
        cmp si,SIZE
            je quit
        jmp next1
quit:   ;prtdec@ [limit] ;updated size
        fnew@ targetf
        fopenw@ targetf,[limit],buff
        exitp@     


Only 2 problems remaining.

1. Just like revolution said (Quoted ';')
2. I have no idea on how to deal with files larger than 64KBs.

NOTE: fnew@, fopenr@ and fopenw@ are simply calls to DOS INT 21H services (for file processing - create, open/read and open/write respectively).

Test file: "test.asm" (combining both NASM and MASM codes, with comments, 814KBs)

Code:
section .bss
section .data

  string: db "some string",0
  string_l: equ $-string

  N dd 6 ; might use a more meaningful name,,,

section .text

global _start

_start:

mov esi, [N] ; or get it some other way

mov eax, 1 ; sys_exit
mov ebx, 0 ; claim "no error"
int 80h

include \masm32\include\masm32rt.inc

.data ; initialised variables
MyAppName db "Masm32:", 0
MyReal8 REAL8 123.456

.data? ; non-initialised (i.e. zeroed) variables
MyDword dd ?

.code

;This is a test line
;Another test line



start:
  invoke MessageBox, 0, chr$("A box, wow!"), addr MyAppName, MB_OK
  mov eax, 123  ; just an example  launch OllyDbg to see it in action
  exit
end start

mov eax, 4 ; sys_write
mov ebx, 1 ; sdtout
lea ecx, [string + esi]
mov edx, 1 ; just one, please
int 80h     



The output file ("newf.txt", after removing comments and compacting blank lines (493 bytes)

Code:
section .bss
section .data
  string: db "some string",0
  string_l: equ $-string
  N dd 6 
section .text
global _start
_start:
mov esi, [N] 
mov eax, 1 
mov ebx, 0 
int 80h
include \masm32\include\masm32rt.inc
.data 
MyAppName db "Masm32:", 0
MyReal8 REAL8 123.456
.data? 
MyDword dd ?
.code
start:
  invoke MessageBox, 0, chr$("A box, wow!"), addr MyAppName, MB_OK
  mov eax, 123
  exit
end start
mov eax, 4 
mov ebx, 1 
lea ecx, [string + esi]
mov edx, 1 
int 80h    


Waiting for revolution for line parsing Very Happy
Post 17 May 2014, 05:15
View user's profile Send private message Visit poster's website Reply with quote
fasmnewbie



Joined: 01 Mar 2011
Posts: 555
fasmnewbie 17 May 2014, 05:20
@AsmGuru62
Maybe for batch processing or restructuring your internal documentations. Can save you lots and lots of keystrokes particularly if you need to delete side or same-line comments.

What risk? Rolling Eyes
Post 17 May 2014, 05:20
View user's profile Send private message Visit poster's website Reply with quote
fasmnewbie



Joined: 01 Mar 2011
Posts: 555
fasmnewbie 17 May 2014, 05:24
Any improvement is welcome.
Post 17 May 2014, 05:24
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20309
Location: In your JS exploiting you and your system
revolution 17 May 2014, 06:13
fasmnewbie wrote:
Waiting for revolution for line parsing
I hope you aren't waiting for me to write your code? If so you will be waiting a while.
Post 17 May 2014, 06:13
View user's profile Send private message Visit poster's website Reply with quote
fasmnewbie



Joined: 01 Mar 2011
Posts: 555
fasmnewbie 17 May 2014, 07:02
@revolution, been waiting since last week. Btw, this isnt 'my' code. Its for sharing. Your input is invaluable

Just for the sake of newbies, the same code without external macros

Code:
org 100h
SIZE=814 ;Size of source file in bytes
main:   push buff
        push SIZE
        push sourcef
        call fopenr

        call Start

        push targetf
        call fnew

        push buff
        push [limit]
        push targetf
        call fopenw

        int 20h
;--------------------------
targetf db "newf.txt",0
sourcef db "test.asm",0
buff db SIZE dup(?)
pack db SIZE dup(?)
limit dw ?
;--------------------------
Start:
;--------------------------
        xor si,si
        xor di,di
next:   mov al,[buff+si]
        cmp al,';'
            je pck
        cmp al,09h
            je tst
ok:     mov byte[pack+di],al
        cmp si,SIZE
            je done
        inc si
        inc di
        jmp next
tst:    cmp byte[buff+si+1],';'
            jne ok
pck:    inc si
        mov al,[buff+si]
        cmp al,0dh
            je ok
        cmp al,0ah
            je ok
        jmp pck
done:   mov word[limit],di
        xor si,si
        xor di,di
next1:  mov al,[pack+si]
        cmp al,0dh
            je pck1
ok1:    mov byte[buff+di],al
        inc si
        inc di
        cmp si,SIZE
            je quit
        jmp next1
pck1:   cmp [pack+si+2],0dh
            jne ok1
        add si,2
        sub word[limit],2
        cmp si,SIZE
            je quit
        jmp next1
quit:   ret
;--------------------------
fnew:
;--------------------------
        push bp
        mov bp,sp
        mov dx,[bp+4]
        mov ah,3ch
        mov cl,0
        int 21h
        pop bp
        ret
;--------------------------
fopenr:
;--------------------------
        push bp
        mov bp,sp
        mov dx,[bp+4]   ;file to open
        mov al,0        ;read
        mov ah,3dh      ;Function to open file
        int 21h         ;Open file. File handle in AX

        mov bx,ax       ;file handle
        mov cx,[bp+6]   ;number of bytes to read
        mov dx,[bp+8]   ;buffer of data to keep the bytes
        mov ah,3fh      ;function to read
        int 21h         ;Read file

        mov ah,3eh      ;close handle
        int 21h
        pop bp
        ret
;--------------------------
fopenw:
;--------------------------
        push bp
        mov bp,sp
        mov dx,[bp+4]
        mov al,2        ;read and write
        mov ah,3dh      ;Function to open file
        int 21h         ;Open file. File handle in AX

        mov bx,ax       ;file handle of a
        mov cx,[bp+6]   ;number of bytes to write
        mov dx,[bp+8]   ;address of data to copy from
        mov ah,40h      ;write function
        int 21h         ;Write to file

        mov ah,3eh      ;release handle
        int 21h
        pop bp
        ret    


Hope this would be useful.
Post 17 May 2014, 07:02
View user's profile Send private message Visit poster's website Reply with quote
sid123



Joined: 30 Jul 2013
Posts: 339
Location: Asia, Singapore
sid123 17 May 2014, 09:49
fasmnewbie wrote:
revolution wrote:
Also what happens with this line?
Code:
text: db 'Here is some text; this is NOT a comment.',13,10,0 ;this is a comment    


Any idea how to fix it? This is my first time engaging this kind of problem and using int 21h's file service. Show me some macro skill Very Happy

Worse case:
Code:
db 'This is NOT a comment', CR, LF, NULL ; These 'macros' (Note the quotation marks) aren't defined ;;
    

_________________
"Those who can make you believe in absurdities can make you commit atrocities" -- Voltaire https://github.com/Benderx2/R3X
XD
Post 17 May 2014, 09:49
View user's profile Send private message Reply with quote
fasmnewbie



Joined: 01 Mar 2011
Posts: 555
fasmnewbie 18 May 2014, 05:21
I rearranged the code so it is more manageable, plus one more feature added - to delete the tab and spaces so your new file will be left aligned. You can disable it if you don't want to.

Code:
org 100h
include 'dbg16.inc';'proc16.inc'
main:
        push buff
        push SIZE
        push sourcef
        call read_file

        stdcall strcpy,pack,buff,SIZE
        mov [limit],SIZE

        ;These 3 can be alternated and arranged
        ;in any order
        call del_comments
        call del_lines
        call del_tabspace

        ;After compacting, write to new file (newf.txt)
        push pack
        push [limit]
        push targetf
        call write_file

        mov ah,0
        int 16h
        mov ah,4ch
        int 21h

;--------------------------
SIZE=814 ;Actual size of test.asm
targetf db "newf.asm",0
sourcef db "test.asm",0
buff db SIZE dup(?)
pack db SIZE dup(?)
limit dw 0
;--------------------------

;--------------------------
del_tabspace:  ;delete tabs and spaces
;--------------------------
        xor si,si
        xor di,di
next4:  mov al,[pack+si]
        cmp al,09h
            je pck4
ok4:    mov byte[buff+di],al
        cmp si,[limit]
            je done4
        inc si
        inc di
        jmp next4
pck4:   inc si
        jmp next4
done4:  mov word[limit],di
        stdcall strcpy,pack,buff,[limit]
        xor si,si
        xor di,di
next5:  mov al,[pack+si]
        cmp al,20h
            je tst5
        cmp al,0ah
            je tst51
ok5:    mov byte[buff+di],al
        cmp si,word[limit]
            je done5
        inc si
        inc di
        jmp next5
tst51:  cmp byte[pack+si+1],20h
            jne ok5
        inc si
        mov [pack+si],al
        jmp next5
tst5:   cmp byte[pack+si+1],20h
            jne ok5
pck5:   inc si
        jmp next5
done5:  mov word[limit],di
        stdcall strcpy,pack,buff,[limit]
        ret

;--------------------------
;Delete comments
del_comments:
;--------------------------
        xor si,si
        xor di,di
next:   mov al,[pack+si]
        cmp al,';'
            je pck
        cmp al,09h
            je tst
ok:     mov byte[buff+di],al
        cmp si,[limit]
            je done
        inc si
        inc di
        jmp next
tst:    cmp byte[pack+si+1],';'
            jne ok
pck:    inc si
        cmp si,[limit]
            je done
        mov al,[pack+si]
        cmp al,0dh
            je ok
        cmp al,0ah
            je ok
        jmp pck
done:   mov word[limit],di
        stdcall strcpy,pack,buff,[limit]
        ret

;--------------------------------
del_lines:
;--------------------------------
        xor si,si
        xor di,di
next1:  mov al,[pack+si]
        cmp al,0dh
            je pck1
ok1:    mov byte[buff+di],al
        inc si
        inc di
        cmp si,[limit]
            je quit
        jmp next1
pck1:   cmp [pack+si+2],0dh
            jne ok1
        add si,2
        cmp si,[limit]
            je quit
        jmp next1
quit:   mov [limit],di
        stdcall strcpy,pack,buff,[limit]
        ret

;--------------------------
proc strcpy,dest,source,sz
;--------------------------
        mov si,[dest]
        mov di,[source]
        xor bx,bx
go1:    mov al,byte[di+bx]
        mov byte[si+bx],al
        inc bx
        cmp bx,[sz]
            je done1
        jmp go1
done1:  ret
endp

;--------------------------
read_file:
;--------------------------
        push bp
        mov bp,sp
        mov dx,[bp+4]   ;file to open
        mov al,0        ;read
        mov ah,3dh      ;Function to open file
        int 21h         ;Open file. File handle in AX

        mov bx,ax       ;file handle
        mov cx,[bp+6]   ;number of bytes to read
        mov dx,[bp+8]   ;buffer of data to keep the bytes
        mov ah,3fh      ;function to read
        int 21h         ;Read file

        mov ah,3eh      ;close handle
        int 21h
        pop bp
        ret

;--------------------------
write_file:
;--------------------------
        push bp
        mov bp,sp

        mov dx,[bp+4]   ;Create new file
        mov ah,3ch
        mov cl,0
        int 21h

        mov dx,[bp+4]
        mov al,2        ;read and write
        mov ah,3dh      ;Function to open file
        int 21h         ;Open file. File handle in AX

        mov bx,ax       ;file handle of a
        mov cx,[bp+6]   ;number of bytes to write
        mov dx,[bp+8]   ;address of data to copy from
        mov ah,40h      ;write function
        int 21h         ;Write to file

        mov ah,3eh      ;release handle
        int 21h
        pop bp
        ret    


The code is not perfect, but the idea is there. If you found it buggy, try to improve or correct it.

I bet King Tomasz must have a lot better version for internal FASM use. I just can't find where it is from the source. Even if I do find it, I may not know how to interpret them Laughing

Test file (test.asm - exactly 814 Bytes - a random combo of nasm and masm code). For robustness test, you can multiply its content and get some more random codes. Don't forget to change the SIZE though.


Description:
Download
Filename: test.asm
Filesize: 814 Bytes
Downloaded: 631 Time(s)

Post 18 May 2014, 05:21
View user's profile Send private message Visit poster's website Reply with quote
fasmnewbie



Joined: 01 Mar 2011
Posts: 555
fasmnewbie 18 May 2014, 05:31
Is it possible to run this kind of program that accepts command line arguments like

Code:
d:\>strip test.asm newf.asm    


What's the specific terminology for this?
Post 18 May 2014, 05:31
View user's profile Send private message Visit poster's website Reply with quote
fasmnewbie



Joined: 01 Mar 2011
Posts: 555
fasmnewbie 18 May 2014, 05:35
revolution?

It's ok, take your time. I am in no hurry anyway.
Post 18 May 2014, 05:35
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20309
Location: In your JS exploiting you and your system
revolution 18 May 2014, 09:23
fasmnewbie wrote:
Is it possible to run this kind of program that accepts command line arguments like

Code:
d:\>strip test.asm newf.asm    


What's the specific terminology for this?
I'd guess this is called a filter program. It is a simple change: just read the input from STDINPUT instead of a file, and send the output to STDOUTPUT instead of a file.
Post 18 May 2014, 09:23
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20309
Location: In your JS exploiting you and your system
revolution 18 May 2014, 09:24
fasmnewbie wrote:
revolution?

It's ok, take your time. I am in no hurry anyway.
What is your question? You appear to be waiting for something but I have nothing on my to do list that has your name attached.
Post 18 May 2014, 09:24
View user's profile Send private message Visit poster's website Reply with quote
fasmnewbie



Joined: 01 Mar 2011
Posts: 555
fasmnewbie 18 May 2014, 12:39
revolution wrote:
fasmnewbie wrote:
Is it possible to run this kind of program that accepts command line arguments like

Code:
d:\>strip test.asm newf.asm    


What's the specific terminology for this?
I'd guess this is called a filter program. It is a simple change: just read the input from STDINPUT instead of a file, and send the output to STDOUTPUT instead of a file.


No. It is simply called "command line program". hehehe Laughing
Post 18 May 2014, 12:39
View user's profile Send private message Visit poster's website Reply with quote
fasmnewbie



Joined: 01 Mar 2011
Posts: 555
fasmnewbie 18 May 2014, 12:44
revolution wrote:
fasmnewbie wrote:
revolution?

It's ok, take your time. I am in no hurry anyway.
What is your question? You appear to be waiting for something but I have nothing on my to do list that has your name attached.
It's okay. I anticipated that long ago. Btw that list is called "pride at stake".

Calm down Very Happy
Post 18 May 2014, 12:44
View user's profile Send private message Visit poster's website Reply with quote
DOS386



Joined: 08 Dec 2006
Posts: 1900
DOS386 22 May 2014, 08:56
> It is simply called "command line program". hehehe

Right Wink

> I bet King Tomasz must have a lot better version for
> internal FASM use. I just can't find where it is from the source.

PARSER.INC Wink

> 1. Just like revolution said (Quoted ';')

The idea is simple: run through the line until you find semicolon <;> OR EOL OR sng-quot <'> OR dbl-qout <"> ... if you find such a quot, search for same closing quot again or EOL, if you find the closing quot, return to the main search

> 2. I have no idea on how to deal with files larger than 64K

Easy:

1. Set up a reasonable buffer size (32 KiB) line length limit (for example 1 KiB)
2. After every line, check whether you are too close to the 32 KiB limit (for example 30 KiB)
3. If so, write the output buffer, move the not yet processed content to the begin of the buffer and fill the remaining space from the input file
Post 22 May 2014, 08:56
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page Previous  1, 2, 3  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.