flat assembler
Message board for the users of flat assembler.
 Home   FAQ   Search   Register 
 Profile   Log in to check your private messages   Log in 
flat assembler > Macroinstructions > fasmg: output multiple files by writing uncompressed zip

Author
Thread Post new topic Reply to topic
Grom PE



Joined: 13 Mar 2008
Posts: 102
Location: i@grompe.org.ru
fasmg: output multiple files by writing uncompressed zip
fasmg can have multiple input files, but only one output file. Remembering this discussion, I decided to remove this limitation... by writing an uncompressed zip file!


Code:
; writing ZIP format for fasmg by Grom PE
; allows to output multiple files in a single zip file (uncompressed)

ZIP::
; todo: calculate crc32 only once instead of on every pass (is that possible?)
; todo: sandbox the contents in add2zip..endadd2zip so it's possible to use org inside and other complex formats

virtual at 0
  crc32table::
  repeat 256
    r = %-1
    repeat 8
      r = r shr 1 xor (0xEDB88320 * r and 1)
    end repeat
    dd r
  end repeat
end virtual

namespace ZIP
  VersionToExtract := 20
  VersionMadeBy := 63
  FILE_INDEX = 0
  HAS_FILES = 0
  FILE_OFFSET = $%
end namespace

macro endadd2zip
  namespace ZIP
    FILE_SIZE = $% - FILE_OFFSET

    if HAS_FILES
      match e,entry
        c = 0xffffffff
        repeat FILE_SIZEa:0
          ; fixme: fragile, breaks if "org" is used inside the region
          load b:byte from e.Data:a
          load t dword from crc32table:(b xor (c and 0xFF))*4
          c = c shr 8 xor t
        end repeat
        e.Crc32 = c xor 0xffffffff
        e.CompressedSize = FILE_SIZE
        e.UncompressedSize = FILE_SIZE
      end match
      FILE_INDEX = FILE_INDEX + 1
    end if
  end namespace
end macro

macro add2zip name*
  namespace ZIP
    local e
    endadd2zip
    e.GeneralPurpose = 0
    e.CompressionMethod = 0
    e.FileTime = 0
    e.FileDate = 0
    e.FileNameLength = lengthof name
    e.FileAttributes = 0
    e.FileAttributesExt = 0
    e.LocalHeaderOffset = $%%
    e.FileName = name
    entry equ e

    db 'PK',3,4
    dw VersionToExtract
    dw e.GeneralPurpose
    dw e.CompressionMethod
    dw e.FileTime
    dw e.FileDate
    dd e.Crc32
    dd e.CompressedSize
    dd e.UncompressedSize
    dw e.FileNameLength
    dw 0 ; ExtraFieldLength
    db e.FileName
    ;db '' ; ExtraField
    org 0
    e.Data::

    FILE_OFFSET = $%
    HAS_FILES = 1
  end namespace
end macro

postpone
  purge add2zip
  endadd2zip
  namespace ZIP
  org $%%
central:
  irpv e,entry
    db 'PK',1,2
    dw VersionMadeBy
    dw VersionToExtract
    dw e.GeneralPurpose
    dw e.CompressionMethod
    dw e.FileTime
    dw e.FileDate
    dd e.Crc32
    dd e.CompressedSize
    dd e.UncompressedSize
    dw e.FileNameLength
    dw 0 ; ExtraFieldLength
    dw 0 ; CommentLength
    dw 0 ; DiskNumber
    dw e.FileAttributes
    dd e.FileAttributesExt
    dd e.LocalHeaderOffset

    db e.FileName
    ;db '' ; ExtraField
    ;db '' ; Comment
  end irpv
tail:
  db 'PK',5,6
  dw 0 ; Number of this disk
  dw 0 ; Number of the disk with the start of the central repository
  dw NUMBER_OF_FILES
  dw NUMBER_OF_FILES
  dd tail - central
  dd central
  dw 0 ; Comment length
  ;db '' ; Comment

  NUMBER_OF_FILES := FILE_INDEX
  end namespace
end postpone




this could be used like so:

Code:
include 'zipwrite.inc'

db 'Some non-zip data that goes in the beginning',10

add2zip 'hello.txt'
db 'Hello world!',10

add2zip 'greetings.txt'
db 'Greetings to all the flat assembler fans!',10




In the future, this could also prove useful for making .jar files for JVM assembly.


Description: zip write macros for fasmg
Download
Filename: fasmg_zipwrite.zip
Filesize: 1.41 KB
Downloaded: 26 Time(s)

Post 01 Nov 2016, 16:45
View user's profile Send private message Visit poster's website Reply with quote
Trinitek



Joined: 06 Nov 2011
Posts: 249
Very clever.
Post 01 Nov 2016, 21:17
View user's profile Send private message Reply with quote
Tomasz Grysztar
Assembly Artist


Joined: 16 Jun 2003
Posts: 6253
Location: Kraków, Poland
Re: fasmg: output multiple files by writing uncompressed zip

Grom PE wrote:
todo: calculate crc32 only once instead of on every pass (is that possible?)

While I tried to avoid exposing details of the resolving process to the constructions of language, the calculation of checksums is an example of place where it is tempting to introduce a way to compute them only in the final pass. Or, to be more precise: to make it possible to mark some code in such a way that it would get assembled only when everything that came before is already correctly resolved. It is easy to add something like that to fasmg, in form of a built-in variable that could be checked with "if", but I'm still considering the consequences of such move on the overall design of the language.
Post 02 Nov 2016, 11:11
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 14586
Location: Planet Dirt
I added CRC natively into fasmarm for this reason. Doing a native CRC is much faster than with a macro. And makes doing it on every pass is almost inconsequential in terms of the extra time needed. If the defining parameters are made flexible enough then it can accommodate all the common bit lengths and polynomials.
Post 02 Nov 2016, 11:18
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar
Assembly Artist


Joined: 16 Jun 2003
Posts: 6253
Location: Kraków, Poland

revolution wrote:
I added CRC natively into fasmarm for this reason. Doing a native CRC is much faster than with a macro. And makes doing it on every pass is almost inconsequential in terms of the extra time needed. If the defining parameters are made flexible enough then it can accommodate all the common bit lengths and polynomials.

This is an obvious thing to do when constructing some actual targeted assembler, like fasm for x86 or fasmarm (after all, fasm computes PE checksums natively), or any potential assembler based on fasmg engine. But when everything including the entire output format generation is processed by macros, the engine cannot help much with specific implementations, at least not in general. Even if it had some helper functions to calculate CRC of specified data blocks, in other places we would need the PE checksum algorithm, or SHA, or something different altogether. For the output formatters I see it as "all macro or no macro" choice.

On the other hand, if in the future there are some specialized assemblers based on fasmg engine, they will most probably be able to assemble all these "all macro" solutions created currently for fasmg, while also having some slick native implementations of some instruction sets and output formats.
Post 02 Nov 2016, 13:22
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar
Assembly Artist


Joined: 16 Jun 2003
Posts: 6253
Location: Kraków, Poland
I though we could try the same thing with TAR format, since this should be even simpler. But even though this format is simple, it manages to be a kind of mess at the same time.
Here come my macros:

Code:
macro tar_number? value*,length:8
        local d
        repeat length-1
                d = value shr ((length-1-%)*3)
                if d > 0
                        db '0' + d and 111b
                else
                        db 20h
                end if
        end repeat
        db 20h
end macro

macro tar_record? name

        local data,size,checksum_field,checksum,byte

        org 0

        db string name,(100 - lengthof string namedup 0
        tar_number 10077o       ; file mode
        tar_number 0            ; owner id
        tar_number 0            ; group id
        tar_number size,12      ; file size
        tar_number %t,12        ; last modification time
        checksum_field db 8 dup 20h
        db '0',100 dup 0        ; normal file

        checksum = 0
        repeat $
                load byte : 1 from $-%
                checksum = checksum + byte
        end repeat
        repeat 6
                byte = checksum shr ((6-%)*3)
                store '0' + byte and 111b : 1 at checksum_field + % - 1
        end repeat
        store 0:1 at checksum_field + 7

        db 512-$ dup 0

        org 0
        data = $%
        define $% ($%-data)
        define $%% ($%%-data)

        macro end?.tar_record?
                size = $%
                db (512 - size and 511dup 0
                restore $%,$%%
                purge end?.tar_record?
        end macro

end macro

And use them like this:

Code:
tar_record 'hello.txt'

        db 'Hello world!',10

end tar_record

tar_record 'greetings.txt'

        db 'Greetings to all the flat assembler fans!',10

end tar_record

db 2*512 dup 0  ; two null records to mark the end of tarball

The macros emulate the "$%" and "$%%" values inside the contained files.

They also be combined with POSTPONE emulation:

Code:
macro tar_number? value*,length:8
        local d
        repeat length-1
                d = value shr ((length-1-%)*3)
                if d > 0
                        db '0' + d and 111b
                else
                        db 20h
                end if
        end repeat
        db 20h
end macro

macro tar_record? name

        local postponed,data,size,checksum_field,checksum,byte

        org 0

        db string name,(100 - lengthof string namedup 0
        tar_number 10077o       ; file mode
        tar_number 0            ; owner id
        tar_number 0            ; group id
        tar_number size,12      ; file size
        tar_number %t,12        ; last modification time
        checksum_field db 8 dup 20h
        db '0',100 dup 0        ; normal file

        checksum = 0
        repeat $
                load byte : 1 from $-%
                checksum = checksum + byte
        end repeat
        repeat 6
                byte = checksum shr ((6-%)*3)
                store '0' + byte and 111b : 1 at checksum_field + % - 1
        end repeat
        store 0:1 at checksum_field + 7

        db 512-$ dup 0

        org 0
        data = $%
        define $% ($%-data)
        define $%% ($%%-data)

        macro postponed
        end macro

        macro postpone?!
            esc macro postponed 
        end macro  

        macro end?.postpone?!  
                postponed 
            esc end macro  
        end macro

        macro end?.tar_record?
                postponed
                size = $%
                db (512 - size and 511dup 0
                restore $%,$%%
                purge postpone?,end?.postpone?,end?.tar_record?
        end macro

end macro

and then it becomes possible to assemble an entire example program as one of the files:

Code:
tar_record 'hello.txt'

        db 'Hello world!',10

end tar_record

tar_record 'win32.exe'

        include 'win32.asm'     ; example from fasmg package

end tar_record

db 2*512 dup 0  ; two null records to mark the end of tarball

Post 22 Nov 2016, 14:22
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar
Assembly Artist


Joined: 16 Jun 2003
Posts: 6253
Location: Kraków, Poland
Re: fasmg: output multiple files by writing uncompressed zip

Tomasz Grysztar wrote:

Grom PE wrote:
todo: calculate crc32 only once instead of on every pass (is that possible?)

While I tried to avoid exposing details of the resolving process to the constructions of language, the calculation of checksums is an example of place where it is tempting to introduce a way to compute them only in the final pass. Or, to be more precise: to make it possible to mark some code in such a way that it would get assembled only when everything that came before is already correctly resolved. It is easy to add something like that to fasmg, in form of a built-in variable that could be checked with "if", but I'm still considering the consequences of such move on the overall design of the language.

I have finally decided on a way to implement this. There is a new variant of POSTPONE directive, that looks like this:

Code:
postpone ?
    ; ...
end postpone

and it postpones the block until the rest of the source has been resolved. If you put the entire computation of a checksum in such block, it should generally get assembled just once, at the end of assembly.
Post 06 Mar 2017, 14:21
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >

Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001-2005 phpBB Group.

Main index   Download   Documentation   Examples   Message board
Copyright © 2004-2016, Tomasz Grysztar.