flat assembler
Message board for the users of flat assembler.
Index
> Main > data compression theories and implementation |
Author |
|
cod3b453 10 Aug 2011, 19:24
A number of algorithms use lookup tables, trees or "dictionaries" of common sequences. The important question is what are you compressing and how does the algorithm handle it. If your data is random it is likely the "compressed" result is larger but if your data is a smaller set of values or repeats or has common patterns, it is easier to compress.
|
|||
10 Aug 2011, 19:24 |
|
typedef 10 Aug 2011, 20:45
I haven't come with any code yet as I was just thinking. The way I thought it would be would only work on known file formats and less likely on unstructured binary data. I was thinking that, if the file has a known format and it has pre-defined flags, those flags(ie DWORD) can be put in a look-up table and then in the compressed file replace that field with a byte that points to location of the flag in the look-up table.
Example, take an .EXE file. Let's take the CPU type field. and here's the look up table for the EXE header CPU type field. 0. i386 ; WORD this is our cpu type 1. AMD ; WORD 3. etc...; WORD on compression, this field will be reduced to a byte, 0 in this case. and then on decompression, the deflater will replace this field with a WORD therefore re-building a valid PE. But then, I'm trying to think how that can be applied to a block of data such as a floating number field. ? Hhmm... |
|||
10 Aug 2011, 20:45 |
|
Madis731 12 Aug 2011, 07:20
Byte is too coarse of a unit. That is why most compressors use bits. Most used fields take 2 bits (so they will compress the most), and the less the fields are used, the longer are the bitstreams. The bitstreams can even be larger than a DWORD, depending on your dictionay size...
...this is called http://en.wikipedia.org/wiki/Huffman_coding 7-zip uses multiple compression types on archives which contain for example txt, exe, mp3, wav etc.
|
||||||||||
12 Aug 2011, 07:20 |
|
typedef 12 Aug 2011, 18:01
^^Hmm.., I read the wikipedia page. It seems so complicated the first time.
|
|||
12 Aug 2011, 18:01 |
|
DOS386 17 Aug 2011, 04:12
typedef wrote: I was thinking of a data compression method. I don't know if it's already been implemented. The idea is to have a byte lookup table and also a known file format lookup table. Would this even work considering that most files are structured(have headers)? This won't work. Problems: 1. You are limited to known file types 2. You could "optimize" some inefficient headers (like the PE header, or even the silly "This program can't be run in DOS mode, go online and pirate some Windaube, see http://www.best*********4free.net/index.html, dude"), but not the other file data 3. You can compress such headers using ordinary algo's too - no need for your "technology" 4. Check out RLE, LZ77, Huffman, LZW84, Deflate (= LZ77+Huffman), and come back when you understand them all |
|||
17 Aug 2011, 04:12 |
|
typedef 17 Aug 2011, 04:55
DOS386 wrote:
PS: Your weird link did not work either |
|||
17 Aug 2011, 04:55 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.