flat assembler
Message board for the users of flat assembler.

Index > Compiler Internals > some questions about fas file format

Goto page Previous  1, 2, 3  Next
Author
Thread Post new topic Reply to topic
buzzkill



Joined: 15 Mar 2009
Posts: 111
Location: the nether lands
buzzkill
And I'm back with another question: Smile

When I generate a symbol file for the fasm tool listing.asm, and look at the preprocessed source for lines 5-14 (the "ccall" macro) I see the following:
line 5 is preprocessed as 3B 05 6D 61 63 72 6F 1A etc, which is what I expected, but lines 6-14 all have a zero-byte after the initial 0x3B byte, eg line 6 is 3B 00 7B 1A etc. Now this I didn't expect, because the null-byte is used as the end-of-line marker, and is not used for anything else in the preprocessed source if I have understood the documentation.
So is this (the null-byte after 0x3B in preprocessed macro-lines) intended, or perhaps a bug? I'm asking because I'm writing a program to process this prep.src info, and I noticed it went wrong here, because it treated these null-bytes as end-of-line. Now I can just have it make an exception for a 3B-00-combination of course, but I'd just like to know if I've misunderstood the specs (wouldn't be the first time Smile )
Post 01 Apr 2009, 21:10
View user's profile Send private message Reply with quote
Tomasz Grysztar
Assembly Artist


Joined: 16 Jun 2003
Posts: 7721
Location: Kraków, Poland
Tomasz Grysztar
This is simply a zero-length token (inserted there to mark the beginning of the data ignored by the assembler).
Post 01 Apr 2009, 21:13
View user's profile Send private message Visit poster's website Reply with quote
buzzkill



Joined: 15 Mar 2009
Posts: 111
Location: the nether lands
buzzkill
I thought the 3B byte was the indicator that the rest of the line is ignored? Do these zero-length tokens appear in other places as well, or just in these 3B lines?
Post 01 Apr 2009, 21:22
View user's profile Send private message Reply with quote
Tomasz Grysztar
Assembly Artist


Joined: 16 Jun 2003
Posts: 7721
Location: Kraków, Poland
Tomasz Grysztar
My answer will not be about what you can actually meet with current version of fasm, but rather about how this data structure is defined in general. So: first of all, do not thing about it as a "byte indicator", the line structure always consists just of tokens, and you should read it token-wise, not byte-wise. Both 1Ah and 3Bh tokens have exactly the same structure, so (theoretically) they both can be zero-length as a special case. The 3Bh token marks the place, from where all the next tokens (including this one) will be ignored by parser/assembler. It can be zero-length, but it can also be "normal" token.
Post 01 Apr 2009, 21:33
View user's profile Send private message Visit poster's website Reply with quote
buzzkill



Joined: 15 Mar 2009
Posts: 111
Location: the nether lands
buzzkill
Ah that clears it up for me, up until now I _had_ thought of the 3B as a byte indicator (like "if you read 3Bh, just skip everything till first null-byte"). And I also hadn't considered the possibility of zero-length tokens. As usual, Tomasz, thank you for a quick and clear answer Smile
Post 01 Apr 2009, 21:44
View user's profile Send private message Reply with quote
Tomasz Grysztar
Assembly Artist


Joined: 16 Jun 2003
Posts: 7721
Location: Kraków, Poland
Tomasz Grysztar
Well, skipping till the first null-byte would be a really bad idea - because, even not taking zero-length tokens into consideration, you still have some null bytes in the quoted string tokens, because they have 32-bit length (which usually has the high bytes zeroed).
Post 01 Apr 2009, 21:46
View user's profile Send private message Visit poster's website Reply with quote
buzzkill



Joined: 15 Mar 2009
Posts: 111
Location: the nether lands
buzzkill
Oh yeah, I forgot to mention that Smile I did notice that, and skip four bytes after a 0x22 byte.
*edit* Inside a line that starts with 0x3B of course Smile
Post 01 Apr 2009, 21:51
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
I have a question too Razz

Has fas format currently or planned means to check if a fas file is the debug info of a given binary?
Post 01 Apr 2009, 21:55
View user's profile Send private message Reply with quote
Tomasz Grysztar
Assembly Artist


Joined: 16 Jun 2003
Posts: 7721
Location: Kraków, Poland
Tomasz Grysztar
It contains pathnames of output and source files, and that's about all there is.
Post 01 Apr 2009, 22:18
View user's profile Send private message Visit poster's website Reply with quote
buzzkill



Joined: 15 Mar 2009
Posts: 111
Location: the nether lands
buzzkill
The names of the input and output file (that were used to generate the symbol file) are stored in the string table. But there's no way of knowing if either of those files has been changed after the creation of the symbol file, that's the downside of having your symbol info in a separate file. If you wanted to keep a link between the binary and the symbol file, you'd have to eg generate a unique hash of the binary and store that in the symbol file, at the time of symbol file creation, and that's not done now (I don't know if Tomasz has plans for this in the future).
Post 01 Apr 2009, 22:20
View user's profile Send private message Reply with quote
vid
Verbosity in development


Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid
As for sources, how about just saving date/time of last modify of source file(s) in FAS? Wouldn't it be easier?

As for binary, doesn't FAS contain also the binary output, that you can compare to result?
Post 01 Apr 2009, 22:34
View user's profile Send private message Visit poster's website AIM Address MSN Messenger ICQ Number Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
If not enough info is present then I think that something like SHA1 or whatever would be good to have within the fas file.
Post 01 Apr 2009, 23:44
View user's profile Send private message Reply with quote
buzzkill



Joined: 15 Mar 2009
Posts: 111
Location: the nether lands
buzzkill
vid wrote:
As for sources, how about just saving date/time of last modify of source file(s) in FAS? Wouldn't it be easier?


But then if you just change the date or time of a src file, they would be considered out of sync. You could even change the file, and then alter the date/time, and they would be out of sync but you'd think they weren't. You'd really need to calculate a hash of the contents to be sure it hasn't changed, just watching the date/time isn't enough.

vid wrote:
As for binary, doesn't FAS contain also the binary output, that you can compare to result?


It contains the assembly dump, but if you assemble a src file to an object file, and then link that with other files/libraries to generate the final binary, then fasm itself doesn't know anything about that binary.
Post 02 Apr 2009, 14:58
View user's profile Send private message Reply with quote
buzzkill



Joined: 15 Mar 2009
Posts: 111
Location: the nether lands
buzzkill
LocoDelAssembly wrote:
If not enough info is present then I think that something like SHA1 or whatever would be good to have within the fas file.


Yes, something like that would be needed then. However, you also don't have these kinds of assurances with source and binary, ie you assemble a src file to a binary, and then change the src file, and voila: your src and binary are out of sync.
So I don't know how useful it would be to try to force your binary and symbol file to be in sync. If you're the programmer, then it's your job to make sure you're working with the correct versions of src, object, executable and symbol files. If you ship a binary plus symbol file to a user/customer without the src, you're guaranteed that the files are in sync.
I've never seen a set of compiler+tools that try to enforce this. Do you have a specific use case for this?
Post 02 Apr 2009, 15:05
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
Yes, I was thinking about the possibility of having a linker-like tool for ELF executables to be able to import more like PE files. Having some special named symbols on the executable source so the tool can find and patch them (besides adding the extra segment(s) and header(s)). But, before doing this it must be assured that the binary is not patched already nor a completely unmatched one.

I can't find the pdb format spec anywhere Mad, but how the debugger detects when your PDB file doesn't match your binary?
Post 02 Apr 2009, 15:58
View user's profile Send private message Reply with quote
buzzkill



Joined: 15 Mar 2009
Posts: 111
Location: the nether lands
buzzkill
I'm not familiar with PDB, looking at wikipedia I see "the PDB format is undocumented and proprietary", and windows-only. I'm on linux (and not a big fan of "undocumented and proprietary"), so I'm afraid I can't help you there Sad
Post 02 Apr 2009, 17:02
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
baldr (BTW, are you with us?), said he was trying to do a fas-.pdb converter http://board.flatassembler.net/topic.php?p=85821#85821 , perhaps he was joking? Sad

[edit]Mmmh, I have to check this: http://board.flatassembler.net/topic.php?t=9791 [/edit]
Post 02 Apr 2009, 17:31
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
OK, I've checked YASM's source, and seems it generates the MD5 hash of every source file but I'm not sure what about the binary.

Also that is explained in the TXT that comes with the source:
Quote:
0x000000F4: source file info
for each source file:
4 bytes - offset of filename in source filename string table
{2 bytes - checksum type/length? (0x0110)
16 bytes - MD5 checksum of source file} OR
{2 bytes - no checksum (0)}
2 bytes - 0 (padding?)
Post 02 Apr 2009, 18:35
View user's profile Send private message Reply with quote
buzzkill



Joined: 15 Mar 2009
Posts: 111
Location: the nether lands
buzzkill
But doesn't yasm generate debug/symbolic info integrated into the binary? That means that for yasm, only the src and binary have to be in sync, while for fasm, you'd need src and binary and symbol file to be in sync. And, in addition to my arguments above, if you want to convert fasm's symbol file format into something that you can use in a debugger (eg dwarf on linux), you'd risk losing your hash info in the conversion fas->dwarf.
So for me personally, there wouldn't be much value in integrating these hashes into the symbol file, but if Tomasz wants to implement it, I won't object Smile
Post 02 Apr 2009, 18:54
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
Perhaps we are forgetting an important fact for the lack of binary hashes, it is the linker that produce the final executable, not the assembler Razz

And just tested it and you are right, it generates the CV8 debug info into the OBJ file.
Post 02 Apr 2009, 19:41
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page Previous  1, 2, 3  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar.

Powered by rwasa.