flat assembler
Message board for the users of flat assembler.

Index > Main > Relative imports and GNU linker... ?

Author
Thread Post new topic Reply to topic
Trojany



Joined: 31 May 2010
Posts: 5
Trojany 31 May 2010, 23:08
hi all,

I'm currently planning to implement a new programming language as a gcc-compatible compiler. More precisely: The program I'd like to develop will get a source file of that new language as input, and will output an object file which in turn can be linked by gnu linker (ld).
Since the creation of object files looked not that easy to me, I decided that the compiler will only produce assembly code and use an assembler to convert it to the actual object files, which can be linked by gnu.

My problem is:

That new language has a feature which I'm not sure about whether and how it can be done with gnu linker:
In that language it is allowed, when importing symbols from other source files, to explicitly specify which imported symbol comes from which source file.
This means that it is possible to have the following three source files built together to one program:
Code:
;File: source_a
  ;...
  Export my_symbol
  ;...

;File: source_b
  ;...
  Export my_symbol
  ;...

;File: source_both
  ;...
  Import my_symbol From "a/path/to/source_a" As my_symbol_from_a
  Import my_symbol From "a/path/to/source_b" As my_symbol_from_b
  ;...    
In that programming language this will NOT produce a compiling error, it is allowed to have symbols with identical names defined in multiple source files.
It will not produce a duplicate-identifier compile time error, because the "As" operator was used in the import statements.
In the "source_both" file you can use my_symbol_from_a and my_symbol_from_b to address my_symbol in source_a and source_b, respectively.

Now I'm thinking about how to compile such source files to object files which in turn can be linked correctly with gnu linker.

A first attempt would be producing the following assembly files:
Code:
;File: assembly_a
  ;...
  public _my_symbol
  ;...

;File: assembly_b
  ;...
  public _my_symbol
  ;...

;File: assembly_both
  ;...
  extrn '_my_symbol' as _my_symbol_from_a ;?
  extrn '_my_symbol' as _my_symbol_from_b ;?
  ;...    
This won't work, because
1.: gnu linker prints "multiple definition of `my_symbol'" and returns with an error, and
2.: I don't know how to specify which of the extrn _my_symbol statements is to import from which other object file.
But how else could I create the object files?
I read manual section 2.4.3 and 2.4.4 about COFF and ELF, but I couldn't find anything about telling the linker which extrn directive refers to which linked object file.

I know that this behaviour of the language is somewhat different to that other languages have, e.g. in C/C++ you just do declarations of extern functions (by including the header files), without explicitly specifying which function comes from which other source file. However, I'd still like to find a way to use gnu linker as the last step of compiling programs in that language.

Any help is appreciated, thanks in advance!
Post 31 May 2010, 23:08
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20630
Location: In your JS exploiting you and your system
revolution 01 Jun 2010, 00:12
Code:
;File: assembly_a
  ;...
  public _my_symbol_from_a
  ;...

;File: assembly_b
  ;...
  public _my_symbol_from_b
  ;...

;File: assembly_both
  ;...
  extrn '_my_symbol_from_a' as _my_symbol_from_a
  extrn '_my_symbol_from_b' as _my_symbol_from_b
  ;...    
Post 01 Jun 2010, 00:12
View user's profile Send private message Visit poster's website Reply with quote
Trojany



Joined: 31 May 2010
Posts: 5
Trojany 02 Jun 2010, 02:22
Thanks, revolution,

but how'd you compile that?
Code:
;File: source_a
  ;...
  Export my_symbol
  Export my_symbol_from_a ;Assume it's a different one, not a copy of my_symbol
  ;...

;File: source_b
  ;...
  Import my_symbol From "a/path/to/source_a" As my_symbol_from_a
  Export my_symbol
  ;...

;File: source_both
  ;...
  Import my_symbol_from_a From "a/path/to/source_a" As my_symbol_from_a
  Import my_symbol From "a/path/to/source_b" As my_symbol_from_b
  ;...    
Maybe I should say a word about namespaces in that language:
Whenever there's function's or variable's definition in a source file, it's name is added to the file's local namespace, it doesn't interfere with a possibly same name in another source file.

When functions or variables should be visible to other source files ("public"), the Export-statement is used - they're not added to a global namespace, they're just made visible, still in their local namespaces.

When a source file needs to access a function or variable in another source, it uses the Import-statement, providing a (relative) path to the file, and the name which has been used when exporting the symbol. (that's the name in the other file's local namespace)
The As-operator tells which name in the local namespace the symbol is to be imported to.
Post 02 Jun 2010, 02:22
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20630
Location: In your JS exploiting you and your system
revolution 02 Jun 2010, 04:44
Why not just avoid having the same named variables.

I always prefix the module name before each major label:
Code:
proc MODULE1_function
  ;...
endp
public MODULE1_function    
The "MODULE1" prefix will make all names unique.
Post 02 Jun 2010, 04:44
View user's profile Send private message Visit poster's website Reply with quote
Tyler



Joined: 19 Nov 2009
Posts: 1216
Location: NC, USA
Tyler 02 Jun 2010, 05:08
Invent your own name mangling scheme. You could prepend the path to the file for example. http://en.wikipedia.org/wiki/Name_mangling

Fasm does this when you use the local directive in a macro.
Post 02 Jun 2010, 05:08
View user's profile Send private message Reply with quote
Trojany



Joined: 31 May 2010
Posts: 5
Trojany 04 Jun 2010, 02:22
Well, I hoped, there would be some option to specify what I want in the object files, maybe by creating the symbol tables manually instead of using format directive.
I thought these objected file formats, ELF or even COFF would have this feature, and it's just impossible to use it if you don't make the object file "from scratch" or use linker scripts.

Otherwise yes, I'd have to do some name mangling,
the problem with name mangling is that the object files can be distributed without source:
so if someone receives somehow an object file and wants to use it without being able to recompile from source, the one has to stick to the symbol names in there. if these names collide with names in another object file received from another person, there's a problem.

Appending local file paths to the symbol names wouldn't make them unique across a network.

I think I'll have to make the compile query an UUID from OS an assign it to the module (using it as prefix for all symbols in its object file).
Post 04 Jun 2010, 02:22
View user's profile Send private message Reply with quote
Tyler



Joined: 19 Nov 2009
Posts: 1216
Location: NC, USA
Tyler 04 Jun 2010, 02:53
If you trust that every definition of the symbol that ld is reporting as being defined multiple times is referring to the same variable, you could use the "--allow-multiple-definition" option. It tells ld to ignore multiple definitions, and to the first definition it comes to for all references.
Post 04 Jun 2010, 02:53
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.