flat assembler
Message board for the users of flat assembler.

Index > Main > Sections in binary format

Author
Thread Post new topic Reply to topic
Tyler



Joined: 19 Nov 2009
Posts: 1216
Location: NC, USA
Tyler 30 Mar 2012, 20:20
Is it possible to use sections in a binary format, to declare data in the same file as code?
Post 30 Mar 2012, 20:20
View user's profile Send private message Reply with quote
bzdashek



Joined: 15 Feb 2012
Posts: 147
Location: Tolstokvashino, Russia
bzdashek 30 Mar 2012, 20:41
I'm sorry, I don't understand half of your question, but the answer to the second half follows:
Code:
format PE console
entry start

include 'win32a.inc'

section   '.rofl' code readable writable executable

start:
       mov     [houtput],STD_OUTPUT_HANDLE
 mov     esi,usage
   call    print
       invoke  ExitProcess,0

proc       print
       pushf
       push    ebp edi eax
 push    [houtput]
   call    [GetStdHandle]
      mov     ebp,eax
     mov     edi,esi
     or      ecx,-1
      xor     al,al
       repne   scasb
       neg     ecx
 sub     ecx,2
       push    0
   push    bytes_count
 push    ecx
 push    esi
 push    ebp
 call    [WriteFile]
 pop     eax edi ebp
 popf
        ret
endp

usage                db      'Usage: kill_ci <objects.sif>',0
houtput         dd      ?
bytes_count        dd      ?

data import
  library kernel32,'KERNEL32.DLL'

  import kernel32,\
    GetStdHandle,'GetStdHandle',\
    WriteFile,'WriteFile',\
    ExitProcess,'ExitProcess'

end data
    


The print procedure was ripped from (display_string)
\fasm\TOOLS\WIN32\SYSTEM.INC
Post 30 Mar 2012, 20:41
View user's profile Send private message Reply with quote
tripledot



Joined: 06 Jan 2009
Posts: 49
tripledot 30 Mar 2012, 20:46
You're declaring data in the code section: bad for caching.
Post 30 Mar 2012, 20:46
View user's profile Send private message Reply with quote
bzdashek



Joined: 15 Feb 2012
Posts: 147
Location: Tolstokvashino, Russia
bzdashek 30 Mar 2012, 20:57
tripledot wrote:
You're declaring data in the code section: bad for caching.

I thought that what Tyler asked.
Post 30 Mar 2012, 20:57
View user's profile Send private message Reply with quote
tripledot



Joined: 06 Jan 2009
Posts: 49
tripledot 30 Mar 2012, 21:05
His question confused me too! Smile I first thought he was just asking about keeping code and data in the same source file (!)

Tyler, I don't think the flat binary format is aware of sectioning. (I'm happy to be corrected though!)

It would be better to create separate code and data sections - a modern CPU will cache code and data much better that way. If you store variables in the same section as your code, and you write to them, the CPU will flush the entire instruction cache, likely costing hundreds of cycles.
Post 30 Mar 2012, 21:05
View user's profile Send private message Reply with quote
bzdashek



Joined: 15 Feb 2012
Posts: 147
Location: Tolstokvashino, Russia
bzdashek 30 Mar 2012, 21:23
Maybe he wanted to include binary inside a binary, so he could run binary while running binary? Idea
Post 30 Mar 2012, 21:23
View user's profile Send private message Reply with quote
tripledot



Joined: 06 Jan 2009
Posts: 49
tripledot 30 Mar 2012, 21:48
+01b
Post 30 Mar 2012, 21:48
View user's profile Send private message Reply with quote
Tyler



Joined: 19 Nov 2009
Posts: 1216
Location: NC, USA
Tyler 31 Mar 2012, 06:46
Okay, I'll clarify.

I'm working on a big project which I have split up into an include file for each function. I want code and data to be kept separate. How do I declare data in the function files without causing data and code to be mixed?

main.asm
Code:
main:
   ;do stuff
   call func1
   call func2

include 'func1.asm'
include 'func2.asm'
    

func1.asm
Code:
func1:
   ret
    

func2.asm
Code:
func2:
   ret    

Ho do I include a data definition in func2.asm without mixing code and data?
Post 31 Mar 2012, 06:46
View user's profile Send private message Reply with quote
tripledot



Joined: 06 Jan 2009
Posts: 49
tripledot 31 Mar 2012, 07:28
I've no idea whether this is good practice or not, but I generally have one 'master' file called "main.asm", which is divided into sections (data, code, imports, etc...)

To declare sections in FASM, use the 'section' directive, e.g.
Code:
section '.text' code readable executable
; code goes here

section '.data' data readable writeable
; data goes here
    

Within each section I 'include' external files that belong in sections of that type. For example, I keep all my matrix functions in a file called "matrix.code", which gets included in the main code section, and any matrix-related global data goes a file called "matrix.data", which gets included in the main data section. I find this system helps me to manage source files in large projects.

When working on a project, I run a new instance of FASM for each logical part of the program I'm editing, and open .code and .data files in their own tabs within each instance. That way I don't get lost in a maze of tabs (I usually have a LOT of source files on the go at once!) cos I can quickly hide windows I'm not currently interested in.

Does this approach suit your workflow?

[EDIT]
Examples speak louder than waffle:

Code:
section '.text' code readable executable
include 'func1.code'
include 'func2.code'

section '.data' data readable writeable
include 'func1.data'
include 'func2.data'
    


etc.

Using includes like this in the code section makes it easy to keep related functions located next to one another, which is good for code caching. I have no idea about organising data in terms of position relative to its parent code... in any case, you shouldn't have too many globals in a big project, and I used to leave it up to the linker anyway. Since switching to FASM I stopped caring, and it hasn't caused me any harm yet. The lack of a linking stage is a Godsend, actually.

OT, but does anyone know anything deeper about locating global data 'geographically' close to its code? Any worthwhile benefits?
Post 31 Mar 2012, 07:28
View user's profile Send private message Reply with quote
AsmGuru62



Joined: 28 Jan 2004
Posts: 1637
Location: Toronto, Canada
AsmGuru62 31 Mar 2012, 11:29
That is exactly my approach too.
Very good for large projects.

Sorry, I did not get the "geo" question.
For me the only thing matters is the alignment of data.
It can be mixed with other data... is that what you asking?
Post 31 Mar 2012, 11:29
View user's profile Send private message Send e-mail Reply with quote
edfed



Joined: 20 Feb 2006
Posts: 4347
Location: Now
edfed 31 Mar 2012, 13:29
data are in DS/ES/FS/GS/SS
code in CS.

that's the only thing to remember.

now, you take your datas, put them in a data section.
take your code, and put it in the code (text) section.
compile
and you get a .exe file.


about code and data separation,

Code:
include 'header.inc' ;or 'header.ash'
section '.text' code readable executable
include 'functions.inc'
section '.data' data readable writeable
include 'datas.inc'
    
Post 31 Mar 2012, 13:29
View user's profile Send private message Visit poster's website Reply with quote
tripledot



Joined: 06 Jan 2009
Posts: 49
tripledot 31 Mar 2012, 15:15
@AsmGuru62: Sure, I know you can mix your data up a bit. I align habitually but it makes sense to me to group logically-related data close together in memory as well. Certainly for small data that can fit in a cache line. If it all comes down to finding magic alignments to avoid certain 2^n strides clogging them), I give up. I think this is a great example of C compilers and linkers outdoing humans pretty much hands-down.

Good tip, edfed. There's usually enough going on in a main.asm without the names of every datatype!
Post 31 Mar 2012, 15:15
View user's profile Send private message Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo 18 Apr 2012, 20:33
I think YASM (and/or NASM??) supports multi-section binary, but I've never used it. EDIT: Yup, seems old YASM 0.7.0 added it, see the online doc section here.

As for code and data and self-modifying slowing things down, yes, it can hurt a lot on modern cpus, much moreso than it used to. Most compilers and OSes these days have read-only data, presumably because of this.

I hate to bring up a dumb example, but I found that the hacked version of BEFI.COM for DOS was several times slower with self-modifying than without. It's something like writing within 4 kb of code you're running from, so you really probably shouldn't join code and writable data, and I assume most people (OOP, Forth) have accounted for this (hopefully!).

Quote:

Core i5 3.2 Ghz, DOSEMU under PuppyLinux 2.6.33.2

So I sacrificed four bytes for speed, but it's worth it, no? Wink

C:\tmp>runtime befi982 bench3.bef
2147483296 00.44 seconds elapsed
C:\tmp>runtime befi978 bench3.bef
2147483296 01.70 seconds elapsed

C:\tmp>runtime befi982 bench2.bef
2147483596 03.63 seconds elapsed
C:\tmp>runtime befi978 bench2.bef
2147483596 15.94 seconds elapsed
Post 18 Apr 2012, 20:33
View user's profile Send private message Visit poster's website Reply with quote
tripledot



Joined: 06 Jan 2009
Posts: 49
tripledot 18 Apr 2012, 21:45
Good link, ta.

I was under the impression that mixing read-only data within code segments was a bad idea too, since the CPU has separate caches for code and data. I don't think it's a good idea to even put string constants amongst code, it's more bytes to chew before the decoders have anything to work with.

That 4 byte thing... presumably that's CPU-dependent? Unkie Agner may know something about this; I'll have to have another look. I wouldn't want to write into even a 16 or 32k code cache line, just in case.
Post 18 Apr 2012, 21:45
View user's profile Send private message Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3499
Location: Bulgaria
JohnFound 19 Apr 2012, 06:58
Tyler wrote:
How do I declare data in the function files without causing data and code to be mixed?


The answer is: "Use data definition macros from freshlib project":_globals.inc reference.

You can use these macros in your code following way:
Code:
        mov  eax, [Var1]
        add  eax, [Var2]
        mov  [Var3.left], eax

uglobal
  Var3  RECT
endg 

iglobal
  Var1  dd 1234
  Var2  dd 5678
endg
    


And somewhere else (in the data section) you should place:
Code:
section '.data' data readable writeable 
IncludeAllGlobals
    


Note that the data will be properly ordered, i.e. the undefined data will be placed at the end and will not take place in the compiled binary.

_________________
Tox ID: 48C0321ADDB2FE5F644BB5E3D58B0D58C35E5BCBC81D7CD333633FEDF1047914A534256478D9
Post 19 Apr 2012, 06:58
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
Tyler



Joined: 19 Nov 2009
Posts: 1216
Location: NC, USA
Tyler 19 Apr 2012, 23:45
JohnFound wrote:
Tyler wrote:
How do I declare data in the function files without causing data and code to be mixed?


The answer is: "Use data definition macros from freshlib project":_globals.inc reference.

You can use these macros in your code following way:
Code:
        mov  eax, [Var1]
        add  eax, [Var2]
        mov  [Var3.left], eax

uglobal
  Var3  RECT
endg 

iglobal
  Var1  dd 1234
  Var2  dd 5678
endg
    


And somewhere else (in the data section) you should place:
Code:
section '.data' data readable writeable 
IncludeAllGlobals
    


Note that the data will be properly ordered, i.e. the undefined data will be placed at the end and will not take place in the compiled binary.
That's exactly what I was looking for. Thanks man.
Post 19 Apr 2012, 23:45
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.