dynamic array

Index > Main > dynamic array

Goto page 1, 2 Next

Author

Thread

Mino

Joined: 14 Jan 2018
Posts: 163

Mino 23 May 2018, 16:55

Hello,
I don't really understand the concept of a FASM array. I came across https://board.flatassembler.net/topic.php?t=7901 while doing my research, however, I don't understand the meaning of all the instructions.
For example, if I would like to do this in a language that accepts dynamic tables:

Code:

int MyArray[];
...
MyArray[0] = 87;
MyArray[1] = 34;
MyArray[3] = 98; // --> Occurante an error

How to do this in fasm?
In the page I gave above, we present this example code :

Code:

array: 
dd @f-$-4 
dd .xl    ;two dimentions array 
dd .yl 
rb ? 
@@:

If I understand correctly, a "translation" in C would give:

Code:

int array[][];

Finally... I don't think I understand.
My goal is to write a function in C++ with the following header:

Code:

std::string SetArray(std::string name, int dimension, int bits, std::vector<std::string> values = { "" });

It would therefore define, via the parameters (+ other methods, but it doesn't matter here) a corresponding assembler code, safe and dynamic (as in my 1st example).
Would you explain the principle a little better, and how I could implement this?
I obviously don't ask for C++ code, just tell me some indications allowing me to arrive at the expected result.

Thank you very much!

_________________
The best way to predict the future is to invent it.

23 May 2018, 16:55

redsock

Joined: 09 Oct 2009
Posts: 439
Location: Australia

redsock 23 May 2018, 21:47

Quote:

in a language that accepts dynamic tables

A lot of the lazy-programmer-friendly languages map what looks and feels like an array into a key/value map (hashmap, RBtree, etc).

The important bit to keep in mind IMO is that in C for example,

Code:

int MyArray[];

is the same as

Code:

int *MyArray;

Of course if you attempt to declare an array like your desired first entry, you immediately get an error with a C compiler:

Code:

error: storage size of ‘MyArray’ isn’t known

So, some decisions about what your "dynamic array" does about sparse/missing keys/indexes must first be made, and then about general allocation strategies. While you certainly can use data segment directives in fasm like you have also shown (dd, rb) if you truly want the sizing of your array to by dynamic, then you'll need some kind of memory allocation code to go along with it (which is of course OS dependent).

Finally, I don't understand your C++ function header's intention:

Code:

std::string SetArray(std::string name, int dimension, int bits, std::vector<std::string> values = { "" });

For starters, why does it return a std::string? What is this function supposed to do? You pass it a name, an integer dimension, and then a list of std::string values... I am not sure I understand.

Do you want to create a fasm-produced object file that will link with that function declaration? If so, you'll need to deal with name mangling, and all of the std::string and std::vector calling conventions from your fasm code (less than pleasant but mostly just because of the name mangling that C++ must do). I wrote a fairly long page ages ago about integrating my fasm codebase with gcc/C/C++: https://2ton.com.au/rants_and_musings/gcc_integration.html This covers some of the ground required to do this (on Linux anyway).

Either way, I am interested to know more about your intent here.

_________________
2 Ton Digital - https://2ton.com.au/

23 May 2018, 21:47

Furs

Joined: 04 Mar 2016
Posts: 2685

Furs 24 May 2018, 14:15

Mino wrote:

In the page I gave above, we present this example code :
Code:
array: 
dd @f-$-4 
dd .xl    ;two dimentions array 
dd .yl 
rb ? 
@@:
    
If I understand correctly, a "translation" in C would give:
Code:
int array[][];
    
Finally... I don't think I understand.

That's a wrong translation. It's actually a struct like:

Code:

struct
{
  uint32_t size;
  uint32_t xl;
  uint32_t yl;
} array;

Which isn't even an array.

And btw they're called "VLA" or "variable length array", not "dynamic". Dynamic is simply an array that is allocated with malloc (in C), but usually has a fixed size calculated at runtime, and is treated as a pointer, not with the name[] construct.

If you use a variable as the length of an array in C on the stack, it subtracts that much from the stack pointer to make room for the array, etc.

24 May 2018, 14:15

Mino

Joined: 14 Jan 2018
Posts: 163

Mino 24 May 2018, 17:22

Thanks for your answers Smile

First of all, how to use malloc in FASM? Is there a procedure, and would you recommend it?

Quote:

For starters, why does it return a std::string? What is this function supposed to do? You pass it a name, an integer dimension, and then a list of std::string values... I am not sure I understand.
...
way, I am interested to know more about your intent here.

Actually, I'm implementing a compiler for a language of my design. FASM is the target language, and C++ is the language in which the compiler is encoded.
I am currently writing a namespace, containing classes, methods, functions, constants, ... to facilitate the transition from high level to low level.
I am currently managing the arrays. So I need a function that can define a FASM array.
The function returns a string, because the result is the FAMS code corresponding to the table defined by the given parameters. Example :

Code:

std::string SetArray(std::string name, int dimension, int bits, std::vector<std::string> values = { "" });
...
std::string ExampleArray = SetArray("MyArray", 1, sizeof(int), { "76", "43", "84", "20", "92" });

ExampleArray (HL) =

Code:

int MyArray[] = { 76, 43, 84, 20, 92 };

ExampleArray (LL) =

Code:

mov DWORD [ebp-4], 76
mov DWORD [ebp-8], 43
mov DWORD [ebp-12], 84
mov DWORD [ebp-16], 20
mov DWORD [ebp-20], 92

For the moment, it is the latter code that is generated. However, this does not allow a dynamic size table.
That's my question, actually....
By the way, is it a bad thing to define the values "decreasing" in the stack (starts at -4 and ends at -20)?

Finally here we go. Thanks for your answers, if you have other ideas which could improve the value returned by my function (especially to allow a dynamic table), I am taker.

_________________
The best way to predict the future is to invent it.

24 May 2018, 17:22

alexfru

Joined: 23 Mar 2014
Posts: 80

alexfru 24 May 2018, 18:08

redsock wrote:

The important bit to keep in mind IMO is that in C for example,
Code:
int MyArray[];    
is the same as
Code:
int *MyArray;    

Except, these are different in both syntax and meaning.

24 May 2018, 18:08

Furs

Joined: 04 Mar 2016
Posts: 2685

Furs 24 May 2018, 19:37

Mino wrote:

For the moment, it is the latter code that is generated. However, this does not allow a dynamic size table.

Why not? Just subtract the dynamic size from esp to make room for the array?

Code:

sub esp,  size_of_local_vars

; ... here code that calculates the dynamic array size in eax

sub esp, eax

; negate eax since addressing doesn't allow subtracting registers
neg eax

mov dword [ebp-size_of_local_vars+eax+0], first_array_element
mov dword [ebp-size_of_local_vars+eax+4], second_array_element

Assuming your array is 32-bits per element. Obviously, you could also calculate the array's position into eax directly and address from that, i.e.:

Code:

neg eax
lea eax, [ebp-size_of_local_vars+eax]

mov dword [eax+0], first_array_element
mov dword [eax+4], second_array_element

This is basic pointer stuff, you ought to learn pointers properly Wink

And lastly, since the array lives at [esp], you could just directly address it from esp... but that only works if you have one "dynamic" array only on the stack. So in this case:

Code:

mov dword [esp+0], first_array_element
mov dword [esp+4], second_array_element

Works just fine.

24 May 2018, 19:37

Mino

Joined: 14 Jan 2018
Posts: 163

Mino 24 May 2018, 20:04

Ha! Thank you very much Smile

I think I have understood, and will be inspired by these results.
However, does your sample code occur during program execution, not during compilation? Otherwise, it wouldn't be really dynamic.

24 May 2018, 20:04

Furs

Joined: 04 Mar 2016
Posts: 2685

Furs 24 May 2018, 21:46

Well in that example "eax" is a register so of course its value is at runtime, you can put anything in it (doesn't have to be compile-time constant).

size_of_local_vars is a constant though, but that was just the local variables, without the dynamic array, which are constant already (in size).

24 May 2018, 21:46

Mino

Joined: 14 Jan 2018
Posts: 163

Mino 24 May 2018, 21:49

Okay, thanks Smile

And would you advise me, while we're at it, to define the variables via the stack, or outside the stack? For arrays, in that case?

24 May 2018, 21:49

redsock

Joined: 09 Oct 2009
Posts: 439
Location: Australia

redsock 24 May 2018, 21:55

alexfru wrote:

Except, these are different in both syntax and meaning.

How so?

Code:

#include <cstdio>

static int underlying_array[32];

static inline void SetArrayItem(int *arr, int idx, int val) {
        arr[idx] = val;
}

static inline void SetArrayItem2(int arr[], int idx, int val) {
        arr[idx] = val;
}


int main(int argc, char *argv[]) {
        SetArrayItem(underlying_array, 0, 10);
        SetArrayItem2(underlying_array, 1, 11);
}

In the specific context of what constitutes a "dynamic array", they are syntactically equivalent, no?

_________________
2 Ton Digital - https://2ton.com.au/

24 May 2018, 21:55

Mino

Joined: 14 Jan 2018
Posts: 163

Mino 24 May 2018, 22:05

redsock wrote:

alexfru wrote:

Except, these are different in both syntax and meaning.

How so?

Code:

#include <cstdio>

static int underlying_array[32];

static inline void SetArrayItem(int *arr, int idx, int val) {
        arr[idx] = val;
}

static inline void SetArrayItem2(int arr[], int idx, int val) {
        arr[idx] = val;
}


int main(int argc, char *argv[]) {
        SetArrayItem(underlying_array, 0, 10);
        SetArrayItem2(underlying_array, 1, 11);
}

In the specific context of what constitutes a "dynamic array", they are syntactically equivalent, no?

I think that only changes the syntax. You can see it here (copy-paste).

_________________
The best way to predict the future is to invent it.

24 May 2018, 22:05

alexfru

Joined: 23 Mar 2014
Posts: 80

alexfru 25 May 2018, 07:25

redsock wrote:

alexfru wrote:

Except, these are different in both syntax and meaning.

How so?

Code:

#include <cstdio>

static int underlying_array[32];

static inline void SetArrayItem(int *arr, int idx, int val) {
        arr[idx] = val;
}

static inline void SetArrayItem2(int arr[], int idx, int val) {
        arr[idx] = val;
}


int main(int argc, char *argv[]) {
        SetArrayItem(underlying_array, 0, 10);
        SetArrayItem2(underlying_array, 1, 11);
}

In the specific context of what constitutes a "dynamic array", they are syntactically equivalent, no?

Arrays in function parameter declarations are a special case. For all intents and purposes they are pointers to the array element type.

Outside of parameter declarations arrays are not the same as pointers.
They are really different types. They can't be initialized in the same way and their sizes are different in general. Arrays can't be assigned to.

Code:

static int a[97] = { 1, 2, 3 }; // can't initialize with 0
static int* p = 0; // can't initialize with { 1, 2, 3 }
p = p; // can't assign a to a
printf("%d\n", sizeof a != sizeof p); // prints 1

Questions?

25 May 2018, 07:25

Furs

Joined: 04 Mar 2016
Posts: 2685

Furs 25 May 2018, 12:08

Mino wrote:

Okay, thanks
And would you advise me, while we're at it, to define the variables via the stack, or outside the stack? For arrays, in that case?

It depends on the maximum size of your array. I know it's dynamic, but you should have a maximum.

If the maximum is 4096 bytes or more, then you should not use the stack due to stack guard complications (there's no point going into it at this time, learn more basic stuff first Razz

). Likewise, if you don't know the maximum, then don't use the stack.

So in those cases allocate memory with malloc or HeapAlloc (Windows) or whatever for your dynamic array, and don't forget to free it at the end of the function since it's not "automatically" freed like the stack. Stack space is limited anyway, if you have tens of megabytes of array size as a possibility, definitely do NOT use the stack.

Also, alexfru is right. Arrays are not the same as pointers, even as global variables (pointers are 4 or 8 bytes and point to data, arrays are the data itself embedded). For function parameters, they decay to pointers to the element, but it doesn't mean they're the same.

25 May 2018, 12:08

Mino

Joined: 14 Jan 2018
Posts: 163

Mino 25 May 2018, 14:47

Okay, thanks Smile

So I think I'm going to operate like this:
If the array has a defined size, I use the stack, on the contrary, if the size is indefinite (therefore dynamic), I would most probably use the structure proposed by FASM to create arrays (as in the first messages).
As far as the dynamic allowance functions are concerned, I think I need to find out a little more about them to know how to integrate them well into the program Wink

25 May 2018, 14:47

Furs

Joined: 04 Mar 2016
Posts: 2685

Furs 25 May 2018, 15:20

No you should place a single pointer on the stack or register, which gets the return value from malloc. Then access the array with this pointer (pointer is base of array).

25 May 2018, 15:20

revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 20748
Location: In your JS exploiting you and your system

revolution 25 May 2018, 16:05

Mino wrote:

If the array has a defined size, I use the stack ...

I think there should be one more criterion: If the array has a defined size and is "small" then use the stack, else use malloc/HeapAlloc.

The definition of "small" might be something like <=2kB. Don't forget about the 4kB guard page if you want to make it larger.

25 May 2018, 16:05

Mino

Joined: 14 Jan 2018
Posts: 163

Mino 25 May 2018, 16:39

Very well, thanks for the details Smile

I'd apply them!

25 May 2018, 16:39

Mino

Joined: 14 Jan 2018
Posts: 163

Mino 01 Jun 2018, 22:22

Hello again Smile

I would like to make sure of the reliability of the answer that Furs gave me (not that I doubt, but that I want to know exactly what I'm doing Wink

). Here is a model of code:

Code:

format PE
entry main

size_of_local_vars dd 4 ; 4 bits (?)

section '.code' code executable ; code
        main:

                neg eax
                lea eax, [ebp-size_of_local_vars+eax]

                mov dword [eax+0], 0
                mov dword [eax+4], 1

So, here, I could add an "infinity" of values to the EAX registry, or do I stay limited?
And, why use EAX? Why not use AL, CX, ... I guess it's a byte thing, right?
How could I also move a non-pointed character string directly into EAX? Something like that:

Code:

                mov word [eax+0], "My string"

Or am I required to first define an area in memory that would contain this string :

Code:

                str db "My string"
                ...
                mov word [eax+0], [str]

____
I would also like to know the advantages and disadvantages of using this method.
And, if we consider this to be an array in a higher level language, how would we define several dimensions?

Thank you very much!

_________________
The best way to predict the future is to invent it.

01 Jun 2018, 22:22

revolution
When all else fails, read the source

Joined: 24 Aug 2004
Posts: 20748
Location: In your JS exploiting you and your system

revolution 02 Jun 2018, 01:08

In X86 dd defines a dword, which is four bytes, 32 bits.

In your code above you reserved four dwords, a total of 16 bytes.

You can use any register capable of addressing as the pointer. But CX and AL cannot be used as pointers. And in 32 bit code using BX would limit you to only the first 65536 bytes of memory.

If you want to transfer a string of bytes (e.g. "My string" is 9 bytes), then you have to do it by parts. In 32 bit mode only four bytes maximum can be transferred using the CPU registers like EAX. One way to copy a string from one place to another is to use a loop. You can transfer one byte at a time. First the 'M', then the 'y', then the space, etc. You can also use rep movsb. Or you can transfer four bytes at a time and have special code to handle to case of the remaining extra single bytes if there are any.

Last edited by revolution on 02 Jun 2018, 10:30; edited 1 time in total

02 Jun 2018, 01:08

Mino

Joined: 14 Jan 2018
Posts: 163

Mino 02 Jun 2018, 07:54

So I'm limited to 16 bytes?
This is not possible for example ? :

Code:

mov dword [eax+80656]

And if we had to translate that into C, would we have this ? :

Code:

int MyArray[] = { 0, 1, 2, 3, 4, 5 };

I would also like to know why we "go up" in the positives (+0, +4, +8, ...) and why not in the negatives (-0, -4, -8, ...)?
Thank you Smile

02 Jun 2018, 07:54

Goto page 1, 2 Next

< Last Thread | Next Thread >

Forum Rules:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum