flat assembler
Message board for the users of flat assembler.
Index
> Main > Any way to tell the assembler to use a specific base reg? |
Author |
|
JohnFound 06 Jul 2011, 12:53
Read about "virtual" directive.
|
|||
06 Jul 2011, 12:53 |
|
JoeCoder1 06 Jul 2011, 13:19
Thanks John. That looks like it might work for me. I'll play around with it a little and update the thread. If anybody else has suggestions feel free to chime in as well.
|
|||
06 Jul 2011, 13:19 |
|
r22 06 Jul 2011, 13:38
Ahh IBM s/370 main frame assembly. How nostalgic.
@JoeCoder1 in x86 you don't have to worry about mapping data to registers to access it. You can read data right from a memory address with the MOV op-code. Code: .data ;; 8 bytes of data with first byte set to 0x0 and last byte set to 0x7 myfielda db 0, 1, 2, 3, 4, 5, 6, 7 .code MOV eax, 0 ;; eax = 0 MOV ebx, dword[myfielda] ;; ebx = 0x03020100 ADD eax, 4 ;; eax = 4 (0 + 4) MOV ecx, dword[myfielda + eax] ;; ecx = 0x07060504 |
|||
06 Jul 2011, 13:38 |
|
revolution 06 Jul 2011, 13:49
I doubt virtual is the way to go unless you want to manually copy the generated code into the main code stream later.
Perhaps instead, assuming you still want a symbolic label to represent a register, then you can use equ Code: MYFIELDA equ eax ;assign ;... restore MYFIELDA ;finished |
|||
06 Jul 2011, 13:49 |
|
JoeCoder1 06 Jul 2011, 13:52
Hi r22. Nostalgia hell, I use this stuff every day for my job!
Anyway thanks, yeah I realize that. What I want to do is something but I don't have a way to explain it in fasm because I don't know it yet. In NASM it would look like: Code: struc mystruc fielda resd 1 fieldb resd 1 endstruc and then Code: mov eax,mystruc mov dword [eax+fielda],fieldc and then Code: mov eax,mystruc ; tell the assembler eax is the base address for mystruc mov dword fielda,fieldc I realize this is not the most brilliant example because there is no form of mov storage,storage but you get the general idea, I hope Basically the desire is to be able to specify names and have the assembler resolve the base address automagically without me having to code [eax+offset_name] offset_name by itself would be alot cleaner where you have a lot of code referring to a structure or control block. |
|||
06 Jul 2011, 13:52 |
|
JoeCoder1 06 Jul 2011, 13:53
revolution wrote: I doubt virtual is the way to go unless you want to manually copy the generated code into the main code stream later. That looks interesting but I'm not sure I can get it to work. If I had a structure or even fields aligned in storage somewhere I could certainly start with that but I think I would have to code an equate for every symbol, is that correct? How would you code references to the structure in my previous post? Thanks! |
|||
06 Jul 2011, 13:53 |
|
revolution 06 Jul 2011, 13:58
If you are using a structure then you should the the assignment inside the structure definition. But your code above does not show a structure, just some random area of memory, I was working from that. Generally I envisaged this:
Code: mov eax,some_memory_address MYFIELDA equ eax ;assign mov ebx,[MYFIELDA + structure.member] restore MYFIELDA ;finished |
|||
06 Jul 2011, 13:58 |
|
JoeCoder1 06 Jul 2011, 14:03
That's the way structures are used in NASM, I have that working fine now. What I don't have is a way to eliminate having to code the "[eax+" portion in every reference.
I am trying to avoid having to offset at all, other than in the data description. What I am looking for is a way to only code the name, and have the "[EAX+" be understood by the assembler. For example in my System Z example in the first post, I don't have to code MYAREA+MYFIELDA. I just code MYFIELDA and the assembler knows from my USING statement that MYFIELDA is offset from R5. Thanks. |
|||
06 Jul 2011, 14:03 |
|
revolution 06 Jul 2011, 14:12
You can do the virtual thing, like JohnFound first mentioned, with a formal structure but it needs some setup first and is not particularly flexible if you want to change registers in another piece of code using the same structure
Code: struc MYSTRUC {};structure definition goes here virtual at eax MYSTRUC@eax MYSTRUC end virtual ;... mov ebx,[MYSTRUC@eax.member] |
|||
06 Jul 2011, 14:12 |
|
JoeCoder1 06 Jul 2011, 15:16
Thanks I'll try the ideas you guys suggested!
Edit: fixed my nonsensical attempt at a sentence Last edited by JoeCoder1 on 07 Jul 2011, 11:00; edited 1 time in total |
|||
06 Jul 2011, 15:16 |
|
r22 06 Jul 2011, 15:26
The USING and DROP are a high level construct that would have to be emulated with a MACRO in FASM or some other pre-processing step.
I'm not very good with FASM's macro language so I don't even know if it would be possible to enumerate through a structures symbols and re equate them to REG+symbol_offset. But trying to compare DSECT to a STRUC is a problem itself. DSECT is more primitive without unions or cascading. Code: .code MOV [MYFIELDB], ecx ;; ST r4, MYFIELDA .data MYAREA: ;; label MYFIELDA rd 1 ;; MYAREA + 0 MYFIELDB rd 1 ;; MYAREA + 4 The time this won't be sufficient is when you have multiple DSECTs with the same symbol names in it. Code: .data MYAREA: FIELDA rd 1 MYAREA2: FIELDA rd 1 ;; name is already defined |
|||
06 Jul 2011, 15:26 |
|
JohnFound 07 Jul 2011, 06:32
I am not sure, I fully understand the question, but if the talk is about using structures in assembly - here is what I use.
1. Use "struct/ends" macro instead of "struc {}" directive: Code:
struct MyStructure
.MyField1 dd ?
.MyField2 rb $40
ends
2. Then when we have instance of this structrure somewhere in memory we can use following: (For me it is not the best solution - see below) Code: ; ecx points to the data of type MyStructure virtual at ecx MyInstance MyStructure end virtual mov [MyInstance.MyField1], 123 ; equal to "mov dword [ecx], 123 mov al, [MyInstance.MyField2] ; equal to "mov al, byte [ecx+4] 3. The above solution is IMHO not good at all, because it makes the code obfuscated and not readable. You must know the context in order to understand what MyInstance is. The better solution is always the simple one: Code: struct MyStructure .MyField1 dd ? .MyField2 rb $40 ends ;assume that ecx points to the data of type MyStructure mov [ecx+MyStructure.MyField1], 123 mov al, [ecx+MyStructure.MyField2] In order to understand how "struct" macro works, here is the above code without using macros: Code: struc MyStructure { .MyField1 dd ? .MyField2 rb $40 } virtual at 0 MyStructure MyStructure end virtual ;assume that ecx points to the data of type MyStructure mov [ecx+MyStructure.MyField1], 123 mov al, [ecx+MyStructure.MyField2] |
|||
07 Jul 2011, 06:32 |
|
JoeCoder1 07 Jul 2011, 07:14
r22 wrote: The USING and DROP are a high level construct that would have to be emulated with a MACRO in FASM or some other pre-processing step. I believe but I am not sure, that this has to be done in the assembler proper although maybe it could be jury rigged with virtual like John suggested, because what it actually has to do is to assign offsets and addresses based on user mappings, ideally without any coding considerations other than specifying the base address for the area to be mapped, however that is chosen to be implemented. r22 wrote: But trying to compare DSECT to a STRUC is a problem itself. DSECT is more primitive without unions or cascading. I don't understand what you meant because I have probably written about 250 lines of C code in my life! I have no idea what structures in C can do. If you mean STRUC in fasm I have written less fasm than I have C! Two things I would reply on this topic is USING has been greatly expanded in Sys Z assembler and I don't think we need much of that functionality, I'm focusing on the very basic case of assigning a base register to use in generating instructions as in my example. And as far as unions go, you can certainly do that with DSECTs and it is common practice to do it. I am not sure what you mean about unions, maybe it's not what I understand a union to be. I have no idea what you meant about cascading. Can explain these topics a little more? r22 wrote:
That's a problem, certainly. Because of the tight naming conventions in MVS if we had a conflict like that I don't think we could deal with it in the assembler. For me it would be acceptable to not support that. Anyway I am not trying to compare DSECT to STRUC, I am suggesting there might be a way to simplify coding when you are dealing with control blocks with many fields. If we can accomplish this then it also helps code maintenance because if you change the register you use to reference a control block with many fields then you don't have to painstakingly change every reference. You can't do a global change obviously, because that would affect every reference to the register you're offsetting from in the whole source file. The way they do it in MVS all you have to do is change one assembler directive, and all the references will be changed automagically. |
|||
07 Jul 2011, 07:14 |
|
JoeCoder1 07 Jul 2011, 07:27
JohnFound wrote: I am not sure, I fully understand the question, but if the talk is about using structures in assembly - here is what I use. If I understand your example I don't agree it makes the code obfuscated. It's exactly the kind of abstraction necessary to deal with big programs and big control blocks without being concerned every reference to a field in a control block has to be painstakingly coded on every line. If anything, it doesn't go nearly far enough to doing the job right (see further on). It also makes a change like going from using ebx to base a big structure off of to edx a trivial coding change, where if you did it by coding ebx+ or edx+ you have a lot of work to do and a lot of risk that you either miss an instruction or change something you didn't want to. If you have a way to tell the assembler how to map things, you just made a huge change into a one-liner. I agree that having to qualify names isn't optimal and I personally don't like it. IMHO if there is one structure (or if as rl22 says the structures can't have members with the same name) then I would not support qualified names. If they are already supported, obviously it is not an option to remove it. JohnFound wrote:
That's ok for two lines but it's ugly and it doesn't scale. If I know I'm basing MyStructure off of eax, then I have two questions: Why do I need to qualify names, and why do I have to code the structure name in every reference. These two things ought to be automated away so that assuming exc points to a specific named structure, then I ought to be able to code Code: mov [MyField1], 123 mov al, [MyField2] Let the assembler do the work. That's what it is for! |
|||
07 Jul 2011, 07:27 |
|
JohnFound 07 Jul 2011, 09:08
JoeCoder1 wrote: That's ok for two lines but it's ugly and it doesn't scale. If I know I'm basing MyStructure off of eax, then I have two questions: Why do I need to qualify names, and why do I have to code the structure name in every reference. These two things ought to be automated away so that assuming exc points to a specific named structure, then I ought to be able to code It is true, but in practice, changing the work register is very uncommon thing. That is because the registers in x86 architecture are pretty equal, so it is not actually important what register will be used for addressing and what for arithmetic. On the other hand, the proper structural approach to programming simply excludes big monolith blocks of code. Using explicitly the register in the instruction makes every instruction more readable, because it is absolutely clear, that the instruction addresses not the static variable defined somewhere in the program, but some address, that is calculated in run-time. Quote: If I know I'm basing MyStructure off of eax, then I have two questions: Why do I need to qualify names, and why do I have to code the structure name in every reference. These two things ought to be automated away so that assuming exc points to a specific named structure, then I ought to be able to code The above implicit use of local labels is possible in many languages (for example pascal "with" clause) but the only advantage it have is that it makes typing easier. It is a compromise with the readability of the code (in HLL as well). This concern was important in the past, but today, with use of "auto-completion" features of modern editors it is better to use full name qualifiers - there is no need to type them anymore. (Actually when I use Fresh to write programs, I hardly type maybe 1/3 of the code - the rest 2/3 are added by the editor auto-completion.) Anyway, In FASM the above implicit use of labels can be achieved in some limited degree by using macros. But you have to create them yourself. |
|||
07 Jul 2011, 09:08 |
|
JoeCoder1 07 Jul 2011, 10:56
JohnFound wrote:
I understood several of the x86 gprs do have specific implied usage for string operations, stack operations, etc. But that is not really my point, the point is with so few registers available, it's likely to have to use eax, etc. for many different purposes even in a small piece of code. Therefore, you will end up having many lines referring to eax when you have to change something and only a few of those references may be affected by the change. That takes more effort than simply knowing all references for a given control block are based on a certain gpr. When would you ever change a base register? I imagine porting x86 to exploit x64 would be a good example, since you get 8 more real gprs to choose from. Using them properly is a huge win in performance sensitive code and a big win in program design too, since you don't have to constantly reuse registers or use the stack as much. JohnFound wrote: On the other hand, the proper structural approach to programming simply excludes big monolith blocks of code. I'm not sure what that means, but if you mean in general big monolithic blocks of assembly code are bad or inappropriate, I will have to disagree strongly since I work on large products and I have seen the code for big pieces of MVS and and many if not most of them are structured that way and it solves many problems over splitting things up in many small source files. That is not to say it is all one program, you can also write many functions and package them in one source file. Not to lose sight of the original issue, the proposition is not that we have huge source files therefore we need a way to get the assembler to help with addressing, but that there are so few gprs available which means they are likely used for many things even in a small module, and so much extra possibility of error involved in writing or maintaining code could be eliminated by having the assembler aware of what base address was being used by passing it a directive. The idea is to localize base addressing at the storage mapping level, to scope it as it were, and to limit the scope of changes as much as possible. JohnFound wrote:
Again, in my experience this is simply not true. If it were, then by implication all HLL is also unclear because they free the user from having to constantly code base+displacement address forms. The idea is to map out your control blocks and storage areas, give fields names, and then let the assembler help you as much as possible by associating names to locations you set, rather than you constantly having to code assembly like you were writing object code, which is this case is almost what is happening. This can be done at a higher level, it can make code easier to write, to debug, and to extend, and it can reduce errors overall. The key to assembler is not how much effort it takes to write code, but how much control you have. What I am suggesting does reduce coding effort and errors, but it does not take away any control. After all, nobody says if the feature was in fasm you would have to use it! FWIW since I work on large pieces of code and products that have high LOC and often have to jump in and fix something 20 years old that I never saw before today, I almost never work from the source. I go with what you would call a core dump, and I use listings extensively. They show me the mappings in force for every storage reference on the top of every page and from the object code I can also tell immediately what's going on. Since listings are not so common in the UNIX world (I don't know about Windows) I know people are used to having things alot different. I wish I could give you an example but the code is all proprietary. If I can find some old IBM code on the net I will try to assemble it and show you a listing. Maybe if I can do a better job illustrating how this works you will see the value in it. JoeCoder1 wrote: If I know I'm basing MyStructure off of eax, then I have two questions: Why do I need to qualify names, and why do I have to code the structure name in every reference. These two things ought to be automated away so that assuming exc points to a specific named structure, then I ought to be able to code JohnFound wrote: The above implicit use of local labels is possible in many languages (for example pascal "with" clause) but the only advantage it have is that it makes typing easier. It is a compromise with the readability of the code (in HLL as well). It's not a local label. In the assembler I use at work, I can instruct the assembler to map any area of storage with any mapping I want, and that includes making names available in the whole source file. In assembly, it's essential for reducing errors, improving readability, and reducing maintenance. And don't get me wrong, this is not my idea I am suggesting, and it is not new, it's been working since the 1960s in an environment where assembler is the main language! So you can argue it doesn't belong in fasm, or you don't need it on x86, and I can't say you're wrong. But you can't say it doesn't help in assembler because we know it does! And we don't have to use it, but we usually do! About the only time we don't is when we are hopping through a control block chain and only need a pointer field in each one. In that case we do code base displacement, but that is about the only time. The assembler lets you name fields, let it do its job! JohnFound wrote: Anyway, In FASM the above implicit use of labels can be achieved in some limited degree by using macros. But you have to create them yourself. I'll look at it but from my experience, the feature I'm talking about is essential to writing clean, maintainable code. I think any time you can get the assembler to automate something you should probably consider it. Not much assembly is written on x86, but on the machines I work on almost everything on the system side is, we don't use C. So we have a pretty long history of how to work and how not to work, at least in that environment. Thanks for the dialog! I will try to get virtual to work as I would like. Thanks for suggesting it. |
|||
07 Jul 2011, 10:56 |
|
revolution 07 Jul 2011, 11:12
The problem I foresee with hidden register names is this:
Code: mov eax,something1 mov ebx,something2 mov ecx,something3 mov edx,something4 mov [structure1.member1],eax mov [structure2.member1],ebx mov [structure3.member1],ecx mov [structure4.member1],edx IMO using the assembler to hide stuff to save a bit of typing is asking for trouble later. I don't see such code as "clean code" at all, but instead it appears to me like obfuscated code. Indeed the act of hiding something is the definition of obfuscation! |
|||
07 Jul 2011, 11:12 |
|
JoeCoder1 07 Jul 2011, 11:55
You're misunderstanding what I am suggesting. From your comments I would conclude you should not use macros, since they "hide" stuff. It's a basic principle of engineering that there is good hiding and bad hiding, but in any case what I have been suggesting has no relation to hiding. That is not what it is about.
What it is about is automating what should be automated and making maintenance easier while strengthening the code at the same time. Those things should be among the basic goals of any assembler. I can tell you, as somebody who has used this exact feature that I have been describing, for the last 30 years in a few dozen million LOC, that it hides nothing. In fact it exposes possible conflicts when used properly by causing the assembler to issue a warning when a name could be resolved to more than one base register. It's one of the first things that bothered me when I started looking at x86, I couldn't believe there wasn't a more convenient and safe notation for structure references than having to code explicit base registers and offsets in every reference. You should know, there is a better way. |
|||
07 Jul 2011, 11:55 |
|
< Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.