flat assembler
Message board for the users of flat assembler.
![]() Goto page Previous 1, 2, 3, 4, 5, 6 Next |
Author |
|
radarblue 22 Oct 2016, 15:26
Good day . I have another issue.
Inlines. This is a C version of the above mentioned program assembled or compiled in EMU8086. I read that ASM in C is a compiler specific task. and its a specific syntax for each compiler. I am using the GCC compiler and Codelite IDE for C . The code from my book (page 202) doesnt compile. I looked on the Wiki and found out that Instead of writing : _asm MOV BL , 'A' // ( I got compiler errors ) I have to write something like this : asm "MOV %%BL,'%%A';" // (GCC compiler tolerated that) Here is the (NO-GO) program code. Code: //swap_bytes_inline C-ansi program, TDM-GCC-32. #include <stdio.h> //#include "stdafx.h" compiler error. int main (void) { //declare char temp; char rslt1,rslt2; //switch to assembly asm ( // ("MOV BL,0"); // compiler error "MOV %%BL,%%0;" // nulling the register "MOV %%BH,%%0;" // nulling the register "MOV %%BL,'%%A';" "MOV %%BH,'%%B';" // swap bytes "MOV %%temp,%%BL;" // A in temp "MOV %%rslt1,%%BH;" // B in rslt1 "MOV %%BH,%%temp;" // A in BH "MOV %%rslt2,%%BH;" // A in rslt2 ); printf ("BL = %c, BH = %c\n",rslt1,rslt2); // terminate return 0; } the Compiler goes thru (without red warnings, only yellow warnings). But I got errors on displaying the result once the ASM inline was done, In the printf ("BL = %c, BH = %c\n",rslt1,rslt2); error message is rslt1 and rslt2 is unintialised ... the registers are not transmitted out to the C program . anybody have an idea. I obtained a code from this place LINK: http://www.ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html That gave an idea on how its done. With the neat name Foobar ! Code: #include <stdio.h> //TDM-GCC-32 compiler int main() { int foo = 2, bar = 3, crow = 4; asm ("add %%bx,%%ax" // Define Instruction and Registers :"=a"(foo) // Define Destination :"a"(foo), "b"(bar) // Destination, Source ); asm ( // New asm instruction for new Register operation "add %%dx,%%cx" // must copy from CX to DX :"=c"(crow) // assign new names for new integers :"c"(crow), "d"(bar) ); printf("foo+bar = %d\ncrow+bar = %d", foo, crow ); return 0; } //no result with register AL and AH, only AX, BX . //no result with BX to CX . //no result with ADD AX, BX, must be ADD BX,AX . //no result with more than one register MOV or ADD per _asm instruction . Last edited by radarblue on 22 Oct 2016, 19:13; edited 2 times in total |
|||
![]() |
|
Trinitek 22 Oct 2016, 18:26
There's a couple of problems here. GCC understands AT&T syntax, which has the operands flipped.
Code: ; Intel syntax (used by FASM) mov bl, 0x32 ; AT&T syntax (used by GAS and GCC in-line assembly) mov $0x32, %%bl Also, you need to tell GCC which variables you want to modify. See this: http://ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html#s5 I see you're trying to use the BH register. Further down the page is a list of register constraints: http://ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html#s6 which only lists E?X, ?X, and ?L registers. So it does not appear that you can assign, for example, the BH register to an output variable without putting the result in another register (like AL) and then working with that. And, you do not put %% in front of constants in GCC-flavored AT&T syntax. As an example, if you want to zero a register, you would do the following: Code: mov $0, %%bl ; base 10 number mov $0x0, %%bl ; base 16 number Just for the sake of providing another example, here's something I quickly wrote that will add 5 to a variable. Code: #include <stdio.h> int main(void) { char something = 45; printf("%i\n", something); asm( "add $5, %%bl;" : "=b" (something) /* output in BL goes in variable 'something' */ : "b" (something) /* input variable 'something' goes in BL */ ); printf("%i\n", something); return 0; } |
|||
![]() |
|
radarblue 22 Oct 2016, 18:50
I see. I noticed somthing strange about the AT&T and Intel syntax, particularely the %%. The compiler erred. And really the Destination and Source operands flipped around ! In addition one has to use the string terminator $. Thats very interesting.
And I also notice the AH, BH, CH, DH registers are not listed. Maybe try out a Intel compiler ... I see this listed : https://software.intel.com/en-us/intel-parallel-studio-xe priced 699$ ![]() Thats kinda steep, and beyond my scope . Maybe next christmas . Thanks for a good example. Last edited by radarblue on 22 Oct 2016, 21:30; edited 2 times in total |
|||
![]() |
|
Trinitek 22 Oct 2016, 20:58
Since you're interested in 16-bit assembly, I think OpenWatcom is worth mentioning. It uses Intel syntax for its in-line assembly and can also build DOS and Win16 executables. The in-line is different from the way GCC does it, so you'll have to check the manual, of course.
And it doesn't cost $699. ![]() |
|||
![]() |
|
radarblue 22 Oct 2016, 21:31
Very well, thanks a million !
I will check out the OpenWatcom ! However concerning the example. It works with numbers, but what about characters ? In this I get errors on the ASM line . Code: #include <stdio.h> int main(void) { char ch ='A'; printf("%c\n",ch); asm ( "mov $ch,%%al;" /* Source 'ch' to destination AL */ : "=b" (ch) /* output in AL goes in variable 'ch' */ : "b" (ch) /* input variable 'ch' goes in AL */ ); printf("%c\n",ch); return 0; } |
|||
![]() |
|
Trinitek 22 Oct 2016, 22:32
You cannot access the ch variable like that. Remember that the two ':' lines are responsible for interfacing your assembly code to the C code.
Furthermore, read the register constraints link again. Notice that "a" corresponds to the AX registers, and "b" corresponds to the BX registers, and so on. |
|||
![]() |
|
radarblue 23 Oct 2016, 10:38
You are correct. I managed to get the character Moved into Register like this. when I also do as you say, put the destination and source concerning AX to "a" it functions ! YES
Code: #include <stdio.h> /* TDM-GCC-32 compiler. AT%T syntax . */ int main(void) { char ch1 ='A',ch2,ch3; printf ("C printf =\t\t%c\n",ch1); asm ( "mov $'T',%%ax;" /* MOV $'char' or number to destination %% register */ : "=a" (ch1) /* (destintaion) use a concerning EAX, and b for EBX */ /* : "a" (Source) is input manually in this case, and can be left out */ ); asm ( "mov %%ax,%%bx;" /* MOV from AX to BX ( opposite order as Intel Syntax ) */ "sub $46,%%bx" /* ASCII "T" (84) - 46 = ASCII "&" (38) */ : "=b" (ch2) /* (destination) b for BX register */ : "a" (ch1), "b" (ch2) /* (source a for AX) to (destination b for BX) */ ); asm ( "mov %%bx,%%cx;" "add $46,%%cx;" : "=c" (ch3) : "b"(ch2), "c" (ch3) ); printf("ASM printf ch1 =\t%c\n",ch1); printf("ASM printf ch2 =\t%c\n",ch2); printf("ASM printf ch3 =\t%c\n",ch3); return 0; } |
|||
![]() |
|
radarblue 23 Oct 2016, 13:26
And this would be the program that I inquired. Now working proper with the syntax.
Thank you Trinitek, you really helped me quite a bit ! Code: #include <stdio.h> // TDM - GCC - 32 . AT&T syntax // Swap 2 bytes between registers using inline ASM in C. int main () { char ch1,ch2,temp; asm ( "mov $'M',%%ax;" :"=a" (ch1) ); asm ( "mov $'A',%%bx;" :"=b" (ch2) ); asm ( "mov %%bx,%%cx;" :"=c" (temp) :"b" (ch2), "c" (temp) ); printf ("Before Swap :\t%c %c\n",ch1,ch2); // A to B, then TEMP tp A asm ( "mov %%ax,%%bx" :"=b" (ch2) :"a" (ch1), "b" (ch2) ); asm ( "mov %%cx,%%ax" :"=a" (ch1) :"c"(temp),"a" (ch1) ); printf ("After Swap :\t%c %c",ch1,ch2); return 0; } This particular inline syntax using the GCC has some serious limitations. 1. All the Low Order registers are disabled, AL, BL, CL, DL . 2. Cant copy or do instructions between AX and EAX, both use the same "constraint" . Resulting in very few registers that can be accessed or operated on . Now concerning the suggestion of OpenWatcom, I get a Fortran77 download, by mistake. I didnt check the recent builds. my bad. I think I am on the right track ! With limitations in the GCC first of all I mean the quirky syntax, and the GCC lack support for alot of Mnemonics like MOVSX, MOVSZ, CMOVC ... Reading all this really gets me pale ! |
|||
![]() |
|
Trinitek 24 Oct 2016, 02:22
Quote: 1. All the Low Order registers are disabled, AL, BL, CL, DL . Quote: 2. Cant copy or do instructions between AX and EAX, both use the same "constraint" . Resulting in very few registers that can be accessed or operated on . Which brings me to another example I would like to present for this scenario, using compound bytes in AX as opposed to compound words in EAX, but the concept still applies. Here, a struct encapsulates two chars. Then, the struct itself is passed in as a single argument to the in-line assembly block, into AX. The assembly then performs the logic to extract the char that it needs from the struct. Code: #include <stdio.h> /* * Here, we define a structure that groups 2 chars together as an * addressable unit. * * Due to the nature of the way x86 processors copy multi-byte values * to and from memory (this trait is known as endianness), the high * and low bytes are flipped here in this structure. The x86 * architecture is a "little-endian" architecture, which means that * bytes are read from and written to memory backwards. As an example, * a value of 0xABCD in the AX register, such that AH contains 0xAB * and AL contains 0xCD, when written to memory, will be visible as * 0xCD, 0xAB if you were to inspect the program with a debugger. */ typedef struct { char low; char high; } grouped_char; /* * This function returns the high byte of a grouped char. * * It uses the SHR instruction instead of accessing AH, but both work * just as well. */ char getHigh(grouped_char chars) { char result; asm("shr $8, %%ax;" // bitshift AH into AL "mov %%al, %%bl;" // BL <-- AL : "=b" (result) // resulting char is found in BL : "a" (chars) // pass char group parameter into AX ); return result; } /* * This function returns the low byte of a grouped char. */ char getLow(grouped_char chars) { char result; asm("mov %%al, %%bl" // BL <-- AL, ignoring AH : "=b" (result) // resulting char is found in BL : "a" (chars) // pass char group parameter into AX ); return result; } int main(void) { grouped_char myChars; myChars.high = 'A'; myChars.low = 'B'; printf("Group: %c %c\n", myChars.high, myChars.low); printf("High: %c\n", getHigh(myChars)); printf("Low: %c\n", getLow(myChars)); return 0; } Glad to help! ![]() Last edited by Trinitek on 24 Oct 2016, 15:49; edited 1 time in total |
|||
![]() |
|
revolution 24 Oct 2016, 13:29
Trinitek wrote: the AH register is inaccessible in long mode. Code: use64
mov ah,al Code: flat assembler version 1.71.54 (3145344 kilobytes memory) 1 passes, 2 bytes. |
|||
![]() |
|
Trinitek 24 Oct 2016, 15:52
Woops. I had one too many things on my mind and recalled the 64-bit AH encoding caveat incorrectly.
|
|||
![]() |
|
revolution 25 Oct 2016, 07:02
Trinitek wrote:
![]() Isn't it simpler to remove SHR completely? Code: char getHigh(grouped_char chars) { char result; asm("mov %%ah, %%bl;" // BL <-- AH : "=b" (result) // resulting char is found in BL : "a" (chars) // pass char group parameter into AX ); return result; } |
|||
![]() |
|
radarblue 25 Oct 2016, 19:32
Quote: 1. All the Low Order registers are disabled, AL, BL, CL, DL . Trinitek, you are correct, the low order bytes are permitted in GCC inline. I got errors when adressing AL and AH with the constraint "a" in the same program. And MOV from AL to AH in the same bracket or program. And made a premature conclusion. To copy between the Low and High you advise Bitshifting ![]() Concerning Endianess. I interpret it like this. presume one reads from left to right. In usual numerical values the MSB is on the left, the LSB is on the right. How we normally understand numerical values are Little Endian, and follow the usual positioning number system. I actually thought big end and little end was on a Bit level, but now understand its on a Byte level. The Bits are not reversed from little endian ... even tho the computer has a big endian architecture . First thought ( wrong, Endian doesnt reverse the bits ) 00000111 = 7 decimal little endian 11100000 = 7 decimal big endian second thought ( correct , reverse the adress location of a byte ) REGISTER MOV AX, 0xABCD ---------------------- AX ---------------------- AH 00010101 (AB, 15h). AL 00011001 (CD 19h) Read on the instruction pointer list 1. AL 00011001 (CD 19h) 2. AH 00010101 (AB, 15h) ( this may be a misunderstanding for I cant even put an immediate value (in EMU8086 ) into the register AX, just AL then AH. Making the instruction pointer display the values in the normal order...) NB : I know I speak to people who knows better than me about the topic, and by no means try to say how it is. I say what I think is correct. or how I percieve it at the moment. At first I though this had something to do with the mechanical CPU architecture of translating a horizontal written line of values. into a vertical segment or register stack order. Reading from the top and down, it makes sense. And It seems the X86 prioritize its instructions from the top and down. The displacement is also displaced from the reference Base 0, downward the segment as far as I have gathered. And the Theory explains that the High adress is on the bottom, and Low adress is on the top of a stack. And a stack is "stacked" in X86 as stacking wood, one starts putting a log on the bottom ( high adress ), then build the stack upwards toward the top ( low adress ) One must not confuse LOW adress, with bottom of the stack ! ! ! Theory (that I rely on) also explains that the Stack segments stores its bytes in Big Endian (?), and example in my book that I havent read thru ... however. The following is loosly based on an example from my book p273. SS = 1050h SP = 0002h SEGMENT MOV AX,0xABCD Presume that value of AX is moved to Stack Segment and Stack pointer points to the value of AX in SS. (Understand that I am stribing to understand the concept) ________________Stack Segment_______ ________________LoByte__HiByte_______ SS_______10500_________________10501 SP TOS___10502___CD______AB___10503 Low stack adress _________10504_________________10505 High stack adress ( number increases as the list progress downwards ) Machine philosophy ![]() On the Intel page I see they talk about the benefit of Little Endian Architecture instead of "Main frame" Big Endianess . Trawling the net it seems the benefit has to do with binary arithmetic, relating to the carry bit. One has to resolve the LSB to obtain the carrybit for the MSB. LINK1 : http://softwareengineering.stackexchange.com/questions/95556/what-is-the-advantage-of-little-endian-format/95854#95854 LINK2 : intel Concerning the lost Mnemonic instruction in the GCC and EMU8086, I thought this. Just proceed with the C, and the inlines with the supported mnemonics in GCC . For the instructions not supported by the GCC, translate the Inline examples as if it was an ordinary Assembly programs in EMU 8086 to learn the fundamental proceedure. I then discover that also the EMU8086 has no support for the mnemonics MOVSX, MOVZX, and CMOVC. But see when reading the IA-32 intel documentation that these are valid instructions. LINK3 : http://flint.cs.yale.edu/cs422/doc/24547112.pdf In there see chapter 1.3.1 and compare it to the above. its all opposite !? LINK4 : x86 assembly language and C fundamentals PDF. rar. ( see page 272-273 ) upload.evilzone.org/download.php?id=5895803&type=rar ![]() The emulator havent got all the instructions it coded in. Well, I understand I am not exacly writing directly to the CPU ( firstly because I have an Intel Dual Core and I emulate an 8086) , but the GCC/EMU compiler/program is programmed by somebody, and they left something out. Last edited by radarblue on 24 Nov 2016, 20:05; edited 4 times in total |
|||
![]() |
|
neville 25 Oct 2016, 23:42
Hi radarblue
I hope you're not over-complicating the endianess thing in your mind! Endianess only refers to the order in which bytes are stored in MEMORY. It has nothing to do with CPU registers - the MSB of AX is always AH, the LSB is always AL. Little-endian is the logical choice, as the address of a value in memory is the same, irrespective of the byte-width of the value. In big-endian the address of the LSB depends on the byte-width which must therefore be known beforehand, so it is less flexible. Little-endian is assumed in all X86 (Intel) processors. But if a particular data structure uses a big-endian format, the X86 CPU must re-order the bytes after loading the value into a register e.g. Code: MOV AX,[BIGENDIAN] ;the LSB is read to AH, MSB to AL due to the byte order in memory XCHG AH,AL ;LSB now in AL, MSB in AH as required for further processing... Big-endian is supposedly more "natural" for "humans" to read, but only for those humans who read from left to right!! Ok, that applies to those of us in the English-speaking world etc, but of course many languages are read from right to left. So in Hebrew, for example, little-endian would be considered both logical AND natural! BTW, I'm not sure why you've translated ABh to 15h, and CDh to 19h ?? In your example, AH holds ABh = 10101011 in binary, AL holds CDh = 11001101 in binary. Also MOV AX,ABCDh is an immediate addressing operation which has nothing to do with the SS and SP registers. But the code in memory would be 3 bytes stored in the following order: B8H, CDH, ABH B8H is the opcode for MOV AX,immediate The immediate value ABCDh is stored in memory in little-endian order of course! _________________ FAMOS - the first memory operating system |
|||
![]() |
|
radarblue 26 Oct 2016, 18:57
I think I get it.
It has to do with what direction the bytes are read in a line . Presuming the MSB (Most Significant Byte) is to the left, and LSB is to the right. Big Endian : reads bytes from Reg to RAM, from left to right ( like English language text string ). MSB is stored in RAM at low adress. LSB to high adress. Little Endian : reads bytes from Reg to RAM from right to left ( like Hebrew language, and how longmath arithmetic is performed. One starts calculating with the LSB ) . LSB is stored in RAM with low adress, MSB with high adress. Quote: BTW, I'm not sure why you've translated ABh to 15h, and CDh to 19h ?? I added them for some reason ... A+B then C+D. The text above should have stayed in my own notes. Got to spend some time in the program . Until I know what I talk about, got carried away . Thanks |
|||
![]() |
|
revolution 26 Oct 2016, 23:14
Try not to think in terms of left and right. Memory has no specific ordering like that. Little endian has the least significant values at the lowest numerical address. The higher the address value then the higher the significance within the number.
|
|||
![]() |
|
revolution 27 Oct 2016, 02:51
radarblue wrote: ... EMU8086 has no support for the mnemonics MOVSX, MOVZX, and CMOVC. But see when reading the IA-32 intel documentation that these are valid instructions. 8086 is 16-bit only. No support for movzx, pusha, setcc, cmovcc, MMX, SSE, AVX, etc. No 32/64-bit registers, FS or GS. |
|||
![]() |
|
rugxulo 27 Oct 2016, 07:34
radarblue, I feel like you don't have focus on what exactly you're trying to learn. Is it general programming or 8086/386 or C or Windows or ... ?
Honestly, as much as I hate to agree, DOS is indeed not a good first platform unless you really enjoy digging into obsolete software (almost archaeology) like I do. Hence modern AMD64 assembly books (like this, which I haven't read) might be a better (long-term) idea. Nevertheless, FreeDOS 1.2 is due very very soon, and VirtualBox or QEMU work well, and you can easily run immediate cpu instructions directly when using certain debuggers (e.g. GRDB or D86). On the non-x86 side, you might be better off learning ARM or PPC (or even MMIX or RISC-V). Though I'd honestly recommend (Free)Pascal if all you want to do is learn general computing principles. In particular, it supports inline asm, and it even has a i8086-msdos (cross-)target nowadays. (Just FYI, "asmmode intel" is default for "mode tp" [turbo].) Having said that, you can read The 8086 Primer (2nd ed.) online for free, check 80186 differences on Wikipedia, and learn 8086/186 instruction encodings / disassembly here. There's also plenty more stuff out there (e.g. disassemblers), but that should give you some pointers. I hate to overload you with too much info or suggest that you must absolutely read it all, but indeed it should give you an idea of where to focus. |
|||
![]() |
|
neville 27 Oct 2016, 09:29
radarblue:
I've just read all the posts in this thread from the beginning and I see a few people have been suggesting you take a different course for various reasons. They are all well-meaning, but you've described your favourite book, and I say, just go for it! And it will most likely be the best option for you anyway. Every one of us has a different mix of experiences and interests, which gives each of us our own unique view of the world of computing. There is no right way or wrong way! There is just the way that appeals to us individually. The learning curve you face is quite steep, but it is very easy to make it even steeper through information overload. So there's nothing wrong with limiting your focus initially to a 16-bit real mode environment (which incidentally to this day every X86 CPU still sets up on power-up). Because of all the inconsistencies (idiosyncrasies?) in the X86 architecture, a good grounding in the 16-bit fundamentals won't ever do you any harm! _________________ FAMOS - the first memory operating system |
|||
![]() |
|
Goto page Previous 1, 2, 3, 4, 5, 6 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.