flat assembler
Message board for the users of flat assembler.

Index > Compiler Internals > Sometimes putting "h" at end of number doesn't work

Author
Thread Post new topic Reply to topic
Ben321



Joined: 07 Dec 2017
Posts: 70
Ben321 29 Sep 2018, 19:35
To put hexadecimal numbers into the assemble you can do something like 3Eh or 0x3E. While this works for hex numbers that start with a numerical digit like 3, it doesn't work for ones that start with a letter digit like E. For example, while 0xE3 works, E3h does NOT work in FASM. The reason is it says (in its popup error message box) it is an unrecognized label. It seems to think that just because the string started with an E that it's a non-numerical string, and it ignores the h at the end.

To fix this bug, you need to make FASM look at the end of the string for an h as one of the first things it does when analyzing a string that otherwise contains only valid hexadecimal digits so that it can automatically detect it as a hexadecimal number. This is important, because in ASM code, the standard designator for hex number is NOT "0x" at the beginning of the number (as it is in C or C++), but rather is "h" at the end of the number.
Post 29 Sep 2018, 19:35
View user's profile Send private message Reply with quote
DimonSoft



Joined: 03 Mar 2010
Posts: 1228
Location: Belarus
DimonSoft 29 Sep 2018, 19:58
Ben321 wrote:
To put hexadecimal numbers into the assemble you can do something like 3Eh or 0x3E. While this works for hex numbers that start with a numerical digit like 3, it doesn't work for ones that start with a letter digit like E. For example, while 0xE3 works, E3h does NOT work in FASM. The reason is it says (in its popup error message box) it is an unrecognized label. It seems to think that just because the string started with an E that it's a non-numerical string, and it ignores the h at the end.

To fix this bug, you need to make FASM look at the end of the string for an h as one of the first things it does when analyzing a string that otherwise contains only valid hexadecimal digits so that it can automatically detect it as a hexadecimal number. This is important, because in ASM code, the standard designator for hex number is NOT "0x" at the beginning of the number (as it is in C or C++), but rather is "h" at the end of the number.

And then you’ll see people complaining that FASM recognizes F3h as hexadecimal literal instead of an identifier:
Code:
F3h:
        ; Implementation of Function 3h goes here    

How would you define the rules for a valid identifier then?
Post 29 Sep 2018, 19:58
View user's profile Send private message Visit poster's website Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8349
Location: Kraków, Poland
Tomasz Grysztar 29 Sep 2018, 20:04
Very early versions of fasm were universally case-sensitive and allowed numbers like E3h. From this followed that Ah was a valid hexadecimal number while ah was a register name. This was a very BAD idea and I quickly abandoned the experiment. I proceeded to implement the standard rule held by assemblers that a number must always start with a digit in 0-9 range, so you need to write 0Ah (and you are also allowed to write it as 0aH or 0ah, or 0AH).
Post 29 Sep 2018, 20:04
View user's profile Send private message Visit poster's website Reply with quote
Ben321



Joined: 07 Dec 2017
Posts: 70
Ben321 29 Sep 2018, 20:11
Tomasz Grysztar wrote:
Very early versions of fasm were universally case-sensitive and allowed numbers like E3h. From this followed that Ah was a valid hexadecimal number while ah was a register name. This was a very BAD idea and I quickly abandoned the experiment. I proceeded to implement the standard rule held by assemblers that a number must always start with a digit in 0-9 range, so you need to write 0Ah.


What about when you are trying to refer to A0000000h? There's no higher hex digit than the 8th hex digit, when working with 32bit numbers. As far as I know, using "h" at the end of the hex string is standard for 32bit numbers is standard for assemblers. Or what if you are just using one byte of data at its maximum value value of FFh? That should work, but it won't in FASM. I could do 0FFh, but 0FFh indicates that there's 3 nibbles of data being worked with, even though a byte (the data I'm working with) contains only 2 nibbles.

Again, it's important to stick with the industry standard for assemblers, that hex strings always use uppercase letters, and that the last symbol in a hex string is a lowercase h.
Post 29 Sep 2018, 20:11
View user's profile Send private message Reply with quote
Tomasz Grysztar



Joined: 16 Jun 2003
Posts: 8349
Location: Kraków, Poland
Tomasz Grysztar 29 Sep 2018, 20:29
Ben321 wrote:
Again, it's important to stick with the industry standard for assemblers, that hex strings always use uppercase letters, and that the last symbol in a hex string is a lowercase h.
Industry standard?! I was under the impression that fasm 1.0 was the only one insane enough to try this case-sensitive approach. I certainly know of no other "mainstream" x86 assembler that did it.

If you dislike that starting 0 so much, there is also a $FFFFFFFF syntax allowed.
Post 29 Sep 2018, 20:29
View user's profile Send private message Visit poster's website Reply with quote
DimonSoft



Joined: 03 Mar 2010
Posts: 1228
Location: Belarus
DimonSoft 29 Sep 2018, 20:35
Ben321 wrote:
Again, it's important to stick with the industry standard for assemblers, that hex strings always use uppercase letters, and that the last symbol in a hex string is a lowercase h.

Can you give a link to the standard you’re talking about?

Tomasz Grysztar wrote:
From this followed that Ah was a valid hexadecimal number while ah was a register name. This was a very BAD idea and I quickly abandoned the experiment.

Wow! That’s an even better example. Thanks.
Post 29 Sep 2018, 20:35
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20298
Location: In your JS exploiting you and your system
revolution 30 Sep 2018, 01:12
Some assemblers use the position of the text to decide if it is a label or a number. So if the text starts on the first column (col=1) then it is a label regardless of which character it begins with. So those assemblers allow labels like "0x3e" AND numbers like "0x3e" all mixed together in the same file.

Industry standard? I don't believe such a thing exists. Often the first assembler we encounter becomes the "normal" syntax and thus should be the "standard" syntax all others should follow. But once we encounter other assemblers and see that they are very different from each other we begin to understand that standards don't really exist.
Post 30 Sep 2018, 01:12
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20298
Location: In your JS exploiting you and your system
revolution 30 Sep 2018, 03:24
In fasm we used to be able to specify the floating point number ".e", and the label ".e" was not possible.
Post 30 Sep 2018, 03:24
View user's profile Send private message Visit poster's website Reply with quote
Ben321



Joined: 07 Dec 2017
Posts: 70
Ben321 29 Jul 2024, 20:20
Tomasz Grysztar wrote:
Very early versions of fasm were universally case-sensitive and allowed numbers like E3h. From this followed that Ah was a valid hexadecimal number while ah was a register name. This was a very BAD idea and I quickly abandoned the experiment. I proceeded to implement the standard rule held by assemblers that a number must always start with a digit in 0-9 range, so you need to write 0Ah (and you are also allowed to write it as 0aH or 0ah, or 0AH).


Then using the XXXXXXXXh syntax, how do you recommend that the number FFFFFFFFh be written? The code "dd 0FFFFFFFFh" is not recognized as a valid hex number by the assembler, and it throws an error, because a 9 digit hex number is not valid for a doubleword value. Therefore taking your recommendation of starting the hex number with a 0 doesn't work. It needs to work, because although 0xXXXXXXXX is valid syntax (and would help in this situation by writing 0xFFFFFFFF), it's not a commonly used syntax by assemblers. That's the C and C++ syntax for a hexidecimal number. So maybe you can fix the bug with an 8 digit letter-starting hex number that requires a 0 before the letter (total now of 9 hex digits) so that FASM doesn't error on it.


Alternatively I had this idea. A simple way to tell if an identifier is a label or a hex number ending in "h", is to see if the last character is a colon. A label definition must have a colon immediately following the last letter of the character. Also the first character of a label must be the first non-whitespace character on the line it appears on. I think this is pretty standard for assemblers.

For example this is a valid code:
Code:
inc edx
MyLabel:
mov eax,1
add edx,eax
jmp MyLabel
    


This is also valid code:
Code:
inc edx
  MyLabel: mov eax,1
    add edx,eax
    jmp MyLabel
    



This is NOT valid code, due to the label definition not being the first token on the line:
Code:
inc edx MyLabel:
    mov eax,1
    add edx,eax
    jmp MyLabel
    



This is NOT valid code, due to the label definition not ending with a ":":
Code:
inc edx
  MyLabel
    mov eax,1
    add edx,eax
    jmp MyLabel
    



Using these simple rules you can make sure that the number A0h is not mistaken for a label called "A0h".
Post 29 Jul 2024, 20:20
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20298
Location: In your JS exploiting you and your system
revolution 29 Jul 2024, 21:00
Ben321 wrote:
The code "dd 0FFFFFFFFh" is not recognized as a valid hex number by the assembler,
Yes it is.
Code:
~ cat test.asm 
dd 0FFFFFFFFh
~ fasm test.asm
flat assembler  version 1.73.31  (16384 kilobytes memory)
1 passes, 4 bytes.
~    
Post 29 Jul 2024, 21:00
View user's profile Send private message Visit poster's website Reply with quote
Ben321



Joined: 07 Dec 2017
Posts: 70
Ben321 30 Jul 2024, 01:16
revolution wrote:
Ben321 wrote:
The code "dd 0FFFFFFFFh" is not recognized as a valid hex number by the assembler,
Yes it is.
Code:
~ cat test.asm 
dd 0FFFFFFFFh
~ fasm test.asm
flat assembler  version 1.73.31  (16384 kilobytes memory)
1 passes, 4 bytes.
~    


Sorry. I thought that last time I tried that (a while ago) it didn't work.
Post 30 Jul 2024, 01:16
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 20298
Location: In your JS exploiting you and your system
revolution 30 Jul 2024, 02:01
It has always worked.
Post 30 Jul 2024, 02:01
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4016
Location: vpcmpistri
bitRAKE 30 Jul 2024, 21:09
Ben321 wrote:
Ben321 wrote:
The code "dd 0FFFFFFFFh" is not recognized as a valid hex number by the assembler,
Sorry. I thought that last time I tried that (a while ago) it didn't work.
Long days, it's fairly easy to get an extra 'F' in there. I like to break up runs: 0FFFF_FFFFh or 0FFFF'FFFFh. At a glance I can see what is what. The partitioning can follow however the data is structured to make it semi-self-documenting.

_________________
¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup
Post 30 Jul 2024, 21:09
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.