flat assembler
Message board for the users of flat assembler.

Index > Programming Language Design > [fasmg] parsing XML or Vulkan scrapings ...

Author
Thread Post new topic Reply to topic
bitRAKE



Joined: 21 Jul 2003
Posts: 4060
Location: vpcmpistri
bitRAKE 27 Mar 2021, 00:40
I've been hand converting the Vulkan API into fasmg friendly fragments, but I am to the point of biting off larger pieces and want the full API. To make Vulkan accessible they produce an XML spec called the registry:

https://github.com/KhronosGroup/Vulkan-Docs/blob/main/xml/vk.xml?raw=true
(can be downloaded directly without using github, ?raw=true)

It a crazy file - there is C-syntax (and latex) mixed in with the XML.

So, here is my parsing template. It will cleanly parse the current revision (1.2.172).
Code:
; generate fasmg support includes from Vulkan API Registry XML spec
;       initial version 2021Mar26, bitRAKE (Rickey Bowers Jr.)
format binary as 'txt'

; 64-bit
define _PTR 'dq ?'

; 32-bit
;define _PTR 'dd ?'

virtual at 0
vk::file 'vk.xml'
file_length := $
end virtual
_POS = 0



macro _enum line&
end macro
macro _end_enum
end macro

macro _enums line&
end macro
macro _end_enums
end macro

macro _extension line&
end macro
macro _end_extension
end macro

macro _extensions line&
end macro
macro _end_extensions
end macro

macro _command line&
end macro
macro _end_command
end macro

macro _commands line&
end macro
macro _end_commands
end macro

macro _platform line&
end macro
macro _end_platform
end macro

macro _platforms line&
end macro
macro _end_platforms
end macro

macro _spirvcapabilities line&
end macro
macro _end_spirvcapabilities
end macro

macro _spirvcapability line&
end macro
macro _end_spirvcapability
end macro

macro _spirvextension line&
end macro
macro _end_spirvextension
end macro

macro _spirvextensions line&
end macro
macro _end_spirvextensions
end macro

macro _tag line&
end macro
macro _end_tag
end macro

macro _tags line&
end macro
macro _end_tags
end macro

macro _type line&
end macro
macro _end_type
end macro

macro _types line&
end macro
macro _end_types
end macro

macro _comment line&
end macro
macro _end_comment
end macro

macro _enable line&
end macro
macro _end_enable
end macro

macro _feature line&
end macro
macro _end_feature
end macro

macro _implicitexternsyncparams line&
end macro
macro _end_implicitexternsyncparams
end macro

macro _member line&
end macro
macro _end_member
end macro

macro _name line&
end macro
macro _end_name
end macro

macro _registry line&
end macro
macro _end_registry
end macro

macro _require line&
end macro
macro _end_require
end macro

macro _param line&
end macro
macro _end_param
end macro

macro _proto line&
end macro
macro _end_proto
end macro

macro _unused line&
end macro
macro _end_unused
end macro



macro dispatch tag,pos,len
while 1
        if tag = '?xml'
                ; special one time initial
                break
        else if tag = 'enum'
        else if tag = 'enums'
        else if tag = 'extension'
        else if tag = 'extensions'
        else if tag = 'command'
        else if tag = 'commands'
        else if tag = 'platform'
        else if tag = 'platforms'
        else if tag = 'spirvcapabilities'
        else if tag = 'spirvcapability'
        else if tag = 'spirvextension'
        else if tag = 'spirvextensions'
        else if tag = 'tag'
        else if tag = 'tags'
        else if tag = 'type'
        else if tag = 'types'

        else if tag = 'comment'
        else if tag = 'enable'
        else if tag = 'feature'
        else if tag = 'implicitexternsyncparams'
        else if tag = 'member'
        else if tag = 'name'
        else if tag = 'registry'
        else if tag = 'require'
        else if tag = 'param'
        else if tag = 'proto'
        else if tag = 'unused'
        else
                display 10,'new tag?:',tag
                break
        end if

        ; Note: keep macro names beyond keyword collision possiblity
        if pos = 0 & len = 0
                eval '_end_',tag
        else
                load temp:len from vk:pos
                eval '_',temp ; process attributes with macro
        end if
        break
end while
end macro



i = 0 ; capturing tag
j = 0 ; inside an element
; scan XML and dispatch tags (w/ attributes):
while _POS < file_length
        load char:1 from vk:_POS
        _POS = _POS + 1
        if char = "<"
                if j    ; I probably broke something
;                       load fail:40 from vk:pos-20
;                       display 10,'< inside tag?'
;                       display fail,10
                else
                        pos = _POS
                        i = 1
                        j = 1
                end if
        else if char = " " & i ; first word is tag, rest are attributes
                len = _POS - pos - 1
                load tag:len from vk:pos
                i = 0
        else if char = ">"
                if j
                        j = 0
                        len = _POS - pos - 1 ; <inner> length
                        if i ; no space encountered
                                load tag:len from vk:pos
                                i = 0
                        end if
                        load char_head:1 from vk:pos
                        load char_tail:1 from vk:(_POS-2)
                        ; this is a hack to also run the termination
                        ; macros for tags that self-terminate
                        if char_head = '!' ; ignore comments
                        else if char_head = '/' ; avoid </tag/> (never happens?)
                                load tag:len-1 from vk:pos+1
                                dispatch tag,0,0
                        else if char_tail = '/'
                                dispatch tag,pos,len-1
                                dispatch tag,0,0
                        else
                                dispatch tag,pos,len
                        end if
                else    ; I probably broke something
;                       load fail:40 from vk:pos-20
;                       display '> outside tag?',10
;                       display fail,10
                end if
        end if
end while    
You'll still need to decide on the syntax to output for fasmg (or any other language). Start with one of the elements like enum to get a sense of the design. My fasmg language has deviated too greatly. So, I'm not including my output - wouldn't do you any good.

Should come in handy parsing the other XML registries from The Khronos Group.

p.s. the debugging code commented out only triggers when they have commented out valid XML. In which case it is by-passed correctly. I've left error checking in for further use when the spec changes.

_________________
¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup


Last edited by bitRAKE on 29 Mar 2021, 07:32; edited 3 times in total
Post 27 Mar 2021, 00:40
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4060
Location: vpcmpistri
bitRAKE 27 Mar 2021, 08:13
An example of enum, just spits out constants:
Code:
; stack attributes
struc attrs line&
local remains
define remains line
while 1
        match ,remains
                break
        else match name==quote rest,remains
                . equ name=quote
                define remains rest
        else match name==quote,remains
                . equ remains
                break
        else
                display 10,"unknown: ",line
        end match
end while
end struc


macro _enum line&
        local taco
        taco attrs line

        macro _end_enum
                local N,V
                irpv x,taco
                        match =value==quoted,x
                                V equ quoted
                        else match =name==quoted,x
                                N equ quoted
                        end match
                end irpv
                if defined N\
                 & defined V
                        db N,'=',V,10
                end if
                purge _end_enum
        end macro
end macro    
It's much more complex to get all the enum values. There are bitpos, alias, and extension values.

Extensions follow the rule:
value = dir? 1000000000+(extnumber-1)*1000+offset

But sometimes the extnumber is not defined in the child and the parent number of the extension is used. And then some values present are for informational purposes and don't exist in the actual API. Structures are a similar bag of marbles.

(In school, playing marbles, we would have a small bag and trade to get different marbles. Thus a "bag of marbles" referred to a highly diverse/varied collection. I searched for this phrase online and it is lacking, but I remember hearing it many decades ago.)

_________________
¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup
Post 27 Mar 2021, 08:13
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4060
Location: vpcmpistri
bitRAKE 29 Mar 2021, 07:49
Okay, so we have a taste of how quickly the template can produce results with little additional code, but how can we communicate values through the hierarchy?

One example, (mentioned above) is the extension numbers:
Code:
macro _extension line&
        local taco
        taco attrs line

        irpv x,taco
                match N==quoted,x
                        ; make these availible to child values
                        _LAST.extension.N equ quoted
                end match
        end irpv
        
macro _end_extension
        irpv x,taco
                match N==quoted,x
                        ; restore shadowed values of external scope
                        restore _LAST.extension.N
                end match
        end irpv
        purge _end_extension
end macro
end macro    
We get two things from this little fragment:
Code:
if defined _LAST.extension.supported\
        & _LAST.extension.supported="disabled"    
...to avoid non-API definitions and _LAST.extension.number to calculate values within extensions.

_________________
¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup
Post 29 Mar 2021, 07:49
View user's profile Send private message Visit poster's website Reply with quote
edfed



Joined: 20 Feb 2006
Posts: 4352
Location: Now
edfed 30 Mar 2021, 13:33
a very good lib to deal with encpasulation formats is c++ boost serialization.

this lib deserve a look, and maybe a port in fasm.
Post 30 Mar 2021, 13:33
View user's profile Send private message Visit poster's website Reply with quote
bitRAKE



Joined: 21 Jul 2003
Posts: 4060
Location: vpcmpistri
bitRAKE 30 Mar 2021, 14:14
Perhaps https://developers.google.com/protocol-buffers first - just because it's "language neutral".

I like XML, but the use of it here is minimal. Getting everything out of the spec requires parsing C and XML unfortunately. I hear this might improve with time, but it's quite hairy currently. It's like saying, I'm going to "serialize x86" so it's language neutral. And then I just compile it, lol.

_________________
¯\(°_o)/¯ “languages are not safe - uses can be” Bjarne Stroustrup
Post 30 Mar 2021, 14:14
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.