flat assembler
Message board for the users of flat assembler.

Index > Main > Tag parsing

Author
Thread Post new topic Reply to topic
Reverend



Joined: 24 Aug 2004
Posts: 408
Location: Poland
Reverend 11 May 2005, 17:20
Hello,
Can you guys please give me some advice how to write good and fast algorithm to parse text with tags in it. I mean something like bbcode here in this forum. It may be just a link to external site with some kind of article. I checked sites like codeguru, flipcode, etc. but found nothing. In fact I wrote a small engine capable of such job, but it has many errors, as I see it now, and I have to fix, and fix, and fix...... I guess that, I may have just projected it badly, so from the first version it wasn't perfect. So now, does anybody have any interesting resources about such topic? Thanks in advance

Btw. a small offtopic. Do you think that it's better to use .if, .elseif, .while, etc. macroses that often create inefficient code or is it better to struggle in hundreds of labels, just to avoid situations like:
Code:
.if xxx
 .if yyy
  ...
 .endif
.endif    
Which will be assembled somewhat like:
Code:
cmp x,z
jnz blabla1
 ...
blabla1:
 jmp blabla2
blabla2:
    
Post 11 May 2005, 17:20
View user's profile Send private message Visit poster's website Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3175
Location: Denmark
f0dder 11 May 2005, 17:35
For heavy-duty parsing jobs, you might want to have a look at a parsing engine. It can be beneficial since the grammar you write is sorta like a formal specification of the kind of text you are parsing. If you find the resulting parser is too slow or bloated, you can always then hand-write the parser.

www.goldparser.com looks promising, since it has a decent UI for developing the grammar, and the parser routine and used tables are separata - which means the parser can be used for multiple tables, and a table can be used in multiple languages (unlike stuff like lex/yacc or whatever).

I must admit that my experience with such parsing is very limited, but I'm playing around with it a bit currently, as much as my exams allow me (read: not much Smile ).
Post 11 May 2005, 17:35
View user's profile Send private message Visit poster's website Reply with quote
Reverend



Joined: 24 Aug 2004
Posts: 408
Location: Poland
Reverend 11 May 2005, 17:41
f0dder wrote:
If you find the resulting parser is too slow or bloated, you can always then hand-write the parser
In fact, I am writing such engine on my own. It's not that I actually have some text to parse with specific grammar, but I'm now writing a program to display formatted text. Something like html, but very limited (what doesn't make whis task simplier though Confused)
Post 11 May 2005, 17:41
View user's profile Send private message Visit poster's website Reply with quote
pelaillo
Missing in inaction


Joined: 19 Jun 2003
Posts: 878
Location: Colombia
pelaillo 11 May 2005, 18:06
This is a simple and fast html parser I wrote for menuet browser. It will fit well your purposes and is simple to configure to be used for other purposes.
The parser is system independient code. The helper functions are for win32.

Madis wrote an improved parser for the browser. Maybe he could post his latest version here.


Description: Small HTML parser
Download
Filename: tinyweb.zip
Filesize: 21.35 KB
Downloaded: 428 Time(s)

Post 11 May 2005, 18:06
View user's profile Send private message Yahoo Messenger Reply with quote
Reverend



Joined: 24 Aug 2004
Posts: 408
Location: Poland
Reverend 11 May 2005, 22:08
Very big thx pellailo for sources, but I found an interesting way to achieve my goal. It's so simple, but powerdul Wink I just used stack. When 'exploring' the input when I get to an opening tag I fill a structure with all its info, and push it on a stack. When I find then closing tag I automately pop the stack, and fill rest info (where does it end). It gives you many benefits. Ie. you know after first pass whether all tags were closed (if stack is unormalised, it means not all tags were closed). When poping the stack you already compare an opening and closing tag, so you know if they were used properly. But there are also bad sides of it, as it works only with 100% correct with XML standard tags eg. <b> <i> </b> </i> won't work, becuase youhave to first close <i> tag. Also it makes you create closing tag for every single tag in an input. There can be no sperate tags, without closing one. So I guess these limits won't let you write an html browser (as html pages, are full of incorrect code), but it helped me a lot. Big thanks both f0dder and pellailo.
Post 11 May 2005, 22:08
View user's profile Send private message Visit poster's website Reply with quote
Madis731



Joined: 25 Sep 2003
Posts: 2139
Location: Estonia
Madis731 12 May 2005, 11:28
Post 12 May 2005, 11:28
View user's profile Send private message Visit poster's website Yahoo Messenger MSN Messenger Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.