flat assembler
Message board for the users of flat assembler.
Index
> Projects and Ideas > re4asm - regular expression engine Goto page 1, 2, 3 Next |
Author |
|
mrpink 31 Oct 2006, 16:11
re4asm is a small and reasonably powerful regular expression engine written completely
in assembly language. The regular expression syntax is a proper subset of POSIX ERE with a few minor constraints. Whole-match addressing is supported but submatch addressing is not.
Last edited by mrpink on 12 Dec 2012, 17:53; edited 4 times in total |
|||||||||||
31 Oct 2006, 16:11 |
|
vid 31 Oct 2006, 18:14
later, when FASMLIB core is built, this could be optional module
|
|||
31 Oct 2006, 18:14 |
|
rugxulo 01 Nov 2006, 05:51
I haven't tested it too much, but congrats on everything so far, very cool!
|
|||
01 Nov 2006, 05:51 |
|
TmX 13 Feb 2007, 12:10
How does this engine regex does, compared to PCRE ?
|
|||
13 Feb 2007, 12:10 |
|
Crukko 13 Feb 2007, 15:13
Sure it's a big work.
I'm reading about regex and I think your contribute will be great for Fasm User. Only one thing: can you put more examples? By these, people who doesn't know how it works and how to use has got a quick possibility to start understand |
|||
13 Feb 2007, 15:13 |
|
f0dder 13 Feb 2007, 15:20
|
|||
13 Feb 2007, 15:20 |
|
rugxulo 13 Feb 2007, 19:03
http://www.regular-expression.info
EDIT: Yes, there is still interest in this (or else who's been downloading it?? 174 people, at least!) Last edited by rugxulo on 14 Feb 2007, 02:42; edited 1 time in total |
|||
13 Feb 2007, 19:03 |
|
MichaelH 13 Feb 2007, 21:10
Quote:
Absolutely! Thankyou for your current and any future work you do. |
|||
13 Feb 2007, 21:10 |
|
mrpink 14 Feb 2007, 08:15
Thank you all.
Quote: How does this engine regex does, compared to PCRE ? Well, PCRE is (in general) not POSIX compatible. The engine they use is a so called Traditional NFA but not a POSIX-NFA. My implementation is a DFA. For example, PCRE does not necessarily return/find the leftmost longest match. On the other hand they provide submatch-addressing and backreferencing which cannot be done by a DFA. See (copy of Mastering Regular Expressions) http://www.mamiyami.com/document/regex/0596002890_mastregex2-chp-4-sect-1.html for a more detailed description. I'm not sure which features I should implement because if I implement lots of them, people will say that it is too bloated. But on the other hand most of the features are very handy. Take for example character classes, e.g., [:alpha:]. On the one hand this is just syntactic sugar since it is equivalent to the already implemented [A-Za-z]. But on the other hand [:punct:] is far more convenient than its equivalent. The more features the more code and thus increased binary size. Since AsmRegEx is designed to be embedded into existing applications (it was never meant to become a standalone tool since far more powerful tools are readily available) it should be small and easy to integrate. More features also increase the complexity of the interface. The following features have been implemented(total size: code+data = 3KB):
- character classes The following features will be implemented soon:
- backward searching Please tell me your opinion on these. Maybe it becomes an optional part of FASMLIB if vid agrees. (When it is done, I will port it to all supported assemblers.) By the way what about the license? I thought about changing to LGPL. Should I? I don't have too much time and want to finish this soon. |
|||
14 Feb 2007, 08:15 |
|
vid 14 Feb 2007, 08:34
i think i will have to develop some engine for optional "modules" to FASMLIB. If you code is enough errorproof i will for sure consider it.
About interface, my opinion is that you should not be afraid to implement all you can, ideally implement entire standard. It would be great to have complete implementation maybe 5 times smaller than one written in C. But I think that would be MUCH MUCH more work. About license? What do you want people to prohibit with your library? Why do you find possibility of relinking important? |
|||
14 Feb 2007, 08:34 |
|
OzzY 14 Feb 2007, 13:53
RegEx is very usefull feature to have. I use scripting languages because most of them provide RegEx which makes it easy to parse large amounts of text.
An optimized fast and easy to use implementation for FASM is very great idea. Maybe you could take a look at Pelles C standard library that comes with PCRE and also another easier to use implementation. (but looking at the source files of your implemantation, it looks very easy too. I'm going to try it.) Also there's a more simple thing that is called "glob". It matches wildcards (*, ?, etc) like F?SM (would match FASM, F2SM, FTSM, etc..). Would be nice too. Thank you and keep up the good work. |
|||
14 Feb 2007, 13:53 |
|
rugxulo 14 Feb 2007, 15:01
I'm partial to sed and its regex support, personally, so anything moving closer to that would be fine with me.
P.S. Unless you have a good reason not to, choose the most liberal license. |
|||
14 Feb 2007, 15:01 |
|
mrpink 14 Feb 2007, 20:11
Hello vid.
What do you mean with errorproof? This is zero-defect software. About the entire standard: I'm sure you are familiar with it. Can you show me an implementation that supports equivalence classes and collation sequences? What do you mean by the entire standard? Currently only a subset of ERE is implemented. I can and will not implement BRE. What do you mean by relinking? I looked up this word in a dictionary but it does not exist. I just wanted to state that I've no problem to change it. That's all. Hello Ozzy. To be honest, although PCRE is probably the most popular regex library under the sun, I'm not a fan of it. If you need a small, yet very very powerful and almost POSIX compliant free third party regex library, I would highly recommend TRE by Ville Laurikari. Hello rugxulo. Unfortunately sed uses BRE. It is (close to) impossible to implement them by means of a DFA. I would have to rewrite all and everything. You might implement your own tool using AsmRegEx that does a sed-like job. (And of course, only FASM rocks.) I wish you all a nice rest of the week and an even nicer weekend. I'm off until monday. |
|||
14 Feb 2007, 20:11 |
|
vid 14 Feb 2007, 21:35
Quote: This is zero-defect software. nothing is Quote: I'm sure you are familiar with it No i'm not Quote: Can you show me an implementation that you say there os none? |
|||
14 Feb 2007, 21:35 |
|
f0dder 14 Feb 2007, 23:41
mrpink: what's wrong with PCRE? I haven't yet started a project of mine where I will need RegExes, but I was intending on using PCRE; mainly because it's so wellknown. Would be nice hearing about possible defects before I get in too deep
|
|||
14 Feb 2007, 23:41 |
|
tantrikwizard 15 Feb 2007, 14:14
Depending on your needs, GoldParserhttp://www.devincook.com/goldparser is very nice. There is even an ASM engine. I breifly looked at the ASM engine and I think it used Visual Basic somehow, will probably need to be ported for use with FASM.
|
|||
15 Feb 2007, 14:14 |
|
f0dder 15 Feb 2007, 15:07
tantrikwizard wrote: Depending on your needs, GoldParserhttp://www.devincook.com/goldparser is very nice. There is even an ASM engine. I breifly looked at the ASM engine and I think it used Visual Basic somehow, will probably need to be ported for use with FASM. ASM engine using VB? That makes no sense O_o _________________ - carpe noctem |
|||
15 Feb 2007, 15:07 |
|
tantrikwizard 15 Feb 2007, 23:52
f0dder wrote:
Ah, my mistake: Quote: GOLDx86Engine is written with x86 assembly language to ensure it has decent performance for serious compilers / interpreters.Another important aspect of it is that it is language neutral, that is, the functions contained in the DLL can be called from any programming language that is able to call windows functions such as C/C++, Visual Basic, Delphi etc. The package contains 3 forms of the software: I havent looked at the asm engine but have used the VB, C# and C++ engines. Gold is really cool, define a BNF grammar spec and compile the grammer spec into a proprietary grammer table file (.cgt) The CGT gets loaded into the engine and parses the text that is submitted. The parser then creates an object model heirarchy of the text which makes compiler creation or interpreters easy to write. I've used it for parsing IMAP email server protocol messages as well. The most time consuming portion of using this parser is defining the BNF grammar. Here's a BNF grammar for ASM if anyone needs it in gold: http://tech.groups.yahoo.com/group/GOLDParser/message/2502 |
|||
15 Feb 2007, 23:52 |
|
f0dder 15 Feb 2007, 23:56
I took a brief look at Gold Parser some years ago, looked interesting, but never got around to using it. Not really a replacement for a RegEx engine either, imho.
Anyway, the assembly engine implementation looked pretty trivial, I wouldn't be surprised if it's actually beaten by a decent compiler; would be interesting to see some speed tests. |
|||
15 Feb 2007, 23:56 |
|
Goto page 1, 2, 3 Next < Last Thread | Next Thread > |
Forum Rules:
|
Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.
Website powered by rwasa.