flat assembler
Message board for the users of flat assembler.

Index > Windows > JavaScript Interpreter


How small can a JavaScript Interpreter be?
<50kb
100%
 100%  [ 8 ]
50-90kb
0%
 0%  [ 0 ]
90-120kb
0%
 0%  [ 0 ]
120-160kb
0%
 0%  [ 0 ]
>160kb
0%
 0%  [ 0 ]
Total Votes : 8

Author
Thread Post new topic Reply to topic
daniel.lewis



Joined: 28 Jan 2008
Posts: 92
daniel.lewis
Hi guys. Beautifully packaged assembler.

I had been working on what I thought was an exquisitely written JavaScript interpreter in D until I opened it in a debugger and was shocked and awed at the shit that thing translated my beautiful algorithm into. After some analysis of different optimization features for the compiler, I opted to give up on it.

I'll be porting my almost 2.0 JavaScript Interpreter from D to fasm with analysis being done in IDA Free.

~~~

Phase 1)
Shall be starting using the "debugger", "template" and "cmd" examples to make variations of main programs that read arguments, standard i/o, and/or a file, in DLL, COM, IScriptable, IUnknown, lib, interactive, GUI and command line forms, all with interrupt-based error handling and usage.

Phase 2)
Shall be the authoring of a jump-address-table ECMA 262 Edition 3 conformant statement lexer/parser which pushes byte codes which may, depending on the code, be appended by a Value, and developing a proof of correctness for it by stepping through it with a debugger and doing maths.

Phase 3)
Shall be the authoring of a statement interpreter which switches ESP to the most local heap-based Scope and consumes statements generated by phase 2 while modifying the environment correctly, as analyzed by stepping through it with a debugger and doing maths. Methods and properties will not be made available by this phase.

Phase 4)
Shall be the generation of the ECMA 262 Edition 3 program structure; Global, Object, Array, String, Number, RegExp, Date, Math, and the Error objects. The objects in the structure will be pre sorted using level-order binary sort. and all accesses will use this algorithm.

Phase 5)
Shall be the generation of ECMA 262 Edition 3 conformant methods.

~~~

If anyone has template code I can rip for Phase 1 under public domain or BSD, let me know as it just might make the project easier.

However, it's now 12:53am and I work tomorrow.

_________________
dd 0x90909090 ; problem solved.
Post 28 Jan 2008, 15:56
View user's profile Send private message Reply with quote
rugxulo



Joined: 09 Aug 2005
Posts: 2341
Location: Usono (aka, USA)
rugxulo
I dunno how big an asm version would be. Lately, I've been approximating that any equivalent assembly program is about 10x smaller than (unpacked, static) C version (but that's excluding alignment and locales and whatnot).

P.S. Congratulations on your recent SAG award! Wink
Post 28 Jan 2008, 22:03
View user's profile Send private message Visit poster's website Reply with quote
calpol2004



Joined: 16 Dec 2004
Posts: 110
calpol2004
By using assembly not only will your interpreter be smaller but a lot faster too. It will probably take longer to build the project in than it would in D though.

I really like the D language but i hate the way they made it, it seems to be a refined cleaned up C++ but all the code bloat and other stuff that get's added has put me right off. Not only will you get all that set-up for heap .etc and all you had in C but you have garbage collector running in the background too.

From what i've read, you already have the project done in D? what was the size of it in that language? FASM you will EASILY make 100kb.
Post 28 Jan 2008, 22:33
View user's profile Send private message MSN Messenger Reply with quote
daniel.lewis



Joined: 28 Jan 2008
Posts: 92
daniel.lewis
@work;

As it stands, Walnut 1.0 is the smallest, fastest JavaScript interpreter I have seen out of roughly 14 different engines; with 409kb of source and a 384kb binary.

http://dsource.org/projects/walnut/browser/branches/1.1/

Walnut 2.0 would be standing at 240kb completed. It's 209kb at the time of this writing; but the parser was about 90% finished, and the AST interpretation code wasn't written yet.

http://dsource.org/projects/walnut/browser/branches/1.9/

I'm thinking to abandon Walnut 2.0 and start work on it in x86 assembler, with notes to ease porting it to x86-64. Planning to call the new one "Almond".
Post 29 Jan 2008, 00:09
View user's profile Send private message Reply with quote
edfed



Joined: 20 Feb 2006
Posts: 4237
Location: 2018
edfed
interpreter <<< 50kb (the program that interpret the javascript)
data base ~ ????kb (the database used by the programm to interpret the javascript)

java is really huge.
Post 29 Jan 2008, 00:24
View user's profile Send private message Visit poster's website Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3170
Location: Denmark
f0dder
edfed: javascript != java.

daniel.lewis: D is a pretty interesting language, but the compiler optimizations aren't quite up to the level of MSVC or ICC. I know there's a GCC version as well, but while GCC has improved a lot over the years, MSVC and ICC still tends to beat it.

You might want to try a C++ implementation and look at MSVC, ICC and GCC results, before going all the way and doing it in assembly. Bound to be less work, and you end up with more portable code - while not 100% trivial, it's still less work to produce x86-32 and x86-64 binaries from a single C++ source than from assembly.

PS: be careful about messing with ESP directly, you will need to modify fields in [FS:xx] if you don't want your thread to be auto-terminate.
Post 29 Jan 2008, 00:36
View user's profile Send private message Visit poster's website Reply with quote
daniel.lewis



Joined: 28 Jan 2008
Posts: 92
daniel.lewis
f0dder:

Nah, I've decided on doing it in fasm.

My reasons, are because I plan on using:

- a flow control device for the lexer that's quite illegible, if possible, in C-like languages (a modified duff's device)

- a token stack of bytes which might have 15 more appended to them depending on their value. If it's a '+=', it's unnecessary but objects, arrays, strings, numbers etc will be built during tokenization. Variably sized code points are a bitch in C-like languages.

- stack frame style crushing of local variables for Scope, rather than the high level form implied in the spec. Sure I could do it in C, but transparency makes it easier.

- debugging and analysis of produced code in IDA free or similar; simply because it kicks butt.

- Some code inspired by guys like Agner Fog and Paul Hsieh in places like the Math library. When the examples of good code are in assembler, why use C?

- It's what's inside that counts. : )

Regards,
Dan

_________________
dd 0x90909090 ; problem solved.
Post 29 Jan 2008, 01:45
View user's profile Send private message Reply with quote
vid
Verbosity in development


Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid
Quote:
- a flow control device for the lexer that's quite illegible, if possible, in C-like languages (a modified duff's device)

you might consider "flex", it should generate better lexer than anything you'd write yourself.

Quote:

- a token stack of bytes which might have 15 more appended to them depending on their value. If it's a '+=', it's unnecessary but objects, arrays, strings, numbers etc will be built during tokenization. Variably sized code points are a bitch in C-like languages.

yep, lot of *(type*)value, etc... and it makes portability a hard fight...

Quote:
Some code inspired by guys like Agner Fog and Paul Hsieh in places like the Math library. When the examples of good code are in assembler, why use C?

math library is good example where asm is best. but personally, i'd stick to something done, like LibTomMath, which has version written in assembly, but also in C, so it has both advantages.
Post 29 Jan 2008, 02:04
View user's profile Send private message Visit poster's website AIM Address MSN Messenger ICQ Number Reply with quote
daniel.lewis



Joined: 28 Jan 2008
Posts: 92
daniel.lewis
I think I'll manage. Wink

_________________
dd 0x90909090 ; problem solved.
Post 29 Jan 2008, 07:48
View user's profile Send private message Reply with quote
daniel.lewis



Joined: 28 Jan 2008
Posts: 92
daniel.lewis
Update Exclamation

Smile
I have the DLL binding working.

I added a few features to the example GUI text editor to get that interface working.

The function signatures have been worked out. Each data type uses a different signature, but I plan to use toDouble, toString, toObject etc. as per ECMA spec to coerce types.

I've managed to get Math's methods fleshed out using SSE2 with duff's device for handling multiple arguments. The convention for doubles is you pass argc in ECX, and the rest goes in XMM0-7. Still gotta think out how that'll work with the stack.

I'm still choking in a few places where I can't think which instructions to use, so I have to fix a few items and I have to figure out the proofs still.

My assembler-based lexer/parser now consumes ECMA defined whitespace, comments, Program, BlockStatement, EmptyStatement, ExpressionStatement, PrimaryExpression, StringLiteral and Identifier. Most of it's still not fleshed out and output generation is commented out.

~~

Confused
I had trouble getting cmd line and stdio figured out. I noticed there's no license or release and a copyright on the examples so I never read it.

I also haven't started on using int 0x03 or debugger related stuff.

_________________
dd 0x90909090 ; problem solved.
Post 05 Feb 2008, 00:02
View user's profile Send private message Reply with quote
TheRaven



Joined: 22 Apr 2008
Posts: 89
Location: U.S.A.
TheRaven
I thought of doing the same thing!

FASM is cross platform with little differencing, comparitively speaking of course. Assembly does produce cleaner code realizing speed gains as asm exe's grow in size compared to any equivalent in a HLL. Parsers can really benefit from assembly constrained design as they are on the fly JIT/realtime oriented.

ECMA spec across the boards to boot. Excellent!

FLEX is cool for building compilers, but as far as parsers are concerned I have heard the opposite on Flex. I have also heard that Flex is similar to C which isn't bad but not as good for parser development.

Lewis, you are the man!

Kick a** man!

Any whoo, keep us posted dude as this is awesome, awesome, awesome.
Shocked

_________________
Nothing so sought and avoided more than the truth.
I'm not insane, I know the voices in my head aren't real!
Post 24 Apr 2008, 03:18
View user's profile Send private message Reply with quote
AlexP



Joined: 14 Nov 2007
Posts: 561
Location: Out the window. Yes, that one.
AlexP
How about this:

Instead of interpreting the Java language -> Assembly, how about have some sort of a store of commonly used Java prgms already translated, and just run those. Or just use a Jitter like C#
Post 24 Apr 2008, 12:50
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar.

Powered by rwasa.