Database engine

Index > Projects and Ideas > Database engine

Goto page 1, 2 Next

Author

Thread

Tomasz Grysztar

Joined: 16 Jun 2003
Posts: 8535
Location: Kraków, Poland

Tomasz Grysztar 05 May 2006, 11:35

In context of SQLite discussion on this board, there was once mentioned the idea of some similar project written in assembly. This idea was interesting for me from the beginning, as for some of my projects it would be a great thing to have something like that. Even more: I'm thinking about making some database engine written in assembly with an binary-level access language, which would have then some external wrapper to support SQL syntax, but itself would perhaps be more efficient and easy to use from assembly language programs than text-processing-based SQL (which is better for scripting languages like Perl of PHP). The other nice thing would be to make it portable between any x86-based systems just like fasm is.

Myself I may not be able to start and maintain such a complex project, but perhaps by cooperating we would be able to embody it. Therefore I ask here to let me know if anyone's is interested in cooperating in such project, or has some interesting ideas.

Also, if you've got some interesting links about database systems design and algorithms, please post them here.

05 May 2006, 11:35

Mota

Joined: 29 Dec 2004
Posts: 22

Mota 07 May 2006, 23:25

This is very, very, very funny for you to post, for a reason.

Yesterday, I got the idea of making a database OS, for the various advantages that such an OS would bring. I realised that the way current databases work would make it hard for such an OS to be efficient, so obviously I'd have to do two things:
- Devise my own database design, so that it is able to work with an OS efficiently.
- Actually make a database implementation, with the OS built-in.

Well, I spent all of yesterday designing the database mechanics and I must say it looks promising enough to warrant a try.

So, today, I was going to start designing how everything should be stored in memory when suddenly I had an idea: what if I made the database flexible in its uses, and rather than use it merely for an OS, make it usable for other things? It was already extremely flexible because of the database design I had made, but I knew that for it to be used as other things, being mechanically flexible wasn't enough: it'd have to be binarily flexible. (is binarily a word?)

This soon brought me to a second goal:
- The database has to be OS independant.

Of course, something can never be 100% OS independent, but I think I budged along quite well. I sort-of devised a plugin system, by which almost everything in the database system can be controlled by a plugin.

This way, porting from one OS to another is basically a matter of changing the right plugins and formatting the database executable. It is also possible to load plugins durin run-time.

[size=small]
Basically, the way the plugin system works is:
- Each plugin has a name associated with it. The name must not exceed 4 byte-length and therefore can be stored on a simple dword (and passed on the stack: always a plus).
- The database system looks for a plugin with the right name, and calls one of its functions, which are also indexed by 4-byte names,

That's rather simple, isn't it? Well, the problem is obviously "how am I going to act as a pseudo-kernel for plugins if I don't have access to the manipulation of virtual addresses?" Basically, I copy the whole plugin into a certain address (0x10000 at the moment). They are assigned the access of 0x10000 to 0x1FFFF to themselves, whilst 0x20000 upwards being shared memory. (up to a maximum of 0xFFFFF, at the moment.)

These values are good because it would allow the database to be portable even to 16bit systems without too much hassle, although that's a bit more drastic than changing between OSes. [/size]

That was a bit besides the point, but it serves to illustrate how far I'm willing to optimize a system (the 4-byte thing, that is).

Now, I have already dedigned a method of querying the database in a text-processing way but, as you point out, this is a saturated market. I'm sure that designing a binary query of sorts is perfectly possible (and necessary, too). I will deviate my attentions to the design of such a binary system. Last, but not least, you're free to join me in my adventure if you want. I sent a project application to sourceforge.net. The unix-name of the project is dbsystem Smile

Cya there. (If the project gets approved.)

07 May 2006, 23:25

Madis731

Joined: 25 Sep 2003
Posts: 2138
Location: Estonia

Madis731 10 May 2006, 21:39

I think we can start simple INSERT / DELETE. Hmm, I feel like I've suggested this before, but here me out:
SELECT would of course be very neccessary, but that is it. UPDATE would be like a macroinstruction and WHERE would be a macro of multiple SELECTs. When we have it working, we can improve it further, by adding indexes, rewriting syntax for more optimized approach and implement algorithms that we may accidentally find on the net. I had a relatively thorouugh course in DBs in the university so I can give you some hints on the way like about the different types of indexes and their uses...
...I myself am busy with other projects, but I'll see how I can help this dbengine.flatassembler.net to become reality Wink

10 May 2006, 21:39

Mota

Joined: 29 Dec 2004
Posts: 22

Mota 11 May 2006, 21:49

I'm currently looking into hashtables and seeing if they are at all useful to db.

Code:

elem* INSERT(id, size); -- Does this sound reasonable?
void DELETE(id); /or/ void DELETE(elem*); -- Which is better? Both?

struc elem_header {
    .id dd 0
    .size dd 0
    .type db 0 ; data type
    .flags dw 0 ; flags: they depend on the data type
    .data:
}

Might not make much sense now,but it'll probably be clearer later on.

11 May 2006, 21:49

LocoDelAssembly
Your code has a bug

Joined: 06 May 2005
Posts: 4623
Location: Argentina

LocoDelAssembly 12 May 2006, 00:16

Tomasz (and interested on the project), this database will be a relational database or an object oriented database?

http://en.wikipedia.org/wiki/Object_database

To make this project mature could be take too long time so I think it's important to see which kind of databases will be used in the future (RDBMS or ODBMS).

http://www.odbms.org/introduction.html

Regards

12 May 2006, 00:16

Tomasz Grysztar

Joined: 16 Jun 2003
Posts: 8535
Location: Kraków, Poland

Tomasz Grysztar 12 May 2006, 09:44

Wich one in your opinion the assembly language is more suitable for?

12 May 2006, 09:44

LocoDelAssembly
Your code has a bug

Joined: 06 May 2005
Posts: 4623
Location: Argentina

LocoDelAssembly 12 May 2006, 13:53

Well I'm not a DB expert but seems to be O-O more suitable for assembly, since you make queries using template objects seems that you don't need to send text (or opcodes as your idea).

However, O-O languages implements objects quite diferent each other so you will need to create multiple interfaces instead of a single one (SQL).

Hope someone with more knowledge answer you question more accurately

Regards

12 May 2006, 13:53

Tomasz Grysztar

Joined: 16 Jun 2003
Posts: 8535
Location: Kraków, Poland

Tomasz Grysztar 12 May 2006, 14:07

Some other nice idea that I once saw on this board was to try rewriting SQLite in assembly - that would be perhaps at least instructive.

12 May 2006, 14:07

LocoDelAssembly
Your code has a bug

Joined: 06 May 2005
Posts: 4623
Location: Argentina

LocoDelAssembly 12 May 2006, 14:42

Since SQLite is an embeddable database system it's more suitable to give it a try converting it to ODBMS since it shares the same memory space that the client.
ODBMS.ORG says in When to Use an ODBMS this:

ODBMS.ORG wrote:

Embedded DBMS Applications

Embedded DBMS Applications might involve a database on partially connected mobile devices or as a OO data cache in an application that demands a super-fast response time. If you are using Java or .NET, this requires a self-contained, non-intrusive, and easy-to-deploy persistence solution for the client-side or in the middleware.

Why? Storing Java or .NET objects 'just as they are in memory' is always the leanest and least intrusive way to implement a persistence solution. Using an RDBMS requires the overhead of object-relational mapping, resulting in an increased demand on resources. In addition, an RDBMS approach requires greater administration involvement, especially when you must deploy updated class schemes to your installed base.

Of course I understand that an assembly language programmer usually dislikes write in OO way assembly code but I think that FASM is powerful enough to write friendly macros, struct macro already supports inheritance, it just need some improvements to support method binding and allow virtual methods (polimorphism).

12 May 2006, 14:42

Mota

Joined: 29 Dec 2004
Posts: 22

Mota 12 May 2006, 19:43

The model i'm using for db isn't too far off from object databases, but is much more lose. I guess i'll release the documentation on the site once the thign gets approved.. (4 days in pending queue... is it really a queue?)

12 May 2006, 19:43

shoorick

Joined: 25 Feb 2005
Posts: 1617
Location: Ukraine

shoorick 15 May 2006, 04:27

i'm also not an expert in db internal, but i think some general OOP support should be added as initial stage (embedded and/or via macros) (it should be nice not only for db)

15 May 2006, 04:27

Madis731

Joined: 25 Sep 2003
Posts: 2138
Location: Estonia

Madis731 15 May 2006, 12:02

Are we thinking of a db that will be used within general public or are we aiming at being different (take MenuetOS as an example) and write something totally different that others will hate at the beginning Smile

I think doing something that others have done over&over&over again isn't too interesting. assembly style SQL sounds fun...at least for now:

Code:

push mydata at [database+esi] ; INSERT 
pop ; ... would be DELETE
mov ; ... means UPDATE or SELECT depending on the direction

...or maybe it will become too obfuscated. I hear a poll coming up.

15 May 2006, 12:02

YONG

Joined: 16 Mar 2005
Posts: 7997
Location: 22° 15' N | 114° 10' E

YONG 17 May 2006, 06:55

I like Madis731's idea - ASM style SQL. Most importantly, it will be something new and unique!

YONG

17 May 2006, 06:55

zubi

Joined: 27 Apr 2006
Posts: 25
Location: Turkey

zubi 17 May 2006, 10:00

Concerning the syntax, I think a 'middle way' would attract more people. The one that is suggested seems to be too confusing for complex statements and queries... I also would like to participate to the project if it is decided to be done (or at least start:). I think I can contribute in language parsing as I already have an experience and interest in that. Don't have much knowledge about the internals of a db engine though.

17 May 2006, 10:00

Mota

Joined: 29 Dec 2004
Posts: 22

Mota 17 May 2006, 22:29

heh. well, i decided to abandon db because sourceforge takes ages and eventually i lost interest in said database. So, i'm interested in Madis's idea... care to elaborate a bit more on it?

17 May 2006, 22:29

Madis731

Joined: 25 Sep 2003
Posts: 2138
Location: Estonia

Madis731 18 May 2006, 20:27

Well, it was just an idea, but if enough people are interested, then we can think of a logical way to give asm programmers to express their way of thinking and their syntax in the garment of SQL (actually I meant vice versa Smile

)

18 May 2006, 20:27

shoorick

Joined: 25 Feb 2005
Posts: 1617
Location: Ukraine

shoorick 22 May 2006, 09:54

i think there should be some c-call functions like

Code:

ccall _select,[what],[from],pwhere_callback,...

which can be automatically converted to macros like:

Code:

select,[what],[from],pwhere_callback,...

where "what","from" are some supported structures to explain wide parameters. [from] can be a direct handle to the table, etc.

it has to be started from single-table toy, then may be extended to more complex structure.

22 May 2006, 09:54

Mota

Joined: 29 Dec 2004
Posts: 22

Mota 22 May 2006, 22:06

Yes, I think that giving it C-compatability would be a wise move. Very much so.

As for [what] and [from]...

couldn't we just do a

Code:

ccall _select, table, evaluative_func
ccall _mselect, evaluative_func, number of tables, table 1, table 2....

These would return a pointer to a table with all the selected entries.

You could have a function that merges tables together to make another table:

Code:

ccall _merge, n, table 1, table 2, ... , table n

This way, '_mselect' is just a selection on a merger of tables.

To make tables from scratch...

Code:

ccall _proto

Makes a table prototype.

Code:

ccall _proadd, prot, elemname, elemsize

Would add an element to the prototype.

Code:

ccall _mktable, name, prot

Would create a table from the prototype. Now you can add entries to the table, but this can be figured out later...

....

Advantages of this system? Well, it works perfectly with C, which is portable, doesn't require the use of strings (but you *can* use strings), the structure of entries can match a C structure (or an asm one for that matter (use virtual)). IT would be faster than SQL by a lot, I would think. But then again, it doesn't REALLY match asm as such.

22 May 2006, 22:06

LocoDelAssembly
Your code has a bug

Joined: 06 May 2005
Posts: 4623
Location: Argentina

LocoDelAssembly 22 May 2006, 22:29

Mota, with "ccall _select, table, evaluative_func " do you mean that the database engine will call evaluative_func procedure? In such case that will be not work very good with a server database. Another problem is since the database engine can't know what evaluative_func does it will be forced to iterate all the table's records.

Note that is preferable to spend time in parsing a SQL statement than make one disk read access, CPUs can do a lot of work in a few milliseconds.

Regards

22 May 2006, 22:29

shoorick

Joined: 25 Feb 2005
Posts: 1617
Location: Ukraine

shoorick 23 May 2006, 04:21

the idea is to make sql-compatible set of functions. so, these functions could be used directly on local base or server, but can be easy used to parse sql statement. anyway sql wil be needed, if you plan to give your application to user and let him make changes without recompilation of main module.

23 May 2006, 04:21

Goto page 1, 2 Next

< Last Thread | Next Thread >

Forum Rules:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum