flat assembler
Message board for the users of flat assembler.

Index > OS Construction > Idea for OS file system.

Goto page 1, 2  Next
Author
Thread Post new topic Reply to topic
JohnFound



Joined: 16 Jun 2003
Posts: 3499
Location: Bulgaria
JohnFound 18 Jun 2003, 13:34
Let make this forum handsel. Very Happy

One my idea about new OSes for discussion. Everyone knows that string operations are plague for assembler programming. String operations are slow and hard for programming. In other hand filenames and paths in every FS are strings, so on every opening of the file OS must work with strings. So the idea:

What if every file in the operating sistem have some unique number instead of string file name. In addition, for human reading, the OS may support some database with string aliases of the files.

Everyone knows that big OS opens continuously, hundreds of system files, libraries, etc. If the OS identify this files with numbers, not with strings this process will be very fast.

Second: It will be possible to change the place of the files, without changing their numbers, so, there will be option freely to move applications on different directories, because the actual number of the file will be the same and only it's alias will be changed.

Third: If list with files is sorted (it's not a problem) the search for file with given number will be very fast.

So, what you think about this idea?
Post 18 Jun 2003, 13:34
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
scientica
Retired moderator


Joined: 16 Jun 2003
Posts: 689
Location: Linköping, Sweden
scientica 18 Jun 2003, 18:38
Idea I have an idea, you may discard it if you like Smile

JohnFound wrote:
Everyone knows that big OS opens continuously, hundreds of system files, libraries, etc. If the OS identify this files with numbers, not with strings this process will be very fast.

Sound like an good idea, how about a file identfier structure (FIS) that looks something like this:
Code:
struc __file_id_node{
 .fileID dq ? ; file ID
 .folderID dq ? ; folder ID, under wich the file is placed
 .fileSize dq ? ; filesize
 .fileNodeSize dq ? ; size of the FIS-node
 .fileCreationTime FILETIME
 .fileAccessTime FILETIME
 .fileModifiedTime FILETIME
 .filedata FILE_DATA_ARRAY ; an array of pointers to places on a storage medium, better use an array so file fragmenting can be allowed, even if we don't like it :/
}    

the .filedata could be terminated with an null FILE_DATA_ARRAY_NODE element.
JohnFound wrote:
Second: It will be possible to change the place of the files, without changing their numbers, so, there will be option freely to move applications on different directories, because the actual number of the file will be the same and only it's alias will be changed.

With above, simply changing the .folderID member should be enought, and uppdate the dir table (decreasing/increaing the directory file counts)

JohnFound wrote:
Third: If list with files is sorted (it's not a problem) the search for file with given number will be very fast.

Yes, and sorting the lists can be done via some type of defragmentation, it can be done dir by dir (each dir contains a list of the files in it), simply swapping the .fileID numbers of the files.
Accessing files/dirs should be done via some system API, that has a table of files active (with heir real .fileID and a file handle), and the "users" are only given access to teh handle, so that teh system can "defragment" the data base, skipping files marked "in use" in the system table over active files. The sytem active file table (AFT) could contain some membet that can flag files/dirs locked, and ofcourse some member that says if the entry is a file or dir. I think it would be best to have two tables, one for files and one for dirs, thus there could be 2^64 files and 2^64-1 dirs (dir 0 is the root)

Here's a suggetsion of an dir struct (dir identified struct (DIS)):
Code:
struc __dir_id_node{
 .folderID dq ? ; folder ID
 .parent_folderID dq ? ; folder ID, under wich the folder is placed, if null itäs under the root.
 .dirSize dq ? ; number of file nodes in dir
 .dirNodeSize dq ? ; size of the DIS-node
 .dirCreationTime FILETIME ; creation time of dir
 .fileAccessTime FILETIME ; last time the dir node was accessed, not any of the files
 .fileModifiedTime FILETIME ; last time that the dir node was modified
 .dirs DIR_ARRAY ; an array of 64-bit dirIDs
 .files FILE_ARRAY ; an array of 64-bit fileIDs
}    

Imagine how easy it would be to move a dir, just uppdate the parentID, and all files will be moved Very Happy
How ever, it's not this easy, this would require a loot of work with system API functions, keeping track of active files, it's easy in the mind but in reallity, -headace-, one must make sure two processes don't add a file each to the same dir at excalt the same time.

I hope you understand my idea, because honsetsly I'm having hard to follow my own thinking Laughing Razz

/me feels a little dizzy, maybe because I haven't drinked enougth water or is it this post? Wink

_________________
... a professor saying: "use this proprietary software to learn computer science" is the same as English professor handing you a copy of Shakespeare and saying: "use this book to learn Shakespeare without opening the book itself.
- Bradley Kuhn
Post 18 Jun 2003, 18:38
View user's profile Send private message Visit poster's website Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3499
Location: Bulgaria
JohnFound 18 Jun 2003, 19:18
Hm, it's too complex for me too in this moment. Very Happy
My idea was more simple: Sorted array with file index with small records:

Code:
struc TFileItem {
  .FileID dd ? ; this is ID number of the file
  .Sector dd ? ; this is phisical sector on the disk.
}
    


For 500000 files this structure will take about 4Mbytes and I think it will be loaded in RAM.

On the disk will exists some other bigger data structure, where on every index will correspond long description: "name", "parent dir", date, etc.
Post 18 Jun 2003, 19:18
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
scientica
Retired moderator


Joined: 16 Jun 2003
Posts: 689
Location: Linköping, Sweden
scientica 18 Jun 2003, 20:06
JohnFound wrote:
Hm, it's too complex for me too in this moment. Very Happy

Ok, this reminds me of a lesson i school. (Where my response to what decides the speed of acomputer was replied to by the teacher "Well, my explanation is enougth...")

I admit in comparision with your idea, mine is bit overkill Smile

_________________
... a professor saying: "use this proprietary software to learn computer science" is the same as English professor handing you a copy of Shakespeare and saying: "use this book to learn Shakespeare without opening the book itself.
- Bradley Kuhn
Post 18 Jun 2003, 20:06
View user's profile Send private message Visit poster's website Reply with quote
crc



Joined: 21 Jun 2003
Posts: 637
Location: Penndel, PA [USA]
crc 01 Aug 2003, 18:51
Other than having fewer file infomation fields, how does this differ from an inode-based file system?
Post 01 Aug 2003, 18:51
View user's profile Send private message Visit poster's website Reply with quote
Asaf Karagila



Joined: 28 Oct 2003
Posts: 8
Asaf Karagila 29 Oct 2003, 02:08
@scientica and JohnFound:
i am not much of a system coder,
nor a great programmer to say i know exactly how filesystem works,
but as i wish to someday write my own OS (slowly, but safely),
scientica's idea of a id per file sound great..
with this way you can do several things easily (i think..)

1) defrag, as stated before
2) recognize certain files, like system files
3) execute faster dir/ls commands (you just need it to have a certain id, thats all)
the thing is, you take more space on the harddrive,
also, for that huge table of names and data, you need to allocate space,
where would you do it ?
and how can you know certain things, like how big will be the file,
for instance, you can say "uhm i can take 15% of the harddrive size"
but isnt it a waste on certain systems ? and barely enough on others ?

those things you have to consider when writing a filesystem.

while John's idea much more simple, explains only the first level of the system,
due to a thought that this is a nice idea, i think that its also can be a bad idea,
resorting the entire allocation table in order to do certain actions that can be done without that,
of course its what you have to think when you design, the postive and negative sides of each idea..
upto you if you take it or not..

_________________
Zero and one are everything.
Post 29 Oct 2003, 02:08
View user's profile Send private message Reply with quote
scientica
Retired moderator


Joined: 16 Jun 2003
Posts: 689
Location: Linköping, Sweden
scientica 29 Oct 2003, 08:08
Asaf Karagila wrote:

2) recognize certain files, like system files
3) execute faster dir/ls commands (you just need it to have a certain id, thats all)
the thing is, you take more space on the harddrive,
also, for that huge table of names and data, you need to allocate space,
where would you do it ?
and how can you know certain things, like how big will be the file,
for instance, you can say "uhm i can take 15% of the harddrive size"
but isnt it a waste on certain systems ? and barely enough on others ?

(looking back on my post I see lot's of details/things that my idea misses, like the mentioned flags (system file, etc), )
2) Simply add somemore overhead, a dword/qword for file flags/permisions (*nix style "rwxrwxrwx")
3) the overhead, that's an issue, on the other hand, but if one has a table with an pice of more overhead, say 2 qwords for offset on the disc (so all that's needed is to seek to that address to get the file), then you could accomplish a 0 byte waste. (But it would require either fragmenting or defregmenting).
How big a file will be, that's a tricky one, maybe one should try too keep as much as possible in memory and only write to disc when needed (tradeoff: very fail prone, power loss -- massive data loss). Or the apps could allocate (via the system/fs API) a chunk of data near the end of the disc, where temporary data is stored, and uppos "release" of the file handle the system move the data to the begining of the disc, trying to place it optimal.

(oops, gotta go,Ii'll finish post later)

_________________
... a professor saying: "use this proprietary software to learn computer science" is the same as English professor handing you a copy of Shakespeare and saying: "use this book to learn Shakespeare without opening the book itself.
- Bradley Kuhn
Post 29 Oct 2003, 08:08
View user's profile Send private message Visit poster's website Reply with quote
Dryobates



Joined: 13 Jul 2003
Posts: 46
Location: Poland
Dryobates 29 Oct 2003, 11:54
If you want using only numbers just... use nr of cell in FAT Wink
I'm not interested in OS Construction, but looking from users or programmer side I think that it would be nice but...
As user I would like to use not numbers, but names.
As programmers I would like use numbers (as I do using Handle), but I wouldn't need to translate users filename to numbers, so OS had to do this.
But the problem is... where will you store information about files? All those strings?
I've heard about some file system, which was like database, but as I said, I do not interest in OS so I can't tell you what is the name of it. I think, that you should read a little about different file systems and probably use some currently used idea.
Post 29 Oct 2003, 11:54
View user's profile Send private message AIM Address Yahoo Messenger MSN Messenger ICQ Number Reply with quote
Asaf Karagila



Joined: 28 Oct 2003
Posts: 8
Asaf Karagila 29 Oct 2003, 12:30
i was sitting earlier debating myself with this problem,
and i think i found a solution, a temporary one at least..
the string table will be stored as a reversed data structure on the end of the drive,
so when you want to write something to it, you can just add data towards the harddrive,
of course, on a certain point, it will reach the data, the OS must look forward to that,
and recommend on a cleanup/defrag/etc..

this idea is very flawed in its current state, but i think it has got some potential..

_________________
Zero and one are everything.
Post 29 Oct 2003, 12:30
View user's profile Send private message Reply with quote
eet_1024



Joined: 22 Jul 2003
Posts: 59
eet_1024 30 Oct 2003, 22:29
Why not have an OS that provides API's for handling PERL-like data types ($calars, @rrays, %Hashs)

While you're at it through in a nice RegEx API
Post 30 Oct 2003, 22:29
View user's profile Send private message Reply with quote
Asaf Karagila



Joined: 28 Oct 2003
Posts: 8
Asaf Karagila 30 Oct 2003, 22:54
heh, did you even READ the stuff we wrong ?
we just complained about strings being problematic in asm,
and trying to find a way around it with a filesystem,

why would you even consider writing a regex routine in asm ?!
also, what does regex and apis have to do with filesystems ?

_________________
Zero and one are everything.
Post 30 Oct 2003, 22:54
View user's profile Send private message Reply with quote
Ralph



Joined: 04 Oct 2003
Posts: 86
Ralph 30 Oct 2003, 23:30
This might be a little off-topic, but I'm curious as to why there is so much discussion about file systems these days. To me it looks like people think file system are the only archiac thing left in current operating systems and take it upon themself to modernize them. Gnome Storage, WinFS, object orientated file systems, etc etc. While that might be all good, I don't see why we need a new file system paradigm. I'm prefectly happy with the standard directory/file structure we're all using now. In my opinion, file systems should not be overcomplicated like this. There is really no need for a SQL based system, or a perl-like system, or an object based system. If you need to store data in a more complex way, then you create your own application specific way of doing so within your operating systems rather minimal existing system. Read: if you want to store SQL based data, then stick it into an SQL database.
Post 30 Oct 2003, 23:30
View user's profile Send private message Reply with quote
Asaf Karagila



Joined: 28 Oct 2003
Posts: 8
Asaf Karagila 30 Oct 2003, 23:59
i might try and answer on my own..
cause i dont really know what others think..

programming a function, at least for console modes, is not TOO hard,
i mean, once you have a filesystem working, and you have its frontend apis,
nothing is too hard..
its all become simpler..
so i cant answer for other OSes, but for what i try to write eventually,
the file system is a major part of the planning, making and writing of the OS.

_________________
Zero and one are everything.
Post 30 Oct 2003, 23:59
View user's profile Send private message Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3499
Location: Bulgaria
JohnFound 31 Oct 2003, 14:55
Hi all.
My idea is based on some assumptions:
1. For CPU is more easy/fast/bug free to work with numbers - not with strings.
2. The most of the programs use files very active - so they need very fast way to reach files.
3. For humans is more easy to use string identification of the files with directory tree structures, but it is not so important for programs as long as the program knows what exactly file it needs.
4. FAT based FS uses one structure (FAT) to descrypt every file in the system. This is not very smart, because every destruction of the FAT leads to waste of all files in the disk (in spite of fact that the file is still on the disk.

More detailed description of my idea:

1. There is no clusters: every phisical sector is addressed. 32bit number of sector is enough for 2TBytes hard disk.

2. File structure:
Every file contains two types of sectors:
a. SL sectors (Sector List) - they contains description of data sectors.
b. DATA sectors - they contains file data.

SL format:
Code:
 
Structure of sector list: (double linked list of sectors)
+--------+------------------------------------------+
|offset  |  Description                             |
+--------+------------------------------------------+
| 0..3   | DWORD sector number of previous          |
|        | sector in SL. -1 if the sector is first. |
+--------+------------------------------------------+
| 4..7   | DWORD sector number of next sector is SL |
|        | -1 if the sector is last.                |
+--------+------------------------------------------+
; 8..15  | Only for the first sector in SL - QWORD  |
;        | File size in bytes.                      |
+--------+------------------------------------------+
| 8..511 | Array of 126 (124) DWORD - sector numbers|
|16..511 | of consecutive DATA sectors of the file. |
+--------+------------------------------------------+
    

Every file is identified by the first sector of his SL. For better identification and error checking maybe is good to insert in the first sector of SL, some ID bytes, checksum etc.

The advantage of this structure of the file is that every file is independent of other files - even if every system structure is destroyed, there is possible every file to be recovered (even without human readable name)

3. VTOS - volume table of sectors. This is method of searching of free space when needed for writing files:
Every volume (disk, partition) will contain some sectors (on fixed position) with 2 bit bit mask for every sector on the volume. The size of this structure is not very big. There are 4 possible states for every sector. For example:
00 - free
01 - used
10 - system (fixed, not movable)
11 - bad
The integrity of VTOS should be controled with checksum.
When the system needs free sector it simply searches VTOS array to find 00 bitmask.
The advantage is that this structure may be loaded in the RAM and be updated on disk only sector by sector on change. Even if this structure is destroyed, it may be rebuilded using SL sectors of the files.

4. String names: this is simple tree of strings, written in the standard file (maybe with fixed first SL sector) that will keep human readable names asociated with sector number of the files for using in file browsers and dialogs with the users.

This file system is not entirely my idea. Similar structure have the file system of DOS for Apple II computers (more simple of course)
It is true that everything "new" is well-forgotten "old"

Regards.

[EDIT] Some advance of the idea:[/EDIT]
To allow fast and easy search by string filename, one more structure may be created: array of structures:
[code]
struct TFilePlace {
.Hash dd ? ; hash value computed from full filename.
.Sector dd ? ; sector of the disk the file resides.
}

If this array is sorted by hash value, very fast binary search will be possible, so when the file is given by file name, it will be very quick to search what is the first sector of SL.
Post 31 Oct 2003, 14:55
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
CodeWorld



Joined: 15 Nov 2003
Posts: 69
CodeWorld 19 Nov 2003, 04:51
you write OS?

_________________
Image
FASM & RUS OSDEV at WWW.SYSBIN.COM (EN: ww2.sysbin.com)
Post 19 Nov 2003, 04:51
View user's profile Send private message Visit poster's website Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3499
Location: Bulgaria
JohnFound 20 Nov 2003, 11:16
CodeWorld wrote:
you write OS?


Hm, actually not. Simply I have some ideas about ideas about OS. Wink
Long, long time ago, I wrote an OS for Apple II computers and I have some little drafts for PM OS, but nothing serious. Only ideas about general structure.
Maybe some day... Smile
Regards.
Post 20 Nov 2003, 11:16
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
CodeWorld



Joined: 15 Nov 2003
Posts: 69
CodeWorld 20 Nov 2003, 11:23
help me with multi-processing ... i do it - TSS in GDT max 8192 process, in Linux it 65000... he-he... how? program-organisation? but if it is that protect and speed low...

_________________
Image
FASM & RUS OSDEV at WWW.SYSBIN.COM (EN: ww2.sysbin.com)
Post 20 Nov 2003, 11:23
View user's profile Send private message Visit poster's website Reply with quote
JohnFound



Joined: 16 Jun 2003
Posts: 3499
Location: Bulgaria
JohnFound 20 Nov 2003, 11:45
Hm, I don't know how Linux reaches this (are you sure?), but at first:
It is possible to use multiply GDT. At second - do you actually need 65000 processes? In most cases you should make multy-thread program, not multiprocess.

Regards.
Post 20 Nov 2003, 11:45
View user's profile Send private message Visit poster's website ICQ Number Reply with quote
CodeWorld



Joined: 15 Nov 2003
Posts: 69
CodeWorld 20 Nov 2003, 13:16
JohnFound wrote:
Hm, I don't know how Linux reaches this (are you sure?), but at first:
It is possible to use multiply GDT. At second - do you actually need 65000 processes? In most cases you should make multy-thread program, not multiprocess.

Regards.


how with it in Menuet?

_________________
Image
FASM & RUS OSDEV at WWW.SYSBIN.COM (EN: ww2.sysbin.com)
Post 20 Nov 2003, 13:16
View user's profile Send private message Visit poster's website Reply with quote
1800askgeek



Joined: 04 Apr 2004
Posts: 10
Location: Hawaii
1800askgeek 06 Apr 2004, 01:17
I've never tried writing an OS or a file struct. for that matter, so I'm not the most educated on the issue, but...

The problem is slow boot-ups due to the handling of strings, (and searching for those strings on the disk???) so, why not at shutdown time have your OS run a program that locates where you the files (in the order you need them) are on the disk exactly? So, when your computer boots up, the OS reads the one file (assuming it is always stored in the same spot on the disk) and that'll give it the exact location of each file you need. That way, it'll slow the computer more during shut down instead of start up? And, if the file is out-of-date (change the first bit of the file when you start up? Reverse it when last thing before shut down?) it will sort the files slowly, the old fashioned way. (More or less the way it does now.)
Post 06 Apr 2004, 01:17
View user's profile Send private message Visit poster's website AIM Address Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page 1, 2  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2025, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.