flat assembler
Message board for the users of flat assembler.

Index > OS Construction > file system

Goto page Previous  1, 2, 3, 4  Next
Author
Thread Post new topic Reply to topic
tom tobias



Joined: 09 Sep 2003
Posts: 1320
Location: usa
tom tobias 12 Oct 2007, 11:01
Thanks Edfed, for starting a very useful thread. Mac2004: I did read about Stallings book, some reviewers were unimpressed. What do you think of this article, which I find useful:
http://en.wikipedia.org/wiki/Hierarchical_File_System
I remain curious to learn more about the 256 character allocation for a file name. Assuming 26 symbols in the roman alphabet, that's an awfully large quantity of unique file names, especially if one adds ten arabic numerals. 32 characters, each of which can be any one of 36 different symbols, ought to give as many variations as the quantity of stars in our galaxy---a lot more than we can ever use.
Post 12 Oct 2007, 11:01
View user's profile Send private message Reply with quote
Mac2004



Joined: 15 Dec 2003
Posts: 314
Mac2004 12 Oct 2007, 16:30
tom tobias wrote:
Quote:
What do you think of this article, which I find useful:
http://en.wikipedia.org/wiki/Hierarchical_File_System


The article was nice and pretty thorough. One of the problem is that HFS seems to be proprietary.

About 256 byte name length: I write documents among other things at work. I have met the need for longer file names many times at work. So the 256 bytes name length originates from my experiences while writing docs. Documents must be named so clearly that they are easy to distinguish after several years of time. 32 chars is just not enough for me, but I can understand your point as well. Smile

Besides the remaing 256 bytes of the header are enough for the rest of file variables. Shorter filenames would leave more non-used space within the header. There's no point of leaving unused areas within the header.

I hope this helps you to understand my reasoning. Smile

regards,
Mac2004


Last edited by Mac2004 on 12 Oct 2007, 16:34; edited 1 time in total
Post 12 Oct 2007, 16:30
View user's profile Send private message Reply with quote
edfed



Joined: 20 Feb 2006
Posts: 4330
Location: Now
edfed 12 Oct 2007, 16:34
some files require names longer than 256 bytes

music in MP3 that holds the lyrics in the name for exemple
or the names of each members of the group or others
and this can be used to write simple files like text in only the name of the file

in my opinion, file names needs to be unlimited in size
and the location of the full name is indexed by the second dword in file allocation chain , the first is the file type up to 2^32 files types can be handled by a single os on a single machine


the big problem is how to access the fs when it's in RAM
how to show that a part of the file is in ram and where it is ??
a MS DOS like handle is not a good idea
i prefer a bit who signals the indexed zone in ram or in disk
the index then become a pointer in ram
and the index in disk is there to handle the position in disk
if the bit siqnal a non loaded zone, the index is a sector location

i want to implement the support of several drives in the same file system

with the partition table , you cut the drive in piece o pie
without hte partition table i'll group the drives in one virtual drive
like chips on a RAM module
Post 12 Oct 2007, 16:34
View user's profile Send private message Visit poster's website Reply with quote
Mac2004



Joined: 15 Dec 2003
Posts: 314
Mac2004 12 Oct 2007, 16:43
edfed wrote:
Quote:

in my opinion, file names needs to be unlimited in size
and the location of the full name is indexed by the second dword in file allocation chain , the first is the file type up to 2^32 files types can be handled by a single os on a single machine


The design I was telling about is flexiple to different adaptations. It's up to the programmer to decide. Smile
Post 12 Oct 2007, 16:43
View user's profile Send private message Reply with quote
edfed



Joined: 20 Feb 2006
Posts: 4330
Location: Now
edfed 12 Oct 2007, 17:28
i know
let's go now
the definition of the structure is the most important
it decides of how the functions will works

a 256 byte is too much for little file names and not enough for very long file names

so i will write the file name in a fixed location but a pointer will signal the end of the name that is the beginning of other parameters (aligned on dwords)
Post 12 Oct 2007, 17:28
View user's profile Send private message Visit poster's website Reply with quote
Mac2004



Joined: 15 Dec 2003
Posts: 314
Mac2004 12 Oct 2007, 17:53
edfed wrote:
Quote:

let's go now
the definition of the structure is the most important
it decides of how the functions will works


That's why the structure needs to be as stable as possible. It's easier to maintain (to code, to test, to debug) especially at the later stages of the file system implementation.

regards
Mac2004
Post 12 Oct 2007, 17:53
View user's profile Send private message Reply with quote
Mac2004



Joined: 15 Dec 2003
Posts: 314
Mac2004 12 Oct 2007, 18:00
Hayden wrote:
Quote:

A Garbage collector file system is probably a good 1st free file system to code. each file/directory has a header and data space. the header links files and directories forwards and back and tracks free/used space. As files are deleted they're marked as free. When a new file is written the algo starts at the root and searches the linked list for a big enough space, while doing this you could be doing a semi defag on the fly ect... blah blah...

nb. you could also interlink diretories ect.. ie: a perticular directory may be reached via several different paths starting from the root.



Agreed Smile

regards,
Mac2004
Post 12 Oct 2007, 18:00
View user's profile Send private message Reply with quote
ManOfSteel



Joined: 02 Feb 2005
Posts: 1154
ManOfSteel 12 Oct 2007, 18:03
@edfed:
Quote:
music in MP3 that holds the lyrics in the name for exemple or the names of each members of the group or others

MP3 has ID3 tags for that (including a large comment/note section for random data). Why not use them when they are already available. Personally, I find the "%artist/band - %name" combination quite sufficient as a filename.

Quote:
and this can be used to write simple files like text in only the name of the file

This is wrong as a principle since the filename is "metadata" just like date/time attributes, permissions, etc. A filename is just a title, an identifier, not an essay!

Quote:
in my opinion, file names needs to be unlimited in size

... requiring multiple sector read/write operations just to retrieve the filename. Not what I would call a very effective design.
Post 12 Oct 2007, 18:03
View user's profile Send private message Reply with quote
Mac2004



Joined: 15 Dec 2003
Posts: 314
Mac2004 12 Oct 2007, 18:08
ManOfSteel:
Quote:

... requiring multiple sector read/write operations just to retrieve the filename. Not what I would call a very effective design.


That's why I chose the file name length of 256 bytes. The header can be read just by reading 1 sector from the disk.

regards,
Mac2004
Post 12 Oct 2007, 18:08
View user's profile Send private message Reply with quote
edfed



Joined: 20 Feb 2006
Posts: 4330
Location: Now
edfed 12 Oct 2007, 18:24
yes id3 exists

it was just a little idea

in fact i don't want to freeze the file name structure
why??
because i don't want to waste memory
if a file name is 2 bytes long, only 2 bytes will be used
if a file name is 257 bytes long, only 257 bytes will be used

an index will show the end of name/begin of params
the index is 4bytes long to use the speed of 32 bit addressing
ok i waste 4bytes
but i earn 252 bytes for name and other stuff

file names needs to be unlimited in size, even it's a long name or a very short name
Post 12 Oct 2007, 18:24
View user's profile Send private message Visit poster's website Reply with quote
Mac2004



Joined: 15 Dec 2003
Posts: 314
Mac2004 12 Oct 2007, 19:36
edfed wrote:
Quote:

in fact i don't want to freeze the file name structure
why??
because i don't want to waste memory


Please note:

The header is loaded to 512byte temporary buffer when the file is opened. When I want to handle a new file/folder I just clear the temporary buffer and load the next header from the disk and
place it in the temporary buffer. In fact we are using only a little memory while handling the files/folders.

Quote:
file names needs to be unlimited in size, even it's a long name or a very short name


Well, nothing stops you to do so in your own implementation of the file system. Smile Just be aware of the pros and cons of your implementation.

regards,
Mac2004
Post 12 Oct 2007, 19:36
View user's profile Send private message Reply with quote
edfed



Joined: 20 Feb 2006
Posts: 4330
Location: Now
edfed 12 Oct 2007, 20:17
this is the diagram of file implementation on disk
Code:
org 0
rootdir:
dd @f-$-4
.fil00:
dd moile,0,0,0
.fil01:
dd edfed,0,0,0
.fil02:
dd nota,0,0,0
.fil03:
dd track12,0,0,0
.fil04:
dd fasmdir!,fasmdir,0,0;???? seek a method to handle the real sector number during compilation
@@:
moile db 'why are there three dword set to zero? because the first is the first sector of the file',0
edfed db 'and the second is for the full size in sectors and the last is for full size of file in bytes',0
nota db "very long name you know it's not a real problem in this file system cause there is all the bytes you want there",0
track12 db '12 Asian Dub Foundation-RETURN TO JERICHO (dub version).MP3',0
fasmdir! db 'flat assembler',0

align 512
org 0
fasmdir:
dd @f-$-4
.fil00:
db fasm.asm,0,0,0
.fil01:
db fasm.exe,0,0,0
.fil02:
db fasm.inc,0,0,0
@@:
fasm:
.exe db 'fasm.exe',0
.asm db 'fasm.asm',0
.inc db 'fasm.inc',0
align 512
    

in case of file fragmentation, the file link in folder point to the header. this case is signaled by size_in_sectors to be one ( header first sector) and size_in_bytes to be the effective size of the file, saturated at 4Gigabytes-1 ,or 0FFFFFFFFh.
file that are only one sector (mostlly 512bytes) of size, or less, will be accessed directly by the file pointer in folder.
note that if the file pointers and size indicate that the file is one block, instead of fragments, the header will not exist
file that are more than 4G -1 are always fragmented and always have a header.
header is transparent for applications.

for the file system, we need other structures as the sector occupation bitmap (if sector is free, bit = 0, if occupied or dead, bit = 1)
dead sectors mustn't appear in the file pointers.
dead sectors are detected during drive format and sector read/write/verify.

dead sectors into files induce fragmentation.
fragmentation is not a big problem if fragments are on the same track or neighbours tracks.
fragmentation is a problem in case of many, little, and locations on various random tracks.


Last edited by edfed on 05 Dec 2007, 02:13; edited 1 time in total
Post 12 Oct 2007, 20:17
View user's profile Send private message Visit poster's website Reply with quote
Mac2004



Joined: 15 Dec 2003
Posts: 314
Mac2004 13 Oct 2007, 07:26
tom tobias wrote:
Mac2004: I did read about Stallings book, some reviewers were unimpressed.


Personally I like it. (Maybe one reason is that I was lucky to find the book from local book store at a prize of 5 euros (something like 7 US dollars)).

regards,
Mac2004
Post 13 Oct 2007, 07:26
View user's profile Send private message Reply with quote
tom tobias



Joined: 09 Sep 2003
Posts: 1320
Location: usa
tom tobias 13 Oct 2007, 08:20
Mac2004 wrote:
Personally I like it.
Was that the fourth, or fifth edition?
http://www.amazon.com/Operating-Systems-Internals-Design-Principles/dp/0131479547/ref=pd_bbs_5/103-0010915-7579862?ie=UTF8&s=books&qid=1192260829&sr=8-5
Another book, which seems to garner lots of votes from reviewers, was published about 8 years ago by Giampaolo, with an interesting title:
"practical file system design". Any impressions? I like that title, it is just what I seek, though, reading about the book, leaves me wondering--it seems to be focused on BE, and other unix like file systems.
edfed wrote:
...file names needs to be unlimited in size, even it's a long name or a very short name...
I am sorry to confess to being lost here. To my way of thinking, the name is simply an entry to some data, part of which includes the length of the data, time of creation, last revision, prior revisions, and so on, with an array of pointers to various locations representing these different chunks of data. In my limited scope imagination, I try to imagine having more than 4 billion different entries in my data base, (i.e. 32 characters) and I cannot succeed. But, 4 billion, (i.e. 2 ^ 32) while far larger than I, with my feeble imagination, can envision requiring, is infinitesimally TINY, compared with 36 ^ 32 (26 characters plus ten integers). For example, 36 ^ 7, yields seventy eight billion different entries. One field in the entry, can correspond to a name of unlimited, but quantifiable length, it seems to me....
Mac2004 wrote:
The header is loaded to 512byte temporary buffer ...
Oops, I am perseverating here (repeating myself, a sure sign of senility) I am hung up on this 512 byte business....
Why not 8192 bytes, for example, as the element of "granularity"? If a typical hard drive has 1 gigabyte (thinking of the situation a decade ago, i.e. about three-four decades ahead of my current mental status,) that is, roughly 25 times more capacity, than was the case in the late 70's when this 512 byte business erupted, shouldn't we have a commensurate increase in the size of the smallest unit of storage? What is so special about 512 bytes? Sorry to be completely dense.
Smile
Post 13 Oct 2007, 08:20
View user's profile Send private message Reply with quote
Mac2004



Joined: 15 Dec 2003
Posts: 314
Mac2004 13 Oct 2007, 10:06
tom tobias wrote:

Quote:

Was that the fourth, or fifth edition?

It is 3rd edition... I guess that's one reason the book being so cheap...

Quote:

Oops, I am perseverating here (repeating myself, a sure sign of senility) I am hung up on this 512 byte business....
Why not 8192 bytes, for example, as the element of "granularity"?


Propably I'am stuck with the thinking of disks (hdd, floppy etc.) as arrays of sectors. In many cases sectors consist of 512bytes. So 512bytes is 'natural' boundary. Nothing more nothing less.
For instance if you use BIOS int 13h services you access sectors not bytes or anything else.

Please mind that in my FS desing right after the header comes the rest o the file/folder. So the header is just a small portion of a file/folder. A tool I should say.

regards,
Mac2004
Post 13 Oct 2007, 10:06
View user's profile Send private message Reply with quote
edfed



Joined: 20 Feb 2006
Posts: 4330
Location: Now
edfed 13 Oct 2007, 11:34
Mac2004 wrote:

The header is loaded to 512byte temporary buffer when the file is opened. When I want to handle a new file/folder I just clear the temporary buffer and load the next header from the disk and
place it in the temporary buffer. In fact we are using only a little memory while handling the files/folders.

Quote:
file names needs to be unlimited in size, even it's a long name or a very short name


Well, nothing stops you to do so in your own implementation of the file system. Smile Just be aware of the pros and cons of your implementation.

regards,
Mac2004


and what do you do if an opened file is reopen?
you load it's header in the temporary buffer one more time?
an opened file needs it's header in constantly in ram
like if ram was an image of disk

nothing stops me, but only one thing
i want to write a really good fs
and then i need various opinions

your header is one of opinion that i'll use
but not with 256bytes files names

and the buffer is pretty good idea but when i want to know the file header, it needs to reload it from disk

so i need asm coder opinion
because my fs is oriented asm development

and i need opinion of the most asm coders possible
Smile)
Post 13 Oct 2007, 11:34
View user's profile Send private message Visit poster's website Reply with quote
Octavio



Joined: 21 Jun 2003
Posts: 366
Location: Spain
Octavio 13 Oct 2007, 14:37
>so i need asm coder opinion
My opinion is that a filesystem must be harware independent ,not based on 512 bytes sector size,the filesystem should work well on hd cd ram etc...
> my fs is oriented asm development
and specially programing language independent.

long filenames are not needed ,a name is just a reference not a description of the file.
start by implement the most basic features, but leave it ready to add
new features in the future.
Also learn how other filesystem work.
Post 13 Oct 2007, 14:37
View user's profile Send private message Visit poster's website Reply with quote
Mac2004



Joined: 15 Dec 2003
Posts: 314
Mac2004 13 Oct 2007, 14:55
edfed wrote:
Quote:
and what do you do if an opened file is reopen?


There's at least two solutions I can figure:

1) We can add a checker which tests whether the file has been opened or not. If yes, there's no need to reload the header.
One must make sure that the file is opened for writing no more than once at the same time. Other instances should be read-only.
Otherwise we could damage the data of the file.

2) Use of multiple temporary buffers which allow multiple file instances to be opened at the same time.


regards,
Mac2004
Post 13 Oct 2007, 14:55
View user's profile Send private message Reply with quote
edfed



Joined: 20 Feb 2006
Posts: 4330
Location: Now
edfed 14 Oct 2007, 03:13
yes, 512 byte is a variable but the most used is 512

4096 is used in PM paging
65536 is used in RM segment limit
16 is for the segment size
etc...
so there will be the possibility variable cluster size from 0 to max (2^32-1?)

i alway cross a problem when navigate in a file system
making severals cd into and cd .. is slow cause of reloading the follow and previous headers and it's frustrating, for example when you seek a lost file in the tree

PC have at least many MEGA bytes so i don't matter about ram use and it permitts to improve the speed

if the file system is asm oriented, it means that it is sinple to use in asm, and then fast
and if it is simple in asm, it is more simple in others languages

an auto update is possible in multiple instances cases, if the file is modified in an instance the others instance thats use this file updates the file in their memory
or cannot too,
if the file is an audio flux in ram? or an image? it needs to be updated
if the file is a folder, and that other instances modifies this folder? it must be updated

an application is possible with very long file names
using the file system like a possibles sentences list sorted in alphabetic order and that have a meannig writen in the file
i think for exemple to IA and language conprehension
these applications needs a powerfull database to exists otherwise it will rest a pure dream

a data base thats use all the addressing possibility of 64 bit needs to manage very long names to class them by family, IP ,local machine, network, etc... like a path but into the name

there is a lot of uses possiblilities of unlimited name size
Post 14 Oct 2007, 03:13
View user's profile Send private message Visit poster's website Reply with quote
edfed



Joined: 20 Feb 2006
Posts: 4330
Location: Now
edfed 18 Oct 2007, 10:56
36^32 names possible

but...

some names are unpossible

like DJKSQGHKFEUZBFHQJKSNAZAUIOFPANUF
LIOIUOZEB898B8D979210707E02B0Z70
MMMMMMMMMMMMMMMMMMMMMMMMM
00009990900808080279406784378729
why? because they are unreadable
no human can use this
only human readable letters combination are possible, assuming that it's for human use

very long variable file name examples:
"Metallica-Black album Track 01 ENTER SANDMAN.wav" 48 letters
"Fallout 3 van burren demo technique.zip" 39 letters
"FasmW 1 63 23.zip" 17 letters
"MichaelAbrashBlackBookSource.zip" 32 letters

and why having the file names into folders and with variable 'unlimitted' size?
"fdisk.exe" 9 letters
"put.inc" 7 letters
"box.inc"
"2D.inc" 6 letters
"3D.inc"

r/asmx86/graphic library/>dir
>ls

to read all files names, you need to read only the folder
a folder is a system file
this happens only in system memory segments
and a sector can contain approximatelly 24 names
depend on name size
a file is normally into a contiguous sector zone
so it is very fast to load a folder
and don't make a lot of hard disk noise

whith only names into 256 byte files header, you will make ~ 329 X ONE sector reads at various CHS positions
and only use a fiew bytes for names, 10 / 256
Post 18 Oct 2007, 10:56
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page Previous  1, 2, 3, 4  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2024, Tomasz Grysztar. Also on GitHub, YouTube.

Website powered by rwasa.