flat assembler
Message board for the users of flat assembler.

Index > Heap > Petabox and the death of NTFS

Goto page Previous  1, 2, 3 ... 6, 7, 8 ... 10, 11, 12  Next
Author
Thread Post new topic Reply to topic
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17279
Location: In your JS exploiting you and your system
revolution
Borsuc: You seem to continue to use your strawman argument to suggest that this all cannot work. 16EB is not under discussion here.

Things like bad sector scanning are in the usage model. This would clearly have to change. Bad sectors can be scanned on the fly. I give you an example that is in use today. NAND flash is delivered from the factories to you. After some time is starts to get errors but the computer is not told to constantly scan for errors, they are discovered upon usage and remapped on the fly. And there is no reason this cannot be extended to large SM. Just format and assume all sectors are okay, and later mark any bad spots as they are encountered.

The interface is not an issue, there are so many ways to get high quality and high speed access to device data. This is also in the usage model. If we need a PCI/AGP/whatever bus to give a desired access speed then it can be incorporated. Don't worry about it. There is nothing to say we have to have all SM connected by a long flexible cable to the mobo, maybe we plug it in like an SDRAM module into a PCI-X bus. But all this is also assuming we need to gain access to the entire data within some brief period. This is not a given fact. The usage model will be the most important change. And this is also a reason NTFS will not be suitable. NTFS won't match the new usage model requirements.


Last edited by revolution on 07 Apr 2009, 02:04; edited 1 time in total
Post 07 Apr 2009, 01:43
View user's profile Send private message Visit poster's website Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
Quote:

here is nothing to say we have to have all SM connected by a long flexible cable to the mobo, maybe we plug it in like an SDRAM module into a PCI-X bus.

And in fact something like that existed in the past: http://en.wikipedia.org/wiki/Hardcard (I have seen one of them working on a 386 long ago)
Post 07 Apr 2009, 01:51
View user's profile Send private message Reply with quote
f0dder



Joined: 19 Feb 2004
Posts: 3170
Location: Denmark
f0dder
Borsuc: SATA is currently limited to 300MB/s (3 gbit/s sata link), with 6gbit/s on the way (which should allow for 600MB/s, I guess).

Revolution: I'm not sure the mapping is trivial to do. Consider the application using a 2048-byte buffer for file reading, a pagesize of the typical 4kb, and a bunch of other data in the same 4kb page as the buffer...
Post 07 Apr 2009, 14:06
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17279
Location: In your JS exploiting you and your system
revolution
f0dder wrote:
I'm not sure the mapping is trivial to do. Consider the application using a 2048-byte buffer for file reading, a pagesize of the typical 4kb, and a bunch of other data in the same 4kb page as the buffer...
I expect it would be interface specific. If we use an interface as if it were normal RAM then the OS would likely do a copy rather than a mapping. Since not many apps expect that writing to RAM is also writing to an HDD. The apps would expect to control when the writes happen so mapping won't work in this case. But the copy is trivial. Either way it is trivial to get data from a RAM based device. All that is needed is a wide enough address bus.
Post 07 Apr 2009, 14:13
View user's profile Send private message Visit poster's website Reply with quote
Borsuc



Joined: 29 Dec 2005
Posts: 2466
Location: Bucharest, Romania
Borsuc
Hmm the thing with the bad sector scanning was an example of total drive scanning. If it takes 1 year to access all your storage, then it's not very useful, except for archiving. You can have 1 TB of data, but if the speed is 1KB/s, trust me, you are not going to make use of it (unless you're archiving).

As for SATA yeah my bad, maybe I confused bytes with bits.

Also data transfer works with clock speed. Even the laws of physics (=speed of light) may pose a challenge. (nanoseconds aren't that fast in this respect, just gigahertz).
Post 07 Apr 2009, 23:23
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17279
Location: In your JS exploiting you and your system
revolution
Borsuc wrote:
Hmm the thing with the bad sector scanning was an example of total drive scanning. If it takes 1 year to access all your storage, then it's not very useful, except for archiving. You can have 1 TB of data, but if the speed is 1KB/s, trust me, you are not going to make use of it (unless you're archiving).

As for SATA yeah my bad, maybe I confused bytes with bits.

Also data transfer works with clock speed. Even the laws of physics (=speed of light) may pose a challenge. (nanoseconds aren't that fast in this respect, just gigahertz).
So, you still think inside the box!

A single bit line cannot transfer data very fast (as you have mentioned), but put 128 of them together and now speed is starting to get very good. Put 256 lines and now the CPU cannot keep up. However, this still shows inside-the-box thinking. It is not a given fact that we have to scan entire drives or that we have to be able to see the whole drive in some small amount of time. All that depends upon how you want to use it.

If there were exabyte drives (your favourite example, not mine) available I would buy them because then I never have to worry about running out of disc space.

If petabyte drives (my example) can't be made then my prediction is wrong, no matter, it is only a prediction. BUT, if they can be made they will be made I assure you. And no amount of claims saying "You can't make good use of it because ..." will stop them being made. You can't hold back technology.
Post 09 Apr 2009, 14:09
View user's profile Send private message Visit poster's website Reply with quote
Borsuc



Joined: 29 Dec 2005
Posts: 2466
Location: Bucharest, Romania
Borsuc
revolution wrote:
A single bit line cannot transfer data very fast (as you have mentioned), but put 128 of them together and now speed is starting to get very good.
We tried, that's called IDE. Razz

revolution wrote:
If there were exabyte drives (your favourite example, not mine) available I would buy them because then I never have to worry about running out of disc space.

If petabyte drives (my example) can't be made then my prediction is wrong, no matter, it is only a prediction. BUT, if they can be made they will be made I assure you. And no amount of claims saying "You can't make good use of it because ..." will stop them being made. You can't hold back technology.
The average joe won't. Plus there won't be advertising. "Buy this HD, you'll never run out of space!"? Then what's the point for the NEXT one over that one, same advertising line? Since obviously, live media (which is, after all, the sole thing that CAN consume so much huge space) is limited by bandwidth. Even games since they're real-time. Heck ANYTHING real-time is limited by bandwidth, ignoring computational costs for the moment. (which would have to be also very huge to handle all that bandwidth...)

Not many people buy it for archiving. If that were the case, tapes would have been VERY popular some time ago. But they weren't, and also expensive, due to less demand.

Also revolution, make a difference between a theory, prototype, and practice. There have always been prototypes of stuff, doesn't mean they entered mass-production, even if "it was possible". There's a "holographic versatile disc" or HVD with 1TB of data that should have entered since 2006, and while prototypes have been made, something must have kept it back for mass production or day-to-day use -- I read somewhere that it was because of heating issues, that it change properties very quickly on small variations and thus it was not suitable.

Just look at CDs and how people still use them. People decide what "exists" in stores. And it is only MY prediction that the average joe won't need that huge space (since he doesn't do archiving anyway) because the "huge movies" or "huge real-time media" can't be possible without adequate bandwidth.

Plus like I said, it would take a whole lot of time to maintain such a drive Razz

Like the saying goes: "What good is storage if you can't access it [in time]?"
Or rather, the entire Universe is already storage (after all...), but since we can't read it (unknown format), we might as well not call it one.

_________________
Previously known as The_Grey_Beast
Post 09 Apr 2009, 17:38
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17279
Location: In your JS exploiting you and your system
revolution
IDE runs on a long flexible cable, this is why it has problems. Still thinking inside the box? Remove the cable and many problems go away.

Borsuc wrote:
Plus there won't be advertising. "Buy this HD, you'll never run out of space!"?
A bold claim. I would not have been brave enough to predict how marketing type execs do their job. And I would have though the opposite of what you say, if it helps to sell products then they will say whatever is necessary to achieve that.

Why are you so focussed on bandwidth? Why do you think a lack of bandwidth (still not a given fact, but just following your argument) will make drive manufacturers stop developing larger SMs? That just doesn't make sense. They will make them bigger if they can. My claim is "they can" and the "will be made" part naturally follows from that. "Why does a dog lick it's balls? Because it can". Razz
Post 09 Apr 2009, 23:21
View user's profile Send private message Visit poster's website Reply with quote
Borsuc



Joined: 29 Dec 2005
Posts: 2466
Location: Bucharest, Romania
Borsuc
People don't usually buy shit they don't need, and if you are able to make something, then make a prototype, they are made all the time. If you want it to succeed in the market it has to sell though.

Again, this is only my prediction (that it won't sell, unless it has the same price as a smaller capacity drive, unlikely) -- it could be very easily turn out wrong simply by the fact that the average joe might turn out to be an idiot and buy off bigger and expensive drives while not using them to their full potential simply for "the effect".

And bandwidth is important. Storage without bandwidth is like developing a rocket and then blasting it out in space (without doing anything). My desk already can hold too much data in its atoms, but since I can't read it, nor write to it, it's worthless.

And of course, PCI Express for example uses a combination of serial + parallel (or only serial???). Although, parallel gains are usually small, just small multiples, instead of several orders of magnitude required for this to work out. (by the way, another problem with parallel and high-clock-speeds is interference and synchronization!)

Also remember, if the huge storage won't be 'justified' by real-time media (the ONLY candidate to waste so much space, except for archiving!), then it's hard to find something to put on it -- unless we get huge bandwidth somehow.


Personally, and I won't lie to you, I think it's better that the trend nowadays compared to the past it focusing more on reliable data storage such as SSD and flash (well compared to hard drives anyway), and not stupidly huge storage. They are still expensive though, but if people concentrated less on that stupid huge storage factor and more on improving what you DO with the current storage (reliability, durability, energy consumption, etc.. etc...) probably is as much wishful thinking as people actually fixing bugs in their products rather than "covering them up" with 'new and buggier (and bloated) versions'.

We don't even utilize our current storage properly and yet we want more? :rolleyes:
That's like giving a baby who is barely able to walk a motorcycle... in a way Razz

_________________
Previously known as The_Grey_Beast
Post 11 Apr 2009, 00:21
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17279
Location: In your JS exploiting you and your system
revolution
So because you don't know of a "good" way to use it then it won't be built. I don't think that is a good reason. The public are stupid, they buy things they don't need. I would buy large SMs because it frees me up to think about other things and not have to concern myself with where to put my data or what I can safely delete. Sure, I could be smarter about what I do with my data but why should I go to the trouble of being smarter about it when I can just go down to the shop and buy a bigger drive?
Post 11 Apr 2009, 01:02
View user's profile Send private message Visit poster's website Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
Quote:

People don't usually buy shit they don't need

Really? I see lots of people buying over featured cellphones to later use it just in the same way as with a basic one. And this is just an example that you can see yourself daily, there are many more, including purchasing computers just because it is decorated with an apparently nice fruit (even though it is clearly more expensive than others for no reason at all...).

Also, over-sized drives would be beneficial for manufactures to be able to say "almost unlimited space" without being sued. Ability of restoring the file system state (at file, directory or volume level) to any point (because of the tremendous size it would be feasible) and still having "unlimited" capacity is greatly welcomed.
Post 11 Apr 2009, 01:03
View user's profile Send private message Reply with quote
Borsuc



Joined: 29 Dec 2005
Posts: 2466
Location: Bucharest, Romania
Borsuc
Really, you can do simple math. Even for backup, it would take a shitload of time if you think you're going to do it frequently. Let's say you have a hard drive, and you use only 1% of it during a period of 2 years. Now, does buying one 10x the capacity make you feel any safer of running out of space? Really? (compared to the first one)

Maybe you should worry about that in a decade though, when you will reach 10% or 20% or 30% from the original Wink

and even then still waay unjustified

as for people buying stuff, well they do just to brag most times. though this will lose its appeal after the "hey, brag all you want, you still don't use more memory than me!"...

also notice internet connection speed. That's a huge factor as well, seeing as how most people these days fill up their hard drives from the net... Rolling Eyes

_________________
Previously known as The_Grey_Beast
Post 11 Apr 2009, 01:33
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17279
Location: In your JS exploiting you and your system
revolution
Borsuc wrote:
... it would take a shitload of time ...
This has not been proven. The interface is not a fixed entity.

And are you referring to peta- or exa-? Because exa- is a whole different thing.
Post 11 Apr 2009, 01:39
View user's profile Send private message Visit poster's website Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
Quote:

Really, you can do simple math. Even for backup, it would take a shitload of time if you think you're going to do it frequently.

I suppose you are saying this for the restore the filesystem to any state in time feature? That would be just adjusting the FS datastructures to start using the old clusters. For backups I don't see why you would do a mirror copy of the entire surface. Also, the free space could be used by the OS to store statistical data or logging whatever thing just because there is a lot of space to use there (for those who want to do extreme tracking of the system activity).

And as for the calculations, with a 600MB/s interface it would take 20 days to transfer the entire disk surface so excellent, I expect to write data for long time without problems, included data that is generated for caching (algorithms that take longer to compute than reading previous result from media), logging, etc (i.e. data not generated by me but yet it is beneficial for system speed and/or stability and that wouldn't need to be backed up).

Don't consider just data generated or downloaded by you, there are programs out there that could start to use that space in the same way that programs now tend to use more than 640 KB of RAM memory.

[edit]If you meant 1 EB media then lets suppose that 8 GB/s will be possible (isn't possible already with PCI-E?), in less than 5 years you would transfer the entire surface but again, I expect that the media would be used for more things that just storing my data + SO and programs files and the only thing that must be backed up is MY data only skipping all the rest in the same way I'm currently skipping pagefile.sys and TEMP folder, for instance.[/edit]
Post 11 Apr 2009, 02:12
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17279
Location: In your JS exploiting you and your system
revolution
LocoDelAssembly: Interesting that you use the word "surface". Might it be possible that the data are not stored on a "surface" any more?
Post 11 Apr 2009, 02:31
View user's profile Send private message Visit poster's website Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
Lets consider "surface" here as an abstract concept Razz

BTW, I forgot one important scenario, there are situation in which wasting space to store nothing is greatly appreciated (because of the speed gained, not for masochism). There is one thing that even today would make good use if it: hashing. The ability to have lots of buckets would make collisions almost impossible so accesses would be evenly faster for any record you want to restore. Look at databases with 32-bit primary keys, now you could have a separate bucket for every table row, also you could index some views this way too. Clearly indexing works better when space is huge and again this is still the kind of data you don't need to backup so again it doesn't matter how many years it takes to copy the entire media. The filesystem could also forget about being faster and space efficient with its internal data structures, now it can use (and waste) a lot of space mercilessly and focus only in being fast.
Post 11 Apr 2009, 02:45
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 17279
Location: In your JS exploiting you and your system
revolution
LocoDelAssembly wrote:
Lets consider "surface" here as an abstract concept Razz
Okay, I'll let you away with that one. Wink
LocoDelAssembly wrote:
The filesystem could also forget about being faster and space efficient with its internal data structures, now it can use (and waste) a lot of space mercilessly and focus only in being fast.
So this is the "death of NTFS" part that I mentioned.
Post 11 Apr 2009, 02:52
View user's profile Send private message Visit poster's website Reply with quote
Borsuc



Joined: 29 Dec 2005
Posts: 2466
Location: Bucharest, Romania
Borsuc
revolution wrote:
This has not been proven. The interface is not a fixed entity.
Yeah but I'm talking about the idea behind serial communications, which need higher clock frequencies for that, and those have limits. Of course there might be something new but I can't make up my head around even an idea (whether practical or not): if not serial and/or parallel, then how does it do it?

LocoDelAssembly wrote:
I suppose you are saying this for the restore the filesystem to any state in time feature? That would be just adjusting the FS datastructures to start using the old clusters. For backups I don't see why you would do a mirror copy of the entire surface. Also, the free space could be used by the OS to store statistical data or logging whatever thing just because there is a lot of space to use there (for those who want to do extreme tracking of the system activity).
Small, insignificant. The only thing that can be bigger (orders of magnitude bigger) would be MEDIA. of any form (games, movies, etc...). Not statistical or computing programs, unless you are a scientist or some stuff like that. Not for the average joe that is.

You don't see defragmenting programs getting bigger, or other system-related tasks (several orders of magnitude remember). You will never need 500GB of space wasted for a tool to format your hard disk (an example of such a tool) no matter how much memory you have. Except for media which also, must be real-time or it's worthless.

Also, I think you overlooked an important point. What about writing the filesystem? (i.e making the actual backup) Does that not take up bandwidth, and also, the precious "optimizations" and speed improvements made up by the cache excuse?

LocoDelAssembly wrote:
Don't consider just data generated or downloaded by you, there are programs out there that could start to use that space in the same way that programs now tend to use more than 640 KB of RAM memory.
But I download a lot more than 640 KB.

Caching/logging is a weak excuse. Unless of course we'll keep on harddisks forever instead of switch to solid-state media, which has absolutely no delays in access times. This is a typical better use of the NEEDED memory than making up excuses for stupidly bigger media, which mind you, need "workarounds" to be 'fast' as it can be seen. I think we should focus more on "optimizing" the available bytes we use rather than make more and then trying to find excuses to use them.

_________________
Previously known as The_Grey_Beast
Post 11 Apr 2009, 21:19
View user's profile Send private message Reply with quote
LocoDelAssembly
Your code has a bug


Joined: 06 May 2005
Posts: 4633
Location: Argentina
LocoDelAssembly
Quote:

Caching/logging is a weak excuse. Unless of course we'll keep on harddisks forever instead of switch to solid-state media, which has absolutely no delays in access times.

Really, what was your point here??? Have you understood what was to be cached? I never talked about caching to avoid extra media accesses but rather I was talking about using the media to store precomputed things that are slower to recalculate than accessing them from a storage device.

Quote:
Also, I think you overlooked an important point. What about writing the filesystem? (i.e making the actual backup) Does that not take up bandwidth, and also, the precious "optimizations" and speed improvements made up by the cache excuse?

You have to backup data + enough data to reconstruct the FS (which does not include raw copy of the state as it was). Then, since it was supposed by you (and I also beleave) that user data would be a small percentage it wouldn't take much. The way backups are made will have to change since in the case of databases for instance I don't need to backup the empty buckets, so copying the entire file would be stupid most of the time.

I see you have said nothing about hashing, it is because you passed over that post or because you agree?

Finally, since you are over worried about bandwidth let me show you what we have now: PC3-16000 DDR3-SDRAM (triple channel) 384.0 Gbit/s 48.0 GB/s

So, around 9 months to do a full (and unnecessary as I've told you already) copy? You can have a baby in the meantime if you are so bored waiting for the copy to complete Laughing (Or realize that in the future the bandwidth will be several times bigger)

(Sorry for quoting with random sorting)
Quote:
But I download a lot more than 640 KB.


The point is that programs today use more and more memory even though they could do the same with less. Some do so because are plain shit and others for storing LUTs to speed-up calculations. Databases (even those of "personal use"), preallocate storage space to be faster and/or ensure availability of service (and also for journalling to support transactions?), but currently it is not possible to have pre-allocated huge amounts of space which disables you from using the file somewhat in the same way you access an array and using the key for which you are searching for as the index in this array.

The only "invented excuses" were the statistical and logging thing that those were perhaps somewhat fancy and of no real need, but the scenarios of "over allocation" of space exists even today and the more space you have the better it work.

PS: BTW, when I say media I'm referring to the storage media (e.g. magnetic disk, solid state memory, etc), without saying any particular technology because I don't know which one will be.
Post 11 Apr 2009, 23:08
View user's profile Send private message Reply with quote
Borsuc



Joined: 29 Dec 2005
Posts: 2466
Location: Bucharest, Romania
Borsuc
LocoDelAssembly wrote:
Really, what was your point here??? Have you understood what was to be cached? I never talked about caching to avoid extra media accesses but rather I was talking about using the media to store precomputed things that are slower to recalculate than accessing them from a storage device.
Sorry I thought you were talking about hashing functions speeding up the random access on "weird files" that required (and possibly defragmented as well). My bad.

Although what would actually be that caching, for what specific examples? The reason I ask is not because they don't exist, but because they would be small and negligible -- compared even to today's media like movies or games.

[quote"LocoDelAssembly"]You have to backup data + enough data to reconstruct the FS (which does not include raw copy of the state as it was). Then, since it was supposed by you (and I also beleave) that user data would be a small percentage it wouldn't take much. The way backups are made will have to change since in the case of databases for instance I don't need to backup the empty buckets, so copying the entire file would be stupid most of the time.[/quote]But then it wouldn't eat much space either. You reduce the bandwidth you use -- you also reduce the storage need.

LocoDelAssembly wrote:
I see you have said nothing about hashing, it is because you passed over that post or because you agree?
Well seeing as I have misunderstood you before, can you give me a simple example to make it clearer for me please? Embarassed

LocoDelAssembly wrote:
Finally, since you are over worried about bandwidth let me show you what we have now: PC3-16000 DDR3-SDRAM (triple channel) 384.0 Gbit/s 48.0 GB/s
That's triple channel, you can achieve similar boost with 3 Hard Disks for example (also that's RAM, seeing as even flash memory, which is solid state, is slower... so not just because of mechanical parts).

But that's not the point I was making or the way to go. Of course, you can always buy 2 computers this way too to achieve double the storage plus double the bandwidth, but that is not an improvement at all.

2 processors side-by-side don't make an improvement over 1.
1 processor twice as powerful as the previous one does.

see the difference?

LocoDelAssembly wrote:
The only "invented excuses" were the statistical and logging thing that those were perhaps somewhat fancy and of no real need, but the scenarios of "over allocation" of space exists even today and the more space you have the better it work.

PS: BTW, when I say media I'm referring to the storage media (e.g. magnetic disk, solid state memory, etc), without saying any particular technology because I don't know which one will be.
Yes I understand that, but what I meant was that this storage "caching" or whatever needs to be written. Ironically, since it's meant for speed, it SLOWS DOWN your hard disk because it eats up bandwidth.

(ps: replace hard disk with whatever else you want Wink).

_________________
Previously known as The_Grey_Beast
Post 13 Apr 2009, 01:35
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  
Goto page Previous  1, 2, 3 ... 6, 7, 8 ... 10, 11, 12  Next

< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You can attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar.

Powered by rwasa.