flat assembler
Message board for the users of flat assembler.

flat assembler > Windows > how can I detect if there is a hdd failure?

Author
Thread Post new topic Reply to topic
vivik



Joined: 29 Oct 2016
Posts: 485
I learned that I can read or write to a disk or partition with the CreateFile function. I want to try to store information on hdd without a file system, or with a primitive self-written one. (Just to make data recovery a bit simpler).

Question is, how can I detect if there is a hdd failure? According to this https://www.pcreview.co.uk/threads/fat32-crc-implementation.3920429/ , there is no crc or similar in the file system, it's implemented in the driver somehow.
Post 02 Sep 2018, 17:48
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 16057
Location: 112 Ocean Avenue, Amityville
HDDs store checksum data for each sector. When you read a sector the HDD checks the data integrity and reports back any failures.

You can also use the SMART data to see if an HDD is failing.
Post 02 Sep 2018, 19:36
View user's profile Send private message Visit poster's website Reply with quote
DimonSoft



Joined: 03 Mar 2010
Posts: 419
Location: Belarus
As for the API, your read function might return failure code at some point. Say, your ReadFile might return 0 and GetLastError at that moment will return something like ERROR_CRC, maybe some other code like ERROR_SECTOR_NOT_FOUND depending on the driver logic and the nature of the error. Not that you can do much in such cases.
Post 03 Sep 2018, 03:42
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 16057
Location: 112 Ocean Avenue, Amityville
Another thing is that an HDD will usually only detect errors during reads. While there are some errors that write can detect, it is less common.

So if you want to have a good assurance that your data was written correctly you can read it back again after writing. But be sure to bypass any caching to make sure you are reading from the actual surface.

It is also good practice to read the entire surface occasionally to allow the ECC codes to detect any transient errors and rewrite affected sectors. This helps to keep the data in a readable state for longer before unrecoverable errors start to accumulate.
Post 03 Sep 2018, 08:19
View user's profile Send private message Visit poster's website Reply with quote
DimonSoft



Joined: 03 Mar 2010
Posts: 419
Location: Belarus
As a side note, some bad hard drives might lie about actual writes to the surface: see Raymond’s post plus comments. In the years of crazy marketing it’s quite possible for a hard drive to read back written data from cache instead of the surface for performance reasons.
Post 03 Sep 2018, 08:50
View user's profile Send private message Visit poster's website Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 16057
Location: 112 Ocean Avenue, Amityville
And all of the above comments apply equally to SSDs (and those cheaper versions of SSDs known commonly as thumb drives), just as much as HDDs.
Post 04 Sep 2018, 07:52
View user's profile Send private message Visit poster's website Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 1260
If you want recovery ability, you should use something like what par2cmdline does.

https://github.com/Parchive/par2cmdline

I don't really understand it, seems magic to me. But it works.
Post 04 Sep 2018, 15:48
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 16057
Location: 112 Ocean Avenue, Amityville
Furs wrote:
If you want recovery ability, you should use something like what par2cmdline does.

https://github.com/Parchive/par2cmdline
Or you could just use normal backups on separate media and not try to rely upon opaque recovery software Wink
Post 04 Sep 2018, 21:32
View user's profile Send private message Visit poster's website Reply with quote
Furs



Joined: 04 Mar 2016
Posts: 1260
revolution wrote:
Furs wrote:
If you want recovery ability, you should use something like what par2cmdline does.

https://github.com/Parchive/par2cmdline
Or you could just use normal backups on separate media and not try to rely upon opaque recovery software Wink
Well it's open source, I was thinking he could use the same algorithm and do it on chunks of sectors for his filesystem (Reed-Solomon coding or whatever it is called, voodoo to me...).

For example for every 8192 bytes block, store an extra 512-byte sector of redundancy. This gives you 6.25% redundancy which is plenty to repair it.

Even backups suffer from silent bit rot and the like Wink This thing would protect them because not only it detects bad corruption, it also repairs it. Even if you copy bad (silently corrupted) data to the backup, if you copy the repair data also, then chances are you will recover it at some point.
Post 04 Sep 2018, 23:08
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 16057
Location: 112 Ocean Avenue, Amityville
Furs wrote:
For example for every 8192 bytes block, store an extra 512-byte sector of redundancy. This gives you 6.25% redundancy which is plenty to repair it.
The HDDs already do that. They use ECC codes to recover bad data. That is why doing a full surface read can keep the HDD in a good state. Bad data is repaired and rewritten to spare locations.
Post 04 Sep 2018, 23:43
View user's profile Send private message Visit poster's website Reply with quote
DimonSoft



Joined: 03 Mar 2010
Posts: 419
Location: Belarus
Furs wrote:
I was thinking he could use the same algorithm and do it on chunks of sectors for his filesystem (Reed-Solomon coding or whatever it is called, voodoo to me...).

<SillyPun>And they’d better be Reed-Right-Solomon codes, not Reed-only.</SillyPun>
Post 05 Sep 2018, 03:50
View user's profile Send private message Visit poster's website Reply with quote
vivik



Joined: 29 Oct 2016
Posts: 485
revolution wrote:
Bad data is repaired and rewritten to spare locations.


Sounds like a file system feature. This wouldn't work with raw linear disk access? Is that NTFS or FAT? I'd prefer data to be written on the same place it was, because read is faster if it's sequential.

Also, what is SMART? Is that a program, or an additional information hdd provides?

Also, is there a ReadFile alternative that can notify me when the file is 50% read? Let's say I read 500mb of memory in a single read, but I want to notify another thread when the first 1mb and 10mb are loaded, so that I can at least check the header while the rest of the file is being read. I'm not sure if 3 separate ReadFile calls are as effective as just 1. The async io is probably what I'm looking for.
Post 06 Sep 2018, 06:33
View user's profile Send private message Reply with quote
revolution
When all else fails, read the source


Joined: 24 Aug 2004
Posts: 16057
Location: 112 Ocean Avenue, Amityville
vivik wrote:
revolution wrote:
Bad data is repaired and rewritten to spare locations.
Sounds like a file system feature. This wouldn't work with raw linear disk access?
It is all handled by the HDD/SSD automatically.

The SMART data is maintained by the drive. There are programs you can use to view the data from the drive.
Post 06 Sep 2018, 07:22
View user's profile Send private message Visit poster's website Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2018, Tomasz Grysztar.

Powered by rwasa.