Using free space for more fault tolerant file systems.

Discussion in 'C Programming' started by Skybuck Flying, Sep 27, 2011.

  1. Hello,

    Here is an idea which might have some value for some people:

    The file system of for example Microsoft Windows could use free space to
    store redundant data/copies.

    The free space is not used anyway so it could be used to recover from bad
    sectors more easily.

    The free space remains available as free space but is secretly also used as
    redundant data.

    In the event of a bad sector perhaps the file system can recover from it
    more easily.

    Perhaps this technology idea is not needed for harddisks and current file
    system since it seems already pretty stable, perhaps it already does it ?
    But I don't think so.

    However for new technologies like SSD which might be more error prone it
    could be interesting idea to design a new file system or to modify the
    existing file system so that it can take more adventage of free space for
    more redundancy...

    I have read about SSD spreading data across it's chip to prevent quick wear
    of logic/the same sections, but does it also use free space for more
    redundancy ? (I don't think so but I could be wrong).

    So in case this is a new idea it could have some value, so I thought I'd
    mention it.

    Ofcourse users can also do this manually by making multiple copies of
    folders and important data.

    Perhaps the file system could also be extended with a "importancy tag".

    Users could then "tag" certain folders as "highly important".

    The more important the folder is the more redundancy it would get ;)

    The system itself could have an importancy of 2, so that it can survive a
    single bad sector.

    Small little folders but with "super high importancy" could even receive a
    redundancy of 4, 10, maybe even 100.

    (Each redundancy means a duplicate copy, so 100 would mean 100 copies, 1
    real copy, and 99 copies in free space).

    Bye,
    Skybuck.
     
    Skybuck Flying, Sep 27, 2011
    #1
    1. Advertising

  2. Now that the main idea/concept has been described I shall also go into
    somewhat obvious sub details:

    1. First a word about performance, this probably shouldn't be to bad... the
    system could make the redundant copies when the system is idling.

    2. I also apologize to the comp.lang.c newsgroup for being off topic but
    there are a lot of C programmes in there, perhaps some with file system
    skills, or perhaps newbies who want to give it a try ! This is an important
    topic so I think it warrants the off-topicness of it ;) Don't worry I won't
    do it again any time soon probably, since good idea's like these come by
    rarely ;)

    And that's what I have to say about it for now.

    Goodbye,
    May god bless you ! ;) =D

    Bye,
    Bye,
    Skybuck ! ;) =D
     
    Skybuck Flying, Sep 27, 2011
    #2
    1. Advertising

  3. "Skybuck Flying" <> writes:

    > Hello,
    >
    > Here is an idea which might have some value for some people:
    >
    > The file system of for example Microsoft Windows could use free space
    > to store redundant data/copies.
    >
    > The free space is not used anyway so it could be used to recover from
    > bad sectors more easily.
    >
    > The free space remains available as free space but is secretly also
    > used as redundant data.
    >
    > In the event of a bad sector perhaps the file system can recover from
    > it more easily.


    Bad sector correction these days is done within the hard drive, the OS
    is not involved.

    If the OS used "empty" parts of the disc for redundant data storage, it
    would fill up. If there was one redundant copy of every sector, the
    drive would hold only half as much.

    People who to whom data integrity is that important are already using
    RAID arrays. What you describe would save you in the event of a bad
    sector, but wouldn't save you if the entire drive died. RAID can.

    Although you also want off-line backups in a separate location so that a
    fire can't wipe out the drive, the RAID array, and the backups all at
    once.

    -- Patrick
     
    Patrick Scheible, Sep 27, 2011
    #3
  4. Skybuck Flying

    John Gordon Guest

    In <> Patrick Scheible <> writes:

    > What you describe would save you in the event of a bad sector


    It would only save you if there's enough free space to clone the
    entire drive, or if the bad sector happened to be chosen for duplication
    in the remaining free space.

    --
    John Gordon A is for Amy, who fell down the stairs
    B is for Basil, assaulted by bears
    -- Edward Gorey, "The Gashlycrumb Tinies"
     
    John Gordon, Sep 27, 2011
    #4
  5. Skybuck Flying

    Paul Guest

    John Gordon wrote:
    > In <> Patrick Scheible <> writes:
    >
    >> What you describe would save you in the event of a bad sector

    >
    > It would only save you if there's enough free space to clone the
    > entire drive, or if the bad sector happened to be chosen for duplication
    > in the remaining free space.
    >


    I bet you could make use of the space.

    Think "QuickPar". (This article doesn't do it justice, but you have to
    start somewhere.)

    http://en.wikipedia.org/wiki/Quickpar

    QuickPar was proposed as a "belt and suspenders" method of storing
    data on CDs. For example, you'd write 500MB worth of files, and
    store an additional 200MB of parity blocks. If you took a nail, and scratched
    200MB of data on the CD, the remaining parity blocks could be used to
    re-constitute the original data.

    The PAR method is typically used on USENET, in binary groups, for pirating
    movies. A movie might be chopped up into a thousand USENET postings to a
    binary group. USENET servers may have poor retention, or lose some of the
    postings. If the user then injects a significant percentage of parity blocks,
    recipients on the other end, downloading the movie, could download the available
    eight hundred data blocks, two hundred or more parity blocks, and get the entire
    movie to show up on their desktop (as the parity blocks can be used to
    replace the missing data). So the concept was popularized in pirating
    circles, and most of the testing would be done there (as to what works and
    what doesn't work).

    One of the problems with PAR, was the implementation was less than perfect.
    There is a "maths" problem with the tools in popular circulation, such that
    you can't always recover the data. There were some proposals on how to fix
    that (some kind of sparse matrix of some sort), but I stopped following
    the conversation on the subject. I did some testing, i.e. remove a block
    of data, grab a parity block, and see the tool recover the data, so it
    did work in very limited testing. But there are reports, from people
    who have enough parity blocks but can't get the data back.

    Paul
     
    Paul, Sep 27, 2011
    #5
  6. Skybuck Flying

    Nobody Guest

    On Tue, 27 Sep 2011 18:05:19 -0400, Paul wrote:

    > Think "QuickPar". (This article doesn't do it justice, but you have to
    > start somewhere.)
    >
    > http://en.wikipedia.org/wiki/Quickpar


    > One of the problems with PAR, was the implementation was less than
    > perfect. There is a "maths" problem with the tools in popular
    > circulation, such that you can't always recover the data. There were
    > some proposals on how to fix that (some kind of sparse matrix of some
    > sort), but I stopped following the conversation on the subject. I did
    > some testing, i.e. remove a block of data, grab a parity block, and see
    > the tool recover the data, so it did work in very limited testing. But
    > there are reports, from people who have enough parity blocks but can't
    > get the data back.


    The "maths problem" was in the first version of PAR (and in the academic
    paper on which it was based), which has been deprecated for years. The
    problem was fixed in PAR2, which QuickPAR uses.

    Specifically: the algorithm relies upon generating an (n+m) x n matrix
    (where n is the number of data blocks and m is the number of recovery
    blocks) with the properties that the first n rows are an identity matrix,
    and that any combination of n rows result in an invertible matrix. The
    original algorithm didn't always satisfy the second constraint.

    The fix is to start with a Vandermonde matrix (which inherently satisfies
    the second constraint) and manipulate it using equivalence-maintaining
    operations (swapping columns, multiplying a column by a scalar, adding a
    multiple of another column) until the first constraint is satisfied.

    Apart from the maths problem (which, in practice, meant that you might
    occasionally need one or two blocks more should have been needed), there
    were more significant limitations, the main ones being a limit of 255
    blocks and the block size being equal to or larger than the largest file
    in the set (i.e. each file was a single block).

    > The PAR method is typically used on USENET, in binary groups, for
    > pirating movies.


    While the PAR/PAR2 implementation is mostly a usenet thing, the underlying
    technology (Reed-Solomon error correction) is used far more widely:
    CDs (audio and data), DSL, QR-codes, and RAID-6 all use it.
     
    Nobody, Sep 28, 2011
    #6
  7. "Patrick Scheible" wrote in message news:...

    "Skybuck Flying" <> writes:

    > Hello,
    >
    > Here is an idea which might have some value for some people:
    >
    > The file system of for example Microsoft Windows could use free space
    > to store redundant data/copies.
    >
    > The free space is not used anyway so it could be used to recover from
    > bad sectors more easily.
    >
    > The free space remains available as free space but is secretly also
    > used as redundant data.
    >
    > In the event of a bad sector perhaps the file system can recover from
    > it more easily.


    "
    Bad sector correction these days is done within the hard drive, the OS
    is not involved.
    "

    True, however from what I remember reading about that the harddisk only has
    some extra space which is used to recover from bad sectors, so not the
    entire drive/platters could be used ?

    So there could still be some value in doing it in software ;)

    "
    If the OS used "empty" parts of the disc for redundant data storage, it
    would fill up.
    "

    Not really, the redundant parts are tagged as "free space" but also tagged
    as "redundant data" in case of emergency.

    "
    If there was one redundant copy of every sector, the
    drive would hold only half as much.
    "

    Not all data is equal important, more important files could be made more
    redundant.

    The files which are least redundant could be "emptied" first...

    I do know see a little problem with fragmentation, but perhaps that can be
    solved too ;)

    "
    People who to whom data integrity is that important are already using
    RAID arrays. What you describe would save you in the event of a bad
    sector, but wouldn't save you if the entire drive died. RAID can.
    "

    True there are many other ways of adding redundancy, this perhaps new idea
    could add to that, doesn't cost you a thing too probably except for some
    extra software ;)

    Though perhaps it would wear it the harddisk a little bit sooner (probably
    insignificent for hd's), and perhaps SSD might also wear a lot faster (could
    be an issue).

    "
    Although you also want off-line backups in a separate location so that a
    fire can't wipe out the drive, the RAID array, and the backups all at
    once.
    "

    Yeah, multiple places, even a data safe ! ;) :)

    Bye,
    Skybuck.
     
    Skybuck Flying, Sep 28, 2011
    #7
  8. Interesting points they can be used to make the system even more
    interesting.

    When the system has problems reading certain sectors, the system itself
    could mark those as "unreadable".

    However care should be taken to make sure they remain "unreadable", strange
    electrical problems could cause "temporarely unreadibility" perhaps.

    So sometimes the system should try again, briefly, perhaps remark the sector
    is a certain percentage of reliability.

    This way by counting the number of bad sectors it becomes more possible to
    indicate to the user that the harddisk is slowly failing.

    Currently such features rely on "SMART" which might not be enabled on all
    systems.

    For example in my BIOS smart is default off ?!?

    I'm not sure why it's off... maybe it can cause hardware problems, or maybe
    it's not support by windows.

    Whatever the case my be... a software-based system would work around any
    hardware issues or incompatibilities and might make it more easy to report
    pending-failures.

    This gives the user some more time to remedy the problems. One unrecoverable
    sector is already enough to cause major headaches... so in the advent of a
    pending-failure every possible technology which could prevent a single
    sector from failing would be definetly worth it.

    And I totally agree with you, as soon as the disk starts failing it should
    be replaced.

    In no way is this system idea ment as a long term solution, it's mostly ment
    to allow recovery of bad harddisks before it totally fails.

    So this system idea gives more time to move data of the disk to something
    else.

    Bye,
    Skybuck.
     
    Skybuck Flying, Sep 28, 2011
    #8
  9. However another interesting point could be that the system could be used as
    a sort of "second class" harddisk.

    The harddisk is not used for anything important, it's just use to speed up
    reading....

    As long as the harddisk is working, there is no real reason to ditch it...
    it could still be usefull for reading.

    As long as the real data is backup/moved to a new harddisk by the user.

    So this system idea could also make "unreliable disks" still somewhat
    usefull ! ;)

    Until it totally fails or causes hardware problems, like perhaps harddisk
    intercommunication/bus/protocol problems or something.

    ^ Risky though... it might be difficult to diagnose until it's physically
    disconnected, could also be rare electrical situation.

    Bye,
    Skybuck.
     
    Skybuck Flying, Sep 28, 2011
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Tolerant comparator

    , Aug 31, 2005, in forum: VHDL
    Replies:
    4
    Views:
    483
    Andy Peters
    Sep 1, 2005
  2. Shuo Xiang

    Stack space, global space, heap space

    Shuo Xiang, Jul 9, 2003, in forum: C Programming
    Replies:
    10
    Views:
    2,974
    Bryan Bullard
    Jul 11, 2003
  3. Kirk Haines

    Fault Tolerant DRb?

    Kirk Haines, Jul 1, 2005, in forum: Ruby
    Replies:
    6
    Views:
    215
    Ara.T.Howard
    Jul 1, 2005
  4. Don Stocks

    Fault Tolerant TupleSpace?

    Don Stocks, Jul 10, 2006, in forum: Ruby
    Replies:
    0
    Views:
    153
    Don Stocks
    Jul 10, 2006
  5. Edgardo Hames
    Replies:
    1
    Views:
    143
    ВаÑиÌлий ГригоÌрьевич ЗаÌйце
    Jan 7, 2010
Loading...

Share This Page