A bit resistant to disruption

Discussion in 'C Programming' started by Francois Grieu, Feb 7, 2010.

  1. /*
    Disclaimer: This is a hard problem, only minimally disguised into
    something on-topic for comp.lang.c

    You are programming a C99 conforming freestanding implementation.
    Disruption (reset or power off/on cycle) can occur at *ANY* time without
    advance warning, and stops whatever operation is underway. The program
    is re-executed from main() after a disruption.

    The objective is to implement a single bit resistant to disruption,
    accessed using two procedures to be written:
    int bRead(void); // Return 0 or 1 according to bit state.
    void bToggle(void); // Change the state of the bit.

    These two properties shall be met:
    a) The bit is stable in the absence of Toggle. That is, the result given
    by any two undisrupted bRead() is the same unless bToggle() was invoked
    in-between.
    b) The bit is changed by Toggle. That is, the result given by any two
    undisrupted bRead() is different if bToggle() was invoked exactly once
    in-between and no disruption occurred during that execution of bToggle().


    The only way to store information across disruptions is using a number
    of independent physical EEPROM cells, designated by index j. Access to
    the EEPROM cells is by the following three functions. The properties
    stated for theses functions apply for any fixed non-negative integer j
    less than the number of cells, and "for that cell" is implied everywhere.

    // Read a designated EEPROM cell; always returns 0 or 1.
    extern int eRead(int j);

    // Erase a designated EEPROM cell.
    // After undisrupted eErase, eRead returns 0 until eProgram is invoked.
    extern void eErase(int j);

    // Program a designated *ERASED* EEPROM cell.
    // After undisrupted eProgram, eRead returns 1 until eErase is invoked.
    // Programming a cell that is not erased cause undefined behavior.
    extern void eProgram(int j);

    An EEPROM cell can be left neither erased nor programmed by a disrupted
    eErase unless that cell was already erased, or by a disrupted eProgram.
    For a cell in such halfway state, eRead returns a value specified only
    to be 0 or 1. That otherwise unspecified value may in particular change
    after disruption (due e.g. to uncontrollable random noise and/or
    temperature and/or voltage drift in the hardware used by eRead), even
    though the sate of cells is unchanged by disruption or/and eRead.

    Note: if a cell is not erased (left in a halfway state or programmed),
    an eProgram of that cell is prohibited. A workaround is to use the
    following rather than eProgram():
    void mySafeProgram(int j)
    {
    eErase(j);
    eProgram(j);
    }

    Before the first run, all EEPROM cells have been erased. eRead, eErase,
    and eProgram terminate unless a disruption occurs, assuming constraints
    on j and eProgram are met. bRead and bToggle must similarly terminate
    unless a disruption occurs.


    Can it be done? If yes, what is the minimum number of cells needed? Give
    a proof.

    Depending on the answer, try to minimize the expected failure rate or
    maximize the expected LED flash rate in the following use case:
    */

    #define NCELLS 10 // number of cells

    extern void sLed(int x); // 0: LED on green then off
    // 1: LED on red then off.
    extern int eRead(int j); // Read a cell.
    extern void eErase(int j); // Erase a cell to 0.
    extern void eProgram(int j); // Program an erased cell to 1.

    int bRead(void); // Return 0 or 1 according to bit state.
    void bToggle(void); // Change the state of the bit.

    // Declarations and code for bRead/bToggle go here

    int main(void)
    {
    int vState;
    vState = bRead();
    while(1)
    {
    sLed(vState);
    bToggle();
    vState = !vState;
    }
    return 0;
    }

    /*
    Assume eProgram, eErase, sLed each last 1 second; everything else is
    comparatively instant; a disruption occurs with odds p each second;
    unspecified eRead values are assigned at random with mean 0.5; the LED
    is off following disruption; a failure is a violation of the
    requirement: if a disruption occurs with the LED on in some color, then
    the first time the program lights a LED it must be with that color.


    Notice that disruption without warning is the standard under which most
    real-life hardware actually operates (with the exception of systems with
    physical protection against adversarial reset and a power reserve with
    means to sense imminent power loss), even though this is often left out
    of the specification. And that real disruption-surviving media (EEPROM,
    Flash, CDRW, though not battery-backed SRAM) do have the limitation that
    read after interrupted change of state sometime gives an unstable
    result. While "programming a cell that is not erased cause undefined
    behavior" is not the rule, it can exist due to a poorly specified API,
    an overcautious hardware specification, or in some cases due to tue
    hardware limitations.

    The problem is derived from actual requirements in a Smart Card. This is
    neither homework, nor an attempt to gather brainpower for free: I'm the
    author of the problem and have conclusively determined the feasibility,
    with a concise proof. A similar problem "A bit in bottles" is in
    rec.puzzles, some of the replies contain answers, and I committed mine.
    http://groups.google.com/group/rec.puzzles/browse_thread/thread/ab892b3cf88af0de

    Francois Grieu
    */
    Francois Grieu, Feb 7, 2010
    #1
    1. Advertising

  2. Francois Grieu

    Stefan Ram Guest

    Francois Grieu <> writes:
    >The objective is to implement a single bit resistant to disruption,
    >accessed using two procedures to be written:
    > int bRead(void); // Return 0 or 1 according to bit state.
    > void bToggle(void); // Change the state of the bit.


    This is a complete sentence defining the »objective«, which
    actually does not allow for any further requirements to be
    listed later. Otherwise it would need to include a clause such as:

    »... to implement a single bit according to requirements
    given below.«.

    For example, if I would write »The objective /is/ to
    calculate the sum of 2 and 3.« The structure of this
    sentence means the objective is defined once and for all.
    I can not add any requirements (»propertier«) to be met later.
    Otherwise, I would need to write »One /part/ of the objective
    is to calculate the sum of 2 and 3.« or »The objective is to
    calculate the sum of 2 and 3 fulfilling the following properties.«
    or so.

    But the sentence cannot be understood as it is written, because
    what it means for a bit to be »resistant to disruption« is unknown.
    (At least to me, that is.)

    >These two properties shall be met:
    >a) The bit is stable in the absence of Toggle. That is, the result given
    >by any two undisrupted bRead() is the same unless bToggle() was invoked
    >in-between.
    >b) The bit is changed by Toggle. That is, the result given by any two
    >undisrupted bRead() is different if bToggle() was invoked exactly once
    >in-between and no disruption occurred during that execution of bToggle().

    ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
    I thought the hard case was a disruption during the execution of bToggle().
    If I can assume that it is not being interrupted, why can't I then
    implement bToggle in the most obvious way?
    Stefan Ram, Feb 7, 2010
    #2
    1. Advertising

  3. Francois Grieu

    Kaz Kylheku Guest

    On 2010-02-07, Francois Grieu <> wrote:
    > The objective is to implement a single bit resistant to disruption,
    > accessed using two procedures to be written:
    > int bRead(void); // Return 0 or 1 according to bit state.
    > void bToggle(void); // Change the state of the bit.


    Note that the power can go out at any time while bToggle
    is executing, creating an obvious ambiguity: did this outage occur
    before or after the bit was toggled?

    What if the power goes out while the the expression bToggle() is
    evaluating, but before the call has taken place? What if the power
    outage occurs just after bToggle has been called, and is returning?

    In the event of a system disruption such as a power loss, the best that
    we can ensure is that the data is /consistent/. There is no way to
    ensure that the data has the absolutely most recent value.

    For instance, a database transaction to update some records either
    did happen or did not happen; the database is not in some in-between
    inconsistent or corrupt state.

    A single bit cannot ever be inconsistent!
    Kaz Kylheku, Feb 7, 2010
    #3
  4. Stefan Ram a écrit :
    > Francois Grieu <> writes:
    >> The objective is to implement a single bit resistant to disruption,
    >> accessed using two procedures to be written:
    >> int bRead(void); // Return 0 or 1 according to bit state.
    >> void bToggle(void); // Change the state of the bit.

    >
    > This is a complete sentence defining the »objective«, which
    > actually does not allow for any further requirements to be
    > listed later. Otherwise it would need to include a clause such as:
    >
    > »... to implement a single bit according to requirements
    > given below.«.


    These requirements are properties a) b). I stand corrected.

    > For example, if I would write »The objective /is/ to
    > calculate the sum of 2 and 3.« The structure of this
    > sentence means the objective is defined once and for all.
    > I can not add any requirements (»propertier«) to be met later.
    > Otherwise, I would need to write »One /part/ of the objective
    > is to calculate the sum of 2 and 3.« or »The objective is to
    > calculate the sum of 2 and 3 fulfilling the following properties.«
    > or so.
    >
    > But the sentence cannot be understood as it is written, because
    > what it means for a bit to be »resistant to disruption« is unknown.
    > (At least to me, that is.)
    >
    >> These two properties shall be met:
    >> a) The bit is stable in the absence of Toggle. That is, the result given
    >> by any two undisrupted bRead() is the same unless bToggle() was invoked
    >> in-between.
    >> b) The bit is changed by Toggle. That is, the result given by any two
    >> undisrupted bRead() is different if bToggle() was invoked exactly once
    >> in-between and no disruption occurred during that execution of bToggle().

    > ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
    > I thought the hard case was a disruption during the execution of bToggle().


    No. If there is a disruption during bToggle, what the next undisrupted
    bRead() returns is left unspecified.

    > If I can assume that it is not being interrupted, why can't I then
    > implement bToggle in the most obvious way?


    Implement bToggle as you wish. The difficult requirement is a), given
    that eRead may return unstable data.

    Francois Grieu
    Francois Grieu, Feb 7, 2010
    #4
  5. Francois Grieu

    Stefan Ram Guest

    Francois Grieu <> writes:

    Given:

    >extern int eRead(int j);
    >extern void eErase(int j);
    >extern void eProgram(int j);
    >void mySafeProgram(int j)


    Wanted:

    >the result given by any two undisrupted bRead() is the same
    >unless bToggle() was invoked in-between.


    So, what about the following implementation?

    int bRead( int const j )
    { int const value = eRead( j );
    if( value )mySafeProgram( j ); else eErase( j );
    return value; }
    Stefan Ram, Feb 7, 2010
    #5
  6. On 2010-02-07, Francois Grieu <> wrote:
    >
    > The objective is to implement a single bit resistant to disruption,
    >


    As has been said, a single bit cannot be "inconsistent" in the
    sense that a database or complex structure might be. If you want
    it to hold across power failures, this is likely a hardware issue.

    bit ^= 1;

    is likely to be a single instruction on your hardware so if you
    #define Toggle() as such, there's nothing more that can be done
    via software.
    Andrew Poelstra, Feb 7, 2010
    #6
  7. Kaz Kylheku wrote:
    > On 2010-02-07, Francois Grieu <> wrote:
    >> The objective is to implement a single bit resistant to disruption,
    >> accessed using two procedures to be written:
    >> int bRead(void); // Return 0 or 1 according to bit state.
    >> void bToggle(void); // Change the state of the bit.

    >
    > Note that the power can go out at any time while bToggle
    > is executing, creating an obvious ambiguity: did this outage occur
    > before or after the bit was toggled?


    This is why after bToggle() is disrupted, the result of the next
    undisrupted bRead() is left unspecified.

    > What if the power goes out while the the expression bToggle() is
    > evaluating, but before the call has taken place? What if the power
    > outage occurs just after bToggle has been called, and is returning?


    Count that as disrupted bToggle().

    If you want a "falsifiable" test, consider the use case given. Any
    correct bRead/bToggle pass this test. Any bRead/bToggle that pass this
    test can be turned into a correct bRead/bToggle2 by having bToggle2
    invoke bRead and discard the result then invoke bToggle.

    > In the event of a system disruption such as a power loss, the best that
    > we can ensure is that the data is /consistent/. There is no way to
    > ensure that the data has the absolutely most recent value.


    Agreed.

    > For instance, a database transaction to update some records either
    > did happen or did not happen; the database is not in some in-between
    > inconsistent or corrupt state.


    Agreed.

    > A single bit cannot ever be inconsistent!


    With actual physical EEPROM or Flash, after a change of a cell has been
    disrupted, two readings without a change are not guaranteed to return
    the same value. This is because a cell is read by comparing the charge
    in the cell to a reference value, with some uncertainty, and that charge
    only changes slowly (in the order of a millisecond, and a power loss, at
    least a deliberate one as in short circuit, occurs faster than that).
    Note this is one area where SRAM + battery backup is better (positive
    feedback between the two sides of a SRAM cell makes it inherently
    stable, or at least makes metastability plain unobservable). All this
    stuff is very often overlooked, and gives rise to all kind of subtle and
    hard to track issues, especially when temperature changes and when
    disruption is frequent or/and adversarial (which must be assumed in
    Smart Cards, especially when remotely powered).

    One usual solution is that after reading a bit as erased, you erase it
    again; and after reading it as programmed, you program it again
    (positive feedback with software); so that next time you read, you'll
    get a stable value. A relatively easy (I would not say trivial) tweak
    using a second bit avoids premature wear out after things have
    stabilized. You then have a "bit" in the usual sense and can use it to
    perform e.g. atomic database transaction using classical algorithms.

    But in the present problem things are made complicated by prohibition of
    over-program of cells.


    Francois Grieu
    Francois Grieu, Feb 7, 2010
    #7
  8. Stefan Ram a écrit :
    > Francois Grieu <> writes:
    >
    > Given:
    >
    >> extern int eRead(int j);
    >> extern void eErase(int j);
    >> extern void eProgram(int j);
    >> void mySafeProgram(int j)

    >
    > Wanted:
    >
    >> the result given by any two undisrupted bRead() is the same
    >> unless bToggle() was invoked in-between.

    >
    > So, what about the following implementation?
    >
    > int bRead( int const j )
    > { int const value = eRead( j );
    > if( value )mySafeProgram( j ); else eErase( j );
    > return value; }
    >


    Assume bRead() just has returned 1.
    A disruption occurs.
    bRead() is called again.
    A disruption occurs during that bRead(), while it is doing
    mySafeProgram(), sometime during eErase() or eProgram(); say, in between
    to simplify.
    bRead() is called again without disruption, and returns 0.

    Property a) is violated, regardless of how you implement bToggle, which
    you did not specify BTW.

    Francois Grieu
    Francois Grieu, Feb 7, 2010
    #8
  9. Andrew Poelstra wrote :
    > As has been said, a single bit cannot be "inconsistent" in the
    > sense that a database or complex structure might be.


    A bit, agreed. The state of a cell, disagreed. See my answer in the
    sub-thread that you mention.

    If you want a useful model of an EEPROM cell and don't want to dive into
    how hardware works, think of a bottle that can only be emptied or filled
    slowly, and is read by grossly weighting it on a scale giving only
    light/heavy reading. This is how classic EEPROM and Flash work
    (multilevel Flash store e.g. 2 bits per cell with a scale having 4
    readings instead of 2, we don't consider that).

    (snip)
    Francois Grieu
    Francois Grieu, Feb 7, 2010
    #9
  10. On 2010-02-07, Francois Grieu <> wrote:
    > Andrew Poelstra wrote :
    >> As has been said, a single bit cannot be "inconsistent" in the
    >> sense that a database or complex structure might be.

    >
    > A bit, agreed. The state of a cell, disagreed. See my answer in the
    > sub-thread that you mention.
    >
    > If you want a useful model of an EEPROM cell and don't want to dive into
    > how hardware works, think of a bottle that can only be emptied or filled
    > slowly, and is read by grossly weighting it on a scale giving only
    > light/heavy reading. This is how classic EEPROM and Flash work
    > (multilevel Flash store e.g. 2 bits per cell with a scale having 4
    > readings instead of 2, we don't consider that).
    >
    > (snip)
    > Francois Grieu


    I know how Flash and EEPROM work, on a basic level - which is why
    I think you need a hardware solution. You can buffer and checksum
    to validate the data as best you can, but in the end if you lose
    power in the middle of a flash write you may very well be screwed.
    Andrew Poelstra, Feb 7, 2010
    #10
  11. Francois Grieu

    Stefan Ram Guest

    Francois Grieu <> writes:
    >A disruption occurs during that bRead(), while it is doing


    I do not see how to implement a solution, but by storing
    an error detecting code into more than one bit for each
    of the values for ON or OFF one could at least /detect/ an
    indefinite bit with a certain probability.
    Stefan Ram, Feb 7, 2010
    #11
  12. Andrew Poelstra wrote:
    > On 2010-02-07, Francois Grieu <> wrote:
    >> Andrew Poelstra wrote :
    >>> As has been said, a single bit cannot be "inconsistent" in the
    >>> sense that a database or complex structure might be.

    >> A bit, agreed. The state of a cell, disagreed. See my answer in the
    >> sub-thread that you mention.
    >>
    >> If you want a useful model of an EEPROM cell and don't want to dive into
    >> how hardware works, think of a bottle that can only be emptied or filled
    >> slowly, and is read by grossly weighting it on a scale giving only
    >> light/heavy reading. This is how classic EEPROM and Flash work
    >> (multilevel Flash store e.g. 2 bits per cell with a scale having 4
    >> readings instead of 2, we don't consider that).
    >>
    >> (snip)
    >> Francois Grieu

    >
    > I know how Flash and EEPROM work, on a basic level - which is why
    > I think you need a hardware solution. You can buffer and checksum
    > to validate the data as best you can, but in the end if you lose
    > power in the middle of a flash write you may very well be screwed.


    I count that as one "I think the answer is: impossible". All the persons
    that I know have solved the problem (that's three including me) seem to
    have changed their mind at least once.

    Francois Grieu
    Francois Grieu, Feb 7, 2010
    #12
  13. Stefan Ram wrote:
    > Francois Grieu <> writes:
    >> A disruption occurs during that bRead(), while it is doing

    >
    > I do not see how to implement a solution, but by storing
    > an error detecting code into more than one bit for each
    > of the values for ON or OFF one could at least /detect/ an
    > indefinite bit with a certain probability.


    Now we are moving into the right direction!

    Hints: try to prove impossibility. Any hole in your proof may lead to a
    solution. That, or prove a procedure, any hole...

    Francois Grieu
    Francois Grieu, Feb 7, 2010
    #13
  14. Francois Grieu

    Moi Guest

    On Sun, 07 Feb 2010 18:51:25 +0100, Francois Grieu wrote:

    > /*
    > Disclaimer: This is a hard problem, only minimally disguised into
    > something on-topic for comp.lang.c
    >
    > You are programming a C99 conforming freestanding implementation.
    > Disruption (reset or power off/on cycle) can occur at *ANY* time without
    > advance warning, and stops whatever operation is underway. The program
    > is re-executed from main() after a disruption.
    >
    > The objective is to implement a single bit resistant to disruption,
    > accessed using two procedures to be written:
    > int bRead(void); // Return 0 or 1 according to bit state. void
    > bToggle(void); // Change the state of the bit.
    >
    > These two properties shall be met:
    > a) The bit is stable in the absence of Toggle. That is, the result given
    > by any two undisrupted bRead() is the same unless bToggle() was invoked
    > in-between.
    > b) The bit is changed by Toggle. That is, the result given by any two
    > undisrupted bRead() is different if bToggle() was invoked exactly once
    > in-between and no disruption occurred during that execution of
    > bToggle().
    >
    >
    > The only way to store information across disruptions is using a number
    > of independent physical EEPROM cells, designated by index j. Access to
    > the EEPROM cells is by the following three functions. The properties
    > stated for theses functions apply for any fixed non-negative integer j
    > less than the number of cells, and "for that cell" is implied
    > everywhere.
    >
    > // Read a designated EEPROM cell; always returns 0 or 1. extern int
    > eRead(int j);
    >
    > // Erase a designated EEPROM cell.
    > // After undisrupted eErase, eRead returns 0 until eProgram is invoked.
    > extern void eErase(int j);
    >
    > // Program a designated *ERASED* EEPROM cell. // After undisrupted
    > eProgram, eRead returns 1 until eErase is invoked. // Programming a cell
    > that is not erased cause undefined behavior. extern void eProgram(int
    > j);
    >
    > An EEPROM cell can be left neither erased nor programmed by a disrupted
    > eErase unless that cell was already erased, or by a disrupted eProgram.
    > For a cell in such halfway state, eRead returns a value specified only
    > to be 0 or 1. That otherwise unspecified value may in particular change
    > after disruption (due e.g. to uncontrollable random noise and/or
    > temperature and/or voltage drift in the hardware used by eRead), even
    > though the sate of cells is unchanged by disruption or/and eRead.
    >
    > Note: if a cell is not erased (left in a halfway state or programmed),
    > an eProgram of that cell is prohibited. A workaround is to use the
    > following rather than eProgram():
    > void mySafeProgram(int j)
    > {
    > eErase(j);
    > eProgram(j);
    > }
    >
    > Before the first run, all EEPROM cells have been erased. eRead, eErase,
    > and eProgram terminate unless a disruption occurs, assuming constraints
    > on j and eProgram are met. bRead and bToggle must similarly terminate
    > unless a disruption occurs.
    >
    >
    > Can it be done? If yes, what is the minimum number of cells needed? Give
    > a proof.


    First shot: two hits in my cerebral hash-table
    1) alternating bit-protocol (difficult with only one bit at a time ...)
    2) Leslie Lamport's reading <--> writing in opposite directions.

    NB, I like the problem (though it is not strictly c.l.c stuff ;-)

    AvK
    Moi, Feb 7, 2010
    #14
  15. Moi wrote :

    >> Can it be done? If yes, what is the minimum number of cells needed? Give
    >> a proof.

    >
    > First shot: two hits in my cerebral hash-table
    > 1) alternating bit-protocol (difficult with only one bit at a time ...)


    Nice. However in the present problem there are no errors, only those
    pesky disruptions.

    > 2) Leslie Lamport's reading <--> writing in opposite directions.


    When I wrote "hard problem", I did not mean quite that hard...

    > NB, I like the problem (though it is not strictly c.l.c stuff ;-)


    At least "freestanding implementation" is on topic and exactly describes
    the context: digital computer, no error, single process, all variables
    lost on disruption, no standard library for persistent storage. And
    c.l.c may be one of the best place to reach programmers facing that
    issue (minus perhaps the prohibition of over-program, which is what
    makes the problem challenging).

    Francois Grieu
    Francois Grieu, Feb 8, 2010
    #15
  16. Francois Grieu

    Moi Guest

    On Mon, 08 Feb 2010 07:26:45 +0100, Francois Grieu wrote:

    > Moi wrote :
    >
    >>> Can it be done? If yes, what is the minimum number of cells needed?
    >>> Give a proof.

    >>
    >> First shot: two hits in my cerebral hash-table 1) alternating
    >> bit-protocol (difficult with only one bit at a time ...)

    >
    > Nice. However in the present problem there are no errors, only those
    > pesky disruptions.
    >
    >> 2) Leslie Lamport's reading <--> writing in opposite directions.

    >
    > When I wrote "hard problem", I did not mean quite that hard...


    I think that your failed eeprom_reset_bit() and eeprom_set_bit()
    operations are more or less equivalent to Lamport's "sending messages to another process".
    Messages can be lost in transit, your eeprom-operations can fail
    because of a crash.
    The only difference is that in the eeprom-case there is a sort of tri-state
    logic; in the case of lost messages the domain stays restricted
    to just two cases.

    If I understood the problem correct:
    1) the machine can be expected to crash *any moment*, so even while "in recovery mode"
    2) an undefined cell (either rewritten or partially cleared) produces a random
    value; subsequent reads may yield different values.

    My current approach uses a state machine with a kind of graycodes
    (maybe necklaces: http://www.theory.csc.uvic.ca/~cos/inf/neck/NecklaceInfo.html ; I am not
    sure yet how this should be named)
    The guiding idea is that in any state there is a maximal number of
    bits that could be undefined (tri-state): the bits that could have been affected
    by the previous or next state in the graph.
    While recovering this number is two, in "normal operation" it may perhaps be just one.

    My guess is I need three or four bits / bottles.
    .... still working on it ...

    AvK
    Moi, Feb 8, 2010
    #16
  17. Moi wrote :
    > On Mon, 08 Feb 2010 07:26:45 +0100, Francois Grieu wrote:
    >
    >> Moi wrote :
    >>
    >>>> Can it be done? If yes, what is the minimum number of cells needed?
    >>>> Give a proof.
    >>> First shot: two hits in my cerebral hash-table 1) alternating
    >>> bit-protocol (difficult with only one bit at a time ...)

    >> Nice. However in the present problem there are no errors, only those
    >> pesky disruptions.
    >>
    >>> 2) Leslie Lamport's reading <--> writing in opposite directions.

    >> When I wrote "hard problem", I did not mean quite that hard...

    >
    > I think that your failed eeprom_reset_bit() and eeprom_set_bit()
    > operations are more or less equivalent to Lamport's "sending messages to another process".
    > Messages can be lost in transit, your eeprom-operations can fail
    > because of a crash.
    > The only difference is that in the eeprom-case there is a sort of tri-state
    > logic; in the case of lost messages the domain stays restricted
    > to just two cases.


    Yes I see your point. One redeeming difference is that things are
    strictly sequential / singe-process in my case.

    > If I understood the problem correct:
    > 1) the machine can be expected to crash *any moment*, so even while "in recovery mode"


    Yes. Notice this is very real-world.

    > 2) an undefined cell (either rewritten or partially cleared) produces a random
    > value; subsequent reads may yield different values.


    Yes, with one exception: if a cell/bottle was erased/empty, and you
    erase/empty it again while a disruption occurs, it remains erased/empty.
    The makers of EEPROM will reluctantly accept to give this insurance,
    which is self-evident with bottles. There is no equivalent in the other
    direction, for you can't program/fill a cell/bottle unless it is fully
    erased/empty [or so insisted the makers of my EEPROM].

    > My current approach uses a state machine with a kind of graycodes
    > (maybe necklaces:http://www.theory.csc.uvic.ca/~cos/inf/neck/NecklaceInfo.html ; I am not
    > sure yet how this should be named)
    > The guiding idea is that in any state there is a maximal number of
    > bits that could be undefined (tri-state): the bits that could have been affected
    > by the previous or next state in the graph.
    > While recovering this number is two, in "normal operation" it may perhaps be just one.
    >
    > My guess is I need three or four bits / bottles.
    > ... still working on it ...


    You are the first poster with a proposition that I won't contradict, and
    are so far untouched by the "at least one public change of mind" curse!

    > AvK


    Francois Grieu
    Francois Grieu, Feb 9, 2010
    #17
  18. Francois Grieu

    Seebs Guest

    On 2010-02-09, Francois Grieu <> wrote:
    > You are the first poster with a proposition that I won't contradict, and
    > are so far untouched by the "at least one public change of mind" curse!


    The tricky part, so far as I can tell, is that we can't tell what a
    partially-written bit might look like. On the other hand, in theory,
    only one write can be interrupted, as I understand it.

    Hmm.

    Okay, here's an approach. Imagine without loss of generality that our
    goal is to toggle the bit in cell 0, and that we have additional bits to
    toggle. The bit is "T".

    Let's see. Okay, when toggle() is called, we'll start by clearing a working
    area. So we have some unspecified number of bits which is our working area,
    call them W[0] ... W[N].

    The first thing we need to do is be sure that we can recover the previous
    value in the event of a disruption. So, necessarily, we need a bit to hold
    the known-previous-value. Let's call that W0. But how do we know whether
    W0 holds a bit we care about? We need a flag for whether W0 has been written.
    Let's call that W1. So, read() should do something like:
    if (W1)
    return W0
    else
    return T

    What about toggle()? If W1 is set, then it was interrupted during an
    operation. It should try to extract the value from W0 and push it back
    into T. Once it has done that, it can clear W1 (because it no longer
    needs to be able to remember the value in W0), then W0. That reduces us
    to the default state, which is that T holds the current bit and we wish
    to toggle it.

    Given that, we then:
    * Copy T into W0
    * Set W1
    * Erase T
    * if T was previously 0, program T
    * erase W1
    * erase W0

    The rationale of this ordering is that, at every time:
    * if W1 is set, W0 holds the correct value of T
    * if W1 is unset, T is known to be correct

    If we are interrupted while setting W1, we were interrupted before we modified
    T, so that's not a problem -- T is still at its correct old value. If we
    are interrupted while setting W0, we were interrupted before we set W1, so
    that's not a problem -- T is still at its correct old value. If we are
    interrupted during any of the erasing or programming of T, that's not a
    problem -- W0 and W1 are set, so we'll pick up the right value next time
    we come in. If we are interrupted during the erase of W1, that's fine. If
    it gets erased, it's clear and T has the right value. If it doesn't get
    erased, it's still set, and W0 still holds the correct "previous value",
    so we'll be in one of the trustworthy intermediate states.

    So I think that's it.

    So!

    // Read a designated EEPROM cell; always returns 0 or 1.
    extern int eRead(int j);
    // Erase a designated EEPROM cell.
    // After undisrupted eErase, eRead returns 0 until eProgram is invoked.
    extern void eErase(int j);
    // Program a designated *ERASED* EEPROM cell.
    // After undisrupted eProgram, eRead returns 1 until eErase is invoked.
    // Programming a cell that is not erased cause undefined behavior.
    extern void eProgram(int j);

    /* default state:
    * bit 0 holds the current Durable Bit
    * bit 1 is undefined
    * bit 2 is clear
    *
    * during a toggle:
    * bit 2 is set
    * bit 1 holds the current Durable Bit
    * bit 0 is undefined.
    *
    */

    int
    bRead(void) {
    if (eRead(2)) {
    return eRead(1);
    } else {
    return eRead(0);
    }
    }

    void
    bToggle(void) {
    int t;
    if (eRead(2)) {
    t = eRead(1);
    eErase(0);
    if (t)
    eProgram(0);
    eErase(2);
    }
    t = eRead(0);
    eErase(1);
    if (t)
    eProgram(1);
    eProgram(2);
    eErase(0);
    if (!t)
    eProgram(0);
    eErase(2);
    }

    I realized on consideration that I don't need to finish this with
    eErase(1), because bit 1 is undefined. Thus, if eErase(2) completes
    at the end of bToggle(), the toggle is complete and bit 0 holds the
    newly toggled bit. If it doesn't complete, 1 still holds the right
    value.

    Note that there exists at least one case (previous value was zero, and
    a disruption occurred) where 0 is erased twice without having been
    programmed.

    -s
    --
    Copyright 2010, all wrongs reversed. Peter Seebach /
    http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
    http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
    Seebs, Feb 9, 2010
    #18
  19. Seebs wrote:
    > On 2010-02-09, Francois Grieu <> wrote:
    >> You are the first poster with a proposition that I won't contradict, and
    >> are so far untouched by the "at least one public change of mind" curse!

    >
    > The tricky part, so far as I can tell, is that we can't tell what a
    > partially-written bit might look like. On the other hand, in theory,
    > only one write can be interrupted, as I understand it.


    Only one write can be interrupted. But a series of disruptions can lead
    to multiple partially-written cells. See below.

    >
    > Hmm.
    >
    > Okay, here's an approach. Imagine without loss of generality that our
    > goal is to toggle the bit in cell 0, and that we have additional bits to
    > toggle. The bit is "T".
    >
    > Let's see. Okay, when toggle() is called, we'll start by clearing a working
    > area. So we have some unspecified number of bits which is our working area,
    > call them W[0] ... W[N].
    >
    > The first thing we need to do is be sure that we can recover the previous
    > value in the event of a disruption. So, necessarily, we need a bit to hold
    > the known-previous-value. Let's call that W0. But how do we know whether
    > W0 holds a bit we care about? We need a flag for whether W0 has been written.
    > Let's call that W1. So, read() should do something like:
    > if (W1)
    > return W0
    > else
    > return T
    >
    > What about toggle()? If W1 is set, then it was interrupted during an
    > operation. It should try to extract the value from W0 and push it back
    > into T. Once it has done that, it can clear W1 (because it no longer
    > needs to be able to remember the value in W0), then W0. That reduces us
    > to the default state, which is that T holds the current bit and we wish
    > to toggle it.
    >
    > Given that, we then:
    > * Copy T into W0
    > * Set W1
    > * Erase T
    > * if T was previously 0, program T
    > * erase W1
    > * erase W0
    >
    > The rationale of this ordering is that, at every time:
    > * if W1 is set, W0 holds the correct value of T
    > * if W1 is unset, T is known to be correct
    >
    > If we are interrupted while setting W1, we were interrupted before we modified
    > T, so that's not a problem -- T is still at its correct old value. If we
    > are interrupted while setting W0, we were interrupted before we set W1, so
    > that's not a problem -- T is still at its correct old value. If we are
    > interrupted during any of the erasing or programming of T, that's not a
    > problem -- W0 and W1 are set, so we'll pick up the right value next time
    > we come in. If we are interrupted during the erase of W1, that's fine. If
    > it gets erased, it's clear and T has the right value. If it doesn't get
    > erased, it's still set, and W0 still holds the correct "previous value",
    > so we'll be in one of the trustworthy intermediate states.
    >
    > So I think that's it.
    >
    > So!
    >
    > // Read a designated EEPROM cell; always returns 0 or 1.
    > extern int eRead(int j);
    > // Erase a designated EEPROM cell.
    > // After undisrupted eErase, eRead returns 0 until eProgram is invoked.
    > extern void eErase(int j);
    > // Program a designated *ERASED* EEPROM cell.
    > // After undisrupted eProgram, eRead returns 1 until eErase is invoked.
    > // Programming a cell that is not erased cause undefined behavior.
    > extern void eProgram(int j);
    >
    > /* default state:
    > * bit 0 holds the current Durable Bit
    > * bit 1 is undefined
    > * bit 2 is clear
    > *
    > * during a toggle:
    > * bit 2 is set
    > * bit 1 holds the current Durable Bit
    > * bit 0 is undefined.
    > *
    > */
    >
    > int
    > bRead(void) {
    > if (eRead(2)) {
    > return eRead(1);
    > } else {
    > return eRead(0);
    > }
    > }
    >
    > void
    > bToggle(void) {
    > int t;
    > if (eRead(2)) {
    > t = eRead(1);
    > eErase(0);
    > if (t)
    > eProgram(0); //
    > eErase(2); // [A]
    > }
    > t = eRead(0);
    > eErase(1);
    > if (t)
    > eProgram(1);
    > eProgram(2);
    > eErase(0);
    > if (!t)
    > eProgram(0);
    > eErase(2);
    > }
    >
    > I realized on consideration that I don't need to finish this with
    > eErase(1), because bit 1 is undefined. Thus, if eErase(2) completes
    > at the end of bToggle(), the toggle is complete and bit 0 holds the
    > newly toggled bit. If it doesn't complete, 1 still holds the right
    > value.
    >
    > Note that there exists at least one case (previous value was zero, and
    > a disruption occurred) where 0 is erased twice without having been
    > programmed.


    This one is incorrect. Assume a disruption where I inserted [A]; now
    cell 2 is uncertain. Assume cell 2 is read as 1 during the next bToggle,
    and a disruption occurs at ; now cells 0 and 2 are uncertain, and
    bRead can return 0 (if cells 2 and 0 are read as 0) or 1 (if cell 2 is
    read as 0 and cell 0 is read as 1).

    Hint: there's nothing to prevent bRead from invoking eErase or/and eProgram.

    Francois Grieu
    Francois Grieu, Feb 9, 2010
    #19
  20. Francois Grieu

    Seebs Guest

    On 2010-02-09, Francois Grieu <> wrote:
    >> void
    >> bToggle(void) {
    >> int t;
    >> if (eRead(2)) {
    >> t = eRead(1);
    >> eErase(0);
    >> if (t)
    >> eProgram(0); //
    >> eErase(2); // [A]
    >> }
    >> t = eRead(0);
    >> eErase(1);
    >> if (t)
    >> eProgram(1);
    >> eProgram(2);
    >> eErase(0);
    >> if (!t)
    >> eProgram(0);
    >> eErase(2);
    >> }


    > This one is incorrect. Assume a disruption where I inserted [A]; now
    > cell 2 is uncertain. Assume cell 2 is read as 1 during the next bToggle,
    > and a disruption occurs at ; now cells 0 and 2 are uncertain, and
    > bRead can return 0 (if cells 2 and 0 are read as 0) or 1 (if cell 2 is
    > read as 0 and cell 0 is read as 1).


    Okay, I misunderstood the spec. I thought that an erase necessarily
    either worked or failed to work, and couldn't leave a cell indeterminate.
    Re-reading the spec, I see that isn't correct. That does make this a
    lot trickier.

    > Hint: there's nothing to prevent bRead from invoking eErase or/and eProgram.


    True. I'm not sure how it matters, though; a hypothetical user can just run
    bToggle() over and over without ever hitting bRead(), so I can't usefully
    rely on anything it does, I don't think. I'm assuming this has to work
    in the general case, not just in the specific example case given.

    I suspect this could be solved with another bit, but I'm too sleepy to figure
    out how right now.

    -s
    --
    Copyright 2010, all wrongs reversed. Peter Seebach /
    http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
    http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
    Seebs, Feb 9, 2010
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    27
    Views:
    731
    Andrew Thompson
    Aug 7, 2007
  2. Replies:
    0
    Views:
    317
  3. cyber news

    Drug resistant TB on the rise

    cyber news, Aug 11, 2009, in forum: Java
    Replies:
    0
    Views:
    333
    cyber news
    Aug 11, 2009
  4. cyber news

    Drug resistant TB on the rise

    cyber news, Aug 11, 2009, in forum: Java
    Replies:
    0
    Views:
    255
    cyber news
    Aug 11, 2009
  5. Replies:
    1
    Views:
    116
    Tad McClellan
    Oct 18, 2007
Loading...

Share This Page