A bit resistant to disruption

Francois Grieu · Feb 7, 2010

/*
Disclaimer: This is a hard problem, only minimally disguised into
something on-topic for comp.lang.c

You are programming a C99 conforming freestanding implementation.
Disruption (reset or power off/on cycle) can occur at *ANY* time without
advance warning, and stops whatever operation is underway. The program
is re-executed from main() after a disruption.

The objective is to implement a single bit resistant to disruption,
accessed using two procedures to be written:
int bRead(void); // Return 0 or 1 according to bit state.
void bToggle(void); // Change the state of the bit.

These two properties shall be met:
a) The bit is stable in the absence of Toggle. That is, the result given
by any two undisrupted bRead() is the same unless bToggle() was invoked
in-between.
b) The bit is changed by Toggle. That is, the result given by any two
undisrupted bRead() is different if bToggle() was invoked exactly once
in-between and no disruption occurred during that execution of bToggle().

The only way to store information across disruptions is using a number
of independent physical EEPROM cells, designated by index j. Access to
the EEPROM cells is by the following three functions. The properties
stated for theses functions apply for any fixed non-negative integer j
less than the number of cells, and "for that cell" is implied everywhere.

// Read a designated EEPROM cell; always returns 0 or 1.
extern int eRead(int j);

// Erase a designated EEPROM cell.
// After undisrupted eErase, eRead returns 0 until eProgram is invoked.
extern void eErase(int j);

// Program a designated *ERASED* EEPROM cell.
// After undisrupted eProgram, eRead returns 1 until eErase is invoked.
// Programming a cell that is not erased cause undefined behavior.
extern void eProgram(int j);

An EEPROM cell can be left neither erased nor programmed by a disrupted
eErase unless that cell was already erased, or by a disrupted eProgram.
For a cell in such halfway state, eRead returns a value specified only
to be 0 or 1. That otherwise unspecified value may in particular change
after disruption (due e.g. to uncontrollable random noise and/or
temperature and/or voltage drift in the hardware used by eRead), even
though the sate of cells is unchanged by disruption or/and eRead.

Note: if a cell is not erased (left in a halfway state or programmed),
an eProgram of that cell is prohibited. A workaround is to use the
following rather than eProgram():
void mySafeProgram(int j)
{
eErase(j);
eProgram(j);
}

Before the first run, all EEPROM cells have been erased. eRead, eErase,
and eProgram terminate unless a disruption occurs, assuming constraints
on j and eProgram are met. bRead and bToggle must similarly terminate
unless a disruption occurs.

Can it be done? If yes, what is the minimum number of cells needed? Give
a proof.

Depending on the answer, try to minimize the expected failure rate or
maximize the expected LED flash rate in the following use case:
*/

#define NCELLS 10 // number of cells

extern void sLed(int x); // 0: LED on green then off
// 1: LED on red then off.
extern int eRead(int j); // Read a cell.
extern void eErase(int j); // Erase a cell to 0.
extern void eProgram(int j); // Program an erased cell to 1.

int bRead(void); // Return 0 or 1 according to bit state.
void bToggle(void); // Change the state of the bit.

// Declarations and code for bRead/bToggle go here

int main(void)
{
int vState;
vState = bRead();
while(1)
{
sLed(vState);
bToggle();
vState = !vState;
}
return 0;
}

/*
Assume eProgram, eErase, sLed each last 1 second; everything else is
comparatively instant; a disruption occurs with odds p each second;
unspecified eRead values are assigned at random with mean 0.5; the LED
is off following disruption; a failure is a violation of the
requirement: if a disruption occurs with the LED on in some color, then
the first time the program lights a LED it must be with that color.

Notice that disruption without warning is the standard under which most
real-life hardware actually operates (with the exception of systems with
physical protection against adversarial reset and a power reserve with
means to sense imminent power loss), even though this is often left out
of the specification. And that real disruption-surviving media (EEPROM,
Flash, CDRW, though not battery-backed SRAM) do have the limitation that
read after interrupted change of state sometime gives an unstable
result. While "programming a cell that is not erased cause undefined
behavior" is not the rule, it can exist due to a poorly specified API,
an overcautious hardware specification, or in some cases due to tue
hardware limitations.

The problem is derived from actual requirements in a Smart Card. This is
neither homework, nor an attempt to gather brainpower for free: I'm the
author of the problem and have conclusively determined the feasibility,
with a concise proof. A similar problem "A bit in bottles" is in
rec.puzzles, some of the replies contain answers, and I committed mine.
http://groups.google.com/group/rec.puzzles/browse_thread/thread/ab892b3cf88af0de

Francois Grieu
*/

Stefan Ram · Feb 7, 2010

Francois Grieu said:
The objective is to implement a single bit resistant to disruption,
accessed using two procedures to be written:
int bRead(void); // Return 0 or 1 according to bit state.
void bToggle(void); // Change the state of the bit.

This is a complete sentence defining the »objective«, which
actually does not allow for any further requirements to be
listed later. Otherwise it would need to include a clause such as:

»... to implement a single bit according to requirements
given below.«.

For example, if I would write »The objective /is/ to
calculate the sum of 2 and 3.« The structure of this
sentence means the objective is defined once and for all.
I can not add any requirements (»propertier«) to be met later.
Otherwise, I would need to write »One /part/ of the objective
is to calculate the sum of 2 and 3.« or »The objective is to
calculate the sum of 2 and 3 fulfilling the following properties.«
or so.

But the sentence cannot be understood as it is written, because
what it means for a bit to be »resistant to disruption« is unknown.
(At least to me, that is.)

These two properties shall be met:
a) The bit is stable in the absence of Toggle. That is, the result given
by any two undisrupted bRead() is the same unless bToggle() was invoked
in-between.
b) The bit is changed by Toggle. That is, the result given by any two
undisrupted bRead() is different if bToggle() was invoked exactly once
in-between and no disruption occurred during that execution of bToggle().

¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
I thought the hard case was a disruption during the execution of bToggle().
If I can assume that it is not being interrupted, why can't I then
implement bToggle in the most obvious way?

Kaz Kylheku · Feb 7, 2010

The objective is to implement a single bit resistant to disruption,
accessed using two procedures to be written:
int bRead(void); // Return 0 or 1 according to bit state.
void bToggle(void); // Change the state of the bit.

Note that the power can go out at any time while bToggle
is executing, creating an obvious ambiguity: did this outage occur
before or after the bit was toggled?

What if the power goes out while the the expression bToggle() is
evaluating, but before the call has taken place? What if the power
outage occurs just after bToggle has been called, and is returning?

In the event of a system disruption such as a power loss, the best that
we can ensure is that the data is /consistent/. There is no way to
ensure that the data has the absolutely most recent value.

For instance, a database transaction to update some records either
did happen or did not happen; the database is not in some in-between
inconsistent or corrupt state.

A single bit cannot ever be inconsistent!

Francois Grieu · Feb 7, 2010

Stefan Ram a écrit :

This is a complete sentence defining the »objective«, which
actually does not allow for any further requirements to be
listed later. Otherwise it would need to include a clause such as:

»... to implement a single bit according to requirements
given below.«.

These requirements are properties a) b). I stand corrected.

For example, if I would write »The objective /is/ to
calculate the sum of 2 and 3.« The structure of this
sentence means the objective is defined once and for all.
I can not add any requirements (»propertier«) to be met later.
Otherwise, I would need to write »One /part/ of the objective
is to calculate the sum of 2 and 3.« or »The objective is to
calculate the sum of 2 and 3 fulfilling the following properties.«
or so.

But the sentence cannot be understood as it is written, because
what it means for a bit to be »resistant to disruption« is unknown.
(At least to me, that is.)

¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
I thought the hard case was a disruption during the execution of bToggle().

No. If there is a disruption during bToggle, what the next undisrupted
bRead() returns is left unspecified.

If I can assume that it is not being interrupted, why can't I then
implement bToggle in the most obvious way?

Implement bToggle as you wish. The difficult requirement is a), given
that eRead may return unstable data.

Francois Grieu

Stefan Ram · Feb 7, 2010

Given:

extern int eRead(int j);
extern void eErase(int j);
extern void eProgram(int j);
void mySafeProgram(int j)

Wanted:

the result given by any two undisrupted bRead() is the same
unless bToggle() was invoked in-between.

So, what about the following implementation?

int bRead( int const j )
{ int const value = eRead( j );
if( value )mySafeProgram( j ); else eErase( j );
return value; }

Andrew Poelstra · Feb 7, 2010

The objective is to implement a single bit resistant to disruption,

As has been said, a single bit cannot be "inconsistent" in the
sense that a database or complex structure might be. If you want
it to hold across power failures, this is likely a hardware issue.

bit ^= 1;

is likely to be a single instruction on your hardware so if you
#define Toggle() as such, there's nothing more that can be done
via software.

Francois Grieu · Feb 7, 2010

Kaz said:
Note that the power can go out at any time while bToggle
is executing, creating an obvious ambiguity: did this outage occur
before or after the bit was toggled?

This is why after bToggle() is disrupted, the result of the next
undisrupted bRead() is left unspecified.

What if the power goes out while the the expression bToggle() is
evaluating, but before the call has taken place? What if the power
outage occurs just after bToggle has been called, and is returning?

Count that as disrupted bToggle().

If you want a "falsifiable" test, consider the use case given. Any
correct bRead/bToggle pass this test. Any bRead/bToggle that pass this
test can be turned into a correct bRead/bToggle2 by having bToggle2
invoke bRead and discard the result then invoke bToggle.

In the event of a system disruption such as a power loss, the best that
we can ensure is that the data is /consistent/. There is no way to
ensure that the data has the absolutely most recent value.
Agreed.

For instance, a database transaction to update some records either
did happen or did not happen; the database is not in some in-between
inconsistent or corrupt state.
Agreed.

A single bit cannot ever be inconsistent!

With actual physical EEPROM or Flash, after a change of a cell has been
disrupted, two readings without a change are not guaranteed to return
the same value. This is because a cell is read by comparing the charge
in the cell to a reference value, with some uncertainty, and that charge
only changes slowly (in the order of a millisecond, and a power loss, at
least a deliberate one as in short circuit, occurs faster than that).
Note this is one area where SRAM + battery backup is better (positive
feedback between the two sides of a SRAM cell makes it inherently
stable, or at least makes metastability plain unobservable). All this
stuff is very often overlooked, and gives rise to all kind of subtle and
hard to track issues, especially when temperature changes and when
disruption is frequent or/and adversarial (which must be assumed in
Smart Cards, especially when remotely powered).

One usual solution is that after reading a bit as erased, you erase it
again; and after reading it as programmed, you program it again
(positive feedback with software); so that next time you read, you'll
get a stable value. A relatively easy (I would not say trivial) tweak
using a second bit avoids premature wear out after things have
stabilized. You then have a "bit" in the usual sense and can use it to
perform e.g. atomic database transaction using classical algorithms.

But in the present problem things are made complicated by prohibition of
over-program of cells.

Francois Grieu

Francois Grieu · Feb 7, 2010

Stefan Ram a écrit :

Given:

So, what about the following implementation?

int bRead( int const j )
{ int const value = eRead( j );
if( value )mySafeProgram( j ); else eErase( j );
return value; }

Assume bRead() just has returned 1.
A disruption occurs.
bRead() is called again.
A disruption occurs during that bRead(), while it is doing
mySafeProgram(), sometime during eErase() or eProgram(); say, in between
to simplify.
bRead() is called again without disruption, and returns 0.

Property a) is violated, regardless of how you implement bToggle, which
you did not specify BTW.

Francois Grieu

Francois Grieu · Feb 7, 2010

Andrew Poelstra wrote :

As has been said, a single bit cannot be "inconsistent" in the
sense that a database or complex structure might be.

A bit, agreed. The state of a cell, disagreed. See my answer in the
sub-thread that you mention.

If you want a useful model of an EEPROM cell and don't want to dive into
how hardware works, think of a bottle that can only be emptied or filled
slowly, and is read by grossly weighting it on a scale giving only
light/heavy reading. This is how classic EEPROM and Flash work
(multilevel Flash store e.g. 2 bits per cell with a scale having 4
readings instead of 2, we don't consider that).

(snip)
Francois Grieu

Andrew Poelstra · Feb 7, 2010

Andrew Poelstra wrote :

A bit, agreed. The state of a cell, disagreed. See my answer in the
sub-thread that you mention.

If you want a useful model of an EEPROM cell and don't want to dive into
how hardware works, think of a bottle that can only be emptied or filled
slowly, and is read by grossly weighting it on a scale giving only
light/heavy reading. This is how classic EEPROM and Flash work
(multilevel Flash store e.g. 2 bits per cell with a scale having 4
readings instead of 2, we don't consider that).

(snip)
Francois Grieu

I know how Flash and EEPROM work, on a basic level - which is why
I think you need a hardware solution. You can buffer and checksum
to validate the data as best you can, but in the end if you lose
power in the middle of a flash write you may very well be screwed.

Stefan Ram · Feb 7, 2010

Francois Grieu said:
A disruption occurs during that bRead(), while it is doing

I do not see how to implement a solution, but by storing
an error detecting code into more than one bit for each
of the values for ON or OFF one could at least /detect/ an
indefinite bit with a certain probability.

Francois Grieu · Feb 7, 2010

Andrew said:
I know how Flash and EEPROM work, on a basic level - which is why
I think you need a hardware solution. You can buffer and checksum
to validate the data as best you can, but in the end if you lose
power in the middle of a flash write you may very well be screwed.

I count that as one "I think the answer is: impossible". All the persons
that I know have solved the problem (that's three including me) seem to
have changed their mind at least once.

Francois Grieu

Francois Grieu · Feb 7, 2010

Stefan said:
I do not see how to implement a solution, but by storing
an error detecting code into more than one bit for each
of the values for ON or OFF one could at least /detect/ an
indefinite bit with a certain probability.

Now we are moving into the right direction!

Hints: try to prove impossibility. Any hole in your proof may lead to a
solution. That, or prove a procedure, any hole...

Francois Grieu

Moi · Feb 7, 2010

/*
Disclaimer: This is a hard problem, only minimally disguised into
something on-topic for comp.lang.c

You are programming a C99 conforming freestanding implementation.
Disruption (reset or power off/on cycle) can occur at *ANY* time without
advance warning, and stops whatever operation is underway. The program
is re-executed from main() after a disruption.

The objective is to implement a single bit resistant to disruption,
accessed using two procedures to be written:
int bRead(void); // Return 0 or 1 according to bit state. void
bToggle(void); // Change the state of the bit.

These two properties shall be met:
a) The bit is stable in the absence of Toggle. That is, the result given
by any two undisrupted bRead() is the same unless bToggle() was invoked
in-between.
b) The bit is changed by Toggle. That is, the result given by any two
undisrupted bRead() is different if bToggle() was invoked exactly once
in-between and no disruption occurred during that execution of
bToggle().

The only way to store information across disruptions is using a number
of independent physical EEPROM cells, designated by index j. Access to
the EEPROM cells is by the following three functions. The properties
stated for theses functions apply for any fixed non-negative integer j
less than the number of cells, and "for that cell" is implied
everywhere.

// Read a designated EEPROM cell; always returns 0 or 1. extern int
eRead(int j);

// Erase a designated EEPROM cell.
// After undisrupted eErase, eRead returns 0 until eProgram is invoked.
extern void eErase(int j);

// Program a designated *ERASED* EEPROM cell. // After undisrupted
eProgram, eRead returns 1 until eErase is invoked. // Programming a cell
that is not erased cause undefined behavior. extern void eProgram(int
j);

An EEPROM cell can be left neither erased nor programmed by a disrupted
eErase unless that cell was already erased, or by a disrupted eProgram.
For a cell in such halfway state, eRead returns a value specified only
to be 0 or 1. That otherwise unspecified value may in particular change
after disruption (due e.g. to uncontrollable random noise and/or
temperature and/or voltage drift in the hardware used by eRead), even
though the sate of cells is unchanged by disruption or/and eRead.

Note: if a cell is not erased (left in a halfway state or programmed),
an eProgram of that cell is prohibited. A workaround is to use the
following rather than eProgram():
void mySafeProgram(int j)
{
eErase(j);
eProgram(j);
}

Before the first run, all EEPROM cells have been erased. eRead, eErase,
and eProgram terminate unless a disruption occurs, assuming constraints
on j and eProgram are met. bRead and bToggle must similarly terminate
unless a disruption occurs.

Can it be done? If yes, what is the minimum number of cells needed? Give
a proof.

First shot: two hits in my cerebral hash-table
1) alternating bit-protocol (difficult with only one bit at a time ...)
2) Leslie Lamport's reading <--> writing in opposite directions.

NB, I like the problem (though it is not strictly c.l.c stuff ;-)

AvK

Francois Grieu · Feb 8, 2010

Moi wrote :

First shot: two hits in my cerebral hash-table
1) alternating bit-protocol (difficult with only one bit at a time ...)

Nice. However in the present problem there are no errors, only those
pesky disruptions.

2) Leslie Lamport's reading <--> writing in opposite directions.

When I wrote "hard problem", I did not mean quite that hard...

NB, I like the problem (though it is not strictly c.l.c stuff ;-)

At least "freestanding implementation" is on topic and exactly describes
the context: digital computer, no error, single process, all variables
lost on disruption, no standard library for persistent storage. And
c.l.c may be one of the best place to reach programmers facing that
issue (minus perhaps the prohibition of over-program, which is what
makes the problem challenging).

Francois Grieu

Moi · Feb 8, 2010

Moi wrote :

Nice. However in the present problem there are no errors, only those
pesky disruptions.

When I wrote "hard problem", I did not mean quite that hard...

I think that your failed eeprom_reset_bit() and eeprom_set_bit()
operations are more or less equivalent to Lamport's "sending messages to another process".
Messages can be lost in transit, your eeprom-operations can fail
because of a crash.
The only difference is that in the eeprom-case there is a sort of tri-state
logic; in the case of lost messages the domain stays restricted
to just two cases.

If I understood the problem correct:
1) the machine can be expected to crash *any moment*, so even while "in recovery mode"
2) an undefined cell (either rewritten or partially cleared) produces a random
value; subsequent reads may yield different values.

My current approach uses a state machine with a kind of graycodes
(maybe necklaces: http://www.theory.csc.uvic.ca/~cos/inf/neck/NecklaceInfo.html ; I am not
sure yet how this should be named)
The guiding idea is that in any state there is a maximal number of
bits that could be undefined (tri-state): the bits that could have been affected
by the previous or next state in the graph.
While recovering this number is two, in "normal operation" it may perhaps be just one.

My guess is I need three or four bits / bottles.
.... still working on it ...

AvK

Francois Grieu · Feb 9, 2010

Moi wrote :

I think that your failed eeprom_reset_bit() and eeprom_set_bit()
operations are more or less equivalent to Lamport's "sending messages to another process".
Messages can be lost in transit, your eeprom-operations can fail
because of a crash.
The only difference is that in the eeprom-case there is a sort of tri-state
logic; in the case of lost messages the domain stays restricted
to just two cases.

Yes I see your point. One redeeming difference is that things are
strictly sequential / singe-process in my case.

If I understood the problem correct:
1) the machine can be expected to crash *any moment*, so even while "in recovery mode"

Yes. Notice this is very real-world.

2) an undefined cell (either rewritten or partially cleared) produces a random
value; subsequent reads may yield different values.

Yes, with one exception: if a cell/bottle was erased/empty, and you
erase/empty it again while a disruption occurs, it remains erased/empty.
The makers of EEPROM will reluctantly accept to give this insurance,
which is self-evident with bottles. There is no equivalent in the other
direction, for you can't program/fill a cell/bottle unless it is fully
erased/empty [or so insisted the makers of my EEPROM].

My current approach uses a state machine with a kind of graycodes
(maybe necklaces:http://www.theory.csc.uvic.ca/~cos/inf/neck/NecklaceInfo.html ; I am not
sure yet how this should be named)
The guiding idea is that in any state there is a maximal number of
bits that could be undefined (tri-state): the bits that could have been affected
by the previous or next state in the graph.
While recovering this number is two, in "normal operation" it may perhaps be just one.

My guess is I need three or four bits / bottles.
... still working on it ...

You are the first poster with a proposition that I won't contradict, and
are so far untouched by the "at least one public change of mind" curse!

AvK

Francois Grieu

Seebs · Feb 9, 2010

You are the first poster with a proposition that I won't contradict, and
are so far untouched by the "at least one public change of mind" curse!

The tricky part, so far as I can tell, is that we can't tell what a
partially-written bit might look like. On the other hand, in theory,
only one write can be interrupted, as I understand it.

Hmm.

Okay, here's an approach. Imagine without loss of generality that our
goal is to toggle the bit in cell 0, and that we have additional bits to
toggle. The bit is "T".

Let's see. Okay, when toggle() is called, we'll start by clearing a working
area. So we have some unspecified number of bits which is our working area,
call them W[0] ... W[N].

The first thing we need to do is be sure that we can recover the previous
value in the event of a disruption. So, necessarily, we need a bit to hold
the known-previous-value. Let's call that W0. But how do we know whether
W0 holds a bit we care about? We need a flag for whether W0 has been written.
Let's call that W1. So, read() should do something like:
if (W1)
return W0
else
return T

What about toggle()? If W1 is set, then it was interrupted during an
operation. It should try to extract the value from W0 and push it back
into T. Once it has done that, it can clear W1 (because it no longer
needs to be able to remember the value in W0), then W0. That reduces us
to the default state, which is that T holds the current bit and we wish
to toggle it.

Given that, we then:
* Copy T into W0
* Set W1
* Erase T
* if T was previously 0, program T
* erase W1
* erase W0

The rationale of this ordering is that, at every time:
* if W1 is set, W0 holds the correct value of T
* if W1 is unset, T is known to be correct

If we are interrupted while setting W1, we were interrupted before we modified
T, so that's not a problem -- T is still at its correct old value. If we
are interrupted while setting W0, we were interrupted before we set W1, so
that's not a problem -- T is still at its correct old value. If we are
interrupted during any of the erasing or programming of T, that's not a
problem -- W0 and W1 are set, so we'll pick up the right value next time
we come in. If we are interrupted during the erase of W1, that's fine. If
it gets erased, it's clear and T has the right value. If it doesn't get
erased, it's still set, and W0 still holds the correct "previous value",
so we'll be in one of the trustworthy intermediate states.

So I think that's it.

So!

// Read a designated EEPROM cell; always returns 0 or 1.
extern int eRead(int j);
// Erase a designated EEPROM cell.
// After undisrupted eErase, eRead returns 0 until eProgram is invoked.
extern void eErase(int j);
// Program a designated *ERASED* EEPROM cell.
// After undisrupted eProgram, eRead returns 1 until eErase is invoked.
// Programming a cell that is not erased cause undefined behavior.
extern void eProgram(int j);

/* default state:
* bit 0 holds the current Durable Bit
* bit 1 is undefined
* bit 2 is clear
*
* during a toggle:
* bit 2 is set
* bit 1 holds the current Durable Bit
* bit 0 is undefined.
*
*/

int
bRead(void) {
if (eRead(2)) {
return eRead(1);
} else {
return eRead(0);
}
}

void
bToggle(void) {
int t;
if (eRead(2)) {
t = eRead(1);
eErase(0);
if (t)
eProgram(0);
eErase(2);
}
t = eRead(0);
eErase(1);
if (t)
eProgram(1);
eProgram(2);
eErase(0);
if (!t)
eProgram(0);
eErase(2);
}

I realized on consideration that I don't need to finish this with
eErase(1), because bit 1 is undefined. Thus, if eErase(2) completes
at the end of bToggle(), the toggle is complete and bit 0 holds the
newly toggled bit. If it doesn't complete, 1 still holds the right
value.

Note that there exists at least one case (previous value was zero, and
a disruption occurred) where 0 is erased twice without having been
programmed.

-s

Francois Grieu · Feb 9, 2010

Seebs said:
The tricky part, so far as I can tell, is that we can't tell what a
partially-written bit might look like. On the other hand, in theory,
only one write can be interrupted, as I understand it.

Only one write can be interrupted. But a series of disruptions can lead
to multiple partially-written cells. See below.

Hmm.

Okay, here's an approach. Imagine without loss of generality that our
goal is to toggle the bit in cell 0, and that we have additional bits to
toggle. The bit is "T".

Let's see. Okay, when toggle() is called, we'll start by clearing a working
area. So we have some unspecified number of bits which is our working area,
call them W[0] ... W[N].

The first thing we need to do is be sure that we can recover the previous
value in the event of a disruption. So, necessarily, we need a bit to hold
the known-previous-value. Let's call that W0. But how do we know whether
W0 holds a bit we care about? We need a flag for whether W0 has been written.
Let's call that W1. So, read() should do something like:
if (W1)
return W0
else
return T

What about toggle()? If W1 is set, then it was interrupted during an
operation. It should try to extract the value from W0 and push it back
into T. Once it has done that, it can clear W1 (because it no longer
needs to be able to remember the value in W0), then W0. That reduces us
to the default state, which is that T holds the current bit and we wish
to toggle it.

Given that, we then:
* Copy T into W0
* Set W1
* Erase T
* if T was previously 0, program T
* erase W1
* erase W0

The rationale of this ordering is that, at every time:
* if W1 is set, W0 holds the correct value of T
* if W1 is unset, T is known to be correct

If we are interrupted while setting W1, we were interrupted before we modified
T, so that's not a problem -- T is still at its correct old value. If we
are interrupted while setting W0, we were interrupted before we set W1, so
that's not a problem -- T is still at its correct old value. If we are
interrupted during any of the erasing or programming of T, that's not a
problem -- W0 and W1 are set, so we'll pick up the right value next time
we come in. If we are interrupted during the erase of W1, that's fine. If
it gets erased, it's clear and T has the right value. If it doesn't get
erased, it's still set, and W0 still holds the correct "previous value",
so we'll be in one of the trustworthy intermediate states.

So I think that's it.

So!

// Read a designated EEPROM cell; always returns 0 or 1.
extern int eRead(int j);
// Erase a designated EEPROM cell.
// After undisrupted eErase, eRead returns 0 until eProgram is invoked.
extern void eErase(int j);
// Program a designated *ERASED* EEPROM cell.
// After undisrupted eProgram, eRead returns 1 until eErase is invoked.
// Programming a cell that is not erased cause undefined behavior.
extern void eProgram(int j);

/* default state:
* bit 0 holds the current Durable Bit
* bit 1 is undefined
* bit 2 is clear
*
* during a toggle:
* bit 2 is set
* bit 1 holds the current Durable Bit
* bit 0 is undefined.
*
*/

int
bRead(void) {
if (eRead(2)) {
return eRead(1);
} else {
return eRead(0);
}
}

void
bToggle(void) {
int t;
if (eRead(2)) {
t = eRead(1);
eErase(0);
if (t)
eProgram(0); //
eErase(2); // [A]
}
t = eRead(0);
eErase(1);
if (t)
eProgram(1);
eProgram(2);
eErase(0);
if (!t)
eProgram(0);
eErase(2);
}

I realized on consideration that I don't need to finish this with
eErase(1), because bit 1 is undefined. Thus, if eErase(2) completes
at the end of bToggle(), the toggle is complete and bit 0 holds the
newly toggled bit. If it doesn't complete, 1 still holds the right
value.

Note that there exists at least one case (previous value was zero, and
a disruption occurred) where 0 is erased twice without having been
programmed.

This one is incorrect. Assume a disruption where I inserted [A]; now
cell 2 is uncertain. Assume cell 2 is read as 1 during the next bToggle,
and a disruption occurs at ; now cells 0 and 2 are uncertain, and
bRead can return 0 (if cells 2 and 0 are read as 0) or 1 (if cell 2 is
read as 0 and cell 0 is read as 1).

Hint: there's nothing to prevent bRead from invoking eErase or/and eProgram.

Francois Grieu

Seebs · Feb 9, 2010

void
bToggle(void) {
int t;
if (eRead(2)) {
t = eRead(1);
eErase(0);
if (t)
eProgram(0); //
eErase(2); // [A]
}
t = eRead(0);
eErase(1);
if (t)
eProgram(1);
eProgram(2);
eErase(0);
if (!t)
eProgram(0);
eErase(2);
}

Click to expand...

This one is incorrect. Assume a disruption where I inserted [A]; now
cell 2 is uncertain. Assume cell 2 is read as 1 during the next bToggle,
and a disruption occurs at ; now cells 0 and 2 are uncertain, and
bRead can return 0 (if cells 2 and 0 are read as 0) or 1 (if cell 2 is
read as 0 and cell 0 is read as 1).

Click to expand...

Okay, I misunderstood the spec. I thought that an erase necessarily
either worked or failed to work, and couldn't leave a cell indeterminate.
Re-reading the spec, I see that isn't correct. That does make this a
lot trickier.

Hint: there's nothing to prevent bRead from invoking eErase or/and eProgram.

Click to expand...

True. I'm not sure how it matters, though; a hypothetical user can just run
bToggle() over and over without ever hitting bRead(), so I can't usefully
rely on anything it does, I don't think. I'm assuming this has to work
in the general case, not just in the specific example case given.

I suspect this could be solved with another bit, but I'm too sleepy to figure
out how right now.

-s

Why does 1 represent a negative sign bit?	0	Aug 10, 2022
Comparison of Integer and Pointer (that's supposed to be an Integer). Where did I go wrong?	0	Nov 19, 2022
WIN32 - Update Text in a Window in order to show its size in Pixels and coordinates	0	Oct 4, 2023
bit operations	6	Nov 16, 2009
64 bit design	2	Oct 15, 2010
How to accept text and put each letter into a 2d matrix?	0	Jun 3, 2022
Filter sober in c++ don't pass test	0	Dec 2, 2023
How to try a range of hex values in C# code ?	0	Nov 19, 2022

A bit resistant to disruption

Francois Grieu

Stefan Ram

Kaz Kylheku

Francois Grieu

Stefan Ram

Andrew Poelstra

Francois Grieu

Francois Grieu

Francois Grieu

Andrew Poelstra

Stefan Ram

Francois Grieu

Francois Grieu

Moi

Francois Grieu

Moi

Francois Grieu

Seebs

Francois Grieu

Seebs

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads