L
Lee Fesperman
Earlier, I intimated that PostgreSQL was brain-dead because it supported ((foo = NULL)
IS NULL). I still think (foo = NULL) is useless (because it always produces
NULL/UNKNOWN) and redundant (because the appropriate construct is (foo IS NULL)).
However, due to your cites of PostgreSQL docs and my own research (see below), I will
admit there is some 'logic' to their support of ((foo = bar) IS NULL).
There are some better quotes on MySQL but no matter. By chance, I downloaded MySQL
4.1.10 a few weeks ago for an entirely unrelated reason (before this sub-thread and my
current 'battle' with MySQL denizens on comp.databases). I can verify that MySQL
supports ((foo = NULL) IS NULL) with the same semantics as PostgreSQL and doesn't
support IS UNKNOWN, though at one point, it refers to the truth value as "NULL
(UNKNOWN)".
I'd be interested. I have "Oracle Unleashed", but it doesn't touch on the subject as far
as I can tell.
I'm a little afraid that others may not be interested in this discussion, in fact bored
by it! But, who cares, they can just skip the whole thing.
Very widely understood, as regards to the two major open-source DBMSs. I'm pretty sure
the lesser ones which are more SQL Standard compliant have a different view. Who knows
about the major commercial ones though I'd bet that DB2 follows the standard in this
area since the original standard was based on DB2. Our product does follow the standard,
though we've fixed the EXISTS problem.
Certainly I can assert that the standard and RM itself (according to Codd ... Date would
demur since he now is opposed to 3VL and to NULLs in general) do distinguish between
regular expressions (arithmetic, string, ...) and logical expressions (predicates).
Regular expressions can be IS NULL or IS NOT NULL. Logical expressions can be IS TRUE,
IS FALSE or IS UNKNOWN.
I was totally unaware of this dichotomy of opinions, thus my somewhat brash statements
earlier in the thread. However, this is purely a syntax issue; the semantics are the
same (I think).
On that issue, I'll ask you to test the semantics of the logical operators AND and OR
for PostgreSQL. Here's some tests:
+ ((1 = 2) AND (1 = NULL))
+ ((1 = 1) OR (1 = NULL))
Under 3VL, the first should be FALSE not NULL/UNKNOWN and the second should be TRUE. You
might reverse the operands as an additional check since the operators are supposed to be
commutative. I used literals, but you might wish to try column values just to make sure.
Also, since NULL/UNKNOWN are treated as FALSE in a WHERE clause, you might add IS NULL
to the first to really verify. Note: I'd be pretty sure NOT has the right semantics ...
(NOT (1 = NULL)) is NULL/UKNOWN.
The MySQL documentation states that it provides the above semantics, though it didn't in
earlier versions.
Finally, you might be interested in the several white papers about NULLs and 3VL on
www.firstsql.com. I won't list them to save bandwidth. I've had many debates with Date
and his cohorts (Fabian Pascal, David McGoveran, Hugh Darwen) on the subject. They're
completely opposed to NULLs and 3VL; I consider them an essential part of RM (like Codd
did). Even so, I'm still on reasonable terms with them. Some of the papers came out of
my debates with them. You can see their views on www.dbdebunk.com.
IS NULL). I still think (foo = NULL) is useless (because it always produces
NULL/UNKNOWN) and redundant (because the appropriate construct is (foo IS NULL)).
However, due to your cites of PostgreSQL docs and my own research (see below), I will
admit there is some 'logic' to their support of ((foo = bar) IS NULL).
Whereas both PostgreSQL and MySQL (*shudder*) evaluate predicates to TRUE,
FALSE, or NULL and have no truth value referred to as UNKNOWN. If we
suppose that NULL IS UNKNOWN is true and that UNKNOWN IS NULL is true
(this seems to be the sticking point) then the SQL definition and both of
those DBMS servers are compatible.
Interesting quotes:
PostgreSQL:
"SQL uses a three-valued Boolean logic where the null value represents
"unknown"."
<http://www.postgresql.org/docs/8.0/interactive/functions.html>
MySQL:
"The NULL value means ``no data.''"
<http://dev.mysql.com/doc/mysql/en/null-values.html>
There are some better quotes on MySQL but no matter. By chance, I downloaded MySQL
4.1.10 a few weeks ago for an entirely unrelated reason (before this sub-thread and my
current 'battle' with MySQL denizens on comp.databases). I can verify that MySQL
supports ((foo = NULL) IS NULL) with the same semantics as PostgreSQL and doesn't
support IS UNKNOWN, though at one point, it refers to the truth value as "NULL
(UNKNOWN)".
I'll have a look at Oracle and possibly MSSQL at some point. This is
turning out to be more interesting than expected.
I'd be interested. I have "Oracle Unleashed", but it doesn't touch on the subject as far
as I can tell.
I'm a little afraid that others may not be interested in this discussion, in fact bored
by it! But, who cares, they can just skip the whole thing.
Personally, I find PostgreSQL's definition of NULL far more useful,
conceptually; a NULL in a tuple simply means that the corresponding piece
of information is missing or unknown (the value is not NULL, the value is
*not there*), and a NULL result from a comparison means (a) that at least
one value was NULL and (b) that the result of the comparison is likewise
unknown. Perhaps UNKNOWN is a better label for the concept, but NULL is
succinct and very widely understood at this point.
Very widely understood, as regards to the two major open-source DBMSs. I'm pretty sure
the lesser ones which are more SQL Standard compliant have a different view. Who knows
about the major commercial ones though I'd bet that DB2 follows the standard in this
area since the original standard was based on DB2. Our product does follow the standard,
though we've fixed the EXISTS problem.
Certainly I can assert that the standard and RM itself (according to Codd ... Date would
demur since he now is opposed to 3VL and to NULLs in general) do distinguish between
regular expressions (arithmetic, string, ...) and logical expressions (predicates).
Regular expressions can be IS NULL or IS NOT NULL. Logical expressions can be IS TRUE,
IS FALSE or IS UNKNOWN.
I was totally unaware of this dichotomy of opinions, thus my somewhat brash statements
earlier in the thread. However, this is purely a syntax issue; the semantics are the
same (I think).
On that issue, I'll ask you to test the semantics of the logical operators AND and OR
for PostgreSQL. Here's some tests:
+ ((1 = 2) AND (1 = NULL))
+ ((1 = 1) OR (1 = NULL))
Under 3VL, the first should be FALSE not NULL/UNKNOWN and the second should be TRUE. You
might reverse the operands as an additional check since the operators are supposed to be
commutative. I used literals, but you might wish to try column values just to make sure.
Also, since NULL/UNKNOWN are treated as FALSE in a WHERE clause, you might add IS NULL
to the first to really verify. Note: I'd be pretty sure NOT has the right semantics ...
(NOT (1 = NULL)) is NULL/UKNOWN.
The MySQL documentation states that it provides the above semantics, though it didn't in
earlier versions.
Finally, you might be interested in the several white papers about NULLs and 3VL on
www.firstsql.com. I won't list them to save bandwidth. I've had many debates with Date
and his cohorts (Fabian Pascal, David McGoveran, Hugh Darwen) on the subject. They're
completely opposed to NULLs and 3VL; I consider them an essential part of RM (like Codd
did). Even so, I'm still on reasonable terms with them. Some of the papers came out of
my debates with them. You can see their views on www.dbdebunk.com.