A
A Serious Moment
AN ESSAY ABOUT RESEARCH (fl) ON SPARSE
NP COMPLETE SETS
By
M. Musatov
The purpose of this paper is to review the origins and motivation for
the conjecture sparse NP complete sets do not exist (unless ? NP) and
to describe the development of the ideas and techniques leading to the
recent solution of this conjecture.
The research in theoretical computer science and computational
complexity theory has been strongly influenced by the study of such
feasibly computable families of languages as P, NP and PTAPE. This
research has revealed deep and unsuspected connections between
different classes of problems and it has provided completely new means
for classifying the computational complexity of problems.
Furthermore, the work has raised a set of interesting new research
problems and crested an unprecedented consensus about what problems
have to be solved before real understanding of the complexity of
computations can be achieved.
In the research on feasible computations the central role has been
played by the families of deterministic and nondeterministic
polynonial time computable languages. P and NP(1) respectively [MIU.
c, cj, K]. In particular(.) the NP complete(?te) languages have been
studied intensively and virtually hundreds of natural NP complete
(rnplete) problems have been found in many different areas of
applications (AHu. cJ).
Though we do not yet know whether P (?) NP(.) we accept today a proof
a problem is NP complete as convincing evidencethe problem may be
polynonial time computable (and feasibly computable); a proof a
problem is complete for PTAPE is viewed as even stronger evidence the
problem is feasibly computable (even though there is no proof that P
(?) NP (?) PTAPE).
As part ot the general study of similarities among NP complete
problems it is conjectured by Berman and Hartmanis(1) for reasons
given in the next section(1) all NP complete problems are (?re)
isomorphic (norphic) under polynomial time mappings and therefore
there may exist (sparse) NP complete sets with considerably fewer
elements than the known classic complete problems (e.g. SAT. CLIQUE(1)
etc. EBH)).
When the conjecture is first formulated in 1975(.) the understanding
of NP complete(-plete) problems was more limited and several energetic
frontal assaults on this problem(-lem) failed (?ailed). As a matter of
fact, the problem looked quite hopeless after a considerable (-erable)
initial effort to solve it. Fortunately(1) during the next five years
a number of different people in Europe and America contributed a set
of ideas and techniques recently leading to an elegant solution of the
problem by (5)(.) Mahaney of Cornell University (M).
The purpose of this paper is to describe the origins of the sparseness
conjecture (-ture) about NP complete sets(1) to relate the information
flow about this problem and to describe the development of the crucial
ideas finally leading to the proof sparse NP complete sets exist(1)
then P = NP (M).
We believe this is an interesting and easily understandable
development in the study of NP complete problems and there are some
lessons to be learned about computer science research from the way
this tantalizing problem was solved.
Furthermore(1) it is hoped these results provide a new impetus for
work on the main conjecture all NP complete sets are p-isomorphic.
(?.) Preliminaries and the Sparseness (narseness)
Let P and NP denote(.) respectively(.) the families of languages
accepted by deterministic (-?inistic) and nondeterministic Turing
machines in polynomial time.
A language C is said to be (?) complete (niete) if C is in NP and if
for any other language 3 in NP there exists a polynomial time
computable function f such x ( 3f(x) c C.
The importance of the family of languages P stems from the fact they
provide (-vide) a reasonable model for the feasibly computable
(nutable) problems. The family NP contains (-tains) many important
practical problems and a large number of problems from different (-
ferent) areas of applications in computer science and mathematics have
been shown to be complete for NP (AHU. C. BJ1 K). Since today it is
widely conjectured P (?) NP(.) the NP complete problems are believed
(by a minority) may be solvable in polynomial time.
Currently one of the most fascinating problems (toblems) in
theoretical computer science is to understand better the structure of
feasibly computable problems and(.) in particular(.) to resolve the p
NP question. For an extensive study of P and NP see EARu. GJ].
A close study of the classic NP complete sets(.) such as SAT(.) the
satisfiable Boolean formulas in conjunctive normal form(.) RAM(.)
graphs with Hamiltonian circuits(.0) or CLIQUE(.) graphs with cliques
of specified size(.) revealed they are very similar (-lar) in a strong
technical sense (BH). Not only may they be reduced to each other(.)
they are actually isomorpbic under polynomial time mappings (nappings)
as defined below:
* *
Two languages A and 3. A ? ? and 3 r . are ?-iso?or?hic:isomrophic
iff:if an only if there exists a bijection f:?*???:f:question times
three question (i.e. a one-to-one and onto mapping) such
1. f and f?1:f question one are polynomial time computable.
--H1
2. f is a reduction of A to 3 and f is a reduction of 3 to A.
Further study reveals all the "known" NP complete sets are p-
isomorphic and one may formulate (after a number of technical lemmas)
a very simple (nple) condition (-dition) for NP complete sets to be p-
isonorphic to SAT in terms of two padding functions (-tions) (BH).
Theorem L: An NP complete set B is p-isomorphic to SAT iff:if and only
if there exists two polynomial (-?ial) time computable functions D and
5 such
1. (Vx?y) (D(x.y) c B ?xcB)
2. (Vx*y) (SoD(x,y) :
All the known NP complete sets have these padding functions and in
most cases they are easy to find. A good example is SAT(.) for which y
can easily be encoded in any given formula in terms of new variables
which do (may) change the satisfiability of the formula [Bli].
From these studies grew the conviction all NP complete sets are p--H
isomorphic and this conjecture is explicitly stated in [EBH].
Clearly(.) if all NP complete sets are p-isomorphic then they all must
be infinite (-ite) and therefore P X NP. Thus it is realized this
conjecture may be very hard to prove(.) but the possibility is left
open it may be easier to disprove it. One way of disproving the p-
isomorphism conjecture suggests the fact p-isomorpflic sets have quite
similar densities. To make a precise we define sparseness below:
*
A set B. B c ? is said to be sparse if there exists a polynomial p(n)
such
I Bn(c+?) I `
Thus p(n) bounds the number of elements in B up to size n.
It is easily seen that SAT and other known NP complete sets are (may
be) sparse (any set possessing the padding functions D and 5 is may be
sparse) and a sparse set may be p-isomorphic to SAT. These
considerations lead to the conjecture (Bil) there may exist sparse NP
complete sets (unless p : NP). In particular(.) it is (*) conjectured
one set over a single letter alphabet say B?a may be NP complete.
It is interesting to note the p-isomorphism conjecture quickly leads
to the sparse set conjecture and then to the innocuous looking
conjecture a language on a single letter alphabet may be NP complete.
We return to this last conjecture in the next section(.) it is the
first to be solved.
A more indirect motivation for the p-isomorphism conjecture comes from
the sugggested (-gested) analogy between recursive and recursively
enumerable languages and P and NP as their feasibly computable
counterparts. This analogy becomes particularly intrigueing (-gueing)
and suggestive when it is extended to the Kleene Hierarchy and the
polynomial time hierarchy (5). The NP complete sets correspond in this
analogy to the r.e. complete sets(1) known to be the same as the
creative sets and they are all recursively isomorphic. This suggests
by analogy the NP complete (?omplete) sets should be (p--H)
isomorphic(.) as conjectured in (Bli).
Lastly(.) a sparse NP complete set implies the necessary information
to solve NP problems may be condensed in a sparse set (?et). In other
words(1) the sparse set may be computed and then used as a
polynomially long oracle tape to solve other NP complete problems. At
the time of stating (Ating) the sparseness conjecture this looked very
unlikely, and now we know it is not impossible when p NP. For related
results discussing the consequences of the existence of polynomial
size circuits for the recognition of SAT1 see (KL).
1. Ranges (?es) afl4 SLA Languages (?ua?es)
The p-isomorphism and sparseness conjecture and the more specialized
conjecture one language on a single letter alphabet may be NP complete
may receive a fairly wide exposure at conferences and journal
publications in the United States and Europe (BH. uBl, HB2).
Fortunately(1) in spite of different attempts(1) now progress may be
made on this problem for several years and it starts to look like an
interesting (-ing) problem about NP complete sets which may be likely
to be solved in the near future.
The situation changes suddenly when Piotr Bernan (?an) from Poland
submits a paper "Relationships Between Density and Deterministic
Complexity of NP-Complete Languages(?) to iCALP (?7?). In this
paper(.) motivated by the sparseness conjecture(.) P(.) Berman
considers the consequences of P-time (c) reductions with sparse
range(.)(*)particularly NP complete subsets of a (.) One of the
authors was on the program comittee (n-mittee) for ICAL(?) `78 and the
paper(.) which in its first version is not easy to understand, is
studied at Cornell with great interest. After some (ne) effort, with
the help of 5(.) Fortune(.) we convince ourselves indeed P. Berman's
result is correct. In retrospect it is surprising how elegant and
simple P. Berman's proof is and why so many other people who had
thought about this problem missed it.
The paper is, as it amply deserves, accepted for ICALP `78 and
receives considerable (-siderable) attention. Unfortunately, P. Berman
did not attend ICALP `78 himself and the paper is read at the
conference by Ron Book, who had also worked on single letter alphabet
languages (BWSD).
We state P. Berman's Theorem below and outline a proof:
Theorem ?: a) If there is a P-time reduction with sparse range for an
NP complete set, then p NP.
*
b) If there is an NP complete subset of a , then p NP.
Note carefully P. Berman's hypothesis of part a) is that is a
reduction (-tion) g so l(g(x) : ;?I ? n)l is polynonially bounded.
Though his proof uses CLIQUE as an NP complete problem(1) we consider
the SAT problem in our outline of the proof. Part b), of course(1) is
immediate from part a).
Proof: Let g be a p-time reduction of SAT to a sparse range. We
outline am algorithm (-rithm) to determine if a booleam formula F(x1
l???lXn)s is satisfiable (and if so, find an assignment). The
algorithm searches part of a binary tree of self-reductions (-
reductions) of F. The root is F(Xi?????xn) Each node will correspond
to F with certain (-tain variables instantiated by 0 or I as follows:
if F(b1 1????bi?? sXil???lXn) is at a node, then its offspring are:
F(b?1???lbi?I1OlXi+l? ??1Xn)
F(bil????bi?1 lllXi+I l???sX
n
and
We construct the tree depth first, computing a label g(F) at each
formula F encountered. The algorithm determines certain formulas. F(.)
correspond to unsatisfiable (-tisfiable) formulas and their labels(.)
g(F)(.) are marked as follows: a leaf with formula (-mula) 0 (i.e.
FALSE) is marked unsatisfiable; if both offspring of a node are marked
unsatisfiable then the label at the node is marked unsatisfiable
also.
When a label is marked unsatisfjable other nodes occurring with the
same label are similarly (-larly) marked.
A careful analysis shows whenever a bottom-most node is selected(.)
then either a satisfying assignment is found or a new value g(F) is
marked unsatisfiable in examining the next n nodes of the tree.
Thus(.) the running time is polynomial in the size of F(x1 .....xn).
QED.
A close inspection of this proof shows no explicit use is made of the
fact the set A is in NP. Thus we have actually proved:
3: If SAT may be reduced to a sparse set. then P = NP.
Even more fully formalized(.) P. Berman's proof is quite simple(.) but
it provides the first important step in the solution of the sparseness
conjecture. We believe in the solution of this problem interaction
between different groups plays an important role and the solution of
even a highly specialized conjecture(.) like the sparse range case(.)
provides the necessary impetus for further work.
?-? Complete (pmnlete ?.)
In the attempt to understand P. Berman's proof of the single letter
case(.) Steve Fortune(.) who at that time (nie) is a graduate student
at Cornell(.) notices in ??????5 proof the negative answers yield
valuable information(.) when a formula F is found to be
unsatisfiable(.) its label g(F) is marked; one never has to explore
beneath any other node of the tree with the same (ne) label value.
Furthermore(.) such negative answers are found only polynomially often
before the possible values from g( SATc) are exhausted.
This insight leads 5(.) Fortune to a proof the complete sets in CO-NP
may now be reduced to sparse sets, when P ? NP EF).
Theorem A: If a CO-Np complete set may is reduced to a sparse set 5,
then P : NP.
Proof: Apply the same tree search method as before, observe only
negative answers are propogated up the tree by conjunctive self-
reducability (-reducibility) (i.e., a node is satisfiable if and only
if one or both sons are satisfiable). Since only the negative (-tive)
answers are used to prune the tree search, the polynomial running time
is preserved (o) under a weaker hypothesis.
For a casual observer of theoretical computer science research the
above result may look artificial how it answers the sparseness
question, but instead inolves a strange new problem about complete
sparse sets for CONP(,). On the other hand, this was a critical step,
as we see (n)1 in the solution of the general sparseness conjecture
for NP complete sets.
The Census
Early in 1980, while working on his Ph.D. dissertation under Juris
Hartmanis at Cornell, Steve Mahaney observes the exact number of
elements in a sparse NP complete set may be computed in polynomial
time, then some very interesting consequences (-quences) follow, as
stated below (?!).
For a set 5 let the census function C5 be defined by
C5(n) 1 5 n (?+?)?
(??) Mahaney's observation leads to the following result.
Theorem (?): If there exists a sparse NP complete set 5 with a
polynomial time computable (-putable) census function, C51 then NP CO-
Np.
Proof: We show under the hypothesis we may recognize the complement of
5
c
in nondeterministic polynomial time. Since 5 is complete for CO-Np
this guarantees
NP CO-NP.
Given a string WI compute the census function C5(1w1):k. Using a
nondeterministic (-ministic) polynomial time machine guess k different
sequences w1 1w21.. 1WkI such Iw.I?InI, for i:l.2,....k and verify
they all are in 5 using the NP recognizer of 5(.) if the guessing and
verification succeeds then v is in SC iff:if and only if w?w?Ii:
1.2.....k. Thus. S? is in NP and therefore NP = CO-NP.
Combining (sbining) the above result with Fortune's theorem we get the
following.
QED.
Corollary (ilarv ?): if there exists a sparse NP complete set(.) 5(.)
with a polynomial time computable census function then P = NP.
Proof: From the previous theorem, under the hypothesis of the
corollary, we get NP = CC-NP. But then every set complete for NP is
complete for CO-NP and then(1) because 5 is a sparse complete set for
Co-NP, by Fortune's result we get P = NP.
QED.
4 a in(.) the assumption we have a sparse NP complete set with an
easily computable (rn-putable) census function may appear like
imposing unnatural and restrictive conditions (-tions) just to be able
to derive a result. Surprisingly, the careful exploitation of the
census functions lead a step closer to the solution of the sparseness
conjecture (-ture)(.)
the s
During the spring of 1980 Karp and Lipton made available to us a draft
of their forthcoming SIGACT (CACT) paper "Some Connections Between
Nonuniform and Uniform Complexity Classes" (KL). This paper
investigates the consequences of having "advice functions" (or
oracles) which give values depending only on the length cf the input
to be decided. Karp and Lipton develop uniform algorithms to utilize
the existence(.) but not the easy computability(.) of such advice.
Two results in this paper are relevant to the sparseness conjecture.
The first considers the consequence of having a Turing reduction of
SAT to a sparse set or, equivalently (Lently*) the existence of
polynomial size circuits to solve NP (see Discussion below). The
second result considers advice functions yield O(log(n)) bits of
advice for inputs of size n.
Theorern 1: Suppose h(.) is an advice function for NP satisfying
1. for some c0 h(n) I ? c log(n), and
2. there is a deterministic polynomial time algorithm using
c log(n) bits of advice correctly deciding SAT with advice h(.).
Then P = NP.
The proof of this theorem shows all potential values of the c log(n)
bits may be examined and the correct answer determined uniformly in
polynomial time.
The deciphering of the Karp and Lipton paper(1) though it did not deal
directly with the sparseness conjecture(1) suggests to Mahaney a new
approach to the sparseness (-ness?) conjecture which combines the
previously developed methods and leads to its solution (-tion).
The intuitive link between Theorem (n) 7 and the sparseness conjecture
is found in the census results (Theorem 5 and Corollary 6). The
unnatural hypothesis of the census results is the census function(.)
C5(n)(.) is easily computable. Instead of (1) observing C5(n) is
bounded by a polynomial(.) we see the census may be written (-ten) in
O(log(n)) bits. The census results suggest a method to construct an
algorithm (-ritbin) uniformly tries potential values of the census.
The essence of Mahaney's idea is to apply a census-like method
(without knowing the exact census) to a sparse NP complete set to
construct a p-time (-time) reduction of a CONP set to the sparse set,
and then to use a Berman-Fortune depth first search method to solve
SAT. The lack of knowledge about the census function is overcome by
trying all of the polynomially many (nany) values for the census
function and proving the incorrect values may either be detected or
they may not give a wrong answer.
In the proof below the ignorance about the census function is overcome
by constructing (-structing) a pseudo-complement of the sparse NP
complete set 5. The pseudo-complement incorporates guesses about what
the corresponding census is and it is used to construct (-atruct) the
desired sparse set of labels for the depth first search.
The outline of the proof below is as follows: We first give an NP
recognizer for the ?pseudo-complenent" of the sparse set 5. A
reduction of this set to the
c
sparse set 5 is used to provide the sparse set of labels for SAT;
however, the certain (-tain) computation of the set of labels requires
knowing the census of 5(.) Finally(.) the depth first search is
modified to determine satisfiability of a formula (without "c" exact
knowledge of how to generate the sparse set of labels for SAT ).
For the following discussion let 5 c (o.l)* be a sparse complete set
for NP.
Let M5 be a nondeterminiztic polynomial time recognizer of 5 and let
C5(n) : 5 fl (+?) I ? p(n)
where c5c.) is the true census function of 5, and p(.) is a polynomial
bound the size of the census.
We begin by constructing a Turing machine to recognize the pseudo-
complement (-complement) of 5 in nondeterministic polynomial time(.)
inputs include a padding #? and an integer k is a possible value of
C5(n). Define the non-deterninistic recognizer M by the following
procedure:
M(tn,s.k):
Check Is ? n; otherwise reject.
Check k s p(n); otherwise reject.
Guess ?i'?? `5k ??
1. for all i, 1s.I ?
ii. for all i and j, i?j ? s??s?
iii. for all i, check 5(.) is accepted by M5,
the recognizer of 5.
iv. check for all i, a ? a..
L=?a ?: Let Is ? n and k ? p(n)(.) Then on input (in155k) the machine
M will:
1. accept if k < c(n);
2. reject if k ) c(n); and
3. if k : C5(n), then N accepts if and only if M5 rejects a.
(?) Lemma: We show part 3. If M accepts. then it viii have enumerated
the(.) elements of 5 up to size n, verified they belong to 5, and
shown that 5 is dintinct (-tinct) from these elenents. Since k is the
true census(1) M accepts i? and only if (*) is not in 5.
QED.
Intuitively(.) for k C (n)(.) X is a recognizer of 5 complement.
Moreover, M5 accepts its language in non-deterministic polynomial time
(the input ?n is a padding to ensure this).
We (viii) require labeling:labeling functions for pruning tree
searches. The following discussion shows how to construct such
functions from the sparse set 5 and many-fine (line) reductions of
L(M).
Since M is an NP machine and 5 is NP complete(.) there is a p-time
many-one (-one) reduction g:LCN) ? 5 so for some monotonic (nonotonic)
polynomial q(.), inputs to M of size n are reduced tovstrings of size
at most q(n) (cf. Ec) and EK)). Similarly. for the NP-complete problem
SAT, there is a P-time many-one reduction f:SAT ? 5 and a monotonic
polynonial r(.) bounding the increase in size.
Let F of size m be a formula to be decided and let n r(m). Then any
formula F' occurring in the tree of all self reductions will have
size ? m and f(F') will have size at most n. Regarding k as a possible
value for C5(n)(.) we define Ln,k (F') : g(#n.f(F?).k) as the
labeling:labeling function.
..L=uL ?: Let F be a formula of size m and let ? r(m).
Furthermore, let kCs(n) be the true census. Then the function L (F')
u*k for formulas (nulas) 7, of size at nost n satisfies:
1. F' is not satisfiable if and only if L (F) is in 5;
n*k
2. The unsatisfiable formulas (nulas) of size at nost n are mapped
(napped) by Lu*k to at nost p(q(2n+clog(n))) ? p(qC3n)) distinct
strings of 5 where c is a constant depending only on:
Proof: Part 1 is immediate (innediate) from (fr?) Theorem (ren) 5. For
part 2 observe '2n+tlogCn) S 3n is a bound on the size of ???* f(F1)
(.) (k). Applying p 0 q gives an upper bound on the census of strings
the triple could map (nap) to.
We now know a suitable labeling:labeling function exists for k :
c5(n); and we are aware c (n)* is the true census! The algorithn in
the following theorem (n) shows how we
5 may try Ln*k for all k ?
Theorem 1?: If NP has a sparse conplete set* then ? : NP.
Proof: We give a deterninistic procedure to recognize SAT. Let F be a
formula (nula) of size n. Apply the following algorithn:
begin
For k : ? to p(r(m)) do
Execute the depth first search algorithm using
label:labeling function: Lm.k(F')
at each node F' encounters in the pruned search tree.
If a satisfying assignment is found(.)
then halt; F is satisfiable.
If a tree search visits more than
m + rn * p(q(3 r(m))) internal nodes(1)
then halt the search for this k.
end:
F is now satisfiable;
end
The algorithm clearly runs in polynomial time since the loop is
executed at most p(r(m)) times and each iteration of the loop visits a
polynomially (nially) bounded in m number of nodes.
The correctness of the algorithm is established in the following
result.
J=ma 1: If F is satisfiable(1) then for k : cs(r(m)) the search will
find a satisfying assignment.
Proof: By Theorem 5(.) this k gives a labeling:labeling function maps
the unsatisfiable formulas of size at most m to a polynomially bounded
set. Fortune shows the depth first search will find a satisfying
assignment visiting at most
m + m * p(q(3r(m)))internal nodes.
QED.
It is interesting to note we have computed the census: a satisfying
assignment is found with a number of k's; similarly(.) if no
satisfying assignment existed(1) of many of the trees may be searched
but the tree with kCs(r(m)) is now distinguished.
The method of conducting many tree searches is parallelled in the
uniform algorithm (-rithm) technique by Karp and Lipton (KL). They
show when NP is accepted in P with log( ) advice(1) then ? : NP. The
census function might be compared to a log( )-advisor to the
polynomial information in the set 5.
It is not necessary to assume an NP recognizer for the sparse set: 5
is NP-hard.
Lemma (na) 11: If 5 is sparse and NP-hard(1) then a set St is
sparse(.)NP complete(.) and has a P-time reduction: SAT --> St is
length increasing.
Proof: Let f: SAT 5 be a p-time reduction and let # be a new symbol.
Define f#: SAT ? St by where p : max(O. lf(F)1 - Fl). Clearly St is
sparse. The mapping f# reduces SAT to St(.) Membership of 5 in St is
verified by guessing a satisfiable formula maps to a and verifying (?
ifying) satisfiability.
Corollary (?): If NP is sparse reducible, then ? : NP.
1. s
QED.
Although the isomorphism results (EBH) are the direct ancestors of the
work discussed (-cussed) here(1) the concept of sparseness has another
motivation as stated in the Introduction: Can a "sparse amount of
information" be used to solve NP problems in polynomial time? The
approach here assumes the information is given as a many-one reduction
to a sparse set.
For Turing reductions(.) the information is given as a sparse oracle
set. A.
Meyer has shown a sparse oracle for NP is equivalent to the existence
of polynomial (-nomial) size circuit to solve NP (Bfl]. The recent
work by Karp(.) Lipton and Sipser (KL) has shown if NP has polynomial
size circuits(1) then the polynomial time P hierarchy (s) collapses
(??) Their result is weaker than Theorem 10(.) but it also has a
weaker hypothesis. It is an interesting open problem to determine if
polynomial (-mial) size circuits for NP implies ? : NP.
Similarly(.) now we know sparse NP complete sets cannot exist and P ?
NP(.) it is interesting to determine there may exist sparse sets in NP
- P. By Ladner's result (EL) we know if P ? NP then there exist
incomplete sets in NP - P; the proof of this result yields sparse sets
and we have found a way to modify it to yield sparse sets.
For a related study of the structure of NP complete sets(.) see (LLR).
In this paper Landweber(.) Lipton(.) and Robertson explore the
possibility of having large gaps in NP complete sets.
Finally(.) it is hoped the success in solving the sparseness
conjecture will initiate a new attack on the p-isomorphism conjecture
for NP complete sets.
In conclusion(.) it is interesting to see how many people have
directly or indirectly worked and contributed to the solution of the
sparseness conjecture(.) among then(.) referenced in this paper are L.
Berman. P. Berman. R. Book. D. Dobkin, 5. Fortune. J. (II) Hartmanis.
R. Karp. L. Landweber. R. Lipton. 5. Mahaney. A. Xeyer. N.Patterson.
E. Robertson. A. Selman. M. Sipser. and C. Wrathall.
(AHu) Aho. A.V., Hopcroft. J.E.(.) and Ulinian. J.D.(.) The Design and
Analysis of Computer (rn-puter) Algorithms. Addison-Wesley (1974).
B) Berman. P. "Relationship Between Density and Deterministic
Complexity of NP-Complete (-Complete) Languages." Fifth Int.
Colloquium (ollocujum) on Automata. Languages and Programming (?ing).
Italy (July 1978). Springer Verlag Lecture Notes in Computer Science
Vol. 62. pp. 63-71.
(BH) Berman. L. and (R) Hartmanis, J., "On Isomorphisms and Density of
NP and Other Complete (rn-plete) Sets." SlAM J. Comput(.). 6 (1977).
pp. 305-322. See also Proceedings 8th Annual ACH Synposium on Theory
of Computing (titing). (1976) pp 3040.
(BWSD) Book, R.. Wrathall. C.. Selnan. A.. and Dobkin. D.. "Inclusion
Complete Tally Languages and the (U) Hartnanis-Bernan Conjecture."
(C) Cook. S.A.. "The Complexity of Theorem (n) Proving Procedures."
Proc. 3rd Annual ACn Symposium on Theory of Computing. (1977) pp.
151-158.
(F) Fortune. 5., "A Note on Sparse Complete Sets." SlAM J. Comput(.).
(1979). pp. 431-433.
(GJ) Carey. M.R.. and Johnson. D.S(.). "Computers and intractability.
A Guide to the Theory of NP-Completeness." W.H. Freeman and Co(.). San
Francisco. 1979.
(HBl) Hartmanis. J., and Berman. L(.). "On Polynomial Time
Isomorphisms of Complete Sets." Theoretical Computer Science. 3rd CI
Conference. March. 1977. Lecture Note in Computer Science. Vol. 48.
Springer-Verlag. Heidelberg. pp. 1-15.
(uB2) Hartmanis. J(.). and Berman. L(.). (w) On Polynomial Time
Isomorphisms of Some New Complete Sets." j. of Computer and System
Sciences. Vol. 16 (1978). pp. 418-422.
(HM) Hartruanis. J(.). and (?t) Mahaney. S.R(.). "On Census Complexity
and Sparseness of NP-Complete (-Complete) Sets." Department of
Computer Science. Cornell University. Technical Report TR 80-416
(April 1980).
(EK) Karp. R.. "Reducibility Among Combinatorial (nbinatorial)
Problems." in Complexity of Computer Computations (R.E. Miller and
J.W. Thatcher. eds.). Plenum. New York (1972).
(KL) Karp. R.M.* and Lipton. R.J.. "Some Connections between
Nonuniform and Uniform Complexity Classes." Proc. 12th ACM Symposium
on Theory of Computing. (May (Nay) 1980).
(EL) Ladner, R.E.. "On the Structure of Polynomial Time Reducibility."
J. Assoc. Computing Machinery. Vol. 22 (1975). pp. 135--H171.
(ELLR) Landweber. L.ll(.). Lipton. R.J(.). and Robertson. E.L.1 "On
the Structure of Sets in NP and Other Complexity Classes." Computer
Science Tech. Report 342 (December (nber)(19,8). University of
Wisconsin-Madison.
(EM) Mahaney. S.R(.). "Sparse Complete Sets for NP: Solution of a
Conjecture by Berman and (fl) Hartmanis. (W) Department of Computer
Science. Cornell University. Technical Report TIL 80-417 (April 1980).
(EMP) Patterson. M. and Meyer. A.R(.). ("?ith) With What Frequency are
Apparently Intractable Problens Difficult?", Laboratory for Computer
Science. M.I.T. Tech. Report(.). February 1919.
(5) Stockmeyer. L.J(.). "The Polynomial-(nial)Time Hierarchy."
Theoretical Computer Science Vol. 3. (1977). pp. 1-22.
NP COMPLETE SETS
By
M. Musatov
The purpose of this paper is to review the origins and motivation for
the conjecture sparse NP complete sets do not exist (unless ? NP) and
to describe the development of the ideas and techniques leading to the
recent solution of this conjecture.
The research in theoretical computer science and computational
complexity theory has been strongly influenced by the study of such
feasibly computable families of languages as P, NP and PTAPE. This
research has revealed deep and unsuspected connections between
different classes of problems and it has provided completely new means
for classifying the computational complexity of problems.
Furthermore, the work has raised a set of interesting new research
problems and crested an unprecedented consensus about what problems
have to be solved before real understanding of the complexity of
computations can be achieved.
In the research on feasible computations the central role has been
played by the families of deterministic and nondeterministic
polynonial time computable languages. P and NP(1) respectively [MIU.
c, cj, K]. In particular(.) the NP complete(?te) languages have been
studied intensively and virtually hundreds of natural NP complete
(rnplete) problems have been found in many different areas of
applications (AHu. cJ).
Though we do not yet know whether P (?) NP(.) we accept today a proof
a problem is NP complete as convincing evidencethe problem may be
polynonial time computable (and feasibly computable); a proof a
problem is complete for PTAPE is viewed as even stronger evidence the
problem is feasibly computable (even though there is no proof that P
(?) NP (?) PTAPE).
As part ot the general study of similarities among NP complete
problems it is conjectured by Berman and Hartmanis(1) for reasons
given in the next section(1) all NP complete problems are (?re)
isomorphic (norphic) under polynomial time mappings and therefore
there may exist (sparse) NP complete sets with considerably fewer
elements than the known classic complete problems (e.g. SAT. CLIQUE(1)
etc. EBH)).
When the conjecture is first formulated in 1975(.) the understanding
of NP complete(-plete) problems was more limited and several energetic
frontal assaults on this problem(-lem) failed (?ailed). As a matter of
fact, the problem looked quite hopeless after a considerable (-erable)
initial effort to solve it. Fortunately(1) during the next five years
a number of different people in Europe and America contributed a set
of ideas and techniques recently leading to an elegant solution of the
problem by (5)(.) Mahaney of Cornell University (M).
The purpose of this paper is to describe the origins of the sparseness
conjecture (-ture) about NP complete sets(1) to relate the information
flow about this problem and to describe the development of the crucial
ideas finally leading to the proof sparse NP complete sets exist(1)
then P = NP (M).
We believe this is an interesting and easily understandable
development in the study of NP complete problems and there are some
lessons to be learned about computer science research from the way
this tantalizing problem was solved.
Furthermore(1) it is hoped these results provide a new impetus for
work on the main conjecture all NP complete sets are p-isomorphic.
(?.) Preliminaries and the Sparseness (narseness)
Let P and NP denote(.) respectively(.) the families of languages
accepted by deterministic (-?inistic) and nondeterministic Turing
machines in polynomial time.
A language C is said to be (?) complete (niete) if C is in NP and if
for any other language 3 in NP there exists a polynomial time
computable function f such x ( 3f(x) c C.
The importance of the family of languages P stems from the fact they
provide (-vide) a reasonable model for the feasibly computable
(nutable) problems. The family NP contains (-tains) many important
practical problems and a large number of problems from different (-
ferent) areas of applications in computer science and mathematics have
been shown to be complete for NP (AHU. C. BJ1 K). Since today it is
widely conjectured P (?) NP(.) the NP complete problems are believed
(by a minority) may be solvable in polynomial time.
Currently one of the most fascinating problems (toblems) in
theoretical computer science is to understand better the structure of
feasibly computable problems and(.) in particular(.) to resolve the p
NP question. For an extensive study of P and NP see EARu. GJ].
A close study of the classic NP complete sets(.) such as SAT(.) the
satisfiable Boolean formulas in conjunctive normal form(.) RAM(.)
graphs with Hamiltonian circuits(.0) or CLIQUE(.) graphs with cliques
of specified size(.) revealed they are very similar (-lar) in a strong
technical sense (BH). Not only may they be reduced to each other(.)
they are actually isomorpbic under polynomial time mappings (nappings)
as defined below:
* *
Two languages A and 3. A ? ? and 3 r . are ?-iso?or?hic:isomrophic
iff:if an only if there exists a bijection f:?*???:f:question times
three question (i.e. a one-to-one and onto mapping) such
1. f and f?1:f question one are polynomial time computable.
--H1
2. f is a reduction of A to 3 and f is a reduction of 3 to A.
Further study reveals all the "known" NP complete sets are p-
isomorphic and one may formulate (after a number of technical lemmas)
a very simple (nple) condition (-dition) for NP complete sets to be p-
isonorphic to SAT in terms of two padding functions (-tions) (BH).
Theorem L: An NP complete set B is p-isomorphic to SAT iff:if and only
if there exists two polynomial (-?ial) time computable functions D and
5 such
1. (Vx?y) (D(x.y) c B ?xcB)
2. (Vx*y) (SoD(x,y) :
All the known NP complete sets have these padding functions and in
most cases they are easy to find. A good example is SAT(.) for which y
can easily be encoded in any given formula in terms of new variables
which do (may) change the satisfiability of the formula [Bli].
From these studies grew the conviction all NP complete sets are p--H
isomorphic and this conjecture is explicitly stated in [EBH].
Clearly(.) if all NP complete sets are p-isomorphic then they all must
be infinite (-ite) and therefore P X NP. Thus it is realized this
conjecture may be very hard to prove(.) but the possibility is left
open it may be easier to disprove it. One way of disproving the p-
isomorphism conjecture suggests the fact p-isomorpflic sets have quite
similar densities. To make a precise we define sparseness below:
*
A set B. B c ? is said to be sparse if there exists a polynomial p(n)
such
I Bn(c+?) I `
Thus p(n) bounds the number of elements in B up to size n.
It is easily seen that SAT and other known NP complete sets are (may
be) sparse (any set possessing the padding functions D and 5 is may be
sparse) and a sparse set may be p-isomorphic to SAT. These
considerations lead to the conjecture (Bil) there may exist sparse NP
complete sets (unless p : NP). In particular(.) it is (*) conjectured
one set over a single letter alphabet say B?a may be NP complete.
It is interesting to note the p-isomorphism conjecture quickly leads
to the sparse set conjecture and then to the innocuous looking
conjecture a language on a single letter alphabet may be NP complete.
We return to this last conjecture in the next section(.) it is the
first to be solved.
A more indirect motivation for the p-isomorphism conjecture comes from
the sugggested (-gested) analogy between recursive and recursively
enumerable languages and P and NP as their feasibly computable
counterparts. This analogy becomes particularly intrigueing (-gueing)
and suggestive when it is extended to the Kleene Hierarchy and the
polynomial time hierarchy (5). The NP complete sets correspond in this
analogy to the r.e. complete sets(1) known to be the same as the
creative sets and they are all recursively isomorphic. This suggests
by analogy the NP complete (?omplete) sets should be (p--H)
isomorphic(.) as conjectured in (Bli).
Lastly(.) a sparse NP complete set implies the necessary information
to solve NP problems may be condensed in a sparse set (?et). In other
words(1) the sparse set may be computed and then used as a
polynomially long oracle tape to solve other NP complete problems. At
the time of stating (Ating) the sparseness conjecture this looked very
unlikely, and now we know it is not impossible when p NP. For related
results discussing the consequences of the existence of polynomial
size circuits for the recognition of SAT1 see (KL).
1. Ranges (?es) afl4 SLA Languages (?ua?es)
The p-isomorphism and sparseness conjecture and the more specialized
conjecture one language on a single letter alphabet may be NP complete
may receive a fairly wide exposure at conferences and journal
publications in the United States and Europe (BH. uBl, HB2).
Fortunately(1) in spite of different attempts(1) now progress may be
made on this problem for several years and it starts to look like an
interesting (-ing) problem about NP complete sets which may be likely
to be solved in the near future.
The situation changes suddenly when Piotr Bernan (?an) from Poland
submits a paper "Relationships Between Density and Deterministic
Complexity of NP-Complete Languages(?) to iCALP (?7?). In this
paper(.) motivated by the sparseness conjecture(.) P(.) Berman
considers the consequences of P-time (c) reductions with sparse
range(.)(*)particularly NP complete subsets of a (.) One of the
authors was on the program comittee (n-mittee) for ICAL(?) `78 and the
paper(.) which in its first version is not easy to understand, is
studied at Cornell with great interest. After some (ne) effort, with
the help of 5(.) Fortune(.) we convince ourselves indeed P. Berman's
result is correct. In retrospect it is surprising how elegant and
simple P. Berman's proof is and why so many other people who had
thought about this problem missed it.
The paper is, as it amply deserves, accepted for ICALP `78 and
receives considerable (-siderable) attention. Unfortunately, P. Berman
did not attend ICALP `78 himself and the paper is read at the
conference by Ron Book, who had also worked on single letter alphabet
languages (BWSD).
We state P. Berman's Theorem below and outline a proof:
Theorem ?: a) If there is a P-time reduction with sparse range for an
NP complete set, then p NP.
*
b) If there is an NP complete subset of a , then p NP.
Note carefully P. Berman's hypothesis of part a) is that is a
reduction (-tion) g so l(g(x) : ;?I ? n)l is polynonially bounded.
Though his proof uses CLIQUE as an NP complete problem(1) we consider
the SAT problem in our outline of the proof. Part b), of course(1) is
immediate from part a).
Proof: Let g be a p-time reduction of SAT to a sparse range. We
outline am algorithm (-rithm) to determine if a booleam formula F(x1
l???lXn)s is satisfiable (and if so, find an assignment). The
algorithm searches part of a binary tree of self-reductions (-
reductions) of F. The root is F(Xi?????xn) Each node will correspond
to F with certain (-tain variables instantiated by 0 or I as follows:
if F(b1 1????bi?? sXil???lXn) is at a node, then its offspring are:
F(b?1???lbi?I1OlXi+l? ??1Xn)
F(bil????bi?1 lllXi+I l???sX
n
and
We construct the tree depth first, computing a label g(F) at each
formula F encountered. The algorithm determines certain formulas. F(.)
correspond to unsatisfiable (-tisfiable) formulas and their labels(.)
g(F)(.) are marked as follows: a leaf with formula (-mula) 0 (i.e.
FALSE) is marked unsatisfiable; if both offspring of a node are marked
unsatisfiable then the label at the node is marked unsatisfiable
also.
When a label is marked unsatisfjable other nodes occurring with the
same label are similarly (-larly) marked.
A careful analysis shows whenever a bottom-most node is selected(.)
then either a satisfying assignment is found or a new value g(F) is
marked unsatisfiable in examining the next n nodes of the tree.
Thus(.) the running time is polynomial in the size of F(x1 .....xn).
QED.
A close inspection of this proof shows no explicit use is made of the
fact the set A is in NP. Thus we have actually proved:
3: If SAT may be reduced to a sparse set. then P = NP.
Even more fully formalized(.) P. Berman's proof is quite simple(.) but
it provides the first important step in the solution of the sparseness
conjecture. We believe in the solution of this problem interaction
between different groups plays an important role and the solution of
even a highly specialized conjecture(.) like the sparse range case(.)
provides the necessary impetus for further work.
?-? Complete (pmnlete ?.)
In the attempt to understand P. Berman's proof of the single letter
case(.) Steve Fortune(.) who at that time (nie) is a graduate student
at Cornell(.) notices in ??????5 proof the negative answers yield
valuable information(.) when a formula F is found to be
unsatisfiable(.) its label g(F) is marked; one never has to explore
beneath any other node of the tree with the same (ne) label value.
Furthermore(.) such negative answers are found only polynomially often
before the possible values from g( SATc) are exhausted.
This insight leads 5(.) Fortune to a proof the complete sets in CO-NP
may now be reduced to sparse sets, when P ? NP EF).
Theorem A: If a CO-Np complete set may is reduced to a sparse set 5,
then P : NP.
Proof: Apply the same tree search method as before, observe only
negative answers are propogated up the tree by conjunctive self-
reducability (-reducibility) (i.e., a node is satisfiable if and only
if one or both sons are satisfiable). Since only the negative (-tive)
answers are used to prune the tree search, the polynomial running time
is preserved (o) under a weaker hypothesis.
For a casual observer of theoretical computer science research the
above result may look artificial how it answers the sparseness
question, but instead inolves a strange new problem about complete
sparse sets for CONP(,). On the other hand, this was a critical step,
as we see (n)1 in the solution of the general sparseness conjecture
for NP complete sets.
The Census
Early in 1980, while working on his Ph.D. dissertation under Juris
Hartmanis at Cornell, Steve Mahaney observes the exact number of
elements in a sparse NP complete set may be computed in polynomial
time, then some very interesting consequences (-quences) follow, as
stated below (?!).
For a set 5 let the census function C5 be defined by
C5(n) 1 5 n (?+?)?
(??) Mahaney's observation leads to the following result.
Theorem (?): If there exists a sparse NP complete set 5 with a
polynomial time computable (-putable) census function, C51 then NP CO-
Np.
Proof: We show under the hypothesis we may recognize the complement of
5
c
in nondeterministic polynomial time. Since 5 is complete for CO-Np
this guarantees
NP CO-NP.
Given a string WI compute the census function C5(1w1):k. Using a
nondeterministic (-ministic) polynomial time machine guess k different
sequences w1 1w21.. 1WkI such Iw.I?InI, for i:l.2,....k and verify
they all are in 5 using the NP recognizer of 5(.) if the guessing and
verification succeeds then v is in SC iff:if and only if w?w?Ii:
1.2.....k. Thus. S? is in NP and therefore NP = CO-NP.
Combining (sbining) the above result with Fortune's theorem we get the
following.
QED.
Corollary (ilarv ?): if there exists a sparse NP complete set(.) 5(.)
with a polynomial time computable census function then P = NP.
Proof: From the previous theorem, under the hypothesis of the
corollary, we get NP = CC-NP. But then every set complete for NP is
complete for CO-NP and then(1) because 5 is a sparse complete set for
Co-NP, by Fortune's result we get P = NP.
QED.
4 a in(.) the assumption we have a sparse NP complete set with an
easily computable (rn-putable) census function may appear like
imposing unnatural and restrictive conditions (-tions) just to be able
to derive a result. Surprisingly, the careful exploitation of the
census functions lead a step closer to the solution of the sparseness
conjecture (-ture)(.)
the s
During the spring of 1980 Karp and Lipton made available to us a draft
of their forthcoming SIGACT (CACT) paper "Some Connections Between
Nonuniform and Uniform Complexity Classes" (KL). This paper
investigates the consequences of having "advice functions" (or
oracles) which give values depending only on the length cf the input
to be decided. Karp and Lipton develop uniform algorithms to utilize
the existence(.) but not the easy computability(.) of such advice.
Two results in this paper are relevant to the sparseness conjecture.
The first considers the consequence of having a Turing reduction of
SAT to a sparse set or, equivalently (Lently*) the existence of
polynomial size circuits to solve NP (see Discussion below). The
second result considers advice functions yield O(log(n)) bits of
advice for inputs of size n.
Theorern 1: Suppose h(.) is an advice function for NP satisfying
1. for some c0 h(n) I ? c log(n), and
2. there is a deterministic polynomial time algorithm using
c log(n) bits of advice correctly deciding SAT with advice h(.).
Then P = NP.
The proof of this theorem shows all potential values of the c log(n)
bits may be examined and the correct answer determined uniformly in
polynomial time.
The deciphering of the Karp and Lipton paper(1) though it did not deal
directly with the sparseness conjecture(1) suggests to Mahaney a new
approach to the sparseness (-ness?) conjecture which combines the
previously developed methods and leads to its solution (-tion).
The intuitive link between Theorem (n) 7 and the sparseness conjecture
is found in the census results (Theorem 5 and Corollary 6). The
unnatural hypothesis of the census results is the census function(.)
C5(n)(.) is easily computable. Instead of (1) observing C5(n) is
bounded by a polynomial(.) we see the census may be written (-ten) in
O(log(n)) bits. The census results suggest a method to construct an
algorithm (-ritbin) uniformly tries potential values of the census.
The essence of Mahaney's idea is to apply a census-like method
(without knowing the exact census) to a sparse NP complete set to
construct a p-time (-time) reduction of a CONP set to the sparse set,
and then to use a Berman-Fortune depth first search method to solve
SAT. The lack of knowledge about the census function is overcome by
trying all of the polynomially many (nany) values for the census
function and proving the incorrect values may either be detected or
they may not give a wrong answer.
In the proof below the ignorance about the census function is overcome
by constructing (-structing) a pseudo-complement of the sparse NP
complete set 5. The pseudo-complement incorporates guesses about what
the corresponding census is and it is used to construct (-atruct) the
desired sparse set of labels for the depth first search.
The outline of the proof below is as follows: We first give an NP
recognizer for the ?pseudo-complenent" of the sparse set 5. A
reduction of this set to the
c
sparse set 5 is used to provide the sparse set of labels for SAT;
however, the certain (-tain) computation of the set of labels requires
knowing the census of 5(.) Finally(.) the depth first search is
modified to determine satisfiability of a formula (without "c" exact
knowledge of how to generate the sparse set of labels for SAT ).
For the following discussion let 5 c (o.l)* be a sparse complete set
for NP.
Let M5 be a nondeterminiztic polynomial time recognizer of 5 and let
C5(n) : 5 fl (+?) I ? p(n)
where c5c.) is the true census function of 5, and p(.) is a polynomial
bound the size of the census.
We begin by constructing a Turing machine to recognize the pseudo-
complement (-complement) of 5 in nondeterministic polynomial time(.)
inputs include a padding #? and an integer k is a possible value of
C5(n). Define the non-deterninistic recognizer M by the following
procedure:
M(tn,s.k):
Check Is ? n; otherwise reject.
Check k s p(n); otherwise reject.
Guess ?i'?? `5k ??
1. for all i, 1s.I ?
ii. for all i and j, i?j ? s??s?
iii. for all i, check 5(.) is accepted by M5,
the recognizer of 5.
iv. check for all i, a ? a..
L=?a ?: Let Is ? n and k ? p(n)(.) Then on input (in155k) the machine
M will:
1. accept if k < c(n);
2. reject if k ) c(n); and
3. if k : C5(n), then N accepts if and only if M5 rejects a.
(?) Lemma: We show part 3. If M accepts. then it viii have enumerated
the(.) elements of 5 up to size n, verified they belong to 5, and
shown that 5 is dintinct (-tinct) from these elenents. Since k is the
true census(1) M accepts i? and only if (*) is not in 5.
QED.
Intuitively(.) for k C (n)(.) X is a recognizer of 5 complement.
Moreover, M5 accepts its language in non-deterministic polynomial time
(the input ?n is a padding to ensure this).
We (viii) require labeling:labeling functions for pruning tree
searches. The following discussion shows how to construct such
functions from the sparse set 5 and many-fine (line) reductions of
L(M).
Since M is an NP machine and 5 is NP complete(.) there is a p-time
many-one (-one) reduction g:LCN) ? 5 so for some monotonic (nonotonic)
polynomial q(.), inputs to M of size n are reduced tovstrings of size
at most q(n) (cf. Ec) and EK)). Similarly. for the NP-complete problem
SAT, there is a P-time many-one reduction f:SAT ? 5 and a monotonic
polynonial r(.) bounding the increase in size.
Let F of size m be a formula to be decided and let n r(m). Then any
formula F' occurring in the tree of all self reductions will have
size ? m and f(F') will have size at most n. Regarding k as a possible
value for C5(n)(.) we define Ln,k (F') : g(#n.f(F?).k) as the
labeling:labeling function.
..L=uL ?: Let F be a formula of size m and let ? r(m).
Furthermore, let kCs(n) be the true census. Then the function L (F')
u*k for formulas (nulas) 7, of size at nost n satisfies:
1. F' is not satisfiable if and only if L (F) is in 5;
n*k
2. The unsatisfiable formulas (nulas) of size at nost n are mapped
(napped) by Lu*k to at nost p(q(2n+clog(n))) ? p(qC3n)) distinct
strings of 5 where c is a constant depending only on:
Proof: Part 1 is immediate (innediate) from (fr?) Theorem (ren) 5. For
part 2 observe '2n+tlogCn) S 3n is a bound on the size of ???* f(F1)
(.) (k). Applying p 0 q gives an upper bound on the census of strings
the triple could map (nap) to.
We now know a suitable labeling:labeling function exists for k :
c5(n); and we are aware c (n)* is the true census! The algorithn in
the following theorem (n) shows how we
5 may try Ln*k for all k ?
Theorem 1?: If NP has a sparse conplete set* then ? : NP.
Proof: We give a deterninistic procedure to recognize SAT. Let F be a
formula (nula) of size n. Apply the following algorithn:
begin
For k : ? to p(r(m)) do
Execute the depth first search algorithm using
label:labeling function: Lm.k(F')
at each node F' encounters in the pruned search tree.
If a satisfying assignment is found(.)
then halt; F is satisfiable.
If a tree search visits more than
m + rn * p(q(3 r(m))) internal nodes(1)
then halt the search for this k.
end:
F is now satisfiable;
end
The algorithm clearly runs in polynomial time since the loop is
executed at most p(r(m)) times and each iteration of the loop visits a
polynomially (nially) bounded in m number of nodes.
The correctness of the algorithm is established in the following
result.
J=ma 1: If F is satisfiable(1) then for k : cs(r(m)) the search will
find a satisfying assignment.
Proof: By Theorem 5(.) this k gives a labeling:labeling function maps
the unsatisfiable formulas of size at most m to a polynomially bounded
set. Fortune shows the depth first search will find a satisfying
assignment visiting at most
m + m * p(q(3r(m)))internal nodes.
QED.
It is interesting to note we have computed the census: a satisfying
assignment is found with a number of k's; similarly(.) if no
satisfying assignment existed(1) of many of the trees may be searched
but the tree with kCs(r(m)) is now distinguished.
The method of conducting many tree searches is parallelled in the
uniform algorithm (-rithm) technique by Karp and Lipton (KL). They
show when NP is accepted in P with log( ) advice(1) then ? : NP. The
census function might be compared to a log( )-advisor to the
polynomial information in the set 5.
It is not necessary to assume an NP recognizer for the sparse set: 5
is NP-hard.
Lemma (na) 11: If 5 is sparse and NP-hard(1) then a set St is
sparse(.)NP complete(.) and has a P-time reduction: SAT --> St is
length increasing.
Proof: Let f: SAT 5 be a p-time reduction and let # be a new symbol.
Define f#: SAT ? St by where p : max(O. lf(F)1 - Fl). Clearly St is
sparse. The mapping f# reduces SAT to St(.) Membership of 5 in St is
verified by guessing a satisfiable formula maps to a and verifying (?
ifying) satisfiability.
Corollary (?): If NP is sparse reducible, then ? : NP.
1. s
QED.
Although the isomorphism results (EBH) are the direct ancestors of the
work discussed (-cussed) here(1) the concept of sparseness has another
motivation as stated in the Introduction: Can a "sparse amount of
information" be used to solve NP problems in polynomial time? The
approach here assumes the information is given as a many-one reduction
to a sparse set.
For Turing reductions(.) the information is given as a sparse oracle
set. A.
Meyer has shown a sparse oracle for NP is equivalent to the existence
of polynomial (-nomial) size circuit to solve NP (Bfl]. The recent
work by Karp(.) Lipton and Sipser (KL) has shown if NP has polynomial
size circuits(1) then the polynomial time P hierarchy (s) collapses
(??) Their result is weaker than Theorem 10(.) but it also has a
weaker hypothesis. It is an interesting open problem to determine if
polynomial (-mial) size circuits for NP implies ? : NP.
Similarly(.) now we know sparse NP complete sets cannot exist and P ?
NP(.) it is interesting to determine there may exist sparse sets in NP
- P. By Ladner's result (EL) we know if P ? NP then there exist
incomplete sets in NP - P; the proof of this result yields sparse sets
and we have found a way to modify it to yield sparse sets.
For a related study of the structure of NP complete sets(.) see (LLR).
In this paper Landweber(.) Lipton(.) and Robertson explore the
possibility of having large gaps in NP complete sets.
Finally(.) it is hoped the success in solving the sparseness
conjecture will initiate a new attack on the p-isomorphism conjecture
for NP complete sets.
In conclusion(.) it is interesting to see how many people have
directly or indirectly worked and contributed to the solution of the
sparseness conjecture(.) among then(.) referenced in this paper are L.
Berman. P. Berman. R. Book. D. Dobkin, 5. Fortune. J. (II) Hartmanis.
R. Karp. L. Landweber. R. Lipton. 5. Mahaney. A. Xeyer. N.Patterson.
E. Robertson. A. Selman. M. Sipser. and C. Wrathall.
(AHu) Aho. A.V., Hopcroft. J.E.(.) and Ulinian. J.D.(.) The Design and
Analysis of Computer (rn-puter) Algorithms. Addison-Wesley (1974).
B) Berman. P. "Relationship Between Density and Deterministic
Complexity of NP-Complete (-Complete) Languages." Fifth Int.
Colloquium (ollocujum) on Automata. Languages and Programming (?ing).
Italy (July 1978). Springer Verlag Lecture Notes in Computer Science
Vol. 62. pp. 63-71.
(BH) Berman. L. and (R) Hartmanis, J., "On Isomorphisms and Density of
NP and Other Complete (rn-plete) Sets." SlAM J. Comput(.). 6 (1977).
pp. 305-322. See also Proceedings 8th Annual ACH Synposium on Theory
of Computing (titing). (1976) pp 3040.
(BWSD) Book, R.. Wrathall. C.. Selnan. A.. and Dobkin. D.. "Inclusion
Complete Tally Languages and the (U) Hartnanis-Bernan Conjecture."
(C) Cook. S.A.. "The Complexity of Theorem (n) Proving Procedures."
Proc. 3rd Annual ACn Symposium on Theory of Computing. (1977) pp.
151-158.
(F) Fortune. 5., "A Note on Sparse Complete Sets." SlAM J. Comput(.).
(1979). pp. 431-433.
(GJ) Carey. M.R.. and Johnson. D.S(.). "Computers and intractability.
A Guide to the Theory of NP-Completeness." W.H. Freeman and Co(.). San
Francisco. 1979.
(HBl) Hartmanis. J., and Berman. L(.). "On Polynomial Time
Isomorphisms of Complete Sets." Theoretical Computer Science. 3rd CI
Conference. March. 1977. Lecture Note in Computer Science. Vol. 48.
Springer-Verlag. Heidelberg. pp. 1-15.
(uB2) Hartmanis. J(.). and Berman. L(.). (w) On Polynomial Time
Isomorphisms of Some New Complete Sets." j. of Computer and System
Sciences. Vol. 16 (1978). pp. 418-422.
(HM) Hartruanis. J(.). and (?t) Mahaney. S.R(.). "On Census Complexity
and Sparseness of NP-Complete (-Complete) Sets." Department of
Computer Science. Cornell University. Technical Report TR 80-416
(April 1980).
(EK) Karp. R.. "Reducibility Among Combinatorial (nbinatorial)
Problems." in Complexity of Computer Computations (R.E. Miller and
J.W. Thatcher. eds.). Plenum. New York (1972).
(KL) Karp. R.M.* and Lipton. R.J.. "Some Connections between
Nonuniform and Uniform Complexity Classes." Proc. 12th ACM Symposium
on Theory of Computing. (May (Nay) 1980).
(EL) Ladner, R.E.. "On the Structure of Polynomial Time Reducibility."
J. Assoc. Computing Machinery. Vol. 22 (1975). pp. 135--H171.
(ELLR) Landweber. L.ll(.). Lipton. R.J(.). and Robertson. E.L.1 "On
the Structure of Sets in NP and Other Complexity Classes." Computer
Science Tech. Report 342 (December (nber)(19,8). University of
Wisconsin-Madison.
(EM) Mahaney. S.R(.). "Sparse Complete Sets for NP: Solution of a
Conjecture by Berman and (fl) Hartmanis. (W) Department of Computer
Science. Cornell University. Technical Report TIL 80-417 (April 1980).
(EMP) Patterson. M. and Meyer. A.R(.). ("?ith) With What Frequency are
Apparently Intractable Problens Difficult?", Laboratory for Computer
Science. M.I.T. Tech. Report(.). February 1919.
(5) Stockmeyer. L.J(.). "The Polynomial-(nial)Time Hierarchy."
Theoretical Computer Science Vol. 3. (1977). pp. 1-22.