Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Hedman. A First Course in Logic, 2004 (Oxford)

.pdf
Скачиваний:
140
Добавлен:
10.08.2013
Размер:
7.17 Mб
Скачать

210

First-order theories

Theorem 5.21 (Morley) Let T be a countable theory. If T is κ-categorical for some uncountable κ, then T is κ-categorical for all uncountable κ.

Morley’s proof of this theorem introduced methods and concepts to model theory that would bear fruit far beyond Morley’s theorem itself. The proof gave rise to the subject of stability theory. We touch upon some of the ingredients of this proof in Chapter 6 (see Exercise 6.33). However, we do not prove Morley’s theorem. Instead, we refer the reader to books devoted solely to model theory such as [29] and [39] and also to more advanced books on stability theory such as [1] and [6]. Also, for the serious student of model theory, Morley’s original proof in [32] remains essential reading.

We conclude this section by stating without proof two results regarding categoricity and finite axiomatizability. Naturally, a theory is said to be finitely axiomatizable if it is axiomatized by a finite set of sentences. We have seen several examples of finitely axiomatizable theories including the theory of graphs, the theory of equivalence relations, the theory of groups, and others. All of these theories are incomplete. In the next section, we shall see examples of finitely axiomatizable complete theories having infinite models. Such theories necessarily contain a sentence that has only infinite models (see Exercises 2.37 and 2.38 for examples of such sentences). As a rule, most complete theories having infinite models are not finitely axiomatizable. If we restrict our attention to totally categorical theories, then we can be more precise.

Theorem 5.22 (Zil’ber) Totally categorical theories are not finitely axiomatizable.

Recall that the theory of cliques from Example 5.17 is finitely axiomatizable and κ-categorical for all κ. Since this theory has finite models, it is not totally categorical. In contrast, the theory Tclique of infinite cliques is totally categorical, but is not finitely axiomatizable. To axiomatize Tclique we must include sentences saying that there exist more than n elements for each n N. Using counting quantifiers as defined in Exercise 2.20, we can express each of these sentences as

≥nx(x = x). This is an example of a quasi-finite axiomatization.

Definition 5.23 A theory T is quasi-finitely axiomatizable if there exists a finite set F of formulas in one free variable such that T is axiomatized by sentences of the form ≥n(x) with ϕ(x) F .

Theorem 5.24 (Hrushovski) If T is totally categorical and has a finite vocabulary, then T is quasi-finitely axiomatizable.

Zil’ber’s and Hrushovski’s theorems are actually corollaries to results regarding the general structure of models of totally categorical theories. Their proofs of are beyond the scope of this book (these theorems are proved in [38]). We focus

First-order theories

211

on more elementary properties of these theories. In the next section, we prove a fundamental result regarding countably categorical theories.

5.3 Countably categorical theories

We investigate some specific countably categorical structures. We consider structures in the vocabulary V< consisting of a single binary relation <. Each of the examples we consider interprets < as a linear order. In the second part of this section, we prove a fundamental result that holds for all countably categorical theories.

5.3.1 Dense linear orders. Consider the closed unit interval of real numbers. Let R[0,1] be the structure {[0, 1]| <} having the closed interval [0, 1] of real numbers as an underlying set and interpreting < in the usual way. We list some V-sentences that hold in R[0,1]. For reference, we label these sentences as δ1δ7.

δ1: x y((x < y) → ¬(y < x))

δ2: x(¬(x < x))

δ3: x y((x < y) (y < x) (x = y))

δ4: x y z(((x < y) (y < z)) (x < z))

δ5: x y((x < y) → z(x < z z < y))

δ6: x y((x = y) (x < y))

δ7: x y((x = y) (y < x)).

The first four of these sentences say that < is a linear order. Recall from Example 5.8 that the theory TLO is defined as the set of all consequences of these four sentences. The sentence δ5 says that between any two elements, there exists another element. That is, the linear order is dense (see Section 2.4.3). Finally, δ6 says that there exists a smallest element and δ7 says that there exists a largest element.

Let TDLOE denote the set of V<-sentences that can be derived from the above seven sentences. This is the theory of dense linear orders with endpoints. Clearly, R[0,1] |= TDLOE . We claim that T h(R[0,1]) = TDLOE . To show this, we must verify that TDLOE , unlike TLO, is a complete theory. We prove something stronger.

Proposition 5.25 TDLOE is 0-categorical.

Proof Let M and N be two models of TDLOE of size 0. We show that M and N are isomorphic. Let UM and UN denote the underlying sets of M and N

212 First-order theories

respectively. Enumerate these sets as follows:

UM = {a1, a2, a3, . . .} and UN = {b1, b2, b3, . . .}.

Since M and N model δ6 and δ7, these sets must contain a smallest and a largest element (with respect to the order <). We may assume that a1 and b1 are the smallest elements in each set and a2 and b2 are the largest elements.

We construct an isomorphism f : M → N step-by-step. In each step we define f for two elements of UM :

Step 1: Let f (a1) = b1 and f (a2) = b2. For n > 1, step n has two parts.

Step n: Part a. Let An be the set of all ai UM for which f (ai) has been defined in some previous step. Let j be least such that aj is not in An. We define f (aj ). Since An is finite, we can find elements c and d of An so that no element of An is between these two elements and aj is (that is, c < aj < d or d < aj < c). In this case, let f (aj ) be any element of UN that lies between f (c) and f (d). Since N |= δ5, such f (aj ) exists.

Part b. Let Bn be the set of all bi UN for which f 1(bi) has been defined in some previous step (including f (aj ) from part a). Let j be least such that bj is not in Bn. Since Bn is finite, we can find elements c and d of Bn so that no element of Bn is between these two elements and bj is (that is, c < bj < d or d < bj < c). In this case, let f 1(bj ) be any element of UM that lies between f 1(c) and f 1(d). Since M |= δ5, such a f 1(bj ) exists.

After completing step n for all n N, the function f is completely defined. Since f (ai) is defined in Step i (part a) if not before, each ai is in the domain of f . Moreover, f (ai) is defined exactly once. So f has domain UM and is one-to- one. Also, since f 1(bi) is defined in Step i (part b) if not before, f is onto. By design, f preserves the order (ai < aj implies f (ai) < f (aj )). So f is a literal embedding. By Proposition 2.57, f is an isomorphism as was desired.

Corollary 5.26 TDLOE is complete.

Proof By Proposition 5.15, any 0-categorical theory having only infinite models is complete. So to show that TDLOE is complete, it su ces show that any dense linear order is necessarily infinite. If a linear order is finite, then it can be listed as a1 < a2 < · · · < an for some n N. Such a linear order is not dense since there is no element between a1 and a2. So TDLOE has only infinite models and is complete.

It follows from this corollary that any model of TDLOE is elementarily equivalent to R[0,1]. For example, suppose that we restrict the underlying set to the set of rational numbers in the interval [0, 1]. Let Q[0,1] denote the V<-structure having this set of rationals as its underlying set and interpreting < as the usual

First-order theories

213

order. This is a countable model of TDLOE . Since this theory is 0-categorical, it is essentially the only countable model of this theory. Any other countable model must be isomorphic to Q[0,1]. Moreover, since TDLOE is complete, Q[0,1] and R[0,1], although not isomorphic, are elementarily equivalent.

Proposition 5.27 For any uncountable cardinal κ, TDLOE is not κ-categorical.

Proof We define a model H[0,2] of TDLOE that has the same size as R[0,1], but is not isomorphic to R[0,1]. The structure H[0,2] is a hybrid of Q[0,1] and R[0,1]. Its universe is the union of the set of all rational numbers in the interval [0, 1] and the set of all real numbers in the interval [1, 2]. Again, this structure interprets < in the usual way. This structure models TDLOE . Since this theory is complete,

H[0,2] R[0,1]. Moreover,

|H[0,2]| = |Q[0,1]| + |R[0,1]| = 0 + 2 0 = 2 0 .

So H[0,2] has the same size as R[0,1]. To see that it is not isomorphic to R[0,1], note that, in R[0,1], there exist uncountably many elements between any two elements. This is not true in H[0,2]. So an isomorphism between these two models is impossible and TDLOE is not 2 0 -categorical. By Morley’s theorem 5.21, TDLOE is not κ-categorical for any uncountable κ.

Recall the V<-structures Q< and R< from 2.4.3. Since they have no endpoints, these are not models of TDLOE . We define the theory TDLO of dense linear orders without endpoints as the set of all V<-sentences that can be derived from the sentences δ1, δ2, δ3δ4, δ5, ¬δ6, and ¬δ7. We negate the sentences saying there exist a smallest and largest element. Both Q< and R< are models of this theory.

Corollary 5.28 TDLO is 0-categorical.

Proof Any model of TDLO can be extended to a model of TDLOE by adding smallest and largest elements to the underlying set. If there were non-isomorphic countable models of TDLO, then these could be extended to non-isomorphic countable models of TDLOE . Since TDLOE is 0-categorical, so is TDLO.

Corollary 5.29 TDLO is complete.

Proof This is the same as the proof of Corollary 5.26. Since dense linear orders are necessarily infinite, TDLO has no finite models. By Proposition 5.15, TDLO is complete.

Corollary 5.30 Q< R<.

Proof This follows immediately from the fact that Q< and R< are both models of the complete theory TDLO.

In light of these examples, and specifically of the proof of Proposition 5.25, we now investigate arbitrary countably categorical theories.

214

First-order theories

5.3.2 Ryll-Nardzewski et al. Categoricity is a property of theories that is defined in terms of the models of the theory. A theory T is countably categorical if and only if there is exactly one countable model (up to isomorphism) in Mod(T ). As we shall prove, there is a purely syntactic characterization of these theories. We show that a theory T is 0-categorical if and only if there are only finitely many formulas in n free variables up to T -equivalence. Equivalently, a V-structure M having universe U is 0-categorical if and only if, for each n N, only finitely many subsets of U n are V-definable.

Proposition 5.31 Let T be a complete V-theory. If there are only finitely many formulas in n free variables up to T -equivalence for each n, then T is0-categorical.

We prove this proposition using a back-and-forth argument. This method of proof constructs an isomorphism between two structures by alternating back and forth between the elements of each of the two underlying sets. An example of a back-and-forth argument is provided by the proof of Proposition 5.25, where it was shown that any two countable models of TDLOE are isomorphic. The proof of Proposition 5.31 resembles the proof of Proposition 5.25. In fact, Proposition 5.25 is a special case of this proposition.

Proof of Proposition 5.31 Suppose that there are only finitely many formulas in n free variables up to T -equivalence for each n. We show that T is0-categorical.

Let M and N be two models of T of size 0. Let UM and UN denote the underlying sets of M and N respectively. Enumerate these sets as

UM = {a1, a2, a3, . . .} and UN = {b1, b2, b3, . . .}.

We construct an isomorphism f : M → N step-by-step. In each step, we define f (ai) for two elements ai of UM .

For n N, let An be the set of all ai UM for which f (ai) has been defined in some step prior to step n. Since An is finite, we may regard it as a tuple of elements of UM . There are many ways to arrange the elements of a large finite set into a tuple. For any ai and aj in An, one of the two elements f (ai) and f (aj ) of UN must have been defined before the other. Let a¯n be the tuple obtained by arranging the elements of An in the order in which f was defined. Likewise, let

¯

Bn be the corresponding set of all f (ai) UN for ai An. Let bn be the tuple obtained by arranging the elements of Bn in the order in which f was defined.

Note that A1 = B1 = . Note too that, since T is complete, N |= ϕ if and only if M |= ϕ for any V-sentence ϕ.

For n N, assume that An and Bn have been defined in such a way that

| | ¯ V | |

M = ϕan) if and only if N = ϕ(bn) for any -formula ϕ in An free variables.

First-order theories

215

Step n: Part a. Let j be least such that aj is not in An. We define f (aj ).

Let k = |An| = 2(n − 1). By hypothesis, there exists a finite set F of V- formulas in k +1 free variables so that every V-formula having k +1 free variables is T -equivalent to a formula in F .

Let Φ(¯x, y) be the conjunction of those formulas ϕx, y) in F such that M |= ϕan, aj ). Then M |= yΦ(¯an, y).

This formula has |An| = k free variables. By the definitions of An and Bn,

¯

¯

N |= yΦ(bn, y). It follows that N |= yΦ(bn, bi) for some bi UN .

Let f (aj ) = bi.

be the tuple (¯an, aj ) and let ¯bn be the tuple (¯bn, bi), where

Part b. Let a¯n

¯

aj and bi are as defined in part (a). Let l be least such that bl is not in bn. As

in part (a), we can find an element a

i

of U

M

so that M = ϕa

, a

) if and only

|

n

l

 

 

|

n

i

 

if N = ϕ(¯b

, b

) for any ϕ in 2n free variables.

 

 

 

Define f (ai) to be bl.

After completing step n for each n N, the function f is completely defined. Since f (ai) is defined in Step i (part a) if not before, each ai is in the domain of f . Moreover, f (ai) is defined exactly once. So f has domain UM and is one-to- one. Also, since f 1(bi) is defined in Step i (part b) if not before, f is onto. By design, f preserves all V-formulas and is an isomorphism as was desired.

We now have two ways to show that a given theory is countably categorical. We can give a back-and-forth argument as we did for TDLOE in Proposition 5.25. Alternatively, we can show that there are only finitely many formulas in n free variables up to T -equivalence for each n. This may not seem practical. However, in Section 5.5 we discuss quantifier elimination and provide a systematic approach to understanding the definable subsets of certain structures.

The converse of Proposition 5.31 is also true. The countably categorical theories are precisely those theories that are complete and have few formulas (finitely many in n free variables for each n). Since this was proved by Ryll-Nardzewski in a 1959 paper, it is commonly referred to as the Ryll-Nardzewski theorem. Since it also appeared in 1959 in separate papers by Engeler and Svenonius, it is sometimes referred to as the Engeler–Ryll-Nardzewski–Svenonius theorem. We opt for brevity and refer to it as Theorem 5.32.

Theorem 5.32 A

complete

theory

T is

0-categorical if and only if,

for

each n N, there

are only

finitely

many

formulas in n free variables up

to

T -equivalence.

 

 

 

 

 

One direction of this theorem was proved as Proposition 5.31. We postpone the proof of the other direction until Chapter 6 where we shall see several equivalent characterizations of 0-categorical theories.

216 First-order theories

5.4 The Random graph and 0–1 laws

A random graph is a graph constructed by some random process such as rolling a die or flipping a coin. The idea of implementing random processes in graph theory was conceived by Paul Erd¨os and has served as a powerful tool for this and other areas of discrete mathematics. In this section, we discuss this idea and show how it gives rise to a complete first-order theory TRG in the vocabulary of graphs. We prove that TRG is 0-categorical. Whereas there are many possible finite random graphs, there is only one denumerable random graph. From this fact we deduce a 0–1 law for first-order logic.

We assume basic knowledge of probability.

Suppose that we have a set of vertices and want to build a graph. For example, suppose that we have five vertices v1, v2, v3, v4, and v5. To define the graph, we must decide which pairs of vertices share an edge. Let us take the random approach and flip a coin to make our decisions. Given any two vertices (v1 and v2, say) we flip a coin. If the coin lands heads up, then v1 and v2 share an edge. If the coin lands tails up, then they do not share an edge. We repeat this for every pair of vertices. Since there are five vertices, there are (5 · 4)/2 = 10 pairs of vertices to consider. After flipping the coin 10 times, we will have completed the graph.

Any graph having vertices v1, v2, v3, v4, and v5 is a possible outcome of this process. Since each of the ten flips of the coin has two possible outcomes, there are 210 possible graphs. If the coin is fair (landing heads up as frequently as tails up), then each of these 210 graphs is equally likely. However, two or more of the outcomes may be isomorphic graphs. So, up to isomorphism, some graphs are more likely than others. For example, the 5-clique is an unlikely outcome. To obtain this result, each of our 10 flips of the coin must land heads up. The probability of this happening is 1/210. It is more likely that the outcome will have exactly one edge. The probability of this happening is 10/210 (so this will happen roughly 1% of the time).

There are two ways to compute the probabilities in the previous paragraph. Suppose we want to compute the probability that the resulting graph has exactly m edges for some m ≤ 10. Using the formula for binomial probability distribu-

tions, this probability is

10

( 1 )10

(where

10

=

10!

is the number of

m

m

(10−m)!m!

 

2

 

 

 

ways that m of the 10 edges can be chosen). Alternatively, since each of the

 

 

 

 

210 graphs are equally likely, this probability can be computed by counting the number of graphs having exactly m edges and dividing this number by 210. For example the 5-clique is the only one of the 210 possible outcomes that has 10 edges. So the probability that this happens is 1/210.

More generally, suppose that we randomly construct a graph having vertices {v1, v2, . . . , vn} for some n N. There are n(n−1)/2 pairs of vertices to consider. To ease notation, denote n(n − 1)/2 by e(n) (this is the number of edges in

First-order theories

217

the n-clique). If we construct this graph by flipping a fair coin as before, then there are 2e(n) possible outcomes each of which is equally likely. Let ϕ be a VR-sentence. Let Pn(ϕ) be the probability that our randomly constructed graph models the sentence ϕ. This probability can be computed by counting the number of outcomes that model ϕ and dividing by the total number of possibilities 2e(n).

Example 5.33 Let ϕ be the sentence x y(x = y R(x, y)). For each n N, this sentence holds in only one of the 2e(n) graphs having vertices {v1, . . . , vn} (namely, the n-clique).

So Pn(ϕ) = 1/2e(n).

(In particular P5(ϕ) = 1/2e(5) = 1/210 as previously noted.) If n is big, then this probability is close to zero.

Example 5.34 Let ϕ be the VR-sentence saying that the graph has exactly one edge. For each n N, the number of graphs having vertices {v1, . . . , vn} that model this sentence is the number of possible edges e(n).

So Pn(ϕ) = e(n)/2e(n).

If n is big, then this probability is close to zero.

For any V-sentence ϕ and any n N, Pn(ϕ) + Pn(¬ϕ) = 1. This is because every graph either models ϕ or ¬ϕ (and not both). So Pn(¬ϕ) = 1 − Pn(ϕ). In the previous two examples, since Pn(ϕ) approaches zero, Pn(¬ϕ) approaches 1 as n gets large. We express this by using limit notation:

nlim Pn(ϕ) = 0

and

nlim Pn(¬ϕ) = 1.

→∞

 

→∞

The 01 Law for Graphs states

that,

for every VR-sentence θ, either

limn→∞ Pn(θ) = 0 or limn→∞ Pn(θ) = 1. So either θ or its negation almost certainly holds in any large finite graph. This fact imposes limitations on what can be expressed by a VR-sentence. For example, the 0–1 Law for Graphs implies that there is no VR-sentence that holds only in those finite graphs having an even number of vertices.

We have verified the 0–1 Law for Graphs for a couple of particular sentences in the above examples. To prove this law, we must consider some other (more complicated) VR-sentences. For each m N, let ρm be the VR-sentence

x1 · · · xm y1 · · · ym

m m

xi = yj

 

 

 

i

 

 

 

 

 

=1

j=1

 

 

→ z l

R(xi, z)

m

¬ (R(z, yj ) z = yj ) .

i

 

 

 

 

 

 

 

 

 

 

 

 

=1

 

 

j=1

 

 

 

218 First-order theories

This sentence asserts that given any two sets of vertices X = {x1, . . . , xm} and Y = {y1, . . . , ym} such that X ∩ Y = , there exists a vertex z not in X Y that shares an edge with each vertex in X and with no vertex in Y . Note that ρm2 implies ρm1 for m1 ≤ m2. We examine ρm for various values of m.

m = 1: The sentence ρ1 states that, for any vertices x and y there exists a vertex z that shares an edge with x but not y. Since x and z share an edge, they cannot be equal. Moreover, ρ1 asserts there exists such a z that is not equal to y. So any model of ρ1 must have at least three vertices corresponding to x, y, and z. Moreover, reversing the roles of x and y, there must exist a vertex w that shares an edge with y but not x. So any graph that models ρ1 must have at least four vertices. In fact, the smallest example has five vertices (take the sides of a pentagon as edges).

m = 2: Let G be a graph that models ρ2. Let a and b be two vertices of G. Then, there must exist a vertex c that shares and edge with a but not b and a vertex d that shares an edge with b but not a. Moreover, there must exist a vertex e that shares a vertex with both a and b and a vertex f that shares a vertex with neither a nor b. So there must be at least six vertices, but we are not done. There must also exist a vertex g that shares an edge with both e and d and with neither c nor d, and so forth. In contrast to ρ1, it is not an easy task to draw a graph that models ρ2 (nor is it easy to determine the minimal number of vertices for such a graph).

m > 2: It becomes increasingly di cult to demonstrate a finite graph that models ρm as m gets larger. We will not attempt to compute the precise value for Pn(ρm) for given m and n. However, there are some things that we can say with certainty regarding this value. As a first observation, a graph that models ρm must have many vertices. In particular, Pn(ρm) = 0 for n ≤ m. So if m is small, then so is Pn(ρm). Less obvious is the fact that, for big m, Pn(ρm) is close to 1. That is,

lim Pn(ρm) = 1.

n→∞

So although it is hard to give a concrete demonstration of a finite graph that models, say, ρ8, we have a process that will produce such a graph with high probability. If we construct a graph on the vertices {v1, . . . , vn} by flipping a coin, then we will most likely obtain a graph that models ρ8 provided that n is su ciently large. We prove this key fact as the following lemma.

Lemma 5.35 For each m N, limn→∞ Pn(ρm) = 1.

Proof Fix m N.

Given N N, we compute Pn(¬ρm) where n = N + 2m.

Let G be a graph having n vertices. Let X = {x1, . . . , xm} and Y = {y1, . . . , ym} be two sets containing m vertices of G such that X ∩ Y = . If

First-order theories

219

G |= ρm, then there exists a vertex z not in X or Y

such that z shares an

edge with each vertex in X and with no vertex in Y . We want to compute the probability that this is not the case.

For any vertex z of G that is not in X or Y , say that z “works” for X and Y if z shares and edge with each vertex of X and no vertex of Y . For this to happen, each flip of the coin must land heads up for the m pairs of vertices (z, xi) and tails up for the m pairs of vertices (z, yi). The probability of this happening is 1/22m. So given a particular z, it is unlikely that z works for X and Y . However, there are N possible vertices we may choose for z. Whereas the probability that any one of these does not work is (1 1/22m), the probability that all N of the vertices do not work is (1 1/22m)N .

Let k = (1 1/22m). Since k is between 0 and 1, limN→∞ kN = 0. So if N is large, then it is likely that there exists a vertex z that works for X and Y even though the probability that any particular z works is small.

This indicates that it may be likely that a large graph will model ρm. However, we have not finished the computation. For the graph G to model ρm, there must exist a vertex z that works for X and Y for all possible X and Y . Since

there are n = N + 2m vertices in G, there are

n

 

ways to choose the vertices

2m

 

in X Y . There are then

2m

 

 

 

m of these vertices for the set X

m

 

 

 

 

ways to choose

 

 

 

set Y ). In total, there are

 

(and the remaining m for

 

 

= N !m!m! < m!m! ≤ n2m

2m m

n

2m

 

n!

n2m

 

possible choices for X and Y . For each choice, the probability that no z works for X and Y is only kN . For G to model ¬ρm, this must happen for only one of these choices. Thus,

Pn(¬ρm) ≤ n2mkN = n2mkn/k2m.

Since many of the possible choices for X and Y overlap, this is definitely an overestimate of this probability. However, this estimate serves our purpose.

Fact lim n2mkn = 0.

n→∞

 

 

 

 

 

 

This

fact

follows solely from the fact

that k

<

1. Using

calculus,

it is easy

to

see that the function x2mkx

reaches

a

maximum

at x =

2m/ ln k. To see that this function then decreases to zero, repeatedly apply L’Hopital’s rule (2m times) to the expression x2m/k−x having indeterminate

form .

Finally, since limn→∞ Pn(¬ρm) = 0, limn→∞ Pn(ρm) = 1.