Graphs and Combinatorics DOI 10.1007/s00373-013-1347-3 ORIGINAL PAPER
Connector Families of Graphs Gérard Cohen · Emanuela Fachini · János Körner
Received: 28 July 2011 / Revised: 4 January 2013 © Springer Japan 2013
Abstract For every pair of fixed natural numbers k > l we consider families of subgraphs of the complete graph K n such that each graph in the family has at least k connected components while the union of any two has at most l. We show that the cardinality of such a family is at most exponential in n and determine the exact exponential growth of the largest such families for every value of k and l = 1. Keywords Connected graph · Pairwise union · Graph capacity · Information theory 1 Introduction The union of two graphs with the same vertex set is a graph on their common vertex set having as edge set the union of the edge sets of the two. Let F be a family of disconnected graphs with fixed vertex set [n] = {1, 2, . . . , n} and with the property that the pairwise union of any two different graphs from F is connected. It is immediately seen that F has at most 2n−1 − 1 member graphs and this maximum can be achieved by the family of the complements of all the complete bipartite graphs on [n] as it will be shown below (2). If, however, we ask the analogous question for a family
G. Cohen ENST, Paris, France e-mail:
[email protected] E. Fachini “La Sapienza” University of Rome, Rome, Italy e-mail:
[email protected] J. Körner (B) Department of Computer Science, University of Rome, La Sapienza, via Salaria 113, 00198 Rome, Italy e-mail:
[email protected]
123
Graphs and Combinatorics
of even more disconnected graphs, for example those having at least three connected components, then the answer is far less obvious. Somewhat surprisingly, questions of this kind can be answered in the framework of Shannon (and Sperner) capacity of families of graphs, as developed in the series of papers leading from [2] to [6], cf. also [3]. In other words, these purely combinatorial questions have close ties with Shannon’s information theory. In this introductory part of our paper let us be more specific about this interesting relationship. The problems just mentioned belong to the following general framework. Let F and D be two disjoint families of graphs on the same vertex set [n]. We are interested in the largest cardinality M(F, D) of a subfamily G ⊆ F for which the union of any two different member graphs of G is in D. The study of this class of problems was initiated quite recently by Messuti, Simonyi and the third author in [10]. In the central problem of [10] the family F = Fn consisted of all the Hamilton paths in K n with various choices of D. It is shown in [10] that if D1 is the family of all the subgraphs of K n containing an odd cycle, then n M(Fn , D1 ) = n/2 if n is odd, and M(Fn , D1 ) =
1 n 2 n/2
if n is even. However, things change when we consider D0 , the family of those subgraphs of K n that contain an even cycle. In fact, in this case n! M(Fn , D0 ) ≥ n . 2
(In [10] a slightly better lower bound is proved by a greedy algorithm.) In more general questions of the same kind we always experience this second kind of situation, superexponential growth. More precisely, in [10] it is shown that if D(c) is the family of those subgraphs of K n that contain a cycle of length divisible by the natural number c, then for some d ∈ (0, 1) depending on c we have M(Fn , D(c)) ≥ (dn)!. (Note that D0 = D(2).) Further, if D(c) is the family of those subgraphs of K n that contain a cycle of length not divisible by c, then again, for c > 2, perhaps surprisingly, M(Fn , D(c)) ≥ (dn)!. (Note that D1 = D(2).)
123
Graphs and Combinatorics
Finally, if Dˆ 4 is the family of subgraphs of K ncontaining a clique on 4 vertices, n then it is easily seen by Fekete’s lemma [11] that M(Fn , Dˆ 4 ) has a limit we denote by q4 . It was shown in [10] that √ 4
2 ≤ q4 ≤
3 . 2
(1)
As these examples suggest, it is not always clear from the outset in which cases the growth of the largest family is only exponential. Even when it is, the determination of the exponent can be very hard. The central problem discussed and solved in this paper represents an exception. Note that logarithms and exponentials are to the base 2. In particular, the binary entropy function h : [0, 1] → [0, 1] is defined as h(t) = −t log t − (1 − t) log(1 − t) when t ∈ (0, 1) and is extended to the endpoints of its domain by continuity to yield t (0) = t (1) = 0. 2 Intersection Problems? At a first glance our framework might seem just a reformulation of that of Simonovits and Sós, for the famous intersection problems introduced in [13]. Within their framework, in 1976 these authors formulated a beautiful conjecture about triangle intersecting graph families. A graph family is called triangle intersecting if the intersection of any two if its member graphs contains a triangle. According to the conjecture, the triangle intersecting graph family of maximum cardinality is unique (up to isomorphism) and has a so-called kernel structure, meaning that all of its member graphs contain a fixed triangle. This conjecture was proved in a brilliant paper by Ellis, Filmus and Friedgut [4]. In order to make comparison easier with our present framework, we prefer to formulate the general setup for (graph) intersection problems of Simonovits and Sós in the complementary terminology of unions rather than intersections. Given a family F of graphs, we ask for the largest cardinality of a subfamily C ⊆ F for which the pairwise union of its different member graphs is also in F. In case of triangleintersection F is the family of those graphs on [n] which contain a stable set of size 3. So these problems are one-family problems, and a kernel structure is always available as a possible candidate for the (often unique) optimal family. A fundamental feature of these problems is that the defining relation is one of similarity. The requested relation between two graphs, that of having their union in F, is true for every graph from F and itself; it is reflexive. This makes it possible to define kernel structured families, and these in turn often are candidate solutions. Formally, in the present terminology, intersection problems ask for the determination of M(F, D) in the special case F = D. In our problems, however, the defining relation is one of diversity and needs two disjoint graph families for its formulation. Our condition implies that for the different member graphs of C ⊆ F their pairwise union is outside of F, and in a sense, even “far” from it. This condition is not satisfied by any member of F in pair with itself. Thus our relation is irreflexive. For this precise reason, in our case one cannot define
123
Graphs and Combinatorics
kernel structures, and in particular, no candidate solution can be expected in these terms. Our problem is neither equivalent nor more general than that of Simonovits and Sós, rather, it is of a fundamentally different nature. In fact, kernel structures are concentrated, whereas in our problems the optimal solution must be scattered, “code-like”. We feel that this area of extremal combinatorics, initiated in [10] is as rich as that of intersection problems. It is our intention to create through them a bridge between extremal combinatorics and information theory. We expect that for many of these new problems information theory helps bringing us closer to the solution. 3 Connector Families Let C(k) = C(k, n) be the family of those subgraphs of K n which have at least k > 1 connected components. Thus, in particular, these graphs are disconnected. We say that a family G ⊆ C(k, n) is a connector family if the union of any two of its members is connected. We are interested in the largest cardinality of a connector family, asymptotically in n and as a function of k. Clearly, this problem can be phrased within our previous framework. Let D be the family of all the connected graphs. Then the largest cardinality of a connector family is just M(C(k, n), D). We have Theorem 1 For every k > 0 the largest size M(C(k, n), D) of a connector family is exponential in n and the asymptotic exponent is 1 n , lim log M(C(k, n), D) = h n→∞ k where h : [0, 1] → [0, 1] is the binary entropy function. Proof We immediately see that a connector family of maximum cardinality M(C(k, n), D) may entirely consist of complements of complete k-partite graphs. In fact, let us fix k > 1 and a connector family of maximum cardinality. If the family does not entirely consist of complements of complete k-partite graphs, then let G be a member of it that fails to have this structure. Clearly, if we add edges to it so as to have all its connected components complete, the resulting new graph can not be in the family yet, as it would not satisfy the union condition with the previous graph. On the other hand, with all the remaining graphs in the family it has a union which is a connected graph, (since it contains as a subgraph the previous union, itself a connected graph.) If the resulting new graph has more than k connected components, then we can add new edges so as to arrive at the complement of a complete k-partite graph that continues to yield connected unions with the remaining members of the family. But then we can replace the original G with this new graph. Iterating the process, we prove our claim. Notice that if k = 2, then the union of the complements of two complete bipartite graphs is connected. This shows that M(C(2, n), D) = 2n−1 − 1 as claimed.
123
(2)
Graphs and Combinatorics
The previous observation allows us to represent each member graph of our maximum size connector family by a string of length n over the alphabet [k] in the obvious manner. As a matter of fact, the member graphs have cliques (complete graphs) as connected components, and thus are uniquely determined by a k-partition of their vertex set [n]. Namely, we number (order) the connected components of our graphs in an arbitrary manner and associate to the graph G the k-ary characteristic vector x1 x2 . . . xn of the underlying partition so as to have xi = j if the vertex i ∈ [n] belongs to the class j of the partition. Even though our definition seems arbitrary (because the ordering was arbitrary), yet this will do no harm. Let us first prove that lim inf log n→∞
n
M(C(k, n), D) ≤ h
1 k
(3)
To this end, just notice that two different partitions representing two different graphs cannot have a class in common, otherwise the underlying set would be a connected component in the union graph which therefore could not be connected. In particular, the smallest classes of each of the different partitions must all differ. Thus to each partition we can associate a set of size at most n/k so that the sets associated to different partitions are all different.Thus M(C(k, n), D) is bounded from above by the number i≤n/k ni ≤ exp[nh k1 ] which establishes the upper bound. (For a proof of the last inequality, the reader might consult Chapter 2 of [3]). To prove the lower bound, we will construct, for every n, a set of sequences Fˆn ⊆ [k]n with the property that Fˆn induces a clique of maximal size in the n’th power of every graph in the family G of complete bipartite graphs on [k]. We will use a slightly modified version of the sequences in Fˆn to define, as before, graphs whose connected components are cliques, and show that the resulting family of graphs is a connector family. As a straightforward application of the main result from [6] we can find a set Fˆn with the desired property and a size at least as in the claimed lower bound. Next we give the details. Let Fˆn−k ⊆ [k]n−k be a set of strings with the property that for any two of its different elements, xˆ and yˆ , and any non-empty subset A ⊂ [k] there is a coordinate i for which exactly one of xi and yi is in A. For each xˆ ∈ [k]n−k we construct a new sequence x ∈ [k]n by suffixing to it the k elements of [k] in an increasing order. Let Fn ⊆ [k]n be the set of the so obtained new sequences. We claim that the pairwise unions of the corresponding graphs are connected. As a matter of fact, the union of the two graphs represented by the sequences x and y is disconnected if and only if there is a bipartition of the coordinate set [n] which is refined by both of the partitions represented by x and y, respectively. We have to show that such a bipartition does not exist. For this purpose suppose that {B, B} is such a (disconnecting) bipartition for the coordinates of x and let A ⊆ [k] be the set of elements of [k] present in the coordinates in B. This implies that none of the elements of A is present in the coordinates belonging to B. On the other hand, by construction of our set of sequences there is an i ∈ [n] for which {xi , yi } = {a, b} for some a ∈ A and b ∈ A. By construction, we also have coordinates s ∈ [n] and t ∈ [n] for which xs = ys = a and xt = yt = b. Then our hypothesis on x implies that s ∈ B and t ∈ B. Since {xi , yi } = {a, b}, we have
123
Graphs and Combinatorics
either xi = a and yi = b or vice versa. Let us first look at the first case. Recall that by the definition of the set of coordinates B the equality xi = a implies i ∈ B. By assumption, t ∈ B. But then the vertices i and t are adjacent in the graph represented by the sequence y and thus in the union graph there is an edge connecting B and B. In the second case we have xi = b and yi = a. This implies that i ∈ B. But in the sequence y the value of the coordinate i is a just as that of the coordinate s, so that the corresponding vertices i and s are adjacent in the graph defined by y and thus they are adjacent in the union of our two graphs. This means that also in this case there is an edge in the union between vertices in B and B. It remains to explain how the main result from [6] on the Shannon capacity of a family of graphs allows us to construct a suitably large set Fˆn−k with the claimed property. For every non-empty A ⊂ [k] we consider the complete bipartite graph G A with vertex set [k] and partition classes (maximal stable sets) A and [k] − A. The n’th power of this graph G nA has vertex set [k]n . By the definition of power graphs two vertices x ∈ [k]n and y ∈ [k]n are adjacent in the graph G nA if there exists a coordinate i ∈ [n] for which xi and yi are adjacent in G A . In other words the set Fn induces a clique in the n’th power of every bipartite graph with vertex set [k]. We can rephrase this in the language of capacity of graph families. Let G be the family of all bipartite graphs with vertex set [k]. The set Fn is simply a clique in (the n’th power of) each member graph of G. The exponential asymptotics of the largest possible cardinality of such a clique is by definition [6] the capacity of the graph family G. For the reader’s convenience we define the capacity of a family of graphs in the sense of [6] and [7]. For a detailed recent exposition of this problem area we refer to [3]. A finite graph G is a couple of sets {V (G), E(G)} where E(G), the edge set of G, is a set of unordered couples of distinct elements of the vertex set V (G). We associate with every graph the sequence of its power graphs, where the vertex set of G n is the set [V (G)]n of n-length sequences from V (G). Two such sequences, x = x1 x2 . . . xn and y = y1 y2 . . . yn form an edge {x, y} in G n if for at least one coordinate i ∈ [n] the edge {xi , yi } belongs to E(G). Let now G be a family of graphs with common vertex set V . Let N (G, n) be the largest cardinality of a subset of V (G n ) that induces a clique in each of the power graphs G n with G ∈ G. Here a clique is a graph in which every pair of vertices is connected√by an edge. Then, by the already cited lemma of Fekete, the (binary) logarithm of n N (G, n) has a limit, called the Shannon capacity of the family of graphs G and denoted by C(G). In [6] a max–min formula for this limit is proved. This formula plays a central role in our proof. In stating it we will skip some technicalities. For details, the reader is advised to consult [6] or [3]. Let P be an arbitrary probability distribution on V = V (G) and let N (G, n, P, ) be the largest cardinality of a clique the graph G n induces on the set of sequences from V (G n ) in which every element v ∈ V occurs “approximately” n P(v) times in the sense of the number of occurrences being not more than n away from n P(v). We define C(G, P) = lim lim log →0 n→∞
The main theorem in [6] asserts that
123
n
N (G, n, P, ).
Graphs and Combinatorics
C(G) = max min C(G, P).
(4)
G∈G
P
For the complete bipartite graph Q with partition classes B and B, for every probability distribution P on its vertex set [k] we have C(Q, P) = h(P(B)) where h is the binary entropy function. It follows by (4) that C(G) = max min h(P(B)) = h P
B⊆[k]
1 k
since, by symmetry, the optimizing distribution P is the equidistribution on [k]. This then implies the existence of a clique Fn ⊆ [k]n in G with |Fn | ≥ exp2
1 n(h − n ) k
with n → 0. This completes the proof of the upper bound (3).
(5)
4 Related Questions We have solved the problem about connector families relying on the concept of capacity for graph families, introduced in [2] as well as on the corresponding results from [6]. There might be many other cases where these tools can help. This is of interest as it clearly shows how close our new problems are to Shannon’s zero-error information theory. We will introduce here a second example. As before, let us fix the natural numbers k < l, with l ≤ k 2 . Let Bk be the family of all graphs with vertex set [n] whose chromatic number is at most k and let Dl be the family of those graphs on the same vertex set whose chromatic number is at least l. We will say that a family G ⊆ Bk is k-to-l-jumping if the union of any two of its graphs belongs to Dl . Once again, we are interested in the maximum cardinality M(Bk , Dl ) of such a graph family. Similarly to the case of connector families, it is immediately seen that the largest family can consist entirely of complete k-partite graphs. As a matter of fact, it is clear that for no two graphs in the family can one be a subgraph of the other or else the union of the two would be equal to the larger graph, and have its chromatic number. On the other hand, replacing a graph in an optimal graph family with a complete k-partite graph corresponding to one of its k-colorings cannot destroy the condition on the chromatic numbers of the pairwise union graphs. This establishes our claim. Notice next that complete k-partite graphs can be thought of as k-partitions of the ground set [n]. Take two such graphs from a family that satisfies the union condition. The condition just means that the common refinement of the two partitions has at least l non-empty classes. This problem was explicitly stated and in some cases solved in [8]. (A better exposition of the general problem is [7]). In particular, one of the central results in [6] shows that
123
Graphs and Combinatorics
lim log
n→∞
n
M(Bk , Dk 2 ) =
k . 2
(6)
This answered a long-standing open question commonly attributed to Alfréd Rényi about the largest cardinality of a family of qualitatively independent k-partitions of an n-set. A trivial observation can be rephrased as kn n! , D ) ≤ ≤ M(B . k k+1 [(n/k)!]k k! k! In fact, since the edge sets of complete multi-partite graphs are maximal with respect to graphs with the same chromatic number, the union of any two of them has a larger chromatic number than any of the two. The precise value of M(Bk , Dk+1 ) is a Stirling number of the second kind. Our hitherto results are concerned with graph families defined on the common vertex set [n] but lying, with their pairwise unions on the two different sides of a finite (independent of n) threshold for some numerical invariant (number of connected components or chromatic number). One might think that this is the reason why the cardinality of the largest families for each of these problems grows only exponentially with n. This is not the case, however, as shown by the following example. We define, as usual, the radius of a graph in terms of the distance dG (x, y) between vertices x ∈ V (G) and y ∈ V (G), where dG (x, y) is the number of edges in the shortest path between these vertices. Then, for every vertex x ∈ V (G) we define r (G, x) as the largest distance any vertex of G has from x. The radius r (G) is simply the minimum of r (G, x) as x runs over all the vertices of G. In particular, the radius is finite if and only if the graph is connected. Let now R2 be the family of all the graphs with vertex set [n] and radius at least 2 and let D∗ be the family of all graphs on the same vertex set but with radius 1. We claim Proposition 1 For n even M(R2 , D∗ ) ≥
n! . (n/2)!2n/2
Proof Once again in an optimal family F ⊆ R2 we cannot have two graphs containing one another, and also, any graph G ∈ F can be replaced by a graph from R2 − F containing it. Thus we can suppose that F consists of maximal graphs from R2 . But then, by the very nature of maximality, the union of any two maximal elements of R2 has radius 1. To complete the proof, we just have to notice that those graphs on vertex set [n] which are the complements of a perfect matching are maximal with respect to n!
having radius 2. The number of perfect matchings in [n] is (n/2)!2 n/2 . It is not hard to see that the above bound is tight, but this is not our point here. The results of [10] reported in the introduction show that some of these questions can be far more intriguing if we restrict attention to a particular class of graphs such as Hamilton paths.
123
Graphs and Combinatorics Acknowledgments We are extremely thankful to an anonymous referee who suggested the simple proof of the upper bound in Theorem 1 in replacement of our complicated original argument and, perhaps even more importantly, saved us from three annoying errors in our original manuscript. Interesting discussions with Silvia Messuti are gratefully acknowledged.
References 1. Brightwell, G., Cohen, G., Fachini, E., Fairthorne, M., Körner, J., Simonyi, G., Tóth, Á.: Permutation capacities of families of oriented infinite paths. SIAM J. Discrete Math. 2(24), 441–456 (2010) 2. Cohen, G., Körner, J., Simonyi, G.: Zero-error capacities and very different sequences. In: Capocelli, R.M. (ed.) Sequences. Combinatorics, Security and Transimission, Advanced International Workshop on Sequences, Positano, Italy, June 1988, pp. 144–155. Springer, New York (1990) 3. Csiszár, I., Körner, J.: Information Theory: Coding Theorems for Discrete Memoryless Systems, 2nd edn. Cambridge University Press, Cambridge (2011) 4. Ellis, D., Filmus Y., Friedgut, E.: Triangle-intersecting families of graphs. Eur. J. Math. 3(14), 841–885 5. Erd˝os, P., Ko, C., Rado, R.: Intersection theorems for systems of finite sets. Quart. J. Math. Oxford Ser. 2 12, 313–320 (1961) 6. Gargano, L., Körner, J., Vaccaro, U.: Sperner capacities. Graphs Comb. 9, 31–46 (1993) 7. Gargano, L., Körner, J., Vaccaro, U.: Capacities: from information theory to extremal set theory. J. Comb. Theory Ser. A 68(2), 296–315 (1994) 8. Gargano, L., Körner, J., Vaccaro, U.: On the capacity of Boolean graph formulae. Graphs Comb. 11, 29– 48 (1995) 9. Körner, J., Malvenuto, C.: Pairwise colliding permutations and the capacity of infinite graphs. SIAM J. Discrete Math. 20, 203–212 (2006) 10. Körner, J., Messuti, S., Simonyi, G.: Families of graph-different Hamilton paths. SIAM J. Discrete Math. 26, 321–329 (2012) 11. van Lint, J.H., Wilson, R.M.: A Course in Combinatorics. 2nd edn. Cambridge University Press, Cambridge (2001) 12. Shannon, C.E.: The zero-error capacity of a noisy channel. IRE Trans. Inform. Theory 2, 8–19 (1956) 13. Simonovits, M., Sós, V.T.: Intersection theorems on structures. Annals Discrete Math. 6, 301– 313 (1980)
123