In 17.4 Connectivity and Recursive Graph Traversal, we studied an algorithm for checking whether two vertices in a graph are connected. We can adapt this algorithm to determining whether an entire graph is connected—that is, whether every pair of vertices in the graph is connected. However, running such an algorithm might not be necessary for every graph. Intuitively, since connectivity is based on paths between vertices, which in turn are built from edges, it is natural to think that we can “force” a graph to be connected by simply adding more edges to it. In the extreme case, a complete graph of \(n\) vertices with \(\frac{n(n-1)}{2}\) edges is guaranteed to be connected, since all vertices are adjacent to each other. On the other extreme, a graph with \(n \geq 2\) vertices and 0 edges is guaranteed to not be connected.
So in this section, we’ll perform some mathematical analysis on graphs to answer the following question: how many edges does it take to ensure that a graph is connected?
You might wonder what we mean precisely by the previous question. One way to better understand it is to translate it into predicate logic. Here is our translation, with a blank indicating a missing part:
\[ \forall n \in \Z^+,~ \forall G = (V, E),~ |V| = n \land |E| \geq \underline{\quad\quad\quad} \Rightarrow \text{$G$ is connected} \]
Now, we could fill in the blank with \(\frac{n(n-1)}{2}\), since that would force the graph \(G\) to be complete, and therefore connected. But that would be avoiding the deeper question—we want to find the smallest expression (in terms of \(n\)) that makes the statement True.
Now you might wonder how we could come up with the right expression to fill in the blank. Here’s one first step: let’s consider a graph that has many edges, but has one vertex with no edges at all.
Let \(n \in \Z^+\), and assume \(n > 1\). Then there exists a graph \(G = (V, E)\), such that \(|V| = n\) and \(|E| = \frac{(n-1)(n-2)}{2}\), and \(G\) is not connected.
\[\forall n \in \Z^+,~ n > 1 \Rightarrow \left(\exists G = (V, E),~ |V| = n \land |E| = \frac{(n-1)(n-2)}{2} \land \text{$G$ is not connected}\right).\]
Here, we are asked to show that for any \(n\), there is a graph with \(n\) vertices and \(\frac{(n-1)(n-2)}{2}\) edges, but which is still not connected.
So how do we prove this? Because of the existential quantifier, we can choose the graph, though we are constrained by the number of vertices and edges the graph must have. The expression \(\frac{(n-1)(n-2)}{2}\) is a big hint, as it looks suspiciously like the maximum number of edges on \(n - 1\) vertices…
Let \(n \in \Z^+\), and assume \(n > 1\). Let \(G = (V, E)\) be the graph defined as follows:This is the first time we’re defining a concrete graph in a proof, rather than introducing an arbitrary graph.
We need to show three things:
For (i), we have explicitly labelled the \(n\) vertices in \(V\), and so it is clear that \(|V| = n\).
For (ii), we have chosen all possible pairs of vertices from \(\{v_1, v_2, \dots, v_{n-1}\}\) for the edges. There are exactly \(\frac{(n-1)(n-2)}{2}\) such edges.
For (iii), because \(v_n\) is not adjacent to any other vertex, it cannot be connected to any other vertex. So \(G\) is not connected.
We have now proved that a graph with a fairly large number of edges can still not be connected. It is worth noting that \(\frac{(n-1)(n-2)}{2} = \frac{n(n-1)}{2} - (n - 1)\). That is, there is a graph that is missing only \(n - 1\) edges from the set of all possible of edges, but is still not connected. The question becomes: can we go higher still? Is it possible for a graph on \(n\) vertices to have more than \(\frac{(n-1)(n-2)}{2}\) edges and yet still be not connected? Or is the best possible expression from our original question indeed \(\frac{(n-1)(n-2)}{2} + 1\)?
It turns out that the latter is true, and this will be the last, and most challenging, proof we do in this section.
Let \(n \in \Z^+\). For all graphs \(G = (V, E)\), if \(|V| = n\) and \(|E| \geq \frac{(n-1)(n-2)}{2} + 1\), then \(G\) is connected.
\[\forall n \in \Z^+,~ \forall G = (V, E),~ \left(|V| = n \land |E| \geq \frac{(n-1)(n-2)}{2} + 1\right) \Rightarrow \text{$G$ is connected}.\]
It is tempting for us to base our proof on the previous example: after all, if we start with a graph that has \(n - 1\) of its vertices all adjacent to each other, and then add one more edge to the remaining vertex, the new graph is certainly connected. However, this line of thinking relies on a particular starting point for the structure of \(G\), which we cannot assume anything about (other than the number of vertices and edges, of course).
The problem is that even with these restrictions on the number of edges and vertices, it is hard to conceptualize enough common structure among such graphs to use in a proof.If that’s too abstract, just imagine trying to complete the statement “Every graph with \(n\) vertices and at least \(\frac{(n-1)(n-2)}{2} + 1\) edges is/has…”
What is more promising, though, is trying to take a graph which satisfies the constraints on its number of edges and vertices, and then remove a vertex to make the graph smaller, and argue two things:
This idea of “removing a vertex” from a graph to make the problem smaller and simpler is reminiscent of one of our proofs in 17.2 Some Properties of Graphs. We’ll proceed by a proof by induction on the number of vertices of the graph. You’ll notice that the inductive step in this proof is more complicated, and is split up into cases, and involves a sub-proof inside. As you read through this proof, look for both the structure as well as content of the proof: both are vital to understand.
We will proceed by induction on \(n\). More precisely, define the following predicate over the positive integers: \[P(n): \forall G = (V, E),~ \left(|V| = n \land |E| \geq \frac{(n-1)(n-2)}{2} + 1\right) \Rightarrow \text{$G$ is connected}.\] In words, \(P(n)\) says that every graph \(G\) with \(n\) vertices and at least \(\frac{(n-1)(n-2)}{2} + 1\) edges must be connected. We want to prove that \(\forall n \in \Z^+,~ P(n)\) using induction.
Base Case: Let \(n = 1\).
This is a good exercise in substitution: \[P(1): \forall G = (V, E),~ (|V| = 1 \land |E| \geq 1) \Rightarrow \text{$G$ is connected}\]
This statement is vacuously true: no graph exists that has only one vertex and at least one edge, since an edge requires two vertices.
Inductive Step: Let \(k \in \Z^+\), and assume that \(P(k)\) holds. We need to prove that \(P(k+1)\) also holds, i.e.: \[P(k+1): \forall G = (V, E),~ \left(|V| = k + 1 \land |E| \geq \frac{k(k-1)}{2} + 1\right) \Rightarrow \text{$G$ is connected}.\]
Let \(G = (V, E)\), and assume that \(|V| = k + 1\) and \(|E| \geq \frac{k(k-1)}{2} + 1\). We now need to prove that \(G\) is connected. We will split up this proof into two cases.
Case 1: Assume \(|E| = \frac{(k+1)k}{2}\), i.e., \(G\) has all possible edges. In this case, \(G\) is certainly connected.
Case 2: Assume \(|E| < \frac{(k+1)k}{2}\). We now need to prove the following claim.
\(G\) has a vertex in \(G\) with between one and \(k - 1\) neighbours, inclusive.Since there are \(k+1\) vertices total, this claim is saying that there exists a vertex that has at least one neighbour, but not the maximum number of neighbours.
Since \(G\) has fewer than the maximum number of possible edges, there exists a vertex pair \((u, v)\) which is not an edge. Both \(u\) and \(v\) have at most \(k - 1\) neighbours, since there are \(k - 1\) vertices in \(G\) other than these two.
We leave showing that both \(u\) and \(v\) have at least one neighbour as an exercise.
Using this claim, we let \(v\) be a vertex which has at most \(k - 1\) neighbours. Let \(G' = (V', E')\) be the graph which is formed by taking \(G\) and removing \(v\) from \(V\), and all edges in \(E\) which use \(v\). Then \(|V'| = |V| - 1 = k\), i.e., we’ve decreased the number of vertices by one. This is good because we’re trying to do induction on the number of vertices.
However, in order to use \(P(k)\), we need not just that the number of vertices to be \(k\), but that the number of edges is at least \(\frac{(k-1)(k-2)}{2} + 1\).Remember that \(P(k)\) is an implication: if the graph has the appropriate number of vertices and edges, then it is connected. This is what we’ll show next. \[\begin{align*} |E'| &= |E| - \text{number of removed edges} \\ &\geq |E| - (k - 1) \tag{at most $k - 1$ edges removed} \\ &\geq \frac{k(k-1)}{2} + 1 - (k - 1) \tag{assumption on $|E|$} \\ &= \frac{(k-2)(k-1)}{2} + 1 \end{align*}\]
Now that we have this, we can finally use the induction hypothesis: since \(|V'| = k\) and \(|E'| \geq \frac{(k-2)(k-1)}{2} + 1\), we conclude that \(G'\) is connected.
Finally, let us use the fact that \(G'\) is connected to show that \(G\) is also connected. First, any two vertices not equal to \(v\) are connected in \(G\) because they are connected in \(G'\). What about \(v\), the vertex we removed from \(G\) to get \(G'\)? Recall our claim: \(v\) has at least one neighbour, so call it \(w\). Then \(v\) is connected to \(w\), but because \(G'\) is connected, \(w\) is connected to every other vertex in \(G\). By the transitivity of connectednessproved in 17.2!, we know that \(v\) must be connected to all of these other vertices.
Now that we’ve gone through the proof of that statement, we’ll take a moment to step back to see what we’ve done. We started this section by observing that we can implement an algorithm for determining whether a graph is connected or not. But now, we’ve just proved that every graph with \(n\) vertices and at least \(\frac{(n-1)(n-2)}{2} + 1\) edges must be connected.
So for example, if we have a graph with 1000 vertices and 498505 edges, then we can conclude that it must be connected, because \(498505 \geq \frac{999 \cdot 998}{2} + 1\). This is true even if we know nothing else about the graph, and we don’t need to run a recursive algorithm (or any other algorithm) to determine this. It is similarly possible to prove that every graph with \(n\) vertices and at most \(n - 2\) edges must not be connected. And so if we are given a graph with 1000 vertices and only 800 edges, we know that it cannot be connected, even if we know nothing else about it.
Now, do these types of statements invalidate the work we did in the previous section to develop a recursive algorithm for checking connectivity? Absolutely not! For a graph of \(n\) vertices, there is a big range of possible edge numbers between \(n - 1\) and \(\frac{(n-1)(n-2)}{2}\), and our above observations cannot tell us anything about such graphs. The key takeaway is that theoretical, mathematical analysis of graphs can lead to more efficient algorithms—at least some of the time, depending on the purpose of the algorithm.This lesson is very reminiscent of our work all the way back in Chapter 4.