
In network theory, a giant component is a connected component of a given random graph that contains a significant fraction of the entire graph's vertices.
More precisely, in graphs drawn randomly from a probability distribution over arbitrarily large graphs, a giant component is a connected component whose fraction of the overall number of vertices is bounded away from zero. In sufficiently dense graphs distributed according to the Erdős–Rényi model, a giant component exists with high probability.
Giant component in Erdős–Rényi model
Giant components are a prominent feature of the Erdős–Rényi model (ER) of random graphs, in which each possible edge connecting pairs of a given set of n vertices is present, independently of the other edges, with probability p. In this model, if
p
≤
1
−
ϵ
n
{\displaystyle p\leq {\frac {1-\epsilon }{n}}}
for any constant
ϵ
>
0
{\displaystyle \epsilon >0}
, then with high probability (in the limit as
n
{\displaystyle n}
goes to infinity) all connected components of the graph have size O(log n), and there is no giant component. However, for
p
≥
1
+
ϵ
n
{\displaystyle p\geq {\frac {1+\epsilon }{n}}}
there is with high probability a single giant component, with all other components having size O(log n). For
p
=
p
c
=
1
n
{\displaystyle p=p_{c}={\frac {1}{n}}}
, intermediate between these two possibilities, the number of vertices in the largest component of the graph,
P
inf
{\displaystyle P_{\inf }}
is with high probability proportional to
n
2
/
3
{\displaystyle n^{2/3}}
.[1]
Giant component is also important in percolation theory.[1][2] When a fraction of nodes,
q
=
1
−
p
{\displaystyle q=1-p}
, is removed randomly from an ER network of degree
⟨
k
⟩
{\displaystyle \langle k\rangle }
, there exists a critical threshold,
p
c
=
1
⟨
k
⟩
{\displaystyle p_{c}={\frac {1}{\langle k\rangle }}}
. Above
p
c
{\displaystyle p_{c}}
there exists a giant component (largest cluster) of size,
P
inf
{\displaystyle P_{\inf }}
.
P
inf
{\displaystyle P_{\inf }}
fulfills,
P
inf
=
p
(
1
−
exp
(
−
⟨
k
⟩
P
inf
)
)
{\displaystyle P_{\inf }=p(1-\exp(-\langle k\rangle P_{\inf }))}
. For
p
<
p
c
{\displaystyle p<p_{c}}
the solution of this equation is
P
inf
=
0
{\displaystyle P_{\inf }=0}
, i.e., there is no giant component.
At
p
c
{\displaystyle p_{c}}
, the distribution of cluster sizes behaves as a power law,
n
(
s
)
{\displaystyle n(s)}
~
s
−
5
/
2
{\displaystyle s^{-5/2}}
which is a feature of phase transition.
Alternatively, if one adds randomly selected edges one at a time, starting with an empty graph, then it is not until approximately
n
/
2
{\displaystyle n/2}
edges have been added that the graph contains a large component, and soon after that the component becomes giant. More precisely, when t edges have been added, for values of t close to but larger than
n
/
2
{\displaystyle n/2}
, the size of the giant component is approximately
4
t
−
2
n
{\displaystyle 4t-2n}
.[1] However, according to the coupon collector's problem,
Θ
(
n
log
n
)
{\displaystyle \Theta (n\log n)}
edges are needed in order to have high probability that the whole random graph is connected.
Graphs with arbitrary degree distributions
A similar sharp threshold between parameters that lead to graphs with all components small and parameters that lead to a giant component also occurs in tree-like random graphs with non-uniform degree distributions
P
(
k
)
{\displaystyle P(k)}
. The degree distribution does not define a graph uniquely. However, under the assumption that in all respects other than their degree distribution, the graphs are treated as entirely random, many results on finite/infinite-component sizes are known. In this model, the existence of the giant component depends only on the first two (mixed) moments of the degree distribution. Let a randomly chosen vertex have degree
k
{\displaystyle k}
, then the giant component exists[3] if and only if
⟨
k
2
⟩
−
2
⟨
k
⟩
>
0.
{\displaystyle \langle k^{2}\rangle -2\langle k\rangle >0.}
This is known as the Molloy and Reed condition.[4] The first moment of
P
(
k
)
{\displaystyle P(k)}
is the mean degree of the network. In general, the
n
{\displaystyle n}
-th moment is defined as
⟨
k
n
⟩
=
E
[
k
n
]
=
∑
k
n
P
(
k
)
{\displaystyle \langle k^{n}\rangle =\mathbb {E} [k^{n}]=\sum k^{n}P(k)}
.
When there is no giant component, the expected size of the small component can also be determined by the first and second moments and it is
1
+
⟨
k
⟩
2
2
⟨
k
⟩
+
⟨
k
2
⟩
.
{\displaystyle 1+{\frac {\langle k\rangle ^{2}}{2\langle k\rangle +\langle k^{2}\rangle }}.}
However, when there is a giant component, the size of the giant component is more tricky to evaluate.[2]
Criteria for giant component existence in directed and undirected configuration graphs
Similar expressions are also valid for directed graphs, in which case the degree distribution is two-dimensional.[5] There are three types of connected components in directed graphs. For a randomly chosen vertex:
- out-component is a set of vertices that can be reached by recursively following all out-edges forward;
- in-component is a set of vertices that can be reached by recursively following all in-edges backward;
- weak component is a set of vertices that can be reached by recursively following all edges regardless of their direction.
Let a randomly chosen vertex has
k
in
{\displaystyle k_{\text{in}}}
in-edges and
k
out
{\displaystyle k_{\text{out}}}
out edges. By definition, the average number of in- and out-edges coincides so that
c
=
E
[
k
in
]
=
E
[
k
out
]
{\displaystyle c=\mathbb {E} [k_{\text{in}}]=\mathbb {E} [k_{\text{out}}]}
. If
G
0
(
x
)
=
∑
k
P
(
k
)
x
k
{\displaystyle G_{0}(x)=\textstyle \sum _{k}\displaystyle P(k)x^{k}}
is the generating function of the degree distribution
P
(
k
)
{\displaystyle P(k)}
for an undirected network, then
G
1
(
x
)
{\displaystyle G_{1}(x)}
can be defined as
G
1
(
x
)
=
∑
k
k
⟨
k
⟩
P
(
k
)
x
k
−
1
{\displaystyle G_{1}(x)=\textstyle \sum _{k}\displaystyle {\frac {k}{\langle k\rangle }}P(k)x^{k-1}}
. For directed networks, generating function assigned to the joint probability distribution
P
(
k
i
n
,
k
o
u
t
)
{\displaystyle P(k_{in},k_{out})}
can be written with two valuables
x
{\displaystyle x}
and
y
{\displaystyle y}
as:
G
(
x
,
y
)
=
∑
k
i
n
,
k
o
u
t
P
(
k
i
n
,
k
o
u
t
)
x
k
i
n
y
k
o
u
t
{\displaystyle {\mathcal {G}}(x,y)=\sum _{k_{in},k_{out}}\displaystyle P({k_{in},k_{out}})x^{k_{in}}y^{k_{out}}}
, then one can define
g
(
x
)
=
1
c
∂
G
∂
x
|
y
=
1
{\displaystyle g(x)={\frac {1}{c}}{\partial {\mathcal {G}} \over \partial x}\vert _{y=1}}
and
f
(
y
)
=
1
c
∂
G
∂
y
|
x
=
1
{\displaystyle f(y)={\frac {1}{c}}{\partial {\mathcal {G}} \over \partial y}\vert _{x=1}}
.
The criteria for giant component existence in directed and undirected random graphs are given in the table below:
| Type | Criteria |
|---|---|
| undirected: giant component |
E
[
k
2
]
−
2
E
[
k
]
>
0
{\displaystyle \mathbb {E} [k^{2}]-2\mathbb {E} [k]>0}
|
| directed: giant in/out-component |
E
[
k
in
k
o
u
t
]
−
E
[
k
in
]
>
0
{\displaystyle \mathbb {E} [k_{\text{in}}k_{out}]-\mathbb {E} [k_{\text{in}}]>0}
|
| directed: giant weak component |
2
E
[
k
in
]
E
[
k
in
k
out
]
−
E
[
k
in
]
E
[
k
out
2
]
−
E
[
k
in
]
E
[
k
in
2
]
+
E
[
k
in
2
]
E
[
k
out
2
]
−
E
[
k
in
k
out
]
2
>
0
{\displaystyle 2\mathbb {E} [k_{\text{in}}]\mathbb {E} [k_{\text{in}}k_{\text{out}}]-\mathbb {E} [k_{\text{in}}]\mathbb {E} [k_{\text{out}}^{2}]-\mathbb {E} [k_{\text{in}}]\mathbb {E} [k_{\text{in}}^{2}]+\mathbb {E} [k_{\text{in}}^{2}]\mathbb {E} [k_{\text{out}}^{2}]-\mathbb {E} [k_{\text{in}}k_{\text{out}}]^{2}>0}
|
See also
- Complex network – Network with non-trivial topological features
- Erdős–Rényi model – Two closely related models for generating random graphs
- Fractals – Infinitely detailed mathematical structurePages displaying short descriptions of redirect targets
- Graph theory – Area of discrete mathematics
- Interdependent networks – Subfield of network science
- Network science – Academic field
- Percolation theory – Mathematical theory on behavior of connected clusters in a random graph
- Percolation – Filtration of fluids through porous materials
- Scale-free network – Network whose degree distribution follows a power law
References
- Bollobás, Béla (2001), "6. The Evolution of Random Graphs—The Giant Component", Random Graphs, Cambridge studies in advanced mathematics, vol. 73 (2nd ed.), Cambridge University Press, pp. 130–159, ISBN 978-0-521-79722-1.
- Newman, M. E. J. (2010). Networks: an introduction. New York: Oxford University Press. OCLC 456837194.
- Molloy, Michael; Reed, Bruce (1995). "A critical point for random graphs with a given degree sequence". Random Structures & Algorithms. 6 (2–3): 161–180. doi:10.1002/rsa.3240060204. ISSN 1042-9832.
- Molloy, Michael; Reed, Bruce (March 1995). "A critical point for random graphs with a given degree sequence". Random Structures & Algorithms. 6 (2–3): 161–180. doi:10.1002/rsa.3240060204. ISSN 1042-9832.
- Newman, M. E. J.; Strogatz, S. H.; Watts, D. J. (2001-07-24). "Random graphs with arbitrary degree distributions and their applications". Physical Review E. 64 (2) 026118. arXiv:cond-mat/0007235. Bibcode:2001PhRvE..64b6118N. doi:10.1103/physreve.64.026118. ISSN 1063-651X. PMID 11497662.
- Kryven, Ivan (2016-07-27). "Emergence of the giant weak component in directed random graphs with arbitrary degree distributions". Physical Review E. 94 (1) 012315. arXiv:1607.03793. Bibcode:2016PhRvE..94a2315K. doi:10.1103/physreve.94.012315. ISSN 2470-0045. PMID 27575156. S2CID 206251373.