Clustering coefficient

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
34,200pages on
this wiki

Duncan J. Watts and Steven Strogatz (1998) introduced the clustering coefficient1 graph measure to determine whether or not a graph is a small-world network.

First, let us define a graph in terms of a set of $n$ vertices $V={v_1,v_2,...v_n}$ and a set of edges $E$, where $e_{ij}$ denotes an edge between vertices $v_i$ and $v_j$. Below we assume $v_i$, $v_j$ and $v_k$ are members of V.

We define the neighbourhood N for a vertex $v_i$ as its immediately connected neighbours as follows:

$N_i = \{v_j\} : e_{ij} \in E.$

The degree $k_i$ of a vertex is the number of vertices, $|N_i|$, in its neighbourhood $N_i$.

The clustering coefficient $C_i$ for a vertex $v_i$ is the proportion of links between the vertices within its neighbourhood divided by the number of links that could possibly exist between them. For a directed graph, $e_{ij}$ is distinct from $e_{ji}$, and therefore for each neighbourhood $N_i$ there are $k_i(k_i-1)$ links that could exist among the vertices within the neighbourhood. Thus, the clustering coefficient is given as:

$C_i = \frac{|\{e_{jk}\}|}{k_i(k_i-1)} : v_j,v_k \in N_i, e_{jk} \in E.$

An undirected graph has the property that $e_{ij}$ and $e_{ji}$ are considered identical. Therefore, if a vertex $v_i$ has $k_i$ neighbours, $\frac{k_i(k_i-1)}{2}$ edges could exist among the vertices within the neighbourhood. Thus, the clustering coefficient for undirected graphs can be defined as:

$C_i = \frac{2|\{e_{jk}\}|}{k_i(k_i-1)} : v_j,v_k \in N_i, e_{ij} \in E.$

Let $\lambda_G(v)$ be the number of triangles on $v \in V(G)$ for undirected graph $G$. That is, $\lambda_G(v)$ is the number of subgraphs of $G$ with 3 edges and 3 vertices, one of which is $v$. Let $\tau_G(v)$ be the number of triples on $v \in G$. That is, $\tau_G(v)$ is the number of subgraphs (not necessarily induced) with 2 edges and 3 vertices, one of which is $v$ and such that $v$ is incident to both edges. Then we can also define the clustering coefficient as:

$C_i = \frac{\lambda_G(v)}{\tau_G(v)}$

It is simple to show that the two preceding definitions are the same, since $\tau_G(v) = C({k_i},2) = \frac{1}{2}k_i(k_i-1)$.

These measures are 1 if every neighbour connected to $v_i$ is also connected to every other vertex within the neighbourhood, and 0 if no vertex that is connected to $v_i$ connects to any other vertex that is connected to $v_i$.

The clustering coefficient for the whole system is given by Watts and Strogatz as the average of the clustering coefficient for each vertex:

$\overline{C} = \frac{1}{n}\sum_{i=1}^{n} C_i.$

ReferencesEdit

1 Watts, D. J. and Strogatz, S. H. "Collective dynamics of 'small-world' networks." [1]

de:Clusterkoeffizient
he:מקדם התקבצות