Slide 4
Slide 4 text
Overlapping Communities 4
These networks possess rich metadata that al
University Home and work
Family
Buildings in same
neighborhood
a b
c
Figure 1 | Overlapping communities lead to dense n
the discovery of a single node hierarchy. a, Local s
networks is simple: an individual node sees the com
b, Complex global structure emerges when every no
displayed in a. c, Pervasive overlap hinders the disc
organization because nodes cannot occupy multiple
dendrogram, preventing a single tree from encodin
d, e, An example showing link communities (colours
matrix (e; darker entries show more similar pairs o
dendrogram (e). f, Link communities from the full w
around the word ‘Newton’. Link colours represent
regions provide a guide for the eye. Link communit
related to science and allow substantial overlap. No
produced by experiment participants during free w
LETTERS NATURE|
kinship collaboration
friendship
school
Y.-Y. Ahn et al., Nature, 466, 761 (2010)
Real networks often have “community” structure.
- community = “densely connected components”
- extensively studied topic in network science
S. Fortunato / Physics Reports 486 (2010) 75–174
Fig. 1. A simple graph with three communities, enclosed by the dashed circles. Reprinted figure with permi
© 2009, by Springer.
structure [12], or clustering, and is the topic of this review (for earlier reviews see Refs. [13–17]).
clusters or modules, are groups of vertices which probably share common properties and/or pla
graph. In Fig. 1 a schematic example of a graph with communities is shown.
Society offers a wide variety of possible group organizations: families, working and friendsh
nations. The diffusion of Internet has also led to the creation of virtual groups, that live on the Web
Indeed, social communities have been studied for a long time [18–21]. Communities also occur in
from biology, computer science, engineering, economics, politics, etc. In protein–protein interactio
are likely to group proteins having the same specific function within the cell [22–24], in the grap
they may correspond to groups of pages dealing with the same or related topics [25,26], in metabo
related to functional modules such as cycles and pathways [27,28], in food webs they may ident
and so on.
Communities can have concrete applications. Clustering Web clients who have similar interes
near to each other may improve the performance of services provided on the World Wide Web, in
could be served by a dedicated mirror server [31]. Identifying clusters of customers with simila
of purchase relationships between customers and products of online retailers (like, e.g., www.a
to set up efficient recommendation systems [32], that better guide customers through the list o
enhance the business opportunities. Clusters of large graphs can be used to create data structu
store the graph data and to handle navigational queries, like path searches [33,34]. Ad hoc network
networks formed by communication nodes acting in the same region and rapidly changing (beca
instance), usually have no centrally maintained routing tables that specify how nodes have to com
Grouping the nodes into clusters enables one to generate compact routing tables while the cho
paths is still efficient [36].
Community detection is important for other reasons, too. Identifying modules and their
classification of vertices, according to their structural position in the modules. So, vertices with
clusters, i.e. sharing a large number of edges with the other group partners, may have an imp
and stability within the group; vertices lying at the boundaries between modules play an import
Usually they are “overlapping”
“Link community” detection method
in. Notable previous work removed currency metabolites before
identifying meaningful community structure. The statistics presented
here match current knowledge about the two systems, further con-
firming the communities’ relevance.
Having established that link communities at the maximal partition
density are meaningful and relevant, we now show that the link
dendrogram reveals meaningful communities at different scales.
Figure 4a–c shows that mobile phone users in a community are
spatially co-located. Figure 4a maps the most likely geographic loca-
tions of all users in the network; several cities are present. In Fig. 4b,
we show (insets) several communities at different cuts above the
optimum threshold, revealing small, intra-city communities. Below
the optimum threshold, larger, yet still spatially correlated, com-
munities exist (Fig. 4c). Because we expect a tight-knit community
to have only small geographical dispersion, the clustered structures
on the map indicate that the communities are meaningful. The geo-
graphical correlation of each community does not suddenly break
down, but is sustained over a wide range of thresholds. In Fig. 4d, we
look more closely at the social network of the largest community in
Fig. 4c, extracting the structure of its largest subcommunity along
with its remaining hierarchy and revealing the small-scale structures
encoded in the link dendrogram. This example provides evidence for
the presence of spatial, hierarchical organization at a societal scale. To
validate the hierarchical organization of communities quantitatively
sented in Supplementary Information, section 7.
Many cutting-edge networks are far from complete. For example,
an ambitious project to map all protein–protein interactions in yeast
is currently estimated to detect approximately 20% of connections14.
As the rate of data collection continues to increase, networks become
unities
106
105
104
of users
103
102
101
106
105
100
101 102 103
Number of communities
Number of metabolites per community
103
102
101
100
0 50 100 150 200
Number of
metabolites
Number of communities
per metabolite
Metabolic
H
2
O, H+
ATP
ADP
P
i
Threshold, t = 0.20
t =
0.24
t = 0.27
t = 0.27
50 km
a
0.4
D
0.6 0.8 1
d Largest community
Largest
subcommunity
Remaining
hierarchy
t
e
b
c
Word association
Metabolic
0.8
1
Phone
Largest
community
Second
largest
Third largest