Thư viện tri thức trực tuyến
Kho tài liệu với 50,000+ tài liệu học thuật
© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Advances in network science
Nội dung xem thử
Mô tả chi tiết
Adam Wierzbicki · Ulrik Brandes
Frank Schweitzer · Dino Pedreschi (Eds.)
123
LNCS 9564
12th International Conference and School, NetSci-X 2016
Wroclaw, Poland, January 11-13, 2016
Proceedings
Advances in
Network Science
Lecture Notes in Computer Science 9564
Commenced Publication in 1973
Founding and Former Series Editors:
Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board
David Hutchison
Lancaster University, Lancaster, UK
Takeo Kanade
Carnegie Mellon University, Pittsburgh, PA, USA
Josef Kittler
University of Surrey, Guildford, UK
Jon M. Kleinberg
Cornell University, Ithaca, NY, USA
Friedemann Mattern
ETH Zurich, Zürich, Switzerland
John C. Mitchell
Stanford University, Stanford, CA, USA
Moni Naor
Weizmann Institute of Science, Rehovot, Israel
C. Pandu Rangan
Indian Institute of Technology, Madras, India
Bernhard Steffen
TU Dortmund University, Dortmund, Germany
Demetri Terzopoulos
University of California, Los Angeles, CA, USA
Doug Tygar
University of California, Berkeley, CA, USA
Gerhard Weikum
Max Planck Institute for Informatics, Saarbrücken, Germany
More information about this series at http://www.springer.com/series/7409
Adam Wierzbicki • Ulrik Brandes
Frank Schweitzer • Dino Pedreschi (Eds.)
Advances in
Network Science
12th International Conference and School, NetSci-X 2016
Wroclaw, Poland, January 11–13, 2016
Proceedings
123
Editors
Adam Wierzbicki
Polish-Japanese Academy
Warzaw
Poland
Ulrik Brandes
Universität Konstanz
Konstanz, Baden-Württemberg
Germany
Frank Schweitzer
ETH Zürich
Zürich
Switzerland
Dino Pedreschi
University of Pisa
Pisa
Italy
ISSN 0302-9743 ISSN 1611-3349 (electronic)
Lecture Notes in Computer Science
ISBN 978-3-319-28360-9 ISBN 978-3-319-28361-6 (eBook)
DOI 10.1007/978-3-319-28361-6
Library of Congress Control Number: 2015958855
LNCS Sublibrary: SL3 – Information Systems and Applications, incl. Internet/Web, and HCI
© Springer International Publishing Switzerland 2016
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the
material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now
known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book are
believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors
give a warranty, express or implied, with respect to the material contained herein or for any errors or
omissions that may have been made.
Printed on acid-free paper
This Springer imprint is published by SpringerNature
The registered company is Springer International Publishing AG Switzerland
Preface
Network science is an emerging discipline concerned with the study of network models
in domains ranging from biology and physics to computer science, from financial
markets to cultural integration, and from social media to infectious diseases. It is also
an essential tool in the understanding of many kinds of big data, leading to numerous
practical applications. Network models help researchers and practitioners make sense
of an increasingly complex world, especially regarding social phenomena mediated
through information technology. This volume contains several contributions to research
in the area of network science, selected from the best submissions to the NetSci-X 2016
conference. The conference acceptance rate for full papers was 20 %. The International
Conference and School of Network Science (NetSci) is an interdisciplinary event,
gathering all researchers interested in network science. After 11 editions, the conference is the largest and best known event in the area. Published for the first time in the
Lecture Notes in Computer Science series, the volume preserves the interdisciplinary
character of network science, while emphasizing its connection to computer science.
Works of researchers of various backgrounds, such as the social sciences, biology,
economics, and computer science, unite in the aim for a better understanding of
complex networks. The development of better models of complex phenomena, such as
complex networks, is in itself an important contribution to computer science. The use
of such computational models can enhance existing information technology, as well as
expand the scope of applications of information technology into new areas. For this
reason, the study of network science can be beneficial to computer scientists, and
advances in network science can be considered as advances in computer science.
November 2016 Adam Wierzbicki
Ulrik Brandes
Frank Schweitzer
Dino Pedreschi
Organization
NetSci-X 2016 was organized by the Department of Computational Intelligence,
Faculty of Computer Science and Management, Wrocław University of Technology.
Executive Committee
General Chairs
Przemysław Kazienko Wrocław University of Technology, Poland
Bolesław Szymański Rensselaer Polytechnic Institute, USA
Steering Committee
Albert-László Barabási Northeastern University, USA
Raissa D’Souza University of California, USA
Ronaldo Menezes Florida Institute of Technology, USA
Program Chairs
János Kertész Central European University, Hungary
Renaud Lambiotte University of Namur, Belgium
Maxi San Miguel University of the Balearic Islands, Spain
Ulrik Brandes University of Konstanz, Germany
Dino Pedreschi University of Pisa, Italy
Frank Schweitzer ETH Zurich, Switzerland
Adam Wierzbicki Polish-Japanese Institute for Information Technology,
Poland
Tao Jia Southwest University, China
Michael Mäs University of Groningen, The Netherlands
Radosław Michalski Wrocław University of Technology, Poland
School and Tools Chair
Mikołaj Morzy Poznań University of Technology, Poland
Exhibition Chair
Ronaldo Menezes Florida Institute of Technology, USA
Executive Chair
Piotr Bródka Wrocław University of Technology, Poland
Publicity Chair
Radosław Nielek Polish-Japanese Institute of Information Technology,
Poland
Publication Chair
Paulina Adamska Polish-Japanese Institute of Information Technology,
Poland
Organizing Committee
Tomasz Kajdanowicz Wrocław University of Technology, Poland
Radosław Michalski Wrocław University of Technology, Poland
Program Committee
Maria Bielikova Slovak University of Technology in Bratislava,
Slovakia
Mingming Chen Rensselaer Polytechnic Institute, RPI, USA
Freddy Chong Tat Chua HP Labs
Michele Coscia National Research Council, Pisa, Italy
Ernesto Damiani University of Milan, Italy
Pasquale De Meo VU University, Amsterdam, The Netherlands
Schahram Dustdar TU Wien, Austria
Elena Ferrari University of Insubria, Italy
David Garcia ETH Zurich, Switzerland
Jarosław Jankowski West Pomeranian University of Technology, Poland
Mark Jelasity University of Szeged, Hungary
Jarosław Kozlak AGH University of Science and Technology, Poland
Konstantin Kuzmin Rensselaer Polytechnic Institute, RPI, USA
Cheng-Te Li Research Center for IT Innovation, Academia Sinica,
Taiwan
Gang Li School of Information Technology, Deakin University
Matteo Magnani Uppsala University, Sweden
Konstantinos Mersinas Royal Holloway University of London, UK
Katarzyna Musial King’s College London, UK
Stanisław Saganowski Wrocław University of Technology, Poland
Arun Sen Arizona State University, USA
Rajesh Sharma University of Bologna, Italy
Vaclav Snasel VSB-Technical University of Ostrava, Czech Republic
Toyohide Watanabe Nagoya Industrial Science Research Institute, Japan
Katarzyna Wegrzyn-Wolska ESIGETEL
Katharina Anna Zweig University of Technology and Science Kaiserslautern,
Germany
Anna Zygmunt AGH
Sen Wu Stanford University, USA
Michael Szell Northeastern University, USA
Jeff Johnson Open University, UK
Chenliang Li Wuhan University, China
Jari Saramäki Aalto University, Finland
VIII Organization
Sven Kosub Universität Konstanz, Germany
Tim Evans Imperial College London, UK
Bettina Berendt K.U. Leuven, Belgium
Kristian Kersting TU Dortmund University, Germany
Ingo Scholtes ETH Zurich, Switzerland
John Yen The Pennsylvania State University, USA
Sergio Gomez Universitat Rovira i Virgili, Spain
Matthieu Latapy CNRS, France
Derek Greene University College Dublin, Ireland
Marton Karsai ENS de Lyon, France
Ginestra Bianconi Queen Mary University, UK
Yamir Moreno Universidad de Zaragoza, Spain
Javier Borge-Holthoefer QCRI - Qatar Computing Research Institute, Qatar
Bernie Hogan University of Oxford, UK
Szymon Jaroszewicz Polish Academy of Sciences, Poland
Marek Kopel Wrocław University of Technology, Poland
Additional Reviewers
M. Ortmann
S. Tavassoli
M. Kleinbauer
M. Bockholt
S. Liu
E. De Panafieu
J. Lerner
M. Abufouda
Sponsoring Institutions
This conference was organized within the European Union’s Seventh Framework
Programme for research, technological development and demonstration under grant
agreement no. 316097 [ENGINE].
Organization IX
Contents
Quad Census Computation: Simple, Efficient, and Orbit-Aware . . . . . . . . . . 1
Mark Ortmann and Ulrik Brandes
Posting Topics 6¼ Reading Topics: On Discovering Posting and Reading
Topics in Social Media . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Wei Gong, Ee-Peng Lim, and Feida Zhu
Going Beyond GDP to Nowcast Well-Being Using Retail Market Data . . . . . 29
Riccardo Guidotti, Michele Coscia, Dino Pedreschi,
and Diego Pennacchioli
To Trust or Not to Trust Lurkers?: Evaluation of Lurking and
Trustworthiness in Ranking Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Roberto Interdonato and Andrea Tagarelli
Modelling Trend Progression Through an Extension of the Polya Urn
Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Marijn ten Thij and Sandjai Bhulai
Exploiting Content Quality and Question Difficulty in CQA Reputation
Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Adrian Huna, Ivan Srba, and Maria Bielikova
Analysis of Co-authorship Ego Networks . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Valerio Arnaboldi, Robin I.M. Dunbar, Andrea Passarella,
and Marco Conti
Studying the Role of Diversity in Open Collaboration Network:
Experiments on Wikipedia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Katarzyna Baraniak, Marcin Sydow, Jacek Szejda,
and Dominika Czerniawska
On the Evaluation Potential of Quality Functions in Community Detection
for Different Contexts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Jean Creusefond, Thomas Largillier, and Sylvain Peyronnet
Predicting User Participation in Social Media . . . . . . . . . . . . . . . . . . . . . . . 126
Fredrik Erlandsson, Anton Borg, Henric Johnson, and Piotr Bródka
Computing Information Integration in Brain Networks. . . . . . . . . . . . . . . . . 136
Xerxes D. Arsiwalla and Paul Verschure
Embracing n-ary Relations in Network Science. . . . . . . . . . . . . . . . . . . . . . 147
Jeffrey H. Johnson
Studying Successful Public Plazas in the City of Murcia (Spain) Using a
Ranking Nodes Algorithm and Foursquare Data . . . . . . . . . . . . . . . . . . . . . 161
Taras Agryzkov, Pablo Martí, Almudena Nolasco-Cirugeda,
Leticia Serrano-Estrada, Leandro Tortosa, and José F. Vicent
A Comparison of Fundamental Network Formation Principles Between
Offline and Online Friends on Twitter . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Felicia Natali and Feida Zhu
Social Ties as Predictors of Economic Development . . . . . . . . . . . . . . . . . . 178
Buster O. Holzbauer, Boleslaw K. Szymanski, Tommy Nguyen,
and Alex Pentland
Large Scale Graph Representations for Subgraph Census . . . . . . . . . . . . . . . 186
Pedro Paredes and Pedro Ribeiro
An Exploration of Fetish Social Networks and Communities . . . . . . . . . . . . 195
Damien Fay, Hamed Haddadi, Michael C. Seto, Han Wang,
and Christoph Kling
Synergy Landscapes: A Multilayer Network for Collaboration in Biological
Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
Konstantin Kuzmin, Christopher Gaiteri, and Boleslaw K. Szymanski
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
XII Contents
Quad Census Computation: Simple, Efficient,
and Orbit-Aware
Mark Ortmann(B) and Ulrik Brandes(B)
Department of Computer and Information Science,
University of Konstanz, Konstanz, Germany
{Mark.Ortmann,Ulrik.Brandes}@uni-konstanz.de
Abstract. The prevalence of select substructures is an indicator of network effects in applications such as social network analysis and systems
biology. Moreover, subgraph statistics are pervasive in stochastic network
models, and they need to be assessed repeatedly in MCMC sampling and
estimation algorithms. We present a new approach to count all induced
and non-induced 4-node subgraphs (the quad census) on a per-node and
per-edge basis, complete with a separation into their non-automorphic
roles in these subgraphs. It is the first approach to do so in a unified
manner, and is based on only a clique-listing subroutine. Computational
experiments indicate that, despite its simplicity, the approach outperforms previous, less general approaches.
1 Introduction
The F-census of a graph is the frequency distribution of subgraphs from a family
F of non-isomorphic graphs in an input graph. In this work we focus on four
node subgraphs, i.e. quads.
Discrimination of graphs by a subgraph census was proposed already by
Holland and Leinhardt [7,8] in the context of social networks and it is of utmost
importance for the effects of exponential random graph models [20]. While there
is extensive work on determining the subgraph census for varying subgraph
sizes [9,10,12] and also for directed graphs [4], the focus is almost always on
the global distribution, i.e., say, the number of triangles a graph contains but
not how often a given node is part of a triangle. However, for many properties
describing nodes and edges, respectively, it is necessary to know the subgraph
census on a node or edge level basis. For example to calculate a node’s clustering coefficient we need to know in how many triangles it is contained. The same
holds for the Jaccard index computed with respect to an edge. Although, for
these two examples it is not necessary to calculate the frequencies of all nonisomorphic induced 3-node subgraphs, the triad census, there exist edge weights
that take different subgraph configurations into account [1] and the running
time for most edge metrics [14] is dominated by calculating the frequencies of
particular subgraphs.
We gratefully acknowledge financial support from Deutsche Forschungsgemeinschaft
under grant Br 2158/11-1.
c Springer International Publishing Switzerland 2016
A. Wierzbicki et al. (Eds.): NetSci-X 2016, LNCS 9564, pp. 1–13, 2016.
DOI: 10.1007/978-3-319-28361-6 1
2 M. Ortmann and U. Brandes
While k-subgraph censuses specific for nodes and edges are not used widely
in social network analysis, this is different already for bioinformatics. So far,
however, even here the use is restricted to connected k-node subgraphs, so called
graphlets [19] or motifs [17].
A further differentiation of subgraph censuses consist in the distinction of
node and edge automorphism classes (orbits) in each graphlet. For example,
in a diamond (i.e. a complete 4-node graph with one missing edge), there are
two node and edge orbits, see Fig. 1. The node orbits are defined by the nodes
with degree 2, and those with degree 3, respectively. The edge orbits are determined by the edge connecting the nodes with degree 3, and all remaining edges,
respectively. Milenkovi´c and Prˇzulj [16] and Solova et al. [21] utilize the characterization of each node and edge respectively by it’s orbit-aware connected
subgraph census, which they call graphlet degree vector, for clustering purposes.
Due to the importance of subgraph enumeration and censuses in bioinformatics, various computational methods [6,13,15,23] were proposed.
The general approach to determine a subgraph census on the global level is to
solve a system of equations that relates the non-induced subgraph frequency of
each non-isomorphic k-node subgraph with the number of occurrences in other
k-node subgraphs [4,5,9,10,12]. It is known that the time needed to solve the
system of equations for the 4-node subgraph census, which we refer to as the
quad census, on a global level is O(a(G)m + i(G)) [12], where i(G) is the time
needed to calculate the frequency of a given 4-node induced subgraph in G,
and a(G) is the arboricity, i.e., the minimum number of spanning forest needed
to cover E. Following the idea of relating non-induced and induced subgraph
counts, Marcus and Shavitt [13] present a system of equations for the orbitaware connected quad census on a node level that runs in O(Δ(G)m + m2) time
with Δ(G) denoting the maximum degree of G. Because of the larger number of
algorithms invoked by Marcus and Shavitt’s approach, Hoˇcevar and Demˇsar [6]
present a different system of equations, again restricted to connected quads, that
requires fewer counting algorithms and runs in O(Δ(G)2m) time, but does not
determine the non-induced counts.
Contribution: We present the first algorithm to count both induced and noninduced occurrences of all 4-node subgraphs (quads). It is based on a fast algorithm for listing a single quad type and capable of distinguishing the various
roles (orbits) of nodes and edges. While this simplifies and generalizes previous
approaches, our experimental evaluation indicates that it is also more efficient.
In the following section we provide basic notation followed by an introduction
of the system of equations and the algorithm utilized in Sect. 3. In Sect. 4 we
present a running time comparison on real world and synthetic graphs showing
that our approach is more efficient than related methods. We conclude in Sect. 5.
Quad Census Computation: Simple, Efficient, and Orbit-Aware 3
2 Preliminaries
We consider finite simple undirected graphs G = (V,E) and denote the number
of nodes by n = n(G) = |V | and the number of edges by m = m(G) = |E|.
The neighborhood of a node v ∈ V is the set N(v) = {w : {v, w} ∈ E} of
all adjacent nodes, its cardinality d(v) = |N(v)| is called the degree of v, and
Δ(G) = maxv∈V {d(v)} denotes the maximum degree of G.
For finite simple directed graphs G = (V,E) we denote the outgoing neighborhood of a node v ∈ V by N +(v) = {w : (v, w) ∈ E}. The incoming neighborhood
N −(v) is defined analogously.
A complete graph with k nodes is called Kk and K3 is also called a triangle.
We use T(u) = {
N(u)
2
∩E} to refer to the set of node pairs completing a triangle
with u and T({u, v}) = N(u) ∩ N(v) for the set of nodes completing a triangle
with the edge e = {u, v}. For the cardinality of these sets we write t(u) = |T(u)|
and t(e) = |T(e)|. A triad or a quad are any graphs on exactly three or four
nodes, respectively.
A subgraph G = (V
, E
) of G = (V,E), V ⊆ V , is called (node-)induced, if
E = V
2
∩ E, and it is called non-induced, if E ⊆ V
2
∩ E.
Two undirected graphs G and G are said to be isomorphic, if and only if
there exists a bijection π : V (G) → V (G
) such that {v, w} ∈ E(G) ⇐⇒
{π(v), π(w)} ∈ E(G
). Each permutation, including identity, of the node set V ,
such that the resulting graph is isomorphic to G is called an automorphism and
the groups formed by the set of automorphisms is denoted automorphism class
or orbit.
3 Determining the Orbit-Aware Quad Census
The k-node subgraph census is usually computed via a system of linear equations
relating the non-induced and induced k-subgraph frequencies, as the non-induced
frequencies are easier to compute. Lin et al. [12] show that for k = 4 all noninduced frequencies, except for K4, can be computed in O(a(G)m) time. This
implies that the total running time to calculate the quad census at the level
of the entire graph is in O(a(G)m + i(G)), where i(G) is the time needed to
compute the induced frequencies for any induced quad.
The approach of Lin et al., however, is not suitable to answer questions
such as how often a node or an edge is contained in a K4. Furthermore, the
automorphism class of the node/edge in the quad is sometimes of interest. All
non-isomorphic graphs with four nodes are shown in Fig. 1 and the node/edge
labels refer to their automorphism classes (orbits). For example in a diamond
all edges of the C4 belong to the same orbit while the diagonal edge belongs to
another. Analogously the orbits of the nodes can be distinguished.
As our approach also relies on relating non-induced and induced frequencies
we will start by presenting how the non-induced frequencies for a node/edge in
a given orbit relate to the induced counts. Thereafter, we will present formulas
to compute the respective non-induced frequencies and prove that our approach
4 M. Ortmann and U. Brandes
0
0
0
0
co-K4
1
2
1
2
0
co-diamond
3
3
3
3
1
1
co-C4
5
4
6
5
2
2
co-paw
7
7
8
7
3 3
3
co-claw
9
9
10
10
5
4
4
P4
11
12
12
12
6
6
6
claw
13
14
15
14
8
7
8
9
paw
16
16
16
16
10
10
10
10
C4
17
18
18
17
11
11
11
11
12
diamond
19
19
19
19
13
13
13
13
13
13
K4
Fig. 1. All non-isomorphic subgraphs with four nodes (quads). Node and edge labels refer
to the orbits and were enumerated such that each orbit is identified with a single quad
matches the running time of Lin et al., implying that it is asymptotically as fast
as the fastest algorithm to compute the frequencies on a node and edge level for
any induced quad. Note that in the following when we talk about non-induced
frequencies we exclude those of the K4, as it equals the induced frequency.
3.1 Relation of Induced and Non-induced Frequencies
To establish the relation between induced and non-induced frequencies, the number of times G is non-induced in some other graph G with the same number
of nodes has to be known. For instance, let us assume that G is a P3 and G
a K3 (co-paw and -claw without isolated node cf. Fig. 1). Having the definition
of the edge set for non-induced subgraphs in mind we see that G contains three
non-induced P3, as each edge can be removed from a K3 to create a P3. Consequently, if we know the total number of non-induced P3 and we subtract three
times the number of K3 we obtain the number of induced P3 of the input graph.
Similarly, we can establish systems of equations relating induced and noninduced frequencies on a node and edge level distinguishing the orbits for quads,
see Figs. 3 and 4. Note that both systems of equations are needed since we cannot
compute the node from the edge frequencies and vice versa, but from both we
can compute the global distribution. In the following we show the correctness
for ei10(e).
Induced Orbit 10 Edge Census. Let us assume we want to know how often edge e
is in orbit 10 or in other words part of a C4. We know that a C4 is a non-induced
subgraph of a diamond, K4 and of itself, cf. Fig. 2, and that there is no other
quad containing a non-induced C4. Let us first concentrate on the diamond. In a
diamond we have two different edge orbits; orbit 11, i.e. the edges on the C4, and
orbit 12, i.e. the diagonal edge. Figure 2 shows that for every diamond where e
is in orbit 12 there is no way to remove an edge, such that this graph becomes
a C4, but for each diamond where e is in orbit 11 we can remove the diagonal