Empirical comparison of network sampling techniques

Blagus, Neli; Šubelj, Lovro; Bajec, Marko

Computer Science > Social and Information Networks

arXiv:1506.02449 (cs)

[Submitted on 8 Jun 2015 (v1), last revised 9 Jun 2015 (this version, v2)]

Title:Empirical comparison of network sampling techniques

Authors:Neli Blagus, Lovro Šubelj, Marko Bajec

View PDF

Abstract:In the past few years, the storage and analysis of large-scale and fast evolving networks present a great challenge. Therefore, a number of different techniques have been proposed for sampling large networks. In general, network exploration techniques approximate the original networks more accurately than random node and link selection. Yet, link selection with additional subgraph induction step outperforms most other techniques. In this paper, we apply subgraph induction also to random walk and forest-fire sampling. We analyze different real-world networks and the changes of their properties introduced by sampling. We compare several sampling techniques based on the match between the original networks and their sampled variants. The results reveal that the techniques with subgraph induction underestimate the degree and clustering distribution, while overestimate average degree and density of the original networks. Techniques without subgraph induction step exhibit exactly the opposite behavior. Hence, the performance of the sampling techniques from random selection category compared to network exploration sampling does not differ significantly, while clear differences exist between the techniques with subgraph induction step and the ones without it.

Subjects:	Social and Information Networks (cs.SI); Physics and Society (physics.soc-ph)
Cite as:	arXiv:1506.02449 [cs.SI]
	(or arXiv:1506.02449v2 [cs.SI] for this version)
	https://doi.org/10.48550/arXiv.1506.02449

Submission history

From: Neli Blagus [view email]
[v1] Mon, 8 Jun 2015 11:39:13 UTC (304 KB)
[v2] Tue, 9 Jun 2015 13:45:27 UTC (304 KB)

Computer Science > Social and Information Networks

Title:Empirical comparison of network sampling techniques

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Social and Information Networks

Title:Empirical comparison of network sampling techniques

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators